This repository is an official implementation of StreamPETR.
- [2023/07/14] StreamPETR is accepted by ICCV 2023.
- [2023/05/03] StreamPETR-Large is the first online multi-view method that achieves comparable performance (62.0 mAP, 67.6 NDS and 65.3 AMOTA) with the baseline of lidar-based method.
Please follow our documentation step by step. If you like our work, please recommend it to your colleagues and friends.
Model | Setting | Pretrain | Lr Schd | Training Time | NDS | mAP | FPS-pytorch | Config | Download |
---|---|---|---|---|---|---|---|---|---|
RepDETR3D | EVA02-L - 900q | EVA02-L | 24ep | 12 hours (A100) | 60.8 | 52.1 | - | config | model |
StreamPETR | V2-99 - 900q | FCOS3D | 24ep | 13 hours | 57.1 | 48.2 | 12.5 | config | model/log |
RepDETR3D | V2-99 - 900q | FCOS3D | 24ep | 13 hours | 58.4 | 50.1 | 13.1 | config | model/log |
StreamPETR | R50 - 900q | ImageNet | 90ep | 36 hours | 53.7 | 43.2 | 26.7 | config | model/log |
StreamPETR | R50 - 428q | NuImg | 60ep | 26 hours | 54.6 | 44.9 | 31.7 | config | model/log |
The detailed results can be found in the training log. For other results on nuScenes val set, please see Here. Notes:
- FPS is measured on NVIDIA RTX 3090 GPU with batch size of 1 (containing 6 view images, without using flash attention) and FP32.
- The training time is measured with 8x 2080ti GPUs.
- RepDETR3D uses deformable attention, which is inspired by DETR3D and Sparse4D.
Model | Setting | Pretrain | NDS | mAP | AMOTA | AMOTP |
---|---|---|---|---|---|---|
StreamPETR | V2-99 - 900q | DD3D | 63.6 | 55.0 | - | - |
StreamPETR | ViT-Large-900q | - | 67.6 | 62.0 | 65.3 | 87.6 |
- StreamPETR code (also including PETR and Focal-PETR)
- Flash attention
- Deformable attention (RepDETR3D)
- Checkpoints
- Sliding window training
- Efficient training in streaming video
- TensorRT inference
- 3D object tracking
We thank these great works and open-source codebases:
- 3D Detection. MMDetection3d, DETR3D, PETR, BEVFormer, SOLOFusion, Sparse4D.
- Multi-object tracking. MOTR, PF-Track.
If you find StreamPETR is useful in your research or applications, please consider giving us a star 🌟 and citing it by the following BibTeX entry.
@article{wang2023exploring,
title={Exploring Object-Centric Temporal Modeling for Efficient Multi-View 3D Object Detection},
author={Wang, Shihao and Liu, Yingfei and Wang, Tiancai and Li, Ying and Zhang, Xiangyu},
journal={arXiv preprint arXiv:2303.11926},
year={2023}
}