Skip to content

Divide and Conquer: Improving Multi-Camera 3D Perception With 2D Semantic-Depth Priors and Input-Dependent Queries [TIP 2024]

License

Notifications You must be signed in to change notification settings

maggiesong7/SDTR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Divide and Conquer: Improving Multi-Camera 3D Perception With 2D Semantic-Depth Priors and Input-Dependent Queries [TIP 2024]

arXiv

Introduction

This repository is an official implementation of SDTR


We propose SDTR, a transformer-based framework that incorporates semantic and depth priors to improve the capability of inferring both semantic categories and 3D positions. SDTR got accepted by TIP (IEEE Transactions on Image Processing).

Preparation

Environment

This implementation is built upon PETR, and the environment preparation stage can be found in this repo (or see install.md).

Pretrained weights

We provide the pre-trained weights of backbone networks and our SDTR at gdrive. These weights have already been grouped into ckpts and work_dirs folders.

Training Data

Auxiliary Supervision Data

Edit the scripts/configs.py file, first setting the TASK_Flag to choose the type of labels you want, and then setting the DATA_DIR and SAVE_DIR entries to the location of the NuScenes dataset and the desired ground truth folder respectively.

  • Prepare depth labels
    python scripts/gen_depth_gt.py
    
  • Prepare segmentation labels
    python scripts/gen_pvseg_gt.py
    

After preparation, the directory will be as follows:

SDTR
├── mmdetection3d
├── projects
│   ├── configs
│   ├── mmdet3d_plugin
├── tools
├── data
│   ├── nuscenes
│     ├── pv_labels
│     ├    ├── det
│     ├    ├── depth
│     ├    ├── ...
│     ├── HDmaps-nocover
│     ├── ...
├── ckpts
├── README.md

Train & inference

You can train the model following:

tools/dist_train.sh projects/configs/sdtr/sdtr_r50dcn_gridmask_p4.py 8 --work-dir work_dirs/sdtr_r50dcn_gridmask_p4/

You can evaluate the model following:

tools/dist_test.sh projects/configs/sdtr/sdtr_r50dcn_gridmask_p4.py work_dirs/sdtr_r50dcn_gridmask_p4/latest.pth 8 --eval bbox

Visualize

You can generate the reault json following:

./tools/dist_test.sh projects/configs/sdtr/sdtr_r50dcn_gridmask_p4.py work_dirs/sdtr_r50dcn_gridmask_p4/latest.pth 8 --out work_dirs/pp-nus/results_eval.pkl --format-only --eval-options 'jsonfile_prefix=work_dirs/pp-nus/results_eval'

You can visualize the 3D object detection following:

python3 tools/visualize.py

Main Results

config NDS mAP
SDTR-r50-p4-1408x512 38.4% 33.1%
SDTR-vov-p4-1408x512 48.2% 43.0%

Acknowledgement

Many thanks to the authors of mmdetection3d and PETR .

Citation

If you find this project useful for your research, please consider citing:

@article{song2024divide,
  title={Divide and Conquer: Improving Multi-Camera 3D Perception With 2D Semantic-Depth Priors and Input-Dependent Queries},
  author={Song, Qi and Hu, Qingyong and Zhang, Chi and Chen, Yongquan and Huang, Rui},
  journal={IEEE Transactions on Image Processing},
  year={2024},
  publisher={IEEE}
}

Contact

If you have any questions, feel free to open an issue or contact us at qisong@link.cuhk.edu.cn.

About

Divide and Conquer: Improving Multi-Camera 3D Perception With 2D Semantic-Depth Priors and Input-Dependent Queries [TIP 2024]

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published