Official Repo of the Project - RM3D: Robust Data-Efficient 3D Scene Parsing via Traditional and Learnt 3D Descriptors-based Semantic Region Merging
This work presents a general and simple framework to tackle point clouds understanding when labels are limited. The first contribution is that we have done extensive methodology comparisons of traditional and learnt 3D descriptors for the task of weakly supervised 3D scene understanding, and validated that our adapted traditional PFH-based 3D descriptors show excellent generalization ability across different domains. The second contribution is that we proposed a learning-based region merging strategy based on the affinity provided by both the traditional/learnt 3D descriptors and learnt semantics. The merging process takes both low-level geometric and high-level semantic feature correlations into consideration. Experimental results demonstrate that our framework has the best performance among the three most important weakly supervised point clouds understanding tasks including semantic segmentation, instance segmentation, and object detection.
For the task of 3D Object Detection, please refer to RM3D_Det.
For the task of 3D Semantic Segmentation, please refer to RM3D_Sem_Seg.
For the task of 3D Instance Segmentation, please refer to RM3D_Ins_Seg.
Please refer to INSTALL.md for the installation of OpenPCDet
.
Our codebase of 3D object detection is based on OpenPCDet.
OpenPCDet
is a clear, simple, self-contained open source project for LiDAR-based 3D object detection.
It is also the official code release of [PointRCNN]
, [Part-A^2 net]
, [PV-RCNN]
and [Voxel R-CNN]
.
- Support both one-stage and two-stage 3D object detection frameworks
- Support distributed training & testing with multiple GPUs and multiple machines
- Support multiple heads on different scales to detect different classes
- Support stacked version set abstraction to encode various number of points in different scenes
- Support Adaptive Training Sample Selection (ATSS) for target assignment
- Support RoI-aware point cloud pooling & RoI-grid point cloud pooling
- Support GPU version 3D IoU calculation and rotated NMS
Selected supported methods are shown in the below table. Here we provide the pretrained models which achieves State the 3D detection performance on the val set of KITTI dataset.
- All models are trained with 4 RTX 2080 Ti GPUs and are available for download.
- The training time is measured with 4 2080 Ti GPUs and PyTorch 1.5.
Data Efficient Learning with 3% labels
training time | Car@R11 | Pedestrian@R11 | Cyclist@R11 | download | |
---|---|---|---|---|---|
PointPillar | ~2.66 hours | 65.43 | 45.08 | 51.88 | model_PointPillar |
SECOND | ~2.75 hours | 69.56 | 43.29 | 56.66 | model_SECOND |
SECOND-IoU | - | 68.28 | 45.39 | 57.29 | model_SECOND-IoU |
PointRCNN | ~5.67 hours | 64.70 | 46.62 | 62.16 | model_PointRCNN |
PointRCNN-IoU | ~6.12 hours | 67.54 | 47.19 | 60.25 | model_PointRCNN-IoU |
Part-A^2-Free | ~5.98 hours | 65.92 | 57.83 | 63.18 | model_Part-A^2-Free |
Part-A^2-Anchor | ~7.87 hours | 69.22 | 50.79 | 58.17 | model_Part-A^2-Anchor |
PV-RCNN | ~8.78 hours | 74.24 | 47.65 | 60.23 | model_PV-RCNN |
Voxel R-CNN (Car) | ~3.87 hours | 76.23 | - | - | model_Voxel_R-CNN |
CaDDN | ~19.83 hours | 19.34 | 11.86 | 8.17 | model_CaDDN |
We provide the setting of DATA_CONFIG.SAMPLED_INTERVAL
on the Waymo Open Dataset (WOD) to subsample partial samples for training and evaluation,
so you could also play with WOD by setting a smaller DATA_CONFIG.SAMPLED_INTERVAL
even if you only have limited GPU resources.
By default, all models are trained with 3% data (~4.8k frames) of all the training samples on 4 2080 Ti GPUs, and the results of each cell here are mAP/mAPH calculated by the official Waymo evaluation metrics on the whole validation set (version 1.2).
Vec_L1 | Vec_L2 | Ped_L1 | Ped_L2 | Cyc_L1 | Cyc_L2 | |
---|---|---|---|---|---|---|
SECOND | 57.15/66.38 | 49.42/48.67 | 50.75/40.39 | 42.18/36.65 | 47.65/43.98 | 41.22/40.92 |
Part-A^2-Anchor | 59.82/58.29 | 53.33/52.82 | 52.15/44.76 | 45.12/40.29 | 56.67/55.21 | 50.29/50.88 |
PV-RCNN | 59.06/55.38 | 55.67/63.38 | 54.23/43.76 | 44.89/38.28 | 53.15/50.94 | 49.87/48.69 |
We could not provide the above pretrained models due to Waymo Dataset License Agreement, you could easily achieve similar performance by training with the default configs.
More datasets are on the way.
Please refer to INSTALL.md for the installation of OpenPCDet
.
Please refer to DEMO.md for a quick demo to test with a pretrained model and visualize the predicted results on your custom data or the original KITTI data.
Please refer to GETTING_STARTED.md to learn more usage about this project.
RM3D
is released under the MIT license.
For Questions regarding the 3D smenatic segmentation, 3D instance segmentation, and 3D object detction codes of our RM3D, please contact through email (kcliuntu@gmail.com or kcliu@gmail.com).