From f5c558b2c959aabe3220acdde59246e7693cb0a2 Mon Sep 17 00:00:00 2001 From: Sun Jiahao <72679458+sunjiahao1999@users.noreply.github.com> Date: Mon, 5 Jun 2023 14:21:51 +0800 Subject: [PATCH 1/2] [Docs] Update the link of minkunet (#2590) * update link * fix coord --- configs/minkunet/README.md | 2 +- docs/en/user_guides/coord_sys_tutorial.md | 20 ++++++++++---------- docs/zh_cn/user_guides/coord_sys_tutorial.md | 20 ++++++++++---------- 3 files changed, 21 insertions(+), 21 deletions(-) diff --git a/configs/minkunet/README.md b/configs/minkunet/README.md index c889f7c26e..6efe23387e 100644 --- a/configs/minkunet/README.md +++ b/configs/minkunet/README.md @@ -27,7 +27,7 @@ We implement MinkUNet with [TorchSparse](https://github.com/mit-han-lab/torchspa | [MinkUNet18-W32](./minkunet18_w32_torchsparse_8xb2-amp-15e_semantickitti.py) | torchsparse | 15e | ✔ | ✗ | 4.9 | - | - | 63.1 | [model](https://download.openmmlab.com/mmdetection3d/v1.1.0_models/minkunet/minkunet_w32_8xb2-15e_semantickitti/minkunet_w32_8xb2-15e_semantickitti_20230309_160710-7fa0a6f1.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v1.1.0_models/minkunet/minkunet_w32_8xb2-15e_semantickitti/minkunet_w32_8xb2-15e_semantickitti_20230309_160710.log) | | [MinkUNet34-W32](./minkunet34_w32_minkowski_8xb2-laser-polar-mix-3x_semantickitti.py) | minkowski engine | 3x | ✗ | ✔ | 11.5 | 6.5 | 12.2 | 69.2 | [model](https://download.openmmlab.com/mmdetection3d/v1.1.0_models/minkunet/minkunet34_w32_minkowski_8xb2-laser-polar-mix-3x_semantickitti_20230514_202236-839847a8.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v1.1.0_models/minkunet/minkunet34_w32_minkowski_8xb2-laser-polar-mix-3x_semantickitti_20230514_202236.log) | | [MinkUNet34-W32](./minkunet34_w32_spconv_8xb2-amp-laser-polar-mix-3x_semantickitti.py) | spconv | 3x | ✔ | ✔ | 6.7 | 2 | 14.6\* | 68.3 | [model](https://download.openmmlab.com/mmdetection3d/v1.1.0_models/minkunet/minkunet34_w32_spconv_8xb2-amp-laser-polar-mix-3x_semantickitti_20230512_233152-e0698a0f.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v1.1.0_models/minkunet/minkunet34_w32_spconv_8xb2-amp-laser-polar-mix-3x_semantickitti_20230512_233152.log) | -| [MinkUNet34-W32](./minkunet34_w32_spconv_8xb2-laser-polar-mix-3x_semantickitti.py) | spconv | 3x | ✗ | ✔ | 10.5 | 6 | 14.5 | 3 | 69.3 | +| [MinkUNet34-W32](./minkunet34_w32_spconv_8xb2-laser-polar-mix-3x_semantickitti.py) | spconv | 3x | ✗ | ✔ | 10.5 | 6 | 14.5 | 69.3 | [model](https://download.openmmlab.com/mmdetection3d/v1.1.0_models/minkunet/minkunet34_w32_spconv_8xb2-laser-polar-mix-3x_semantickitti_20230512_233817-72b200d8.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v1.1.0_models/minkunet/minkunet34_w32_spconv_8xb2-laser-polar-mix-3x_semantickitti_20230512_233817.log) | | [MinkUNet34-W32](./minkunet34_w32_torchsparse_8xb2-amp-laser-polar-mix-3x_semantickitti.py) | torchsparse | 3x | ✔ | ✔ | 6.6 | 3 | 12.8 | 69.3 | [model](https://download.openmmlab.com/mmdetection3d/v1.1.0_models/minkunet/minkunet34_w32_torchsparse_8xb2-amp-laser-polar-mix-3x_semantickitti_20230512_233511-bef6cad0.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v1.1.0_models/minkunet/minkunet34_w32_torchsparse_8xb2-amp-laser-polar-mix-3x_semantickitti_20230512_233511.log) | | [MinkUNet34-W32](./minkunet34_w32_torchsparse_8xb2-laser-polar-mix-3x_semantickitti.py) | torchsparse | 3x | ✗ | ✔ | 11.8 | 5.5 | 15.9 | 68.7 | [model](https://download.openmmlab.com/mmdetection3d/v1.1.0_models/minkunet/minkunet34_w32_torchsparse_8xb2-laser-polar-mix-3x_semantickitti_20230512_233601-2b61b0ab.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v1.1.0_models/minkunet/minkunet34_w32_torchsparse_8xb2-laser-polar-mix-3x_semantickitti_20230512_233601.log) | | [MinkUNet34v2-W32](minkunet34v2_w32_torchsparse_8xb2-amp-laser-polar-mix-3x_semantickitti.py) | torchsparse | 3x | ✔ | ✔ | 8.9 | - | - | 70.3 | [model](https://download.openmmlab.com/mmdetection3d/v1.1.0_models/minkunet/minkunet34v2_w32_torchsparse_8xb2-amp-laser-polar-mix-3x_semantickitti_20230510_221853-b14a68b3.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v1.1.0_models/minkunet/minkunet34v2_w32_torchsparse_8xb2-amp-laser-polar-mix-3x_semantickitti_20230510_221853.log) | diff --git a/docs/en/user_guides/coord_sys_tutorial.md b/docs/en/user_guides/coord_sys_tutorial.md index d78c0cd934..7c104a17ac 100644 --- a/docs/en/user_guides/coord_sys_tutorial.md +++ b/docs/en/user_guides/coord_sys_tutorial.md @@ -60,13 +60,13 @@ We will stick to the three coordinate systems defined in this tutorial in the fu ## Definition of the yaw angle -Please refer to [wikipedia](https://en.wikipedia.org/wiki/Euler_angles#Tait%E2%80%93Bryan_angles) for the standard definition of the yaw angle. In object detection, we choose an axis as the gravity axis, and a reference direction on the plane $\Pi$ perpendicular to the gravity axis, then the reference direction has a yaw angle of 0, and other directions on $\Pi$ have non-zero yaw angles depending on its angle with the reference direction. +Please refer to [wikipedia](https://en.wikipedia.org/wiki/Euler_angles#Tait%E2%80%93Bryan_angles) for the standard definition of the yaw angle. In object detection, we choose an axis as the gravity axis, and a reference direction on the plane $\\Pi$ perpendicular to the gravity axis, then the reference direction has a yaw angle of 0, and other directions on $\\Pi$ have non-zero yaw angles depending on its angle with the reference direction. Currently, for all supported datasets, annotations do not include pitch angle and roll angle, which means we need only consider the yaw angle when predicting boxes and calculating overlap between boxes. In MMDetection3D, all three coordinate systems are right-handed coordinate systems, which means the ascending direction of the yaw angle is counter-clockwise if viewed from the negative direction of the gravity axis (the axis is pointing at one's eyes). -The figure below shows that, in this right-handed coordinate system, if we set the positive direction of the x-axis as a reference direction, then the positive direction of the y-axis has a yaw angle of $\frac{\pi}{2}$. +The figure below shows that, in this right-handed coordinate system, if we set the positive direction of the x-axis as a reference direction, then the positive direction of the y-axis has a yaw angle of $\\frac{\\pi}{2}$. ``` z up y front (yaw=0.5*pi) @@ -200,7 +200,7 @@ Then, the box dimensions before and after the conversion satisfy the following r Finally, the yaw angle should also be converted: -- $r\_{LiDAR}=-\frac{\pi}{2}-r\_{camera}$ +- $r\_{LiDAR}=-\\frac{\\pi}{2}-r\_{camera}$ See the code [here](https://github.com/open-mmlab/mmdetection3d/blob/master/mmdet3d/core/bbox/structures/box_3d_mode.py) for more details. @@ -228,18 +228,18 @@ For each box related op, we have marked the type of boxes to which we can apply No. For example, in KITTI, we need a calibration matrix when converting from Camera coordinate system to LiDAR coordinate system. -#### Q3: How does a phase difference of $2\pi$ in the yaw angle of a box affect evaluation? +#### Q3: How does a phase difference of $2\\pi$ in the yaw angle of a box affect evaluation? -For IoU calculation, a phase difference of $2\pi$ in the yaw angle will result in the same box, thus not affecting evaluation. +For IoU calculation, a phase difference of $2\\pi$ in the yaw angle will result in the same box, thus not affecting evaluation. -For angle prediction evaluation such as the NDS metric in NuScenes and the AOS metric in KITTI, the angle of predicted boxes will be first standardized, so the phase difference of $2\pi$ will not change the result. +For angle prediction evaluation such as the NDS metric in NuScenes and the AOS metric in KITTI, the angle of predicted boxes will be first standardized, so the phase difference of $2\\pi$ will not change the result. -#### Q4: How does a phase difference of $\pi$ in the yaw angle of a box affect evaluation? +#### Q4: How does a phase difference of $\\pi$ in the yaw angle of a box affect evaluation? -For IoU calculation, a phase difference of $\pi$ in the yaw angle will result in the same box, thus not affecting evaluation. +For IoU calculation, a phase difference of $\\pi$ in the yaw angle will result in the same box, thus not affecting evaluation. However, for angle prediction evaluation, this will result in the exact opposite direction. -Just think about a car. The yaw angle is the angle between the direction of the car front and the positive direction of the x-axis. If we add $\pi$ to this angle, the car front will become the car rear. +Just think about a car. The yaw angle is the angle between the direction of the car front and the positive direction of the x-axis. If we add $\\pi$ to this angle, the car front will become the car rear. -For categories such as barrier, the front and the rear have no difference, therefore a phase difference of $\pi$ will not affect the angle prediction score. +For categories such as barrier, the front and the rear have no difference, therefore a phase difference of $\\pi$ will not affect the angle prediction score. diff --git a/docs/zh_cn/user_guides/coord_sys_tutorial.md b/docs/zh_cn/user_guides/coord_sys_tutorial.md index f4949925e8..d666ba7573 100644 --- a/docs/zh_cn/user_guides/coord_sys_tutorial.md +++ b/docs/zh_cn/user_guides/coord_sys_tutorial.md @@ -60,13 +60,13 @@ MMDetection3D 使用 3 种不同的坐标系。3D 目标检测领域中不同坐 ## 转向角 (yaw) 的定义 -请参考[维基百科](https://en.wikipedia.org/wiki/Euler_angles#Tait%E2%80%93Bryan_angles)了解转向角的标准定义。在目标检测中,我们选择一个轴作为重力轴,并在垂直于重力轴的平面 $\Pi$ 上选取一个参考方向,那么参考方向的转向角为 0,在 $\Pi$ 上的其他方向有非零的转向角,其角度取决于其与参考方向的角度。 +请参考[维基百科](https://en.wikipedia.org/wiki/Euler_angles#Tait%E2%80%93Bryan_angles)了解转向角的标准定义。在目标检测中,我们选择一个轴作为重力轴,并在垂直于重力轴的平面 $\\Pi$ 上选取一个参考方向,那么参考方向的转向角为 0,在 $\\Pi$ 上的其他方向有非零的转向角,其角度取决于其与参考方向的角度。 目前,对于所有支持的数据集,标注不包括俯仰角 (pitch) 和滚动角 (roll),这意味着我们在预测框和计算框之间的重叠时只需考虑转向角 (yaw)。 在 MMDetection3D 中,所有坐标系都是右手坐标系,这意味着如果从重力轴的负方向(轴的正方向指向人眼)看,转向角 (yaw) 沿着逆时针方向增加。 -下图显示,在右手坐标系中,如果我们设定 x 轴正方向为参考方向,那么 y 轴正方向的转向角 (yaw) 为 $\frac{\pi}{2}$。 +下图显示,在右手坐标系中,如果我们设定 x 轴正方向为参考方向,那么 y 轴正方向的转向角 (yaw) 为 $\\frac{\\pi}{2}$。 ``` z 上 y 前 (yaw=0.5*pi) @@ -200,7 +200,7 @@ SUN RGB-D 的原始数据不是点云而是 RGB-D 图像。我们通过反投影 最后,转向角 (yaw) 也应该被转换: -- $r\_{LiDAR}=-\frac{\pi}{2}-r\_{camera}$ +- $r\_{LiDAR}=-\\frac{\\pi}{2}-r\_{camera}$ 详见[此处](https://github.com/open-mmlab/mmdetection3d/blob/master/mmdet3d/core/bbox/structures/box_3d_mode.py)代码了解更多细节。 @@ -228,18 +228,18 @@ SUN RGB-D 的原始数据不是点云而是 RGB-D 图像。我们通过反投影 否。例如在 KITTI 中,从相机坐标系转换为激光雷达坐标系时,我们需要一个校准矩阵。 -#### Q3: 框中转向角 (yaw) $2\pi$ 的相位差如何影响评估? +#### Q3: 框中转向角 (yaw) $2\\pi$ 的相位差如何影响评估? -对于交并比 (IoU) 计算,转向角 (yaw) 有 $2\pi$ 的相位差的两个框是相同的,所以不会影响评估。 +对于交并比 (IoU) 计算,转向角 (yaw) 有 $2\\pi$ 的相位差的两个框是相同的,所以不会影响评估。 -对于角度预测评估,例如 NuScenes 中的 NDS 指标和 KITTI 中的 AOS 指标,会先对预测框的角度进行标准化,因此 $2\pi$ 的相位差不会改变结果。 +对于角度预测评估,例如 NuScenes 中的 NDS 指标和 KITTI 中的 AOS 指标,会先对预测框的角度进行标准化,因此 $2\\pi$ 的相位差不会改变结果。 -#### Q4: 框中转向角 (yaw) $\pi$ 的相位差如何影响评估? +#### Q4: 框中转向角 (yaw) $\\pi$ 的相位差如何影响评估? -对于交并比 (IoU) 计算,转向角 (yaw) 有 $\pi$ 的相位差的两个框是相同的,所以不会影响评估。 +对于交并比 (IoU) 计算,转向角 (yaw) 有 $\\pi$ 的相位差的两个框是相同的,所以不会影响评估。 然而,对于角度预测评估,这会导致完全相反的方向。 -考虑一辆汽车,转向角 (yaw) 是汽车前部方向与 x 轴正方向之间的夹角。如果我们将该角度增加 $\pi$,车前部将变成车后部。 +考虑一辆汽车,转向角 (yaw) 是汽车前部方向与 x 轴正方向之间的夹角。如果我们将该角度增加 $\\pi$,车前部将变成车后部。 -对于某些类别,例如障碍物,前后没有区别,因此 $\pi$ 的相位差不会对角度预测分数产生影响。 +对于某些类别,例如障碍物,前后没有区别,因此 $\\pi$ 的相位差不会对角度预测分数产生影响。 From d27e65aac658e8e104dbbfd6a55119704010ea79 Mon Sep 17 00:00:00 2001 From: JingweiZhang12 Date: Thu, 15 Jun 2023 11:55:31 +0800 Subject: [PATCH 2/2] [Fix] Fix typo --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 94a13eb0e0..b8645a76d8 100644 --- a/README.md +++ b/README.md @@ -54,7 +54,7 @@ users to migrate to the latest version, though it comes with some cost. Please r **v1.1.1** was released in 30/5/2023 -We have constructed a comprehensive LiDAR semantic segmentation benchmark on SemanticKITTI, including Cylinder3D, MinkUNet and SPVCNN methods. Noteworthy, the improved MinkUNetv2 can achieve 70.3 mIoU on the validation set of SemanticKITTI. We have also supported the training of BEVFusion and an occupancy prediction method, TPVFomrer, in our `projects`. More new features about 3D perception are on the way. Please stay tuned! +We have constructed a comprehensive LiDAR semantic segmentation benchmark on SemanticKITTI, including Cylinder3D, MinkUNet and SPVCNN methods. Noteworthy, the improved MinkUNetv2 can achieve 70.3 mIoU on the validation set of SemanticKITTI. We have also supported the training of BEVFusion and an occupancy prediction method, TPVFormer, in our `projects`. More new features about 3D perception are on the way. Please stay tuned! ## Introduction