This repository contains the source code and pretrained models from the paper Scene Recognition in 3D.
Indoor Scene Recognition in 3D
Shengyu Huang, Mikhail Usvyatsov, Konrad Schindler
To the best of our knowledge, we are the first to study the task of indoor scene recognition in 3D.
- 2020-11-26 Updates on pretrained weights and a small data pre-processing script
- 2019-02-28 initial release
If you find our work useful, please consider citing
@article{huang2020indoor,
title={Indoor Scene Recognition in 3D},
author={Huang, Shengyu and Usvyatsov, Mikhail and Schindler, Konrad},
journal={IROS},
year={2020}
}
The required libraries can be easily installed by runing
pip3 install -r requirements.txt
We use MinkowskiEngine(v0.4.2) as our 3D sparse convolution framework. If you have problem with compiling it, please refer to MinkowskiEngine for more details.
We evaluate our model on ScanNet benchmark, the dataset is released under the ScanNet Term of Use, please contact ScanNet team for access.
We preprocess the raw data to be pth file for efficient access. We use torch_cluster for GPU-based effficient farthest point sampling, you can find a sample under the folder tmp
. The train/val/test split can be found under the folder split
.
You can download the pretrained models for testing from here.
Please change base_train
, base_val
to your data folder. Then run
python3 main.py
Please change base_train
, base_val
to your data folder. Then run
python3 scene_classification.py --add_color True --num_points 4096
- For Resnet14, change
path_train
andpath_val
then run
python3 main.py --num_points 4096 --use_color True
- Follow the following 3 steps to train the multi-task learner, these three parts only differ slightly in our implementation, please refer to three folders respectively for more details:
- train the sparse encoder and semantic segmentation decoder
- freeze the encoder then train the sparse classification decoder
- finetune both encoder and decoder with small lr
Here are some great resources we benefit:
I personally also recommend these three repositories regarding sparse convolution to you: