This is the code for our ICCV 2023 paper:
Multi-Object Discovery by Low-Dimensional Object Motion
Sadra Safadoust and Fatma Güney
- Create the conda environment and install the requirements:
conda create -n mos python=3.8
conda activate mos
conda install -y pytorch=1.12.1 torchvision=0.13.1 cudatoolkit=11.3 -c pytorch
conda install -y kornia jupyter tensorboard timm einops scikit-learn scikit-image openexr-python tqdm -c conda-forge
conda install -y gcc_linux-64=7 gxx_linux-64=7 fontconfig matplotlib
pip install cvbase opencv-python filelock
- Install Mask2Former:
python -m pip install 'git+https://github.com/facebookresearch/detectron2.git'
cd src/mask2former/modeling/pixel_decoder/ops
sh make.sh
If you face any problems while installing Mask2Former, please refer to the installation steps in the Mask2Former and Detectron2 repositories.
Please follow the steps here for the CLEVR, ClevrTex, and MOVi datasets.
Download and extract the DAVIS2017 dataset into src/data/DAVIS2017
. Download the motion annotations from here and extract them into /src/data/DAVIS2017/Annotations_unsupervised_motion
. Use motion-grouping repository to create the flows for this dataset.
Download the KITTI raw dataset into src/data/KITTI/KITTI-Raw
. Calculate the flows using RAFT into src/data/KITTI/RAFT_FLOWS
. Note that we use png
flows for KITTI. Download segmentation labels from this repository.
The data directory structure should look like this:
├── src
├── data
├── movi_a
├── train
├── validation
├── movi_c
├── train
├── validation
├── movi_d
├── train
├── validation
├── movi_d
├── train
├── validation
├── moving_clevr
├── tar
├── CLEVRMOV_clevr_v2_*.tar
...
├── moving_clevrtex
├── tar
├── CLEVRMOV_full_old_ten_slow_short_*.tar
...
├── DAVIS2017
├── JPEGImages
├── Annotations_unsupervised_motion
├── Flows_gap4
...
├── KITTI
├── KITTI-Raw
├── 2011_09_26
...
├── RAFT_FLOWS
├── 2011_09_26
...
├── KITTI_test
You can train the model on the datasets by running the corresponding scripts.
E.g. for movi_c, run ./scripts/train_movi_c.sh
You can evaluate the model on the synthetic datasets by running the corresponding scripts.
E.g. for movi_c, run ./scripts/eval_movi_c.sh
and set the model weights path in the script to the path of the segmentation model.
You can download the trained models from here.
The structure of this code is largely based on probable-motion repository. Many thanks to the authors for sharing their code.