Dataset Preparation for Evaluation

We provide scripts to download and prepare the datasets for evaluation. The datasets include: Sintel, Bonn, KITTI, NYU-v2, TUM-dynamics, ScanNetv2, and DAVIS.

Note

The scripts provided here are for reference only. Please ensure you have obtained the necessary licenses from the original dataset providers before proceeding.

Download Datasets

Sintel

To download and prepare the Sintel dataset, execute:

cd data
bash download_sintel.sh
cd ..

# (optional) generate the GT dynamic mask
cd ..
python datasets_preprocess/sintel_get_dynamics.py --threshold 0.1 --save_dir dynamic_label_perfect

Bonn

To download and prepare the Bonn dataset, execute:

cd data
bash download_bonn.sh
cd ..

# create the subset for video depth evaluation, following depthcrafter
cd datasets_preprocess
python prepare_bonn.py
cd ..

KITTI

To download and prepare the KITTI dataset, execute:

cd data
bash download_kitti.sh
cd ..

# create the subset for video depth evaluation, following depthcrafter
cd datasets_preprocess
python prepare_kitti.py
cd ..

NYU-v2

To download and prepare the NYU-v2 dataset, execute:

cd data
bash download_nyuv2.sh
cd ..

# prepare the dataset for depth evaluation
cd datasets_preprocess
python prepare_nyuv2.py
cd ..

TUM-dynamics

To download and prepare the TUM-dynamics dataset, execute:

cd data
bash download_tum.sh
cd ..

# prepare the dataset for pose evaluation
cd datasets_preprocess
python prepare_tum.py
cd ..

ScanNet

To download and prepare the ScanNet dataset, execute:

cd data
bash download_scannetv2.sh
cd ..

# prepare the dataset for pose evaluation
cd datasets_preprocess
python prepare_scannet.py
cd ..

DAVIS

To download and prepare the DAVIS dataset, execute:

cd data
python download_davis.py
cd ..

Evaluation Script (Video Depth)

Sintel

CUDA_VISIBLE_DEVICES=0 torchrun --nproc_per_node=1 --master_port=29604 launch.py --mode=eval_pose  \
    --pretrained="checkpoints/MonST3R_PO-TA-S-W_ViTLarge_BaseDecoder_512_dpt.pth"   \
    --eval_dataset=sintel --output_dir="results/sintel_video_depth" --full_seq --no_crop

The results will be saved in the results/sintel_video_depth folder. You could then run the corresponding code block in depth_metric.ipynb to evaluate the results.

Bonn

CUDA_VISIBLE_DEVICES=0 torchrun --nproc_per_node=1 --master_port=29604 launch.py --mode=eval_pose  \
    --pretrained="checkpoints/MonST3R_PO-TA-S-W_ViTLarge_BaseDecoder_512_dpt.pth"   \
    --eval_dataset=bonn --output_dir="results/bonn_video_depth" --no_crop

The results will be saved in the results/bonn_video_depth folder. You could then run the corresponding code block in depth_metric.ipynb to evaluate the results.

KITTI

CUDA_VISIBLE_DEVICES=0 torchrun --nproc_per_node=1 --master_port=29604 launch.py --mode=eval_pose  \
    --pretrained="checkpoints/MonST3R_PO-TA-S-W_ViTLarge_BaseDecoder_512_dpt.pth"   \
    --eval_dataset=kitti --output_dir="results/kitti_video_depth" --no_crop --flow_loss_weight 0 --translation_weight 1e-3
# adjust flow loss weight and translation weight due to poor flow prediction result and large translation in KITTI
# updated hyperparameters should give better results of Abs_Rel = 0.089; δ<1.25 = 91.11

The results will be saved in the results/kitti_video_depth folder. You could then run the corresponding code block in depth_metric.ipynb to evaluate the results.

Evaluation Script (Camera Pose)

Sintel

CUDA_VISIBLE_DEVICES=0 torchrun --nproc_per_node=1 --master_port=29604 launch.py --mode=eval_pose  \
    --pretrained="checkpoints/MonST3R_PO-TA-S-W_ViTLarge_BaseDecoder_512_dpt.pth"   \
    --eval_dataset=sintel --output_dir="results/sintel_pose"
    # To use the ground truth dynamic mask, add: --use_gt_mask

The evaluation results will be saved in results/sintel_pose/_error_log.txt.

TUM-dynamics

CUDA_VISIBLE_DEVICES=0 torchrun --nproc_per_node=1 --master_port=29604 launch.py --mode=eval_pose  \
    --pretrained="checkpoints/MonST3R_PO-TA-S-W_ViTLarge_BaseDecoder_512_dpt.pth"   \
    --eval_dataset=tum --output_dir="results/tum_pose"

The evaluation results will be saved in results/tum_pose/_error_log.txt.

ScanNet

CUDA_VISIBLE_DEVICES=0 torchrun --nproc_per_node=1 --master_port=29604 launch.py --mode=eval_pose  \
    --pretrained="checkpoints/MonST3R_PO-TA-S-W_ViTLarge_BaseDecoder_512_dpt.pth"   \
    --eval_dataset=scannet --output_dir="results/scannet_pose"

The evaluation results will be saved in results/scannet_pose/_error_log.txt.

Evaluation Script (Single-Frame Depth)

NYU-v2

CUDA_VISIBLE_DEVICES=0 torchrun --nproc_per_node=1 --master_port=29604 launch.py --mode=eval_depth  \
    --pretrained="checkpoints/MonST3R_PO-TA-S-W_ViTLarge_BaseDecoder_512_dpt.pth"   \
    --eval_dataset=nyu --output_dir="results/nyuv2_depth" --no_crop

The results will be saved in the results/nyuv2_depth folder. You could then run the corresponding code block in depth_metric.ipynb to evaluate the results.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

evaluation_script.md

evaluation_script.md

Dataset Preparation for Evaluation

Download Datasets

Sintel

Bonn

KITTI

NYU-v2

TUM-dynamics

ScanNet

DAVIS

Evaluation Script (Video Depth)

Sintel

Bonn

KITTI

Evaluation Script (Camera Pose)

Sintel

TUM-dynamics

ScanNet

Evaluation Script (Single-Frame Depth)

NYU-v2

Files

evaluation_script.md

Latest commit

History

evaluation_script.md

File metadata and controls

Dataset Preparation for Evaluation

Download Datasets

Sintel

Bonn

KITTI

NYU-v2

TUM-dynamics

ScanNet

DAVIS

Evaluation Script (Video Depth)

Sintel

Bonn

KITTI

Evaluation Script (Camera Pose)

Sintel

TUM-dynamics

ScanNet

Evaluation Script (Single-Frame Depth)

NYU-v2