This repository contains the PyTorch implementation of the paper "Shape Prior Deformation for Categorical 6D Object Pose and Size Estimation" (arXiv). Our approach could recover the 6D pose and size of unseen objects from an RGB-D image, as well as reconstruct their complete 3D models.
- Python 3.6
- PyTorch 1.0.1
- CUDA 9.0
ROOT=/path/to/object-deformnet
cd $ROOT/lib/nn_distance
python setup.py install --user
Download camera_train, camera_val,
real_train, real_test,
ground-truth annotations,
and mesh models
provided by NOCS.
Unzip and organize these files in $ROOT/data as follows:
data
├── CAMERA
│ ├── train
│ └── val
├── Real
│ ├── train
│ └── test
├── gts
│ ├── val
│ └── real_test
└── obj_models
├── train
├── val
├── real_train
└── real_test
Run python scripts to prepare the datasets.
cd $ROOT/preprocess
python shape_data.py
python pose_data.py
Notice that running the scripts will additionally shift and re-scale the models of mug category (w/o modifying the original files), such that the origin of the object coordinate frame is on the axis of symmetry. This step is implemented for one of our early experiments and turns out to be unnecessary. Ignoring this step should make no difference to the performance of our approach. We keep it in this repo for reproducibility.
# optional - train an Autoencoder from scratch and prepare the shape priors
python train_ae.py
python mean_shape.py
# train DeformNet
python train_deform.py
Download the pre-trained models, segmentation results from Mask R-CNN, and predictions of NOCS from here.
unzip -q deformnet_eval.zip
mv deformnet_eval/* $ROOT/results
rmdir deformnet_eval
cd $ROOT
python evaluate.py
If you find our work helpful, please consider citing:
@InProceedings{Tian_2020_ECCV,
author = {Tian, Meng and Ang Jr, Marcelo H and Lee, Gim Hee},
title = {Shape Prior Deformation for Categorical 6D Object Pose and Size Estimation},
booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
month = {August},
year = {2020}
}