Original implementation of the paper Yilin Wen, Xiangyu Li, Hao Pan, Lei Yang, Zheng Wang, Taku Komura and Wenping Wang, "DISP6D: Disentangled Implicit Shape and Pose Learning for Scalable 6D Pose Estimation", ECCV, 2022. [paper|supplementary]
The code is tested with the following environment:
Ubuntu 16.04
python 3.6 or 3.7
tensorflow 1.15.0
sonnet 1.23
Our pretrained checkpoint files for different settings, and other related data for running the demo code of the inference stage can be downloaded via the following link: [Inference Data]
which includes:
./ckpts/
: The pretrained ckpt files for Ours-per(Setting I) and Ours-all(Setting III) that are trained on the synthetic CAMERA dataset, and for Setting II that is trained on synthetic images of the first 18 T-LESS objects../demo_data/
: Demo test images and their 2D detection results from T-LESS and REAL275../embeddings/
: Reference rotations for the inference stage of the three settings, with pose codebook and 2D bounding boxes for all 30 T-LESS objects of Setting II../real275_curve/
: .pkl files for visualizing our average precisions at different rotation/translation error and 3D IoU thresholds on REAL275.
You may keep the downloaded ws
folder under the root directory of this git repository.
Run:
python demo_real275.py --trained_category <test_category> --test_category <test_category> --demo_img <img_id>
to estimate pose for objects with the specified category <test_category>
on the specified demo image of <img_id>
. Note that in this setting, we refer to the model that is trained solely on this specified category:
Run by setting --trained_category
as all
:
python demo_real275.py --trained_category all --test_category <test_category> --demo_img <img_id>
to estimate pose for objects with the specified category <test_category>
on the specified demo image of <img_id>
. Here we refer the model that is trained on a combination of all six categories involved in the CAMERA and REAL275 dataset.
Run:
python draw_curves_real275.py
to plot the curves of pose evaluation for Ours-per and Ours-all on REAL275, with regard to metrics of rotation/translation error and 3D IoU.
Run:
python demo_tless.py --test_obj_id <test_obj_id>
to estimate pose for TLESS object with the specified id <test_obj_id>
on the demo image. Our model is trained only on the first 18 T-LESS objects.
Run python train.py
with parsed arguments to train a network with regard to your training data.
We relied on the code from AAE and StyleGAN for the autoencoder framework, and Multipath-AAE for data processing and augmentation. We also adopt utilities from the SIXD Toolkit
If you find this work helpful, please consider citing
@article{wen2022disp6d,
title={DISP6D: Disentangled Implicit Shape and Pose Learning for Scalable 6D Pose Estimation},
author={Wen, Yilin and Li, Xiangyu and Pan, Hao and Yang, Lei and Wang, Zheng and Komura, Taku and Wang, Wenping},
booktitle = {European Conference on Computer Vision (ECCV)},
year= {2022},
}