This is the official repository for NVIDIA's Deep Object Pose Estimation, which performs detection and 6-DoF pose estimation of known objects from an RGB camera. For full details, see our CoRL 2018 paper and video.
This repository contains complete code for training, inference, numerical evaluation of results, and synthetic data generation using either NVISII or Blenderproc. We also provide a ROS1 Noetic package that performs inference on images from a USB camera.
Hardware-accelerated ROS2 inference can be done with the Isaac ROS DOPE project.
train2
contains the original training code, used to generate the results in the CORL paper. There have been some minor bug fixes, but this code will remain largely untouched in the future.- Similarly, the synthetic data generation code in
data_generation/nvisii_data_gen/
was used for the paper, but depends on a rendering library that is no longer maintained. train
contains new training code that is intended to be simpler and easier for users to understand and modify. This code will be maintained, and any new features will be added here.- The synthetic data generation code in
data_generation/blenderproc
is a replacement for the nvisii code, using a different rendering engine that is still actively maintained.
We have tested our standalone training, inference and evaluation scripts on Ubuntu 20.04 and 22.04 with Python 3.8+, using an NVIDIA Titan X, 2080Ti, and Titan RTX.
The ROS1 node has been tested with ROS Noetic using Python 3.10. The Isaac ROS2 DOPE node has been tested with ROS2 Foxy on Jetson AGX Xavier with JetPack 4.6; and on x86/Ubuntu 20.04 with a NVIDIA Titan X, 2080Ti, and Titan RTX.
We have trained and tested DOPE with two publicaly available datasets: YCB, and HOPE. The trained weights can be downloaded from Google Drive.
YCB models can be downloaded from the YCB website, or by using NVDU (see the nvdu_ycb
command).
The HOPE dataset is a collection of RGBD images and video sequences with labeled 6-DoF poses for 28 toy grocery objects. The 3D models can be downloaded here. The folders are organized in the style of the YCB 3d models.
The physical objects can be purchased online (details and links to Amazon can be found in the HOPE repository README.
If you use this tool in a research project, please cite as follows:
@inproceedings{tremblay2018corl:dope,
author = {Jonathan Tremblay and Thang To and Balakumar Sundaralingam and Yu Xiang and Dieter Fox and Stan Birchfield},
title = {Deep Object Pose Estimation for Semantic Robotic Grasping of Household Objects},
booktitle = {Conference on Robot Learning (CoRL)},
url = "https://arxiv.org/abs/1809.10790",
year = 2018
}
Copyright (C) 2018-2024 NVIDIA Corporation. All rights reserved. This code is licensed under the NVIDIA Source Code License.
Thanks to Jeff Smith (jeffreys@nvidia.com) for help maintaining the repo and software. Thanks also to Martin Günther for his code contributions and fixes.
Jonathan Tremblay (jtremblay@nvidia.com), Stan Birchfield (sbirchfield@nvidia.com)