A detection/segmentation dataset with class names characterized by intricate and flexible expressions
The repo is the toolbox for D3
[Doc 📚]
[Paper (DOD) 📄]
[Paper (GRES) 📄]
[Awesome-DOD 🕶️]
Description Detection Dataset (a dog not being held by a leash
.
For each image in the dataset, any object that matches the description is annotated.
The dataset provides annotations such as bounding boxes and finely crafted instance masks.
We believe it will contribute to computer vision and vision-language communities.
-
[02/14/2024] Evaluation on several SOTA methods (SPHNX (the first MLLM evaluated!), G-DINO, UNINEXT, etc.) are released, together with a leaderboard for
$D^3$ . 🔥🔥 -
[10/12/2023] We released an awesome-described-object-detection list to collect and track related works.
-
[09/22/2023] Our DOD paper just got accepted by NeurIPS 2023! 🔥
-
[07/25/2023] This toolkit is available on PyPI now. You can install this repo with
pip install ddd-dataset
. -
[07/25/2023] The paper preprint introducing the DOD task and the
$D^3$ dataset, is available on arxiv. Check it out! -
[07/18/2023] We have released our Description Detection Dataset (
$D^3$ ) and the first version of$D^3$ toolbox. You can download it now for your project. -
[07/14/2023] Our GRES paper has been accepted by ICCV 2023.
The
For more information on the characteristics of this dataset, please refer to our paper.
Currently we host the
After downloading the d3_images.zip
(images in the dataset), d3_pkl.zip
(dataset information for this toolkit) and d3_json.zip
(annotation for evaluation), please extract these 3 zip files to your custom IMG_ROOT
, PKL_PATH
and JSON_ANNO_PATH
directory. These paths will be used when you perform inference or evaluation on this dataset.
This toolkit requires a few python packages like numpy
and pycocotools
. Other packages like matplotlib
and opencv-python
may also be required if you want to utilize the visualization scripts.
There are multiple ways to install
pip install ddd-dataset
git clone https://github.com/shikra/d-cube.git
# option 1: install it as a python package
cd d-cube
python -m pip install .
# done
# option 2: just put the d-cube/d_cube directory in the root directory of your local repository
Please refer to the documentation 📚 for more details. Our toolbox is similar to cocoapi in style.
Here is a quick example of how to use
from d_cube import D3
d3 = D3(IMG_ROOT, PKL_ANNO_PATH)
all_img_ids = d3.get_img_ids() # get the image ids in the dataset
all_img_info = d3.load_imgs(all_img_ids) # load images by passing a list of some image ids
img_path = all_img_info[0]["file_name"] # obtain one image path so you can load it and inference
Some frequently asked questions are answered in this Q&A file.
If you use our
@inproceedings{xie2023DOD,
title={Described Object Detection: Liberating Object Detection with Flexible Expressions},
author={Xie, Chi and Zhang, Zhao and Wu, Yixuan and Zhu, Feng and Zhao, Rui and Liang, Shuang},
booktitle={Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS)},
year={2023}
}
@inproceedings{wu2023gres,
title={Advancing Referring Expression Segmentation Beyond Single Image},
author={Wu, Yixuan and Zhang, Zhao and Xie, Chi and Zhu, Feng and Zhao, Rui},
booktitle={International Conference on Computer Vision (ICCV)},
year={2023}
}
More works related to Described Object Detection are tracked in this list: awesome-described-object-detection.