Real-time and accurate open-vocabulary end-to-end object detection
-
Updated
Sep 6, 2024 - Python
Real-time and accurate open-vocabulary end-to-end object detection
Official PyTorch implementation of ODISE: Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models [CVPR 2023 Highlight]
An open, modular framework for zero-shot, language conditioned pick-and-drop tasks in arbitrary homes.
(CVPR 2023) PLA: Language-Driven Open-Vocabulary 3D Scene Understanding & (CVPR2024) RegionPLC: Regional Point-Language Contrastive Learning for Open-World 3D Scene Understanding
[RSS2024] Official implementation of "Hierarchical Open-Vocabulary 3D Scene Graphs for Language-Grounded Robot Navigation"
[ICLR 2023] PyTorch implementation of VLDet (https://arxiv.org/abs/2211.14843)
[CVPR2023] Code Release of Aligning Bag of Regions for Open-Vocabulary Object Detection
[ICLR2024 Spotlight] Code Release of CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction
[CVPR2024] Generative Region-Language Pretraining for Open-Ended Object Detection
Official implementation of "Open-Vocabulary Multi-Label Classification via Multi-Modal Knowledge Transfer".
Recognize Any Regions
[IJCV 2024] MosaicFusion: Diffusion Models as Data Augmenters for Large Vocabulary Instance Segmentation
(NeurIPS2023) CoDet: Co-Occurrence Guided Region-Word Alignment for Open-Vocabulary Object Detection
PyTorch implementation of ICML 2023 paper "SegCLIP: Patch Aggregation with Learnable Centers for Open-Vocabulary Semantic Segmentation"
Open3DIS: Open-vocabulary 3D Instance Segmentation with 2D Mask Guidance (CVPR 2024)
Our OpenYOLO3D model achieves state-of-the-art performance in Open Vocabulary 3D Instance Segmentation on ScanNet200 and Replica datasets with up ∼16x speedup compared to the best existing method in literature.
[CVPR 2024] Open3DSG: Open-Vocabulary 3D Scene Graphs from Point Clouds with Queryable Objects and Open-Set Relationships
[CVPR 2024] Physical Property Understanding from Language-Embedded Feature Fields
Distilling Large Vision-Language Model with Out-of-Distribution Generalizability (ICCV 2023)
Add a description, image, and links to the open-vocabulary topic page so that developers can more easily learn about it.
To associate your repository with the open-vocabulary topic, visit your repo's landing page and select "manage topics."