Official PyTorch implementation of GroupViT: Semantic Segmentation Emerges from Text Supervision, CVPR 2022.
-
Updated
May 10, 2022 - Python
Official PyTorch implementation of GroupViT: Semantic Segmentation Emerges from Text Supervision, CVPR 2022.
The Paper List of Large Multi-Modality Model, Parameter-Efficient Finetuning, Vision-Language Pretraining, Conventional Image-Text Matching for Preliminary Insight.
[AAAI2021] The code of “Similarity Reasoning and Filtration for Image-Text Matching”
Code for "Learning the Best Pooling Strategy for Visual Semantic Embedding", CVPR 2021 (Oral)
Offline semantic Text-to-Image and Image-to-Image search on Android powered by quantized state-of-the-art vision-language pretrained CLIP model and ONNX Runtime inference engine
Code for journal paper "Learning Dual Semantic Relations with Graph Attention for Image-Text Matching", TCSVT, 2020.
A non-JIT version implementation / replication of CLIP of OpenAI in pytorch
Unofficial code of paper "Improving description-based person re-identification by multi-granularity image-text alignment." by Niu et al. (partially implemented)
A dead-simple image search and image-text matching system for Bangla using CLIP
CLIP (Contrastive Language–Image Pre-training) for Bangla.
[TIP2023] The code of “Plug-and-Play Regulators for Image-Text Matching”
Official implementation and dataset for the NAACL 2024 paper "ComCLIP: Training-Free Compositional Image and Text Matching"
[ICML 2024] Visual-Text Cross Alignment: Refining the Similarity Score in Vision-Language Models
Image-Text Matching Model Zoo
Extended COCO Validation (ECCV) Caption dataset (ECCV 2022)
The 3rd place solution code for the Wikipedia - Image/Caption Matching Competition on Kaggle
Easy wrapper for inserting LoRA layers in CLIP.
Code implementation of paper "SEMScene: Semantic-Consistency Enhanced Multi-Level Scene Graph Matching for Image-Text Retrieval" (ACM TOMM 2024).
Implementation of the "Learn No to Say Yes Better" paper.
Cross-modal Retrieval using Transformer Encoder Reasoning Networks (TERN). With use of Metric Learning and FAISS for fast similarity search on GPU
Add a description, image, and links to the image-text-matching topic page so that developers can more easily learn about it.
To associate your repository with the image-text-matching topic, visit your repo's landing page and select "manage topics."