Official implementation and dataset for the NAACL 2024 paper "ComCLIP: Training-Free Compositional Image and Text Matching"
causality
clip
svo
slip
vision-and-language
compositionality
flickr8k-dataset
image-text-matching
flickr30k
image-text-retrieval
winoground
blip2
-
Updated
Aug 18, 2024 - Python