flickr8k-dataset

Here are 5 public repositories matching this topic...

therrshan / image-captioning

Comparitive analysis of image captioning model using RNN, BiLSTM and Transformer model architectures on the Flickr8K dataset and InceptionV3 for image feature extraction.

transformers rnn image-captioning inceptionv3 bilstm flickr8k-dataset

Updated Sep 29, 2024
Python

eric-ai-lab / ComCLIP

Star

Official implementation and dataset for the NAACL 2024 paper "ComCLIP: Training-Free Compositional Image and Text Matching"

causality clip svo slip vision-and-language compositionality flickr8k-dataset image-text-matching flickr30k image-text-retrieval winoground blip2

Updated Aug 18, 2024
Python

DarkKnightSgh / Text-Image-Text

Star

Text-Image-Text is a bidirectional system that enables seamless retrieval of images based on text descriptions, and vice versa. It leverages state-of-the-art language and vision models to bridge the gap between textual and visual representations.

python information-retrieval transformers image-text flickr8k-dataset text-image streamlit semantic-embedding huggingface-transformers