Change the repository type filter
All
Repositories list
14 repositories
anetqa
Public templaterosita
PublicROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and Intra-modal Knowledge Integration- Implementation of CVPR 2023 paper "Prompting Large Language Models with Answer Heuristics for Knowledge-based Visual Question Answering".
bst
Publicxmchat
Public- A PyTorch reimplementation of bottom-up-attention models
- A lightweight, scalable, and general framework for visual question answering research
mcan-vqa
PublicDeep Modular Co-Attention Networks for Visual Question Answeringactivitynet-qa
Publicmmnas
Publicmt-captioning
PublicA PyTorch implementation of the paper Multimodal Transformer with Multiview Visual Representation for Image Captioning