Official Repository of paper VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding
chatbot
clip
image-encoder
video-encoder
multimodal
dual-encoder
vision-language
vicuna
gpt4
vision-language-pretraining
llava
video-conversation
video-chatbot
llama3
gpt4o
phi-3-mini
-
Updated
Aug 11, 2024 - Python