Emu Series: Generative Multimodal Models from BAAI
-
Updated
Sep 27, 2024 - Python
Emu Series: Generative Multimodal Models from BAAI
The Paper List of Large Multi-Modality Model, Parameter-Efficient Finetuning, Vision-Language Pretraining, Conventional Image-Text Matching for Preliminary Insight.
Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Pre-training Dataset and Benchmarks
mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video (ICML 2023)
Add a description, image, and links to the multimodal-pretraining topic page so that developers can more easily learn about it.
To associate your repository with the multimodal-pretraining topic, visit your repo's landing page and select "manage topics."