GitHub - orrzohar/Video-STaR: Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision

Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision

If you like our project, please give us a star ⭐ on GitHub for latest update.

📣 Announcements

[2024.9.1] 📈 Adding new datasets and supervision types.
[2024.7.20] 👀 Accelerate and batched inference support for faster generation

📰 News

[2024.7.9] 🚀 Codebase released!
[2024.7.9] 💫 VSTaR-1M released!
[2024.7.9] 📄 arXiv released
[2024.6.20] 🤗 Hugging Face demo released - you are welcome to explore the VSTaR-1M dataset
[2024.6.17] 🔥 README release

😮 Highlights

🔥 Adapt any Large Visual-Language Model to any task using any supervision

Video-STaR can adapt LVLMs to diverse downstream tasks and datasets

🚀 Self-improve Large Visual-Language Models using any labeled video dataset

Models utilizing Video-STaR show improvement on visual understanding datasets - like Temporal Compass:

🎥 Introduction of a large, diverse video instruction tuning dataset

🛠️ Requirements and Installation

Python >= 3.10
Pytorch == 2.0.1
CUDA Version >= 11.7
Install required packages:

git clone https://github.com/orrzohar/Video-STaR
cd Video-STaR
conda create -n videostar python=3.10 -y
conda activate videostar
pip install --upgrade pip  # enable PEP 660 support
pip install -e .
pip install -e ".[train]"
pip install flash-attn --no-build-isolation
pip install decord opencv-python git+https://github.com/facebookresearch/pytorchvideo.git@28fe037d212663c6a24f373b94cc5d478c8c1a1d

🗝️ Training & Validating

👍 Acknowledgement

LLaVA The codebase we built upon.
Video-ChatGPT Great job contributing the evaluation benchmark and VI-100K dataset.
Video-LLaVA Base Model.
LLaMA-VID Base Model.

🙌 Related Projects

More coming soon...
Video-Agent

🔒 License

The majority of this project is released under the Apache 2.0 license as found in the LICENSE file.
The service is a research preview intended for non-commercial use only, subject to the model License of LLaMA, Terms of Use of the data generated by OpenAI, and Privacy Practices of ShareGPT. Please contact us if you find any potential violation.

✏️ Citation

If you find our paper and code useful in your research, please consider giving a star ⭐ and citation 📝.

@inproceedings{zohar2024videostar,
    title = {Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision},
    author = {Zohar, Orr and Wang, Xiaohan and Bitton, Yonatan and Szpektor, Idan and Yeung-levy, Serena},
    year = {2024},
    booktitle = {arXiv preprint arXiv:2407.06189},
}

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
assets		assets
videollava		videollava
videostar		videostar
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision

If you like our project, please give us a star ⭐ on GitHub for latest update.

📣 Announcements

📰 News

😮 Highlights

🔥 Adapt any Large Visual-Language Model to any task using any supervision

🚀 Self-improve Large Visual-Language Models using any labeled video dataset

🎥 Introduction of a large, diverse video instruction tuning dataset

🛠️ Requirements and Installation

🗝️ Training & Validating

👍 Acknowledgement

🙌 Related Projects

🔒 License

✏️ Citation

✨ Star History

About

Releases

Packages

Languages

License

orrzohar/Video-STaR

Folders and files

Latest commit

History

Repository files navigation

Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision

If you like our project, please give us a star ⭐ on GitHub for latest update.

📣 Announcements

📰 News

😮 Highlights

🔥 Adapt any Large Visual-Language Model to any task using any supervision

🚀 Self-improve Large Visual-Language Models using any labeled video dataset

🎥 Introduction of a large, diverse video instruction tuning dataset

🛠️ Requirements and Installation

🗝️ Training & Validating

👍 Acknowledgement

🙌 Related Projects

🔒 License

✏️ Citation

✨ Star History

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages