Skip to content

qyx1121/T-MoENet

Repository files navigation

T-MoENet

This is the repository for our work Temporal-guided Mixture-of-Experts for Zero-Shot Video Question Answering (submitted to TCSVT).

Evaluation

The pre-trained models and processed test/val data have been uploaded to Google Drive.

To test the zero-shot performance on the open-ended VideoQA datasets (e.g. MSRVTT-QA, MSVD-QA, TGIF-QA, and iVQA), please run scripts/eval_oe.sh and change the relevant file paths in it.

python inference_oe.py --dataset_path <dataset root>/test.csv \
--feat_path <datset root>/clipvitl14.pth \
--vocab_path <dataset root>/vocab1000.json \
--model_path <pretrained checkpoint>
--batch_size 12

To test the zero-shot performance on the multiple choice VideoQA datasets(e.g. NExT-QA, STAR), please run scripts/eval_mc.sh and change the relevant file paths in it. It should be noted that in STAR, we use the test set test.csv, while in NExT-QA, we use val.csv. We recommend adding --save_result and specifying --save_dir in the script when make inference on STAR. Then upload the result file to eval.ai for evaluating.

python inference_mc.py --dataset_path <dataset root>/test.csv \
--feat_path <datset root>/clipvitl14.pth \
--model_path <pretrained checkpoint> \
--save_result --save_dir <the directory to save the resulting file>
--batch_size 12

Train

coming soon...

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published