video-description

Here are 12 public repositories matching this topic...

jssprz / video_captioning_datasets

Summary about Video-to-Text datasets. This repository is part of the review paper *Bridging Vision and Language from the Video-to-Text Perspective: A Comprehensive Review*

review video-captioning state-of-the-art vision-and-language charades video-to-text msvd video-dataset video-description activitynet-captions trecvid tgif-dataset msr-vtt vatex

Updated Oct 27, 2023
Jupyter Notebook

willyfh / awesome-video-text-datasets

Star

A curated list of video-text datasets in a variety of languages. These datasets can be used for video captioning (video description) or video retrieval.

dataset video-captioning video-to-text video-retrieval video-description vision-language video-text video-language

Updated Feb 18, 2024

jssprz / visual_syntactic_embedding_video_captioning

Star

Source code of the paper titled *Improving Video Captioning with Temporal Composition of a Visual-Syntactic Embedding*

deep-learning representation-learning pos-tagging encoder-decoder video-captioning video-to-text msvd video-description msr-vtt wacv2021 syntactic-representations

Updated Apr 16, 2021
Python

dialogtekgeek / AVSD-DSTC10_Official

Star

Audio Visual Scene-Aware Dialog (AVSD) Challenge at the 10th Dialog System Technology Challenge (DSTC)

qa dialog audio-visual video-description scene-aware

Updated Aug 19, 2022

jssprz / attentive_specialized_network_video_captioning

Star

Source code of the paper titled *Attentive Visual Semantic Specialized Network for Video Captioning*

deep-learning video-captioning video-to-text msvd video-description msr-vtt icpr2020

Updated Apr 6, 2021
Python

AmrHendy / video-content-description

Star

Video content description technique for generating descriptions for unconstrained videos.

deep-learning video-processing feature-extraction encoder-decoder video-captioning video-tagging attention-model visual-deep-learning video-description

Updated Aug 14, 2019
Jupyter Notebook

OwenEdwards / videojs-speak-descriptions-track

Star

A Video.js 7 middleware that uses browser speech synthesis to speak descriptions contained in a description text track

middleware text-to-speech video accessibility speech-synthesis videojs video-description

Updated Aug 10, 2021
JavaScript

willyfh / msvd-indonesian

Star

MSVD-Indonesian: A Benchmark for Multimodal Video-Text Tasks in Indonesian (Bahasa Indonesia).

deep-learning neural-network bahasa-indonesia video-captioning msvd video-retrieval video-description multimodal-dataset video-text indonesian-dataset msvd-indonesian

Updated Aug 4, 2023

AmrHendy / multimedia_question_answering

Star

A simple attention deep learning model to answer questions about a given video with the most relevant video intervals as answers.

deep-learning cnn video-processing feature-extraction attention-model glove-embeddings attention-seq2seq multimedia-retrieval visual-deep-learning video-description video-question-answering

Updated Jul 6, 2019
Python

Ayeshaaaaaaaaa / Video-Description-and-Summarization-Using-BLIP-and-BART-Models

Star

This project processes videos by extracting frames, generating detailed visual descriptions for each frame using the BLIP model, and then summarizing these descriptions with the BART model.

bart blip video-description