Transcript talks #452

cbldev · 2024-11-25T08:16:14Z

I discovered yt2doc who transcribes videos & audios online into readable Markdown documents.

I tried it on this movie already transcripted: https://www.rubyvideo.dev/talks/leveling-up-developer-tooling-for-the-modern-rails-hotwire-era-ruby-turkiye-meetup

I found the YouTube URL: https://www.youtube.com/watch?v=g2bVdaO8s7s
I tried to transcript it with a Docker setup and with an Ollama endpoint:

docker run --network="host" --mount type=bind,source=/home/debian/yt2doc,target=/app ghcr.io/shun-liang/yt2doc --video https://www.youtube.com/watch?v=g2bVdaO8s7s --timestamp-paragraphs --add-table-of-contents --llm-server http://host.docker.internal:11434/v1 --llm-model qwen2.5:14b -o .

The table of content (--add-table-of-contents) didn't work but the result is pretty nice IMO: https://gist.github.com/cbldev/143ad5b9fd4d750436d1b244a85d3490

The text was updated successfully, but these errors were encountered:

adrienpoly · 2024-11-26T07:51:22Z

thanks @cbldev that sounds promising. Currently the quality of the transcript is not great, the raw version we get from YouTube is crap. We try to improve it with OpenAI while the results are better they are sometime incomplete and the timings are often wrong. The video you used wasn't improved it is the raw transcript from Youtube.

The output from yt2doc looks much better. It is not perfect some terms are incorrect but overall is seems fare superior to what we have.

Is this something easy to install locally ? can we run it locally and then seed the results in prod?
Is it possible to give some context to the transcriber engine so that we can help him with the technical terms that could be used in the talk?

cbldev · 2024-11-26T08:33:03Z

Is this something easy to install locally ?

Yes!

I already had Ollama on my host with qwen2.5:14b pulled.

I followed the Run in Docker section of their README:

docker pull ghcr.io/shun-liang/yt2doc

docker run --network="host" --mount type=bind,source=<directory-on-host>,target=/app ghcr.io/shun-liang/yt2doc --video <video-url> --timestamp-paragraphs --llm-server http://host.docker.internal:11434/v1 --llm-model <llm-model> -o .

Few minutes after I've had the transcript in a Markdown file, and that's all!

can we run it locally and then seed the results in prod?

Yes. With a customized output file (-o some_dir/transcription.md), it can be easy to script it and push the transcript to an API endpoint per e.g.

Is it possible to give some context to the transcriber engine so that we can help him with the technical terms that could be used in the talk?

Good question, maybe first by changing the Whisper configuration. Another idea can be to fork the project, customize the system prompt and maybe try it with a code specialized LLM like qwen2.5-coder.

marcoroth added the enhancement New feature or request label Dec 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Transcript talks #452

Transcript talks #452

cbldev commented Nov 25, 2024

adrienpoly commented Nov 26, 2024

cbldev commented Nov 26, 2024

Transcript talks #452

Transcript talks #452

Comments

cbldev commented Nov 25, 2024

adrienpoly commented Nov 26, 2024

cbldev commented Nov 26, 2024