Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transcript talks #452

Open
cbldev opened this issue Nov 25, 2024 · 2 comments
Open

Transcript talks #452

cbldev opened this issue Nov 25, 2024 · 2 comments
Labels
enhancement New feature or request

Comments

@cbldev
Copy link

cbldev commented Nov 25, 2024

I discovered yt2doc who transcribes videos & audios online into readable Markdown documents.

I tried it on this movie already transcripted: https://www.rubyvideo.dev/talks/leveling-up-developer-tooling-for-the-modern-rails-hotwire-era-ruby-turkiye-meetup

  1. I found the YouTube URL: https://www.youtube.com/watch?v=g2bVdaO8s7s
  2. I tried to transcript it with a Docker setup and with an Ollama endpoint:
docker run --network="host" --mount type=bind,source=/home/debian/yt2doc,target=/app ghcr.io/shun-liang/yt2doc --video https://www.youtube.com/watch?v=g2bVdaO8s7s --timestamp-paragraphs --add-table-of-contents --llm-server http://host.docker.internal:11434/v1 --llm-model qwen2.5:14b -o .
  1. The table of content (--add-table-of-contents) didn't work but the result is pretty nice IMO: https://gist.github.com/cbldev/143ad5b9fd4d750436d1b244a85d3490
@adrienpoly
Copy link
Owner

thanks @cbldev that sounds promising. Currently the quality of the transcript is not great, the raw version we get from YouTube is crap. We try to improve it with OpenAI while the results are better they are sometime incomplete and the timings are often wrong. The video you used wasn't improved it is the raw transcript from Youtube.

The output from yt2doc looks much better. It is not perfect some terms are incorrect but overall is seems fare superior to what we have.

Is this something easy to install locally ? can we run it locally and then seed the results in prod?
Is it possible to give some context to the transcriber engine so that we can help him with the technical terms that could be used in the talk?

@cbldev
Copy link
Author

cbldev commented Nov 26, 2024

Is this something easy to install locally ?

Yes!

I already had Ollama on my host with qwen2.5:14b pulled.

I followed the Run in Docker section of their README:

docker pull ghcr.io/shun-liang/yt2doc

docker run --network="host" --mount type=bind,source=<directory-on-host>,target=/app ghcr.io/shun-liang/yt2doc --video <video-url> --timestamp-paragraphs --llm-server http://host.docker.internal:11434/v1 --llm-model <llm-model> -o .

Few minutes after I've had the transcript in a Markdown file, and that's all!

can we run it locally and then seed the results in prod?

Yes. With a customized output file (-o some_dir/transcription.md), it can be easy to script it and push the transcript to an API endpoint per e.g.

Is it possible to give some context to the transcriber engine so that we can help him with the technical terms that could be used in the talk?

Good question, maybe first by changing the Whisper configuration. Another idea can be to fork the project, customize the system prompt and maybe try it with a code specialized LLM like qwen2.5-coder.

@marcoroth marcoroth added the enhancement New feature or request label Dec 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants