Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Synchronization offset #14

Closed
skittlesvampir opened this issue Nov 10, 2023 · 4 comments
Closed

Synchronization offset #14

skittlesvampir opened this issue Nov 10, 2023 · 4 comments

Comments

@skittlesvampir
Copy link

Problem description: openai/whisper#1770 (comment)

I've uploaded the data at: https://ben.ist-toll.xyz/k/whisper-test-files/

@EtienneAb3d
Copy link
Owner

EtienneAb3d commented Nov 10, 2023

@skittlesvampir
Bug fixed: when the accurate text is also in SRT format, both timestamps were in output
🙃

@skittlesvampir
Copy link
Author

Oh my god, now it works!! Thank you so much.

Just two small details:

  1. The Whisper subtitles often quite long compared to the original subtitle that is broken into smaller pieces. (Screenshot 1) Is there a way to get shorter segments?
  2. Sometimes, lot's of consecutive subtitles are combined into a single subtitle (Screenshot 2), because Whisper didn't detect anything for a while. Would it be possible to approximate the subtitles in-between using framerates or something like that?

mpv-shot0002 Screenshot 1

mpv-shot0001 Screenshot 2

@EtienneAb3d
Copy link
Owner

I think it would be very hard to do a good job when guessing timestamps interpolations. In-between texts could be partially fast or slow and may include some sub-parts without spoken text.

For point 2: the real solution is to improve the Whisper recognition. This can be obtained with WhisperHallu.
https://github.com/EtienneAb3d/WhisperHallu

For both points 1 and 2: I'm currently working on a solution using word-level timestamps and some complementary pre-/post-processing around WhisperHallu. I don't plan to release it fully open-source. We can discuss about it if you have a budget.

@skittlesvampir
Copy link
Author

I will check WhisperHallu out, it seems cool.

Unfortunately, I don't have a budget, I'm just synchronizing my own shows so I can understand them better.

Anyways, I think the errors are acceptable, so thank you for your work! I wish your business much success in the future!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants