Problem with time stamp and audio size #124

titusfx · 2022-09-25T15:37:11Z

titusfx
Sep 25, 2022

I've an audio that is 9 seconds long, whisper returns a transcription with 34 seconds long.

The audio has 4 phrases.
The audio is mp3 file and the result is on the txt file (json file)

Sep 26, 2022

We could add a post-processing step so that the timestamps are upper-bounded by the audio length. This wasn't my priority because the timestamps tend to become more accurate with the larger models and a prolonged last timestamp didn't hurt much practically.

View full answer

jongwook · 2022-09-26T12:17:55Z

jongwook
Sep 26, 2022
Maintainer

We could add a post-processing step so that the timestamps are upper-bounded by the audio length. This wasn't my priority because the timestamps tend to become more accurate with the larger models and a prolonged last timestamp didn't hurt much practically.

2 replies

kmk2018 Feb 1, 2023

Hello,
This problem can occur within an utterance if Whisper breaks the utterance into multiple phrases, not just the final end time. For example this is from a 11.09s file with 4 phrases where the 3rd and 4th phrase have the wrong timings:

["cand", [{"id": 0, "seek": 0, "start": 0.0, "end": 4.6000000000000005, "text": ...}, {"id": 1, "seek": 0, "start": 4.6000000000000005, "end": 8.64, "text": ...}, {"id": 2, "seek": 864, "start": 8.64, "end": 14.66, "text": "...}, {"id": 3, "seek": 864, "start": 14.66, "end": 34.36, "text": ...}]],

FYI the input was a .wav file

Pricer187 Sep 28, 2024

We could add a post-processing step so that the timestamps are upper-bounded by the audio length. This wasn't my priority because the timestamps tend to become more accurate with the larger models and a prolonged last timestamp didn't hurt much practically.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problem with time stamp and audio size #124

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 2 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Problem with time stamp and audio size #124

titusfx Sep 25, 2022

Replies: 1 comment · 2 replies

jongwook Sep 26, 2022 Maintainer

kmk2018 Feb 1, 2023

Pricer187 Sep 28, 2024

titusfx
Sep 25, 2022

Replies: 1 comment 2 replies

jongwook
Sep 26, 2022
Maintainer