The param "initial_prompt" affects the result, it will cause missing sequences. #1594

Yuhanlah · 2023-08-10T03:23:43Z

Yuhanlah
Aug 10, 2023

My command is
whisper 0.flac --model small.en --word_timestamps True --initial_prompt " Hello, welcome to my lecture."

The result is
python3.10/site-packages/whisper/timing.py:58: NumbaDeprecationWarning: The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details. def backtrace(trace: np.ndarray): python3.10/site-packages/whisper/transcribe.py:114: UserWarning: FP16 is not supported on CPU; using FP32 instead warnings.warn("FP16 is not supported on CPU; using FP32 instead") [00:24.740 --> 00:28.700] The Jerogan Experience. [01:00.180 --> 01:05.420] Maybe he's annoyed with them. Maybe the maybe the kayaks are fucking up there fishing. Look at that. [01:05.420 --> 01:11.940] Bro. That shit just broke your back. California Beach. Oh, yeah, easily could snap your legs in half. [01:12.580 --> 01:17.700] Easily could snap your neck. But they don't eat meat, right? No. So yeah. But I mean just the power alone. [01:17.860 --> 01:22.400] How does it know what it's doing? It's not gentle. I mean, hopefully you can you still got hoping to spit you up. [01:23.220 --> 01:25.260] Hoping. Yeah, hoping. ...

The first sentence begins from 24s. The timestamp is not correct, either.
But if I remove the param 'initial_prompt', it is correct like this. It begins from 0s.
python3.10/site-packages/whisper/timing.py:58: NumbaDeprecationWarning: The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details. def backtrace(trace: np.ndarray): python3.10/site-packages/whisper/transcribe.py:114: UserWarning: FP16 is not supported on CPU; using FP32 instead warnings.warn("FP16 is not supported on CPU; using FP32 instead") [00:01.600 --> 00:02.320] The Jerogan Experience. [00:02.780 --> 00:04.860] Wasn't it you telling me about the whales that learn how to- [00:04.860 --> 00:07.060] Orcas. Yeah, they've learned how to fuck people's boats up. [00:07.320 --> 00:09.020] That's so funny to me. [00:09.160 --> 00:10.780] It's crazy. It's kind of hilarious. [00:11.380 --> 00:14.020] Because for all these years, we've mistreated them and finally they're like, [00:14.420 --> 00:14.760] Enough. [00:15.440 --> 00:15.440] Yeah. ...

I want to know how the prompt works in this case. And is there any way to avoid missing sentence with this prompt?

Yuhanlah · 2023-08-14T01:57:03Z

Yuhanlah
Aug 14, 2023
Author

Up!

0 replies

Yuhanlah · 2023-10-16T09:38:55Z

Yuhanlah
Oct 16, 2023
Author

Up

0 replies

ThioJoe · 2023-10-27T15:53:13Z

ThioJoe
Oct 27, 2023

I see that the "Hello, welcome to my lecture." initial prompt you're using is from their example. Perhaps it is confusing the model because it's not related to the content?

Maybe try a more general initial prompt like, "This is a transcript of a video, and it may cover a variety of topics."

I'm totally guessing though. Also I wonder if the extra space at the beginning of your prompt string is affecting it.

0 replies

conan1024hao · 2023-11-04T07:06:54Z

conan1024hao
Nov 4, 2023

same issue

0 replies

andrewgambin · 2024-03-28T08:37:37Z

andrewgambin
Mar 28, 2024

This keeps happening to me too. Whole chunks of audio, 10 - 15 seconds from the first window. If I eliminte the initial_prompt, then the whole audio is transcribed properly :(

0 replies

Purfview · 2024-03-28T12:19:11Z

Purfview
Mar 28, 2024

If you want any answers you should share an audio sample with the issue and command to reproduce it .

0 replies

toanhuynhnguyen · 2024-09-25T08:00:47Z

toanhuynhnguyen
Sep 25, 2024

I also get this issue, when I use initial_prompt, it causes missing some sentences. If I do not use initial_prompt then it transcribes audios properly and does not miss any sentences.
Does anyone can help me with this issue, thank you so much.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The param "initial_prompt" affects the result, it will cause missing sequences. #1594

{{title}}

Replies: 7 comments

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

The param "initial_prompt" affects the result, it will cause missing sequences. #1594

Yuhanlah Aug 10, 2023

Replies: 7 comments

Yuhanlah Aug 14, 2023 Author

Yuhanlah Oct 16, 2023 Author

ThioJoe Oct 27, 2023

conan1024hao Nov 4, 2023

andrewgambin Mar 28, 2024

Purfview Mar 28, 2024

toanhuynhnguyen Sep 25, 2024

Yuhanlah
Aug 10, 2023

Yuhanlah
Aug 14, 2023
Author

Yuhanlah
Oct 16, 2023
Author

ThioJoe
Oct 27, 2023

conan1024hao
Nov 4, 2023

andrewgambin
Mar 28, 2024

Purfview
Mar 28, 2024

toanhuynhnguyen
Sep 25, 2024