Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

decoding error about padding #117

Closed
Macsim2 opened this issue Mar 28, 2023 · 2 comments
Closed

decoding error about padding #117

Macsim2 opened this issue Mar 28, 2023 · 2 comments

Comments

@Macsim2
Copy link

Macsim2 commented Mar 28, 2023

First of all, I appreciate to @jianfch for time stamped whisper
but I'm face with the error while decoding below

Traceback (most recent call last):
File "test.py", line 23, in
result = model.transcribe({file_path})
File "{some_path}/python3.8/site-packages/stable_whisper/whisper_word_level.py", line 351, in transcribe_stable
mel_segment = log_mel_spectrogram(audio_segment)
File "{some_path}/lib/python3.8/site-packages/whisper/audio.py", line 138, in log_mel_spectrogram
stft = torch.stft(audio, N_FFT, HOP_LENGTH, window=window, return_complex=True)
File "{some_path}/lib/python3.8/site-packages/torch/functional.py", line 604, in stft
input = F.pad(input.view(extended_shape), [pad, pad], pad_mode)
RuntimeError: Argument #4: Padding size should be less than the corresponding input dimension, but got: padding (200, 200) at dimension 2 of input [1, 1, 8]

if you guys happen to notice me about this errors, let me know some hint thank you.

@jianfch
Copy link
Owner

jianfch commented Mar 28, 2023

Should be fixed in the latest version.

@Macsim2
Copy link
Author

Macsim2 commented Mar 29, 2023

@jianfch thank you, I solved this problem

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants