Transcribe from a Tensor is not working. #1291

ruliworst · 2023-04-27T18:46:16Z

ruliworst
Apr 27, 2023

Hello, I am trying to transcribe audio from a Tensor got using torchaudio library but it is not working. I am using Flask to load the audio given an endpoint. Any solution? Here is the code:

MODEL = whisper.load_model('base')
@app.route('/uploader', methods=['POST'])
def upload_audio():
    audio_file = request.files['audio']
    audio_file = io.BytesIO(audio_file.read())

    waveform, sr = torchaudio.load(audio_file)

    result = MODEL.transcribe(waveform)

    # print the recognized text
    return result["text"]

The error displayed is:
decode_options["language"] = max(probs, key=probs.get) AttributeError: 'list' object has no attribute 'get' in transcribe function.

Thanks in advance.

mitchsayre · 2023-04-27T22:42:33Z

mitchsayre
Apr 27, 2023

I think we are having the same issue. It seems to be the shape of the audio file tensor returned by torchaudio.load() is different from what whisper.transcribe() is expecting. I worked around it but I am not sure if there is a better solution. Here is my code:

file = open(audio_path, 'rb')
waveform, sample_rate = torchaudio.load(file)
waveform = waveform.squeeze()
result = model.transcribe(waveform)
print(result["text"])

tensor squeeze: https://pytorch.org/docs/stable/generated/torch.squeeze.html

1 reply

ruliworst Apr 30, 2023
Author

Hi, first of all thanks for your response.

I tried that solution but when transcribe method is run it gives a kind of array as text result:
3, 2, 1. 3, 4, 1. 3, 4, 1. 4, 4, 4, 5. 4, 5, 5. 4, 5, 5. 4, 5, 5. 4, 5, 6. 4, 5, 6. 4, 5. 4, 5. 5, 5. 5, 5. 5, 5. 5, 5. 5, 5. 5, 5. 5, 5. 5, 5. 5, 5. 5, 5. 5, 5. 5, 5. 5, 5. 5, 5. 5, 5. 5, 5. 5, 5. 5, 5. 5, 5. 5, 5. 5, 5. 5, 5. 5, 5. 5, 5. 5, 5. 5, 5. 5, 5. 5, 5. 5, 5. 5, 5. 5, 5. 5, 5. 5, 5. 5, 5. 5, 5. 5, 5. 5, 5. 5, 5. 5, 5. 5, 5. 5, 5. 5, 5. 5, 5. 5, 5.
So, it is not giving a proper response because it does not transcribe the audio to text.
Anyway, thanks for your response, again. I am still trying to find a solution.

RealHandy · 2024-09-10T20:47:57Z

RealHandy
Sep 10, 2024

If this is still relevant to anyone, I got this error with my use of the "large" model and got past it by specifying language = "en" in my call to model.transcribe(), i.e.
model.transcribe(audio = waveform, verbose = True, language = "en")

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Transcribe from a Tensor is not working. #1291

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments 1 reply

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

Transcribe from a Tensor is not working. #1291

ruliworst Apr 27, 2023

Replies: 2 comments · 1 reply

mitchsayre Apr 27, 2023

ruliworst Apr 30, 2023 Author

RealHandy Sep 10, 2024

ruliworst
Apr 27, 2023

Replies: 2 comments 1 reply

mitchsayre
Apr 27, 2023

ruliworst Apr 30, 2023
Author

RealHandy
Sep 10, 2024