Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

404 Client Error for speaker-embedding.onnx #666

Open
eliich opened this issue Jan 17, 2024 · 5 comments
Open

404 Client Error for speaker-embedding.onnx #666

eliich opened this issue Jan 17, 2024 · 5 comments

Comments

@eliich
Copy link

eliich commented Jan 17, 2024

requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://huggingface.co/pyannote/wespeaker-voxceleb-resnet34-LM/resolve/main/speaker-embedding.onnx

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "my_whisperx_script.py", line 33, in <module>
    diarize_model = whisperx.DiarizationPipeline(use_auth_token="hf_#####", device=device)
  ... [additional traceback lines] ...
  File "huggingface_hub/file_download.py", line 1631, in get_hf_file_metadata
    r = _request_wrapper( ...
  File "huggingface_hub/utils/_errors.py", line 296, in hf_raise_for_status
    raise EntryNotFoundError(message, response) from e
huggingface_hub.utils._errors.EntryNotFoundError: 404 Client Error. (Request ID: ...)

Is the speaker-embedding.onnx file still available at the specified URL?
Could there be a permission issue or has the file been moved?

@ulfkemmsies
Copy link

I also have this error. This seems like a simple fix!

@Lucidology
Copy link

I have this error as well

@Lucidology
Copy link

Lucidology commented May 1, 2024

Fix:
This bug was caused by the fix for another bug from #499

It said to:

pip install pyannote.audio==3.0.1
pip uninstall onnxruntime
pip install --force-reinstall onnxruntime-gpu

I did:

pip uninstall pyannote.audio
pip install pyannote.audio

And now it works, I can fully transcribe and diarize a sound file, I have not tested if this still has the problem from 499 where diarization is slow and runs on the CPU. There is probably a way to pass in the device using python code or maybe just hack the code.

@SeeknnDestroy
Copy link

hi @eliich, your pyannote.audio version should match with the diarization pipeline model. please chech it via pip show pyannote.audio, if it shows:

Name: pyannote.audio
Version: 3.0.1

then use:

diarize_model = whisperx.DiarizationPipeline(model_name="pyannote/speaker-diarization-3.0", use_auth_token=hf_token, device=self.device)
diarize_segments = diarize_model(audio)
result = whisperx.assign_word_speakers(diarize_segments, result)

let me know if it helps!

@mshakirDr
Copy link
Contributor

Commented out line 218 in transcribe.py and replaced with the above mentioned line. The diarization was quickly performed. No 404 error.

       #diarize_model = DiarizationPipeline(use_auth_token=hf_token, device=device)
        diarize_model = DiarizationPipeline(model_name="pyannote/speaker-diarization-3.0", use_auth_token=hf_token, device=device)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants