datasets: Handle converting `int16` audio data in `VoiceSample`. #26

shaper · 2024-06-12T20:52:27Z

We saw VoiceSample failing the assert on float32 audio data when playing around with the Gradio infer app and submitting an mp3 file. We didn't dig deeper into Gradio (I'm sure it's possible to alter/convert there as well), but it seems potentially useful for VoiceSample to handle int16 audio data on top of what it already handles.

farzadab

LGTM, thanks!

ultravox/data/datasets.py

juberti

If it's easy, suggest adding a test to datasets_test.py, a la the existing create_sample.

ultravox/data/datasets.py

shaper · 2024-06-13T20:52:18Z

If it's easy, suggest adding a test to datasets_test.py, a la the existing create_sample.

Added tests, using some type hints in the test resulted in the need to enhance VoiceSample.audio type from audio: Optional[np.ndarray] = None to audio: Optional[NDArray[np.float32]] = None which I think is beneficial (and didn't create any other type issues elsewhere).

Will wait a bit to merge in case there's feedback on the tests.

We saw `VoiceSample` failing the assert on `float32` audio data when playing around with the Gradio infer app and submitting an `mp3` file. We didn't dig deeper into Gradio (I'm sure it's possible to alter/convert there as well), but it seems potentially useful for `VoiceSample` to handle `int16` and `int32` audio data on top of what it already handles.

shaper · 2024-06-13T22:04:05Z

@juberti need you or @farzadab to land as I don't have write access. Excitement! Thanks for the onboarding guidance.

farzadab · 2024-06-13T22:28:33Z

Merged and gave you write access. Thanks!

farzadab approved these changes Jun 12, 2024

View reviewed changes

ultravox/data/datasets.py Outdated Show resolved Hide resolved

ultravox/data/datasets.py Show resolved Hide resolved

farzadab reviewed Jun 13, 2024

View reviewed changes

ultravox/data/datasets.py Show resolved Hide resolved

juberti approved these changes Jun 13, 2024

View reviewed changes

ultravox/data/datasets.py Outdated Show resolved Hide resolved

ultravox/data/datasets.py Show resolved Hide resolved

shaper force-pushed the pr/sample-int16 branch from edc551c to 293c2ca Compare June 13, 2024 20:49

shaper force-pushed the pr/sample-int16 branch from 293c2ca to 50cc4ac Compare June 13, 2024 20:57

juberti approved these changes Jun 13, 2024

View reviewed changes

farzadab merged commit 1aa11bf into fixie-ai:main Jun 13, 2024
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

datasets: Handle converting `int16` audio data in `VoiceSample`. #26

datasets: Handle converting `int16` audio data in `VoiceSample`. #26

shaper commented Jun 12, 2024

farzadab left a comment

juberti left a comment

shaper commented Jun 13, 2024

shaper commented Jun 13, 2024

farzadab commented Jun 13, 2024

datasets: Handle converting int16 audio data in VoiceSample. #26

datasets: Handle converting int16 audio data in VoiceSample. #26

Conversation

shaper commented Jun 12, 2024

farzadab left a comment

Choose a reason for hiding this comment

juberti left a comment

Choose a reason for hiding this comment

shaper commented Jun 13, 2024

shaper commented Jun 13, 2024

farzadab commented Jun 13, 2024

datasets: Handle converting `int16` audio data in `VoiceSample`. #26

datasets: Handle converting `int16` audio data in `VoiceSample`. #26