You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to use tfio.IOTensor.from_ffmpeg to build a simple data input pipeline reading MP3s, but it results in a segfault after processing a certain number of files. It did several tests to verify that it really is the number of files and not a corrupt file.
I'm working with the German part of the Mozilla CommonVoice dataset. For me the segfault happens after 1020 files. My machine has 20GB RAM, but the memory consumption of the process does look OK to me - not like a memory leak.
I'm running Ubuntu 18.04.3 LTS and ffmpeg 7:3.4.6-0ubuntu0.18.04.1.
The text was updated successfully, but these errors were encountered:
@jjedele Sorry for the late reply as debug with ffmpeg is a little challenging. Still looking into it.
In the meantime, if you are only looking for mp3 file, I added a mp3 decoder that is based on minimp3, which does not requires ffmpeg. It also works on Linux/macOS/Windows (ffmpeg only works on Ubuntu 16.04/18.04). You can give it a try with:
tfio.IOTensor.from_audio('audio.mp3')
Note from_audio actually process wav/flac/oggvorbis/mp3 implicitly.
@yongtang No worries, I can completely see how that's a pain. Thank you for looking into it.
That's awesome! Amusingly I started exactly the same project (MP3 read operator based on minimp3), but I'm not there yet since I neither worked with MP3 decoding nor with custom TF operators before. Will have a look at your code to get some inspiration ;)
@jjedele nice to see interest in minimp3. I also created another PR #805 which is attempting to add mp4a support with minimp4 + AVFoundation on macOS. The plan is to use system-native APIs (e.g., Windows and macOS) when possible, and fall back to FFmpeg on Linux for very specific codec only.
I'm trying to use
tfio.IOTensor.from_ffmpeg
to build a simple data input pipeline reading MP3s, but it results in a segfault after processing a certain number of files. It did several tests to verify that it really is the number of files and not a corrupt file.Code to reproduce:
I'm working with the German part of the Mozilla CommonVoice dataset. For me the segfault happens after 1020 files. My machine has 20GB RAM, but the memory consumption of the process does look OK to me - not like a memory leak.
I'm running Ubuntu 18.04.3 LTS and ffmpeg 7:3.4.6-0ubuntu0.18.04.1.
The text was updated successfully, but these errors were encountered: