-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Weird STOI Output #20
Comments
I didn't check your files but are you sure they are completely synced between each other? |
Hi, Thank for replying! Thank you again for reviewing my issues! |
That's counter-intuitive.. Do you have Matlab by any chance? The code is unit tested but maybe something weird happens IDK.. |
I just ran the Matlab tests code and got 0.1973 for the signal with background music with 0.1105 for the signal w/o background music. Thank you! |
Thanks a lot for running the tests in Matlab ! I don't have any intuition as to why this would be the case, sorry.. |
Your scores are below 0.4 which basically means STOI says the speech is not intelligible. Have a look at fig4 in http://cas.et.tudelft.nl/pubs/Taal2011_1.pdf where you see real listening test scores vs STOI predictions. You have to call STOI with a clean signal and a distorted version (less intelligible) of the SAME speech signal. The signals have to be time-aligned. Based on the file size it seems that 'refSpeech.wav' might be not the same time-aligned speech signal as the one used in audio_withBGM.wav? I think it would make more sense if you use audio_withoutBGM.wav as the reference signal (assuming it's 100% intelligible) and audio_withBGM.wav as the distorted version. |
Hi @chtaal ! Thank you for replying! I've tried to evaluate STOI with a couple of my processed audio files as processed audio and audio_withoutBGM.wav as reference as you suggested, and it turned out that the trend is more similar to the results of PESQ (Thank you again for this helpful suggestion!). Yet I still feel weird since the audio_withoutBGM.wav is the raw signal I would like to test Beamforming algorithm on it. If this audio file instead of refSpeech.wav is taken as ref signal, how can I understand the audio quality of my beamforming algorithm with raw microphone signal as reference? Thank you again! |
Hi,
Recently I was trying to evaluate some signals by calculating the stoi of each signals with this package. I used
pystoi.stoi.stoi
function to calculate the stoi. When I input two identical signals as ref_signal and processed_signal, it output 1 perfectly. However, when I replaced processed signal with microphone signals I recorded with and without background music playing, it turned out that the STOI of the signal when background music was presented is always higher, which made no sense.I'm wondering if I'm using the function the wrong way or is there anything wrong with my audio file or understanding about STOI.
I've uploaded my audio files at the following website as well as my code to evaluate STOI.
https://github.com/nanaChang/stoiCheckFile
Thank you!
The text was updated successfully, but these errors were encountered: