-
Notifications
You must be signed in to change notification settings - Fork 316
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mme: don't restrict the host buffer to 16-bit #796
Conversation
PortAudio is still intended to work just fine on pre-Vista systems and also on WINE and ReactOS. Especially in the case of the legacy WMME API this compatibility rather than performance should be the highest priority. What I mean is, supporting output beyond 16-bit is a good improvement, but doing it in a way that assumes Vista+ or is optimised for Vista+ is not an improvement since it degrades backward compatibility. At this point, backward compatibility is the main reason to continue maintaining the MME back end.
Do we have a reliable reference for this claim? I seem to remember even Windows 98 had some floating point support. But I am (fairly) confident that even if Windows converted to float, it would provide transparent passthrough up to 24-bit. |
Please keep in mind that MME is the default PortAudio Host API on Windows. Which means that any defects in MME will impact anyone who just uses the defaults. That is presumably a large portion of PortAudio users. In this particular case, this means any PortAudio user who just uses default settings will get their 24-bit or 32-bit float audio lossily downconverted to 16-bit instead of being passed through as-is to the Windows audio engine. This is surprising, inefficient, user-hostile, and completely unnecessary.
There is no evidence that it degrades backward compatiblity. Can we please only block PRs when there is clear, actual, substantiated concern, as opposed to speculation? Rejecting PRs (or demanding an increase in code complexity) based on highly hypothetical "what ifs" (which, even if confirmed, would only affect extremely dubious use cases) is extremely demotivating and not conducive to progress.
I disagree. Besides compatibility, users might still want to use MME today because:
Based on the above, I don't think it's fair to assume that users only use MME for compatibility reasons.
I'm going to copy-paste from my comment on the equivalent DS PR: MME hands off the samples to the Windows Audio Engine (through WASAPI internally), where the signal is filtered by APOs, resampled (if necessary), mixed, possibly gain-adjusted if software volume control is in use, passed through CAudioLimiter (which does "soft" clipping), and finally converted to integer (with dithering) for handoff to the hardware. (See Microsoft docs e.g. Windows Audio Architecture, Audio Processing Object Architecture, my APO notes, etc.)
If you want a 24-bit sample to make its way untouched throughout the entire shared Windows audio pipeline, that is technically possible since 24-bit and 32-bit float can be converted losslessly to each other. However, in practice that kind of setup would be extremely brittle as you will need to get all the stars to align just right:
If you do all the above, then you can, with difficulty, get a bit-perfect pipeline at 24-bit. If you try the same thing at 16-bit, Windows will also do dithering on top of the above, and that cannot be disabled (as far as I know). Needless to say, none of the above is particularly user-friendly. For this kind of use case it makes way more sense to use an exclusive backend such as WASAPI Exclusive or WDM-KS, which are precisely designed to be bit-perfect and are the right tools for this job. |
Sorry for the delay, here's our position statement. There are two issues:
In the case of this patch, the 2nd issue is further complicated by the fact that any float passthrough is better than downscaling to 16bit. Phil and I agree on this. That said, I'd prefer the conversions to be done in PA (even integer <-> integer conversions) and Phil would prefer to pass floats through to the OS if the OS accepts them. |
I've created #825 to track the broader policy issue. |
I have been writing code against the MME API since 1995. I remember when float was not supported, and when it was added. |
The current code seems dependent on WAVEFORMATEX. |
Sure. You said earlier in this thread that you believe float was supported all the way back to Windows 98. I have no reason to doubt you. Especially since MSDN concurs (Windows 98 SE to be precise). I'm not sure what point you are trying to make by saying that, though. If you're saying that my statement "There is no evidence that it degrades backward compatibility" is technically incorrect, then sure, no problem, let me rephrase my statement right away and correct the record: There is no evidence that this change degrades backward compatibility on any OS that any user could possibly care about in 2023. There. Can we agree on this? If not, can you please cite one, even just one, user who would reasonably expect code written in 2023 to run out-of-the-box on Windows 95? I certainly can't. In fact, I don't know, nor know people who know, anyone who even still runs Windows 95 in any capacity. I can speculate that there might be an extremely small number of people out there who are forced to use Windows 95 because they want to run very old software or very old hardware. Assuming these people exist, what reason could they possibly have to want to run a 2023 version of PortAudio on these systems??
Honestly I am extremely reluctant to add a fallback, which in itself is a non-trivial error handling mechanism with its own complexity and potential for bugs, to guard against a situation that will never occur in practice outside of systems that (1) are extremely rare, if they even still exist and (2) have no reason to use a modern version of PortAudio in the first place. It's just completely pointless code, and I don't like writing pointless code. Right now I'm torn between two options: (1) dropping this patch and telling my users they can't use 24-bit with MME or (2) forking PortAudio for my own project so that I can fix these bugs without having to spend precious time and energy running around in circles in threads like this one. Option (3), doing what you asked, i.e. writing pointless code (under protest) that is never going to run on any platform that any reasonable user could possibly want to use, well… that option comes in a very distant third.
MSDN says |
src/hostapi/wmme/pa_win_wmme.c
Outdated
@@ -1796,14 +1796,13 @@ static void InitializeSingleDirectionHandlesAndBuffers( PaWinMmeSingleDirectionH | |||
static PaError InitializeWaveHandles( PaWinMmeHostApiRepresentation *winMmeHostApi, | |||
PaWinMmeSingleDirectionHandlesAndBuffers *handlesAndBuffers, | |||
unsigned long winMmeSpecificFlags, | |||
unsigned long bytesPerHostSample, | |||
PaSampleFormat sampleFormat, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please change this parameter to be named hostSampleFormat
. I think that would be clearer.
I agree with this. This is the main reason I didn't get around to implementing anything more complex than hardcoding |
It seems Ross is worried about breaking apps on old OS versions that do not support FLOAT. I propose that we make this change conditional on OS version. Something like this pseudo-code:
I don't know how to check the version on Windows. Ross and I looked at this and it is non-trivial. |
Phil's proposal sounds ok to me, but...
If we could just use https://learn.microsoft.com/en-us/windows/win32/api/sysinfoapi/nf-sysinfoapi-getversion |
So when you wrote this code, you had to make a choice between only passing
Thanks Phil. I can get behind this. I still believe this is pointless, and it will still result in unnecessary conversions for people using PortAudio on, say, XP. But it's not like I really care about XP, and the extra code complexity is trivial, so I'm fine with this.
There is precedent for using portaudio/src/hostapi/dsound/pa_win_ds.c Line 348 in 929e2e8
In practice I don't think using
I suspect this is purely because I'll update the PR to use |
Currently, the MME Host API code only creates 16-bit integer MME buffers. All audio data provided by the user is therefore converted by PortAudio to and from 16-bit, regardless of the user buffer format. This basically makes it impossible to pass 24-bit audio through the MME Host API. If the user buffer format is not 16-bit, this also causes pointless conversions to take place, *even if the hardware is running at 16-bit*, because modern Windows versions (Vista+) convert the data to floating point behind the scenes before it is handed off to the hardware. This can lead to silly situations where 32-bit float samples from the user are (lossily) converted to 16-bit by PortAudio, then ended off to Windows via MME, only to be converted back to 32-bit float again, before finally being converted to the format the hardware device is configured to use. This can easily lead to two layers of 16-bit dithering (one from PortAudio, and one from Windows) being piled on top of each other, resulting in an elevated noise floor. This commit fixes this problem by configuring the MME buffers to use the same format as the user buffer. This should stop PortAudio from converting samples in all cases except paInt8, which is not supported by WAVEFORMATEX (only paUInt8 is). This is pretty much the same idea as PortAudio#774. The new code assumes that MME will accept whatever format we throw at it. I feel confident that this is always true on Vista+ regardless of hardware, because starting from Vista MME does not access hardware directly - it always goes through the Windows Audio Engine which supports all formats in both directions (I verified this using paloopback). On pre-Vista Windows, this should still work all the way back to Windows 98 SE, because MME goes through KMixer which supports all formats [1]. Nevertheless, it's difficult to be sure, so this code checks the Windows version it's running on and preserves the old behavior (i.e. always use Int16) on pre-Vista Windows. [1]: https://learn.microsoft.com/en-us/windows-hardware/drivers/audio/background-of-non-pcm-support#waveout-api Fixes PortAudio#139
Done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great! I'll let Ross give final approval.
WASAPI's implementation does not attempt to load dynamically GetVersion when compiled for UWP, that part of the code is excluded with portaudio/src/hostapi/wasapi/pa_win_wasapi.c Lines 1348 to 1352 in 929e2e8
The reason of the complex implementation of WASAPI's
Indeed, only WASAPI. I noticed from discussion that DS and now MME host implementations will use portaudio/src/hostapi/wasapi/pa_win_wasapi.c Line 1233 in 929e2e8
/os/win/pa_win_version.c file which would serve portable and reliable Windows version detection for all host APIs?
If this proposal is acceptable I could take this task and propose the PR for that covering WASAPI, MME and DS host APIs. Please let me know your opinion. |
@dmitrykos - I think it would be great to have a portable GetWindowsVersion() available as a utility. |
Yes, of course. I will introduce separate PR after this one is merged. |
Can you explain why you think |
@dechamps you could have a look at implementation portaudio/src/hostapi/wasapi/pa_win_wasapi.c Line 1233 in 929e2e8
it is quite self explanatory. My previous comment about portability issues was not related to |
Thanks for your patience. We're going to merge this, but two comments: 1 A minor detail: I'm not sure how we arrived at windows Vista as the target, I would have thought 7 was fine. Not a big deal, but it does seem strange to me.
https://learn.microsoft.com/en-us/windows/win32/api/sysinfoapi/nf-sysinfoapi-getversion Hopefully Dmitry's version will help with this issue. |
Thanks for finally merging this. Progress!
Vista is the first Windows version to include the modern Windows audio stack, so it seems natural to use it as the boundary.
I did notice, and I did respond to your point in a previous comment. I strongly suspect that sentence in the Microsoft docs is just poorly phrased - Microsoft almost certainly means to say that |
Currently, the MME Host API code only creates 16-bit integer MME buffers. All audio data provided by the user is therefore converted by PortAudio to and from 16-bit, regardless of the user buffer format.
This basically makes it impossible to pass 24-bit audio through the MME Host API.
If the user buffer format is not 16-bit, this also causes pointless conversions to take place, even if the hardware is running at 16-bit, because modern Windows versions (Vista+) convert the data to floating point behind the scenes before it is handed off to the hardware. This can lead to silly situations where 32-bit float samples from the user are (lossily) converted to 16-bit by PortAudio, then ended off to Windows via MME, only to be converted back to 32-bit float again, before finally being converted to the format the hardware device is configured to use. This can easily lead to two layers of 16-bit dithering (one from PortAudio, and one from Windows) being piled on top of each other, resulting in an elevated noise floor.
This commit fixes this problem by configuring the MME buffers to use the same format as the user buffer. This should stop PortAudio from converting samples in all cases except paInt8, which is not supported by WAVEFORMATEX (only paUInt8 is). This is pretty much the same idea as #774.
The new code assumes that MME will accept whatever format we throw at it. I feel confident that this is always true on Vista+ regardless of hardware, because starting from Vista MME does not access hardware directly - it always goes through the Windows Audio Engine which supports all formats in both directions (I verified this using paloopback).
On pre-Vista Windows, this should still work all the way back to Windows 98 SE, because MME goes through KMixer which supports all formats.
Fixes #139