Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mme: don't restrict the host buffer to 16-bit #796

Merged
merged 1 commit into from
Aug 4, 2023

Conversation

dechamps
Copy link
Contributor

@dechamps dechamps commented Mar 18, 2023

Currently, the MME Host API code only creates 16-bit integer MME buffers. All audio data provided by the user is therefore converted by PortAudio to and from 16-bit, regardless of the user buffer format.

This basically makes it impossible to pass 24-bit audio through the MME Host API.

If the user buffer format is not 16-bit, this also causes pointless conversions to take place, even if the hardware is running at 16-bit, because modern Windows versions (Vista+) convert the data to floating point behind the scenes before it is handed off to the hardware. This can lead to silly situations where 32-bit float samples from the user are (lossily) converted to 16-bit by PortAudio, then ended off to Windows via MME, only to be converted back to 32-bit float again, before finally being converted to the format the hardware device is configured to use. This can easily lead to two layers of 16-bit dithering (one from PortAudio, and one from Windows) being piled on top of each other, resulting in an elevated noise floor.

This commit fixes this problem by configuring the MME buffers to use the same format as the user buffer. This should stop PortAudio from converting samples in all cases except paInt8, which is not supported by WAVEFORMATEX (only paUInt8 is). This is pretty much the same idea as #774.

The new code assumes that MME will accept whatever format we throw at it. I feel confident that this is always true on Vista+ regardless of hardware, because starting from Vista MME does not access hardware directly - it always goes through the Windows Audio Engine which supports all formats in both directions (I verified this using paloopback).

On pre-Vista Windows, this should still work all the way back to Windows 98 SE, because MME goes through KMixer which supports all formats.

Fixes #139

src/hostapi/wmme/pa_win_wmme.c Show resolved Hide resolved
src/hostapi/wmme/pa_win_wmme.c Outdated Show resolved Hide resolved
src/hostapi/wmme/pa_win_wmme.c Outdated Show resolved Hide resolved
@RossBencina
Copy link
Collaborator

PortAudio is still intended to work just fine on pre-Vista systems and also on WINE and ReactOS. Especially in the case of the legacy WMME API this compatibility rather than performance should be the highest priority. What I mean is, supporting output beyond 16-bit is a good improvement, but doing it in a way that assumes Vista+ or is optimised for Vista+ is not an improvement since it degrades backward compatibility. At this point, backward compatibility is the main reason to continue maintaining the MME back end.

because modern Windows versions (Vista+) convert the data to floating point behind the scenes before it is handed off to the hardware.

Do we have a reliable reference for this claim? I seem to remember even Windows 98 had some floating point support. But I am (fairly) confident that even if Windows converted to float, it would provide transparent passthrough up to 24-bit.

@RossBencina RossBencina added the src-wmme MS WMME Host API /src/hostapi/wmme label Mar 20, 2023
@dechamps
Copy link
Contributor Author

dechamps commented Mar 21, 2023

Especially in the case of the legacy WMME API this compatibility rather than performance should be the highest priority.

Please keep in mind that MME is the default PortAudio Host API on Windows. Which means that any defects in MME will impact anyone who just uses the defaults. That is presumably a large portion of PortAudio users.

In this particular case, this means any PortAudio user who just uses default settings will get their 24-bit or 32-bit float audio lossily downconverted to 16-bit instead of being passed through as-is to the Windows audio engine. This is surprising, inefficient, user-hostile, and completely unnecessary.

it degrades backward compatibility

There is no evidence that it degrades backward compatiblity.

Can we please only block PRs when there is clear, actual, substantiated concern, as opposed to speculation? Rejecting PRs (or demanding an increase in code complexity) based on highly hypothetical "what ifs" (which, even if confirmed, would only affect extremely dubious use cases) is extremely demotivating and not conducive to progress.

At this point, backward compatibility is the main reason to continue maintaining the MME back end.

I disagree. Besides compatibility, users might still want to use MME today because:

  1. It is the default and they don't feel the need to change it, or don't know what they're missing out on.
  2. In my experience it is by far the most reliable PortAudio Windows Host API:
  3. The PortAudio MME Host API is the only one to support outputting to multiple devices at the same time.
  4. If the user changes their default audio device in Windows, the default MME "wave mapper" device will automatically and transparently redirect the audio, making for a very neat user experience. This is also true for DS (which you also seem to consider "legacy"), but it is not true for WASAPI nor KS. PortAudio users might go for MME or DS to get that improved user experience when switching devices in Windows (I certainly would).

Based on the above, I don't think it's fair to assume that users only use MME for compatibility reasons.

Do we have a reliable reference for this claim? I seem to remember even Windows 98 had some floating point support.

I'm going to copy-paste from my comment on the equivalent DS PR:

MME hands off the samples to the Windows Audio Engine (through WASAPI internally), where the signal is filtered by APOs, resampled (if necessary), mixed, possibly gain-adjusted if software volume control is in use, passed through CAudioLimiter (which does "soft" clipping), and finally converted to integer (with dithering) for handoff to the hardware. (See Microsoft docs e.g. Windows Audio Architecture, Audio Processing Object Architecture, my APO notes, etc.)

But I am (fairly) confident that even if Windows converted to float, it would provide transparent passthrough up to 24-bit.

If you want a 24-bit sample to make its way untouched throughout the entire shared Windows audio pipeline, that is technically possible since 24-bit and 32-bit float can be converted losslessly to each other. However, in practice that kind of setup would be extremely brittle as you will need to get all the stars to align just right:

  • In the Windows audio device settings, the device output format must be set to 24-bit and the sample rate and channel count have to match, otherwise automatic conversions will take place.
  • No mixing (i.e. no other application playing audio).
  • All sAPOs that come with the driver should be disabled.
  • If your device ends up relying on Windows software volume control, that will need to be set to 100%.
  • If your samples get very close to 0 dBFS, they will get altered by CAudioLimiter, which can only be disabled by hacking around in the registry.

If you do all the above, then you can, with difficulty, get a bit-perfect pipeline at 24-bit. If you try the same thing at 16-bit, Windows will also do dithering on top of the above, and that cannot be disabled (as far as I know).

Needless to say, none of the above is particularly user-friendly. For this kind of use case it makes way more sense to use an exclusive backend such as WASAPI Exclusive or WDM-KS, which are precisely designed to be bit-perfect and are the right tools for this job.

@RossBencina
Copy link
Collaborator

Sorry for the delay, here's our position statement. There are two issues:

  1. Given that MME is the legacy API it should always work. As it stands, this patch will lead to failures if the OS can't handle floats (or whatever >16bit format the user passes to PortAudio). We should implement error handling with a fallback to 16-bit. Phil and I agree on this.

  2. There is a second issue, which goes way beyond this PR and which Phil and I have different opinions about, which is whether PortAudio should always be responsible for float<->integer conversions. As it stands this patch lets the OS perform these conversions, and I would like PortAudio to always do these conversions so that the scaling factors, dithering and clipping are controlled and predictable across all PortAudio host APIs. Phil thinks it's best to pass through floats to the OS.

In the case of this patch, the 2nd issue is further complicated by the fact that any float passthrough is better than downscaling to 16bit. Phil and I agree on this. That said, I'd prefer the conversions to be done in PA (even integer <-> integer conversions) and Phil would prefer to pass floats through to the OS if the OS accepts them.

@RossBencina
Copy link
Collaborator

I've created #825 to track the broader policy issue.

@RossBencina
Copy link
Collaborator

RossBencina commented Jul 14, 2023

There is no evidence that it degrades backward compatiblity.

I have been writing code against the MME API since 1995. I remember when float was not supported, and when it was added.

@philburk
Copy link
Collaborator

The current code seems dependent on WAVEFORMATEX.
it will not run on systems prior to WAVEFORMATEX.
So an essential question is whether all systems that support WAVEFORMATEX support FLOAT.
If so then this code will not cause a regression.

@dechamps
Copy link
Contributor Author

There is no evidence that it degrades backward compatiblity.

I have been writing code against the MME API since 1995. I remember when float was not supported, and when it was added.

Sure. You said earlier in this thread that you believe float was supported all the way back to Windows 98. I have no reason to doubt you. Especially since MSDN concurs (Windows 98 SE to be precise).

I'm not sure what point you are trying to make by saying that, though. If you're saying that my statement "There is no evidence that it degrades backward compatibility" is technically incorrect, then sure, no problem, let me rephrase my statement right away and correct the record:

There is no evidence that this change degrades backward compatibility on any OS that any user could possibly care about in 2023.

There. Can we agree on this?

If not, can you please cite one, even just one, user who would reasonably expect code written in 2023 to run out-of-the-box on Windows 95? I certainly can't. In fact, I don't know, nor know people who know, anyone who even still runs Windows 95 in any capacity. I can speculate that there might be an extremely small number of people out there who are forced to use Windows 95 because they want to run very old software or very old hardware. Assuming these people exist, what reason could they possibly have to want to run a 2023 version of PortAudio on these systems??

Given that MME is the legacy API it should always work. As it stands, this patch will lead to failures if the OS can't handle floats (or whatever >16bit format the user passes to PortAudio). We should implement error handling with a fallback to 16-bit.

Honestly I am extremely reluctant to add a fallback, which in itself is a non-trivial error handling mechanism with its own complexity and potential for bugs, to guard against a situation that will never occur in practice outside of systems that (1) are extremely rare, if they even still exist and (2) have no reason to use a modern version of PortAudio in the first place. It's just completely pointless code, and I don't like writing pointless code.

Right now I'm torn between two options: (1) dropping this patch and telling my users they can't use 24-bit with MME or (2) forking PortAudio for my own project so that I can fix these bugs without having to spend precious time and energy running around in circles in threads like this one. Option (3), doing what you asked, i.e. writing pointless code (under protest) that is never going to run on any platform that any reasonable user could possibly want to use, well… that option comes in a very distant third.

So an essential question is whether all systems that support WAVEFORMATEX support FLOAT.

MSDN says WAVEFORMATEX was introduced with Windows 2000. MSDN also says that floating-point support was added to KMixer in Windows 98 SE and Windows 2000. So I believe the answer to your question is yes.

@@ -1796,14 +1796,13 @@ static void InitializeSingleDirectionHandlesAndBuffers( PaWinMmeSingleDirectionH
static PaError InitializeWaveHandles( PaWinMmeHostApiRepresentation *winMmeHostApi,
PaWinMmeSingleDirectionHandlesAndBuffers *handlesAndBuffers,
unsigned long winMmeSpecificFlags,
unsigned long bytesPerHostSample,
PaSampleFormat sampleFormat,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please change this parameter to be named hostSampleFormat. I think that would be clearer.

@RossBencina
Copy link
Collaborator

Honestly I am extremely reluctant to add a fallback, which in itself is a non-trivial error handling mechanism with its own complexity and potential for bugs

I agree with this. This is the main reason I didn't get around to implementing anything more complex than hardcoding paInt16.

@philburk
Copy link
Collaborator

It seems Ross is worried about breaking apps on old OS versions that do not support FLOAT.
And Etiene is worried about apps on modern OSs not being able to take advantage of FLOAT when it is available.

I propose that we make this change conditional on OS version. Something like this pseudo-code:

if (version < WINDOWS_VISTA)  {
    kNativeFormats = paInt16;
} else {
    kNativeFormats = paUInt8 | paInt16 | paInt24 | paInt32 | paFloat32;
}

I don't know how to check the version on Windows. Ross and I looked at this and it is non-trivial.
See "pa_win_wasapi.c".

@RossBencina
Copy link
Collaborator

Phil's proposal sounds ok to me, but...

I don't know how to check the version on Windows. Ross and I looked at this and it is non-trivial.

If we could just use GetVersion() it would be a simple change, but right now Microsoft seems to be strongly advising against it, and Dmitry is dynamically loading the symbol in pa_win_wasapi.c.

https://learn.microsoft.com/en-us/windows/win32/api/sysinfoapi/nf-sysinfoapi-getversion

@dechamps
Copy link
Contributor Author

dechamps commented Jul 29, 2023

I agree with this. This is the main reason I didn't get around to implementing anything more complex than hardcoding paInt16.

So when you wrote this code, you had to make a choice between only passing paInt16 or passing all possible formats. You chose the former for compatibility reasons. I agree this likely made sense when you made this decision 16+ years ago. It doesn't make sense today.

I propose that we make this change conditional on OS version.

Thanks Phil. I can get behind this. I still believe this is pointless, and it will still result in unnecessary conversions for people using PortAudio on, say, XP. But it's not like I really care about XP, and the extra code complexity is trivial, so I'm fine with this.

If we could just use GetVersion() it would be a simple change, but right now Microsoft seems to be strongly advising against it

There is precedent for using GetVersionEx() in DirectSound:

GetVersionEx( &osvi );

In practice I don't think using GetVersion() is really a problem for this use case. Microsoft warns about using GetVersion() because it cannot be used to detect Windows versions greater than Windows 8, but that's perfectly fine here. This other MSDN doc explicitly states (emphasis mine):

In Windows 8.1 and later, the GetVersion and GetVersionEx functions have been deprecated. As of Windows 10, the VerifyVersionInfo function is also deprecated. You can still call the deprecated functions, but if your application doesn't specifically target Windows 8.1 or later, then the functions will return the Windows 8 version (6.2).

Dmitry is dynamically loading the symbol in pa_win_wasapi.c.

I suspect this is purely because GetVersion() and GetVersionEx() can only be used in "desktop" apps, not in "Universal Windows apps" (a.k.a WinRT). I don't think this is a problem for MME or DS because PortAudio never promised these would be usable in UWP apps. In fact, I suspect MME itself cannot be used in UWP apps (MSDN says "desktop apps only" for MME functions).

I'll update the PR to use GetVersion() when I get the chance.

Currently, the MME Host API code only creates 16-bit integer MME
buffers. All audio data provided by the user is therefore converted by
PortAudio to and from 16-bit, regardless of the user buffer format.

This basically makes it impossible to pass 24-bit audio through the
MME Host API.

If the user buffer format is not 16-bit, this also causes pointless
conversions to take place, *even if the hardware is running at 16-bit*,
because modern Windows versions (Vista+) convert the data to floating
point behind the scenes before it is handed off to the hardware. This
can lead to silly situations where 32-bit float samples from the user
are (lossily) converted to 16-bit by PortAudio, then ended off to
Windows via MME, only to be converted back to 32-bit float again, before
finally being converted to the format the hardware device is configured
to use. This can easily lead to two layers of 16-bit dithering (one from
PortAudio, and one from Windows) being piled on top of each other,
resulting in an elevated noise floor.

This commit fixes this problem by configuring the MME buffers to use the
same format as the user buffer. This should stop PortAudio from
converting samples in all cases except paInt8, which is not supported by
WAVEFORMATEX (only paUInt8 is). This is pretty much the same idea as
PortAudio#774.

The new code assumes that MME will accept whatever format we throw at
it. I feel confident that this is always true on Vista+ regardless of
hardware, because starting from Vista MME does not access hardware
directly - it always goes through the Windows Audio Engine which
supports all formats in both directions (I verified this using
paloopback).

On pre-Vista Windows, this should still work all the way back to Windows
98 SE, because MME goes through KMixer which supports all formats [1].
Nevertheless, it's difficult to be sure, so this code checks the Windows
version it's running on and preserves the old behavior (i.e. always use
Int16) on pre-Vista Windows.

[1]: https://learn.microsoft.com/en-us/windows-hardware/drivers/audio/background-of-non-pcm-support#waveout-api

Fixes PortAudio#139
@dechamps
Copy link
Contributor Author

I'll update the PR to use GetVersion() when I get the chance.

Done.

Copy link
Collaborator

@philburk philburk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! I'll let Ross give final approval.

@dmitrykos
Copy link
Collaborator

dmitrykos commented Aug 1, 2023

Dmitry is dynamically loading the symbol in pa_win_wasapi.c.

I suspect this is purely because GetVersion() and GetVersionEx() can only be used in "desktop" apps, not in "Universal Windows apps" (a.k.a WinRT).

WASAPI's implementation does not attempt to load dynamically GetVersion when compiled for UWP, that part of the code is excluded with ifndef and is hardcoded to return either WINDOWS_10_SERVER2016 or WINDOWS_8_SERVER2012:

#if (_WIN32_WINNT >= _WIN32_WINNT_WIN10)
return WINDOWS_10_SERVER2016;
#else
return WINDOWS_8_SERVER2012;
#endif

The reason of the complex implementation of WASAPI's GetWindowsVersion which is loading dynamically RtlGetVersion, or GetVersion, or VerifyVersionInfoA is to be able to detect Windows Vista, Windows 7, Windows 8, Windows 10 when application is compiled as Win32. Unfortunately Microsoft made it really fragmented by deprecating functions and introducing new.

MME itself cannot be used in UWP apps

Indeed, only WASAPI.

I noticed from discussion that DS and now MME host implementations will use GetVersion which is very unportable. My proposal is to extract WASPI's GetWindowsVersion

static EWindowsVersion GetWindowsVersion()
and move it to the new /os/win/pa_win_version.c file which would serve portable and reliable Windows version detection for all host APIs?

If this proposal is acceptable I could take this task and propose the PR for that covering WASAPI, MME and DS host APIs. Please let me know your opinion.

@philburk
Copy link
Collaborator

philburk commented Aug 1, 2023

@dmitrykos - I think it would be great to have a portable GetWindowsVersion() available as a utility.
Thank you for offering. We will probably go ahead and merge this PR. We can then change it to call your GetWindowsVersion() when it is ready.

@dmitrykos
Copy link
Collaborator

@philburk,

We will probably go ahead and merge this PR. We can then change it to call your GetWindowsVersion() when it is ready.

Yes, of course. I will introduce separate PR after this one is merged.

@dechamps
Copy link
Contributor Author

dechamps commented Aug 1, 2023

I noticed from discussion that DS and now MME host implementations will use GetVersion which is very unportable.

Can you explain why you think GetVersion is "very unportable"? Keep in mind we don't need to distinguish between pre-Vista and Vista here.

@dmitrykos
Copy link
Collaborator

dmitrykos commented Aug 1, 2023

@dechamps you could have a look at implementation

static EWindowsVersion GetWindowsVersion()

it is quite self explanatory.

My previous comment about portability issues was not related to GetVersion usage particularly for this PR, portability issues start from Windows Vista and up and proposal is to unify getting Windows version by using single API wrapped by PA, no matter whether this or that PR will hit portability issues or not.

@RossBencina
Copy link
Collaborator

Thanks for your patience. We're going to merge this, but two comments:

1 A minor detail: I'm not sure how we arrived at windows Vista as the target, I would have thought 7 was fine. Not a big deal, but it does seem strange to me.
2. I did raise this earlier, but maybe I was not clear enough, the latest Microsoft docs contain this caveat on GetVersion:

GetVersion may be altered or unavailable for releases after Windows 8.1. Instead, use the Version Helper functions. For Windows 10 apps, please see Targeting your applications for Windows.

https://learn.microsoft.com/en-us/windows/win32/api/sysinfoapi/nf-sysinfoapi-getversion

Hopefully Dmitry's version will help with this issue.

@RossBencina RossBencina merged commit 7e68ca3 into PortAudio:master Aug 4, 2023
@dechamps
Copy link
Contributor Author

dechamps commented Aug 4, 2023

Thanks for finally merging this. Progress!

1 A minor detail: I'm not sure how we arrived at windows Vista as the target, I would have thought 7 was fine. Not a big deal, but it does seem strange to me.

Vista is the first Windows version to include the modern Windows audio stack, so it seems natural to use it as the boundary.

  1. I did raise this earlier, but maybe I was not clear enough, the latest Microsoft docs contain this caveat on GetVersion:

GetVersion may be altered or unavailable for releases after Windows 8.1. Instead, use the Version Helper functions. For Windows 10 apps, please see Targeting your applications for Windows.

I did notice, and I did respond to your point in a previous comment. I strongly suspect that sentence in the Microsoft docs is just poorly phrased - Microsoft almost certainly means to say that GetVersion cannot be used to distinguish between post-8.1 Windows versions, not that GetVersion itself would suddenly stop working. It would make no sense for them to break the existing behaviour of GetVersion as it would compromise application compatibility for zero benefit to them. Historically, Microsoft tends to take Windows app compatibility very seriously.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
src-wmme MS WMME Host API /src/hostapi/wmme
Projects
None yet
Development

Successfully merging this pull request may close these issues.

enable >16 bit output with WMME api
4 participants