Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

api: replace use of DevCreateObjectQuery with CM_Register_Notifcation… #7

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

dblohm7
Copy link

@dblohm7 dblohm7 commented Mar 23, 2023

… and ensure that the device interface list is explicitly queried immediately after notification registration

Tailscale has several open issues concerning WinTun adapter installations that hang indefinitely, stuck waiting on an event that never triggers. Based on analysis of Tailscale logs and examination of WinTun's adapter installation code, I believe we are seeing a race condition that only manifests under specific timing scenarios that are relatively infrequent for most users, but are common for some.

It is hard to reason about DevCreateObjectQuery because it is an undocumented API. Based upon my reading of the existing code and examination of existing documentation, it looks like CM_Register_Notification is the documented counterpart. Its documentation makes it pretty clear[1] that these notifications are essentially edge-triggered; we need to call CM_Get_Device_Interface_List immediately after registering the notification to ensure that the arrival event has not already happened.

This patch replaces DevCreateObjectQuery with CM_Register_Notification because using a documented API gives us a better understanding of the semantics involved and makes this code easier to reason about. Then we add a call to AdapterGetDeviceObjectFileName to handle the aforementioned explicit check, thus ensuring that we are not waiting on an event that might have occurred prior to registering for event notifications.

[1] https://learn.microsoft.com/en-us/windows/win32/api/cfgmgr32/nf-cfgmgr32-cm_register_notification#remarks

… and ensure that the device interface list is explicitly queried immediately after notification registration

Tailscale has several open issues concerning WinTun adapter
installations that hang indefinitely, stuck waiting on an event that
never triggers. Based on analysis of Tailscale logs and examination of WinTun's
adapter installation code, I believe we are seeing a race condition that only
manifests under specific timing scenarios that are relatively infrequent for
most users, but are common for some.

It is hard to reason about DevCreateObjectQuery because it is an
undocumented API. Based upon my reading of the existing code and examination of
existing documentation, it looks like CM_Register_Notification is the documented
counterpart. Its documentation makes it pretty clear[1] that these
notifications are essentially edge-triggered; we need to call
CM_Get_Device_Interface_List immediately after registering the
notification to ensure that the arrival event has not already happened.

This patch replaces DevCreateObjectQuery with CM_Register_Notification
because using a documented API gives us a better understanding of the semantics
involved and makes this code easier to reason about. Then we add a call to
AdapterGetDeviceObjectFileName to handle the aforementioned explicit
check, thus ensuring that we are not waiting on an event that might have
occurred prior to registering for event notifications.

[1] https://learn.microsoft.com/en-us/windows/win32/api/cfgmgr32/nf-cfgmgr32-cm_register_notification#remarks

Signed-off-by: Aaron Klotz <aaron@tailscale.com>
@lezzi
Copy link

lezzi commented Aug 31, 2024

Hey @dblohm7 👋 . Were you ever able to somehow reproduce this scenario?

@xwwang
Copy link

xwwang commented Oct 25, 2024

I started to use WireGuard thoes months, and calling WireGuardCreateAdapter might fail by accident. The error message is: "Timed out waiting for device query". It seems like the device create event is missing and result in WaitForSingleObject interface timeout.

Most times recall WireGuardCreateAdapter will success, but rarely recall didn't work and I should restart the OS.

It's hard to reproduce this problem Cause I don't know how it happened.....

@dblohm7 Am I caught in the same situation which you're trying to fix? If so I'd like to pull this commit and see whether it works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants