-
Notifications
You must be signed in to change notification settings - Fork 373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add DHCP IP Retries in PrepareHNSNetwork #5819
Conversation
/test-windows-containerd-e2e |
a765a81
to
3aafc50
Compare
/test-windows-containerd-e2e |
1 similar comment
/test-windows-containerd-e2e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a e2e test to verify that vNIC migration is not supposed to modify the original properties.
7c919ee
to
801139a
Compare
/test-windows-containerd-e2e |
4b57901
to
e912ea2
Compare
dcccd93
to
5973c2f
Compare
f2b4ca2
to
4bf6142
Compare
/test-windows-containerd-e2e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/test-windows-containerd-e2e |
/test-windows-containerd-e2e |
/test-windows-containerd-e2e |
pkg/agent/util/net_windows.go
Outdated
if err != nil { | ||
klog.ErrorS(err, "Failed to get Ipv4 DHCP status on the network adapter", "adapter", uplinkAdapter.Name) | ||
} | ||
klog.Warningf("Timeout acquiring IP for the adapter, DHCP status: %t", dhcpStatus) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Use structured logging for new logs, use InfoS if this is something expected and ErrorS if not expected to happen.
- Logging dhcpStatus could be confusing when it fails to get its status.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the logic, now we only print dhcpStatus
if we successfully retrieve its value from adapter. @tnqn
// Therefore, we set the timeout limit to triple of that value, allowing a maximum wait of 6 seconds here. | ||
err = wait.PollImmediate(1*time.Second, 6*time.Second, func() (bool, error) { | ||
var checkErr error | ||
adapter, ipFound, checkErr = adapterIPExists(nodeIPNet.IP, uplinkAdapter.HardwareAddr, ContainerVNICPrefix) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does Windows support both DHCP and static case? If yes, these retries would add unnecessary initialization delay for static IP case.
I think it should first check if this is DHCP case, and only expects it to get IP from DHCP server when DHCP is enabled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my understanding, the "IP delay available issue" occurs during the creation of a new uplink interface, and this delay is consistent regardless of whether DHCP is enabled on the interface. Regarding the static IP case, we still expect to get available IP from adapter timely. Otherwise it may indicate an issue with the adapter itself. @wenyingd please correct me if I am wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For static IP configurations, we still expect Windows OS can automatically migrate the IP address from pnic to vnic. We would like to use this check to ensure that Windows OS has performed expected behavior, and gives warning logs if not.
f2f6c7a
to
7d125ac
Compare
/test-windows-all |
a737bab
to
ee8222f
Compare
/test-windows-all |
pkg/agent/util/net_windows.go
Outdated
if err == wait.ErrWaitTimeout { | ||
dhcpStatus, err := InterfaceIPv4DhcpEnabled(uplinkAdapter.Name) | ||
if err != nil { | ||
klog.ErrorS(err, "Failed to get Ipv4 DHCP status on the network adapter", "adapter", uplinkAdapter.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
klog.ErrorS(err, "Failed to get Ipv4 DHCP status on the network adapter", "adapter", uplinkAdapter.Name) | |
klog.ErrorS(err, "Failed to get IPv4 DHCP status on the network adapter", "adapter", uplinkAdapter.Name) |
pkg/agent/util/net_windows.go
Outdated
@@ -647,6 +670,16 @@ func HostInterfaceExists(ifaceName string) bool { | |||
return true | |||
} | |||
|
|||
// InterfaceIPv4DhcpEnabled returns the Ipv4 DHCP status on the specified interface. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
In the commit message: s/a warning message will be returned/an error will be logged/ |
No need to rerun tests after addressing the typos |
To address the potential race condition issue where acquiring a DHCP IP address may fail after CreateHNSNetwork, we added a retry mechanism to wait for an available IP. If the DHCP IP cannot be acquired within six seconds, an error will be logged. Signed-off-by: Shuyang Xin <gavinx@vmware.com>
Got it, PR has been updated. |
/skip-all |
To address the potential race condition issue where acquiring a DHCP IP address may fail after CreateHNSNetwork,
we added a retry mechanism to wait for an available IP. If the DHCP IP cannot be acquired within six seconds,
an error will be logged.