-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
net: TCP RST handling is unreliable on OSX #50254
Comments
TCP RST handling is flaky on OSX, see golang/go#50254. We can avoid this test from randomly failing by using QUIC instead.
* don't run the reconnect test using TCP on OSX Due to golang/go#50254. * don't check the error return value when closing stream in reconnect test
This looks like the same macOS bug as described here: |
@seankhliao I see the "WaitingForInfo" label was applied here. I'm happy to provide more info, if that helps us getting closer to a fix. What else do you need? |
Does it reproduce on 1.19 (which should include the workaround for #37795)? |
Yes, it does. Running my test case with
I'm running Go 1.19 on an M1 Macbook Pro:
Here's a new pcap for the failing test case above: dump.pcapng.zip |
Unfortunately, this is a macOS bug, not a Go bug. (See #37795 (comment), which reproduces the problem entirely in C.) Any fix is going to have to come from Apple. Perhaps there's a workaround that can be applied in the Go runtime, but I can't think of one. |
FYI, the workaround for #37795 is just a change to the flaky test which makes it avoid triggering the macOS bug. It doesn't address the underlying problem. (Which is a kernel bug triggered--I think--by receiving a RST while in the process of accepting a connection.) |
Makes sense, thanks for investigating @neild. |
I reported this through Apple's "Feedback Assistant", which appears to be the only official way to report issues these days. It did not give me any issue number. (Used to be you'd get a radar number, but no way to track activity on the issue.) |
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes, tested with go.18beta1.
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
I'm seeing occasional TCP connection timeouts, even though the other side has sent (and we have received) the TCP RST packet.
I managed to reproduce the failure using the following minimal working example. It reliably fails at least once when run 1000 times (
go test -run TestLinger -count 1000 -v -failfast
).What did you expect to see?
I expect the
tcp.Conn
to be reliably closed / reset when we receive the TCP RST packet.This works reliably on Linux (incl. the code I posted above), but not on OSX.
What did you see instead?
Occasionally (the test case above fails in maybe 1 out of 100 runs), the TCP RST doesn't seem to have any effect on the connection at all. The connection stays open, and eventually (after a long time) runs into a connection timeout.
Here's a pcap of this transport:
tcprst.pcapng.gz, showing that the RST was actually (and received).
The text was updated successfully, but these errors were encountered: