Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test: System.Net.Sockets.Tests.SendReceiveEap.SendToRecvFrom_Datagram_UDP(loopbackAddress: ::1) failed in CI #1712

Closed
KristinXie1 opened this issue Mar 10, 2017 · 20 comments · Fixed by #44591
Assignees
Labels
area-System.Net.Sockets disabled-test The test is disabled in source code against the issue test-bug Problem in test source code (most likely) test-run-core Test failures in .NET Core test runs
Milestone

Comments

@KristinXie1
Copy link

Failed test: System.Net.Sockets.Tests.SendReceiveEap.SendToRecvFrom_Datagram_UDP(loopbackAddress: ::1)
Configuration: OuterLoop_CentOS7.1_debug (build#123)

Message:

System.Net.Sockets.Tests.SendReceiveEap.SendToRecvFrom_Datagram_UDP(loopbackAddress: ::1) [FAIL]
        Assert.True() Failure
        Expected: True
        Actual:   False

Stack Trace:

/mnt/resource/j/workspace/dotnet_corefx/master/outerloop_centos7.1_debug/src/System.Net.Sockets/tests/FunctionalTests/SendReceive.cs(148,0): at System.Net.Sockets.Tests.SendReceive.<SendToRecvFrom_Datagram_UDP>d__17.MoveNext()
           --- End of stack trace from previous location where exception was thrown ---
              at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
              at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
           --- End of stack trace from previous location where exception was thrown ---
              at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
              at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
           --- End of stack trace from previous location where exception was thrown ---
              at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
              at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)

Detail: https://ci.dot.net/job/dotnet_corefx/job/master/job/outerloop_centos7.1_debug/123/consoleText

@karelz
Copy link
Member

karelz commented Mar 14, 2017

cc @Priya91 @ianhays @steveharter

@steveharter
Copy link
Member

@stephentoub this test was added about three weeks ago. Any thoughts? We should probably disable it.

  int sent = await SendToAsync(right, new ArraySegment<byte>(sendBuffer), leftEndpoint);
>>Assert.True(await receiverAck.WaitAsync(AckTimeout));
  senderAck.Release();

Also there are two Assert.True's here without any userMessage param so that should be added.

@stephentoub
Copy link
Member

this test was added about three weeks ago

The test itself has actually been in the repo since 2015:
dotnet/corefx@09932a4#diff-5d77ee23eb2ea596e218f9c1ef09d793R22

What changed a few weeks ago was allowing the test to work with the various Async APIs on Socket, e.g. allowing it to work with the EAP, Task, and APM methods (and the sync ones), rather than just the APM ones. It's possible an issue was introduced as part of that conversion.

@KristinXie1
Copy link
Author

Failed again on build 20170324.01

@stephentoub
Copy link
Member

What changed a few weeks ago was allowing the test to work with the various Async APIs on Socket

@steveharter, actually, FYI, it looks like the test has failed not just on EAP but also on APM, which was the one that previously existed.

@stephentoub
Copy link
Member

@steveharter, this looks like the same problem as was fixed for some other tests in https://github.com/dotnet/corefx/issues/5185. The test is sending 10 packets over UDP and expecting all 10 to be received, which isn't guaranteed. There's even a TODO in the test stating that it needs to be hardened against such loss:
https://github.com/dotnet/corefx/blob/ed823ad9470f2ecf412d5089fe36cd6958fc5834/src/System.Net.Sockets/tests/FunctionalTests/SendReceive.cs#L71

@steveharter
Copy link
Member

What changed a few weeks ago was allowing the test to work with the various Async APIs on Socket

@steveharter, actually, FYI, it looks like the test has failed not just on EAP but also on APM, which was the one that previously existed.

FWIW according to jdash, the tests started failing on 3/9/2017. The refactoring work was on 2/26/2017 - dotnet/corefx@ca392ca#diff-5d77ee23eb2ea596e218f9c1ef09d793

Test Case: System.Net.Sockets.Tests.SendReceiveEap.SendToRecvFrom_Datagram_UDP(loopbackAddress: ::1)
Failed Jenkins Jobs
Build Number 	Machine Name 	Date
dotnet_corefx/master/outerloop_centos7.1_debug 123 	centos71-20170216-outer1fbe70 	03/09 07:52 AM
dotnet_corefx/master/outerloop_centos7.1_debug 125 	centos71-20170216-outer125790 	03/11 07:52 AM
dotnet_corefx/master/outerloop_centos7.1_release 123 	centos71-20170216-outer6ffb00 	03/12 03:32 PM

@steveharter
Copy link
Member

steveharter commented Mar 28, 2017

I believe many timeouts or proposed UDP packet loss are interference from other tests that happen to listen on another tests port, where tests do a receivefrom on a port that they did not bind to (IPV4\IPV4 dual mode tests), so the 'bad' test receives the data causing the 'good' test to timeout or miss some data. This is mostly a Linux issue due to the randomness in port assignment (vs Windows which is incremental).

So if the refactoring isn't to blame, perhaps a new or other modified test is (that was added\modified on or a few days before 3/9/2017).

@stephentoub stephentoub removed their assignment Apr 12, 2017
@steveharter
Copy link
Member

steveharter commented Apr 15, 2017

Stress testing Windows 10 resulted in this test starting but not finishing (so not considered a failure in CI reports), unlike Linux which does fail in CI. For the Windows 10 repro, a background exception was reported that may have originated from that test, or from the other test that was currently running: System.Net.Sockets.Tests.DualModeConnectToIPAddressArray.DualModeConnect_IPAddressListToHost_Throws

System.Net.Sockets.Tests.SendReceiveApm.SendToRecvFrom_Datagram_UDP(loopbackAddress: 127.0.0.1) [STARTING]
...
System.Net.Sockets.SocketException: An existing connection was forcibly closed by the remote host

For Linux, possible interference from the DualMode test which mixes\matches IPv4\v6 addresses expecting failures, which may randomly have port collisions on linux so those should be disabled.

@karelz
Copy link
Member

karelz commented Apr 16, 2017

Overall it looks like 1/week failure rate ... borderline for addressing it in 2.0. Given that this is most likely just bad test and we're tight on workforce for 2.0, we will keep it in Future - I marked it as 'wishlist' as it should be on top of our Future backlog.

@wfurt
Copy link
Member

wfurt commented Dec 7, 2017

I could not find any recent failure. Please reopen if this fails again.

@wfurt wfurt closed this as completed Dec 7, 2017
@krwq
Copy link
Member

krwq commented Apr 4, 2018

No failures because test case is disabled:

System.Net.Sockets/tests/FunctionalTests/SendReceive.cs:41

@krwq krwq reopened this Apr 4, 2018
@wfurt wfurt self-assigned this Mar 7, 2019
@wfurt wfurt removed their assignment Aug 6, 2019
@antonfirsov antonfirsov transferred this issue from dotnet/corefx Jan 14, 2020
@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added area-System.Net.Sockets untriaged New issue has not been triaged by the area owner labels Jan 14, 2020
@antonfirsov
Copy link
Member

This is a major blocker for implementing #938, since we need a robust way to test UDP, if we want to cover those changes.

@antonfirsov antonfirsov added disabled-test The test is disabled in source code against the issue test-bug Problem in test source code (most likely) labels Jan 14, 2020
@antonfirsov
Copy link
Member

antonfirsov commented Nov 6, 2020

This failure is likely not about port stealing or any other direct interference with other tests.

The test tends to fail when CPU load is high. I can reproduce the failure by running the Theory's cases on a 2 VCPU Linux system in parallel with one single other test case that calculates PI.

antonfirsov added a commit that referenced this issue Nov 16, 2020
…44591)

Some `SendReceive` socket tests may be prone to timing issues on CI. This seems to be the root cause of #1712. We need a reliable way to run such tests to unblock the work on new UDP socket API-s in #33418.

This PR defines a new `SendReceiveNonParallel` test group, moving `SendToRecvFrom_Datagram_UDP` into that group. Since this is already a significant reorganization, it seemed reasonable to also:
- Harmonize naming: all SendReceive test classses are now named either  `SendReceive_[SubVariant]` 
 or `SendReceiveNonParallel_[SubVariant]`
- Split `SendReceive.cs` into multiple files:
    - `SendReceive.cs` for the parallel variants
    - `SendReceiveNonParallel.cs` for the new, non-parallel variants
    - Rename the non-generic class `SendReceive` to `SendReceiveMisc` (to avoid name collision and confusion with `SendReceive<T>`) and move it to `SendReceiveMisc.cs` 
    - Move `SendReceiveListener` and `SendReceiveUdpClient` to separate files, rename `SendReceiveListener` to `SendReceiveTcpClient`
@ghost ghost locked as resolved and limited conversation to collaborators Dec 25, 2020
MichalStrehovsky added a commit to MichalStrehovsky/runtime that referenced this issue Dec 9, 2021
Fixes dotnet#1712.

Some RVA data blobs within the compiler are special and contain other dependencies the compiler needs to look at during scanning phase.

This fixes an issue where the `<Module>` type wasn't having its metadata generated in optimized builds because the scanning phase never saw the `<Module>` type being allocated and didn't predict it as needing metadata. The p/invoke fixup blob references the `<Module>` type as "a type from the assembly that contained the p/invoke". We need to scan the fixup blob during the scanning phase so that the type is seen.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Net.Sockets disabled-test The test is disabled in source code against the issue test-bug Problem in test source code (most likely) test-run-core Test failures in .NET Core test runs
Projects
None yet
8 participants