-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[linux-arm64] Random and rare runtime crash System.ArgumentOutOfRangeException (System.Net.Sockets) #72365
Comments
Tagging subscribers to this area: @dotnet/ncl Issue DetailsDescriptionDescription
It seems to append only on loaded applications. Exit signal: Abort (6) Reproduction StepsWe don't have any reproduction yet. We probably need to heavily stress network! It seems to be a race condition. Expected behaviordon't crash the runtime when we are using sockets... Actual behaviorrandom and rare crashes of the runtime Regression?No response Known WorkaroundsNo response ConfigurationDotnet runtime version: 6.0.6 Other informationfollow up of #70486
|
@NQ-Brewir are you working on getting a repro, or some more actionable information? I would recommend to close the issue until there is info which is actionable. |
This issue has been marked |
This issue has been automatically marked |
I'm also getting the same error. It looks like it's also appearing here: aws/aws-lambda-dotnet#1244. My config: This happens (again, intermittently) when I'm running/debugging a few microservices (on Kestrel - was unsure if this was a Kestrel issue, but saw this reported here). One additional piece of info is that in the framework method that throws the exception:
The argument parameter is always (int) 40 . Just like the linked AWS issue, none of my exception handlers seem to be catching the error. Any ideas on next steps? I can't seem to isolate the exception for repo... |
I decided to run exactly the same code in the same way on a Windows machine: Dotnet runtime version: v7.0.100-preview.7 (x64) ...it's been running now for 24hr without error. I'll keep it running, but I'd normally get that ^ exception thrown within a couple of hours on ARM64/macOS - so more of a platform issue? |
the issue is still happening, but way less ofter since we removed all ValueTask from our codebase. |
Hello, |
@NQ-Brewir, could you try catching the unhandled exception via the AppDomain event handler and dump the full exception object to the logger (with the inner exception)? Note that it can get too noisy and costly in the production environment, so you may want to filter which exception object to dump. The call stack in top post resembles the lower part of exception @BrennanConroy logged here: aspnet/SignalR#1703 (comment). I'm not sure if it is the same (mysterious) issue. If it is, then going by the SignalR's call stack, the inner exception of runtime/src/libraries/System.Net.Sockets/src/System/Net/Sockets/Socket.Tasks.cs Lines 1266 to 1283 in 3d74b00
|
I saw what looks like this issue in a CI pipeline run on osx/arm64:
note: it looks like a crash dump was created |
This can be a bug, we should investigate.
Unfortunately, I see the following:
@dotnet/dnceng any chance the limit can be increased? @paulquinn it's been a while, but any chance you can produce a dump for us? |
Unfortunately this limitation comes about from our support of on-premises machines; these tend to cost us lots of time and money uploading large dumps which are often ignored despite this time/financial cost. If you need to check out a machine with the same specifications as the test one, that can likely be arranged, we'd just need to know the specific queue that this work item ran on (or have its full log linked, etc) |
I just noticed I missed the start of the conversation and the fact this is technically duplicate of #70486. Might worth to keep it open because of the number of the reports we see. |
We really should compress the dumps @MattGal. They are often full of zeros and we can probable get 10:1 gain |
This was discussed in dotnet/dnceng#1219, feel free to reopen it and make your case. |
@am11 I tried logging more info using the AppDomain eventhandler, but it seems to not go through it. |
We had to remigrate to amd64 du to some other reasons, and the server is not crashing anymore. This issue is thus really due to ARM, and not to any wrongly used ValueTask |
Thanks for the update. I'll close this and we can reopen if it reoccurs and we're able to get more information for debugging. |
Description
Description
Random and rare crashes with this exception:
It seems to append only on loaded applications.
Exit signal: Abort (6)
Reproduction Steps
We don't have any reproduction yet. We probably need to heavily stress network! It seems to be a race condition.
Expected behavior
don't crash the runtime when we are using sockets...
Actual behavior
random and rare crashes of the runtime
Regression?
No response
Known Workarounds
No response
Configuration
Dotnet runtime version: 6.0.6
OS : GNU/Linux Debian 11 Bullseye
CPU: ARM64 Graviton 2 (AWS)
We are using Orleans with this application
Other information
follow up of #70486
we triple checked all usages of ValueTask and removed all usages of it, just to be sure
this time, this is notn some ValueTasks awaited twice
The text was updated successfully, but these errors were encountered: