Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[1.37.0] Stack overflow #10749

Open
Tracked by #10376
posvyatokum opened this issue Mar 11, 2024 · 0 comments
Open
Tracked by #10376

[1.37.0] Stack overflow #10749

posvyatokum opened this issue Mar 11, 2024 · 0 comments
Assignees

Comments

@posvyatokum
Copy link
Member

Users observe stack overflow crashes that correlate with high network load.
Connected issue #10604

Stack overflow problem was flagged and investigated internally before 1.37.0 release. We understood that the issue is connected to the network actor, but couldn't reproduce it in the debug environment. The issue was labeled as rare enough to not disrupt the network.

After the 1.37.0 release to mainnet validators started seeing this issue in a much larger frequency than anticipated.
At the same time, a similar issue was discovered and debugged in statelessnet #10663. We didn't have enough confidence in this solution to make a release with it before the resharding, so we came up with an ad-hoc patch that sacrifices debug data in network actor to avoid stack overflow without any major refactorings 915aea7

After the 1.37.1 release with that patch, we didn't see any complaints from validators, so we assume that the issue is solved in this release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants