Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"sigaltstack too small in native thread" in detach_signal test #6880

Open
derekbruening opened this issue Jul 12, 2024 · 3 comments
Open

"sigaltstack too small in native thread" in detach_signal test #6880

derekbruening opened this issue Jul 12, 2024 · 3 comments

Comments

@derekbruening
Copy link
Contributor

This happened once on the aarch64-sve-precommit-256 test:

https://github.com/DynamoRIO/dynamorio/actions/runs/9914012718/job/27392240520?pr=6879

2024-07-12T21:08:37.5756527Z 358: detaching
2024-07-12T21:08:37.5756915Z 358: signal count post-detach: 66757
2024-07-12T21:08:37.5757319Z 358: native signals delivered: 59542
2024-07-12T21:08:37.5758963Z 358: <Application /opt/actions-runner/_work/dynamorio/dynamorio/build/build_debug-internal-64/suite/tests/bin/api.detach_signal (20510). Cannot correctly handle received signal 12 in thread 20513: sigaltstack too small in native thread.>

This code was recently changed by #6815 so it is possible this is an introduced regression.

@abhinav92003
Copy link
Contributor

This code was recently changed by #6815 so it is possible this is an introduced regression.

The change in #6815 was to avoid running out of stack space by reusing the existing frame during native signal delivery for an almost-detached thread (which has only the removal of DR main_signal_handler left); this should actually make "sigaltstack too small in native thread" less likely as we're now using less stack space for native signal delivery.

We'll need more info on the exact sequence of events happening here.

@AssadHashmi
Copy link
Contributor

Could this regression have something to do with #6868 merged 3 days ago?

@derekbruening
Copy link
Contributor Author

I logged into the aarch64-precommit machine but can't reproduce it there:

derek@dynamorio:~/dr/build$ ctest --repeat-until-fail 500 -R detach_signal
Test project /home/derek/dr/build
    Start 351: code_api|api.detach_signal
    Test #351: code_api|api.detach_signal .......   Passed    0.22 sec
    Start 351: code_api|api.detach_signal
    Test #351: code_api|api.detach_signal .......   Passed    0.21 sec
...
1/1 Test #351: code_api|api.detach_signal .......   Passed    0.19 sec

100% tests passed, 0 tests failed out of 1

Total Test time (real) = 104.28 sec

I don't have access to the aarch64-sve-precommit machine which is where it failed. @AssadHashmi maybe you could run it on that machine 1000x and see if it reproduces? If so maybe removing #6868 and repeating would show whether that is the culprit?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants