-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix stack overflows when compiling high-recursion_limit
programs
#93056
Fix stack overflows when compiling high-recursion_limit
programs
#93056
Conversation
r? @wesleywiser (rust-highfive has picked a reviewer for you, use r? to override) |
recursion_limit
programsrecursion_limit
programs
☔ The latest upstream changes (presumably #93006) made this pull request unmergeable. Please resolve the merge conflicts. |
f12763d
to
1e81a60
Compare
☔ The latest upstream changes (presumably #93154) made this pull request unmergeable. Please resolve the merge conflicts. |
1e81a60
to
74bf68b
Compare
☔ The latest upstream changes (presumably #92007) made this pull request unmergeable. Please resolve the merge conflicts. |
74bf68b
to
c37d401
Compare
☔ The latest upstream changes (presumably #93836) made this pull request unmergeable. Please resolve the merge conflicts. |
c37d401
to
43bdd27
Compare
☔ The latest upstream changes (presumably #93893) made this pull request unmergeable. Please resolve the merge conflicts. |
43bdd27
to
2269bd2
Compare
This comment has been minimized.
This comment has been minimized.
2269bd2
to
4f7b12c
Compare
☔ The latest upstream changes (presumably #93148) made this pull request unmergeable. Please resolve the merge conflicts. |
4f7b12c
to
4537a1f
Compare
☔ The latest upstream changes (presumably #94103) made this pull request unmergeable. Please resolve the merge conflicts. |
4537a1f
to
e85e1f2
Compare
This comment has been minimized.
This comment has been minimized.
e85e1f2
to
af949ca
Compare
☔ The latest upstream changes (presumably #94134) made this pull request unmergeable. Please resolve the merge conflicts. |
af949ca
to
ee03962
Compare
☔ The latest upstream changes (presumably #94174) made this pull request unmergeable. Please resolve the merge conflicts. |
ee03962
to
a55ec9c
Compare
3bf123c
to
240c6da
Compare
@LegionMammal978, this is the PR you reached to me about, right? I'll try to have look at it next week (especially wrt the debuginfo problem). |
Yes, this is the PR I was referring to. I've since gotten the hack to work with your changes; annoyingly, a few of the |
☔ The latest upstream changes (presumably #94081) made this pull request unmergeable. Please resolve the merge conflicts. |
240c6da
to
3d18ca2
Compare
@LegionMammal978, I've finally gotten around to taking a closer look at this. Regarding the debuginfo part, I have to say I'm skeptical about going to such lengths on our side, just to make sure that LLVM doesn't run into a stack overflow for what are most likely unrealistic programs I suggest splitting the non-controversial parts of this PR out into a separate PR and get that landed. About debuginfo, I'm not so sure what the best approach is.
Do we know if this is a problem for the default recursion limit? I think it would be fine if explicitly upping the recursion limit implied increased memory consumption.
I'll need to think about that approach a bit. Maybe it can be implemented cheaply and mostly transparently as part of the |
Really, the optimal solution here would be for LLVM to provide a config option to let the user pass in their own stack-growing callback. But I have no clue how to begin submitting a patch to upstream LLVM, especially one of such magnitude, and I've heard horror stories of patches for the big compilers languishing for years with no response. For that reason, I'm not too optimistic about a solution on the LLVM side. It would be a shame if we couldn't get this in on either side, though, since the main motivation is that programs setting the
The annoying part is, the controversial debuginfo fix is necessary for most of the type-based test cases to work. I'm currently using 8 different test cases, all of which create some deeply nested program structure using a declarative macro:
Test cases 1, 2, and 3 require this fix to work in debug builds. Notably, this fix is also necessary to compile the "dumb program" posted in the original issue. I guess it would be possible to extract out the other fixes, but first I'd want to find a way to reduce the number of re-indented functions.
The problem comes from programs that are manually specifying the recursion limit. Currently, in my test cases, I specify |
☔ The latest upstream changes (presumably #95524) made this pull request unmergeable. Please resolve the merge conflicts. |
3d18ca2
to
122e4de
Compare
☔ The latest upstream changes (presumably #96316) made this pull request unmergeable. Please resolve the merge conflicts. |
122e4de
to
fd24604
Compare
This comment has been minimized.
This comment has been minimized.
fd24604
to
d190707
Compare
☔ The latest upstream changes (presumably #91557) made this pull request unmergeable. Please resolve the merge conflicts. |
The ensure_sufficient_stack function executes its callback in a dynamically allocated stack when the current stack is low on space. Using it in recursive algorithms such as visitors prevents the compiler from overflowing its stack when compiling modules with a high recursion_limit.
In rustc_ast, an Expr contains an ExprKind, which can contain a P<Expr>. To prevent the stack from overflowing when dropping an Expr, one of the types must use ManuallyDrop for its fields. Since both Expr and ExprKind are frequently deconstructed into their fields using match expressions, I chose to attach it to P::ptr.
This code is a proof-of-concept more than anything else. It is not ready to be merged in its current state. Any version of this fix must somehow measure the depth of the metadata nodes (or a rough upper bound), and it must somehow send that depth to the LLVM codegen entry point. There is likely a tradeoff to be made between performance and readability here.
d190707
to
a68b894
Compare
☔ The latest upstream changes (presumably #96495) made this pull request unmergeable. Please resolve the merge conflicts. |
Since there was a review and following activity, I'd flag this as s-waiting-on-author until a new review can be requested @rustbot -S-waiting-on-review +S-waiting-on-author |
@LegionMammal978 |
@JohnCSimon My apologies, I didn't receive the ping. I've been pretty busy IRL for the past few months, so I haven't had much time to fix this up. Right now, I'll probably want to wait for #97447 to land, given how much of the type-folding logic it touches. Also, I've been meaning in general to find a way to enumerate the recursive visitors in the compiler, since I'm certain I haven't caught all of them. Regarding the debuginfo depth, I have some ideas for how to independently compute it from the type graph without too much churn, but I'm still not exactly sure what form it will take. I might be able to cook something up in 2 or 3 weeks. |
@LegionMammal978 FYI: when a PR is ready for review, send a message containing |
@LegionMammal978 @rustbot label: +S-inactive |
This should close #75577; I've added
ensure_sufficient_stack
to all of the recursive paths used by the original dumb program, as well as more I've found while testing trivial variations. (Some of the variations are listed in my comment there.) However, there are a few problems left with these fixes.While I tried to avoid modifying indentation of large blocks, this was not possible for all functions. In its current state, upstream changes are causing frequent merge conflicts with the modified functions. There might be no solution to this. It would be nice to have something like an
#[ensure_sufficient_stack]
attribute on functions that wraps their entire body, but rustc currently only seems to be using proc macros for deriving traits.The final commit is extremely unpolished; I mainly wanted to have it work at all. The
llvm::Verifier
class (defined insrc/llvm-project/llvm/lib/IR/Verifier.cpp
) finds all references to metadata nodes (mostly debuginfo) in the LLVM module, then traverses them in a recursive function. Fixing stack overflows here is necessary for all of the variations involving deeply nested types. Since the function is written in C++, I can't simply insertensure_sufficient_stack
; the only option is to allocate the entire additional stack space ahead of time. However, to do that, the compiler must compute the depth of the metadata node graph (which can contain loops), and then pass that down to the backend in the codegen thread for the module. There are several ways to achieve the first:HashMap
somewhere. This is what my commit does; I added a call to every invocation of aDIBuilder
method.llvm::DIWhatever
references, always pass(llvm::DIWhatever, usize)
pairs with the depth of the node and its operands.llvm::Module
after the fact to compute the depth. This would effectively involve replicating all ofVerifier.cpp
.recursion_limit
. This is the simplest possible solution, but it creates lots of memory overhead when the limit is not reached and would likely cause issues in currently existing crates.After that, there's the issue of collecting the data and passing it to the backend. In my commit, I add a
RefCell<FxHashMap<&'ll llvm::Metadata, usize>>
to theCrateDebugContext
(as well as a method to add to it), and a staticMutex<FxHashMap<usize, usize>>
to associatellvm::Module
pointers to their maximum depths. I'm not familiar with the compiler, and I simply have no idea how to idiomatically pass anything but thellvm::Module
itself to the backend. Overall, I could really use some advice on this part of the fix.There are undoubtedly many recursive paths which my limited testing has not revealed. Eliminating all stack overflows would require inspecting the entire call graph of the compiler, including its backends. I recall seeing a couple tools offering call graph generation for regular crates, but I doubt they'd work well with rustc's custom build system.