-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure all iterations in Rayon iterators run in the presence of panics #68171
Conversation
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems not great that we have to do this, but makes sense too. Overall I'm not sure about the approach. (see followup comment)
src/librustc_data_structures/sync.rs
Outdated
@@ -181,46 +199,40 @@ cfg_if! { | |||
($($blocks:tt),*) => { | |||
// We catch panics here ensuring that all the blocks execute. | |||
// This makes behavior consistent with the parallel compiler. | |||
let mut panic = None; | |||
let panic = ::rustc_data_structures::sync::Lock::new(None); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use $crate
here? That should make it work anywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
I'm not sure that this is quite the right approach to take, at least in the parallel case. I would appreciate @cuviper taking a look -- is there some better way to get this behavior (all "pieces" of a parallel iterator running to conclusion) in rayon? I'm also a bit worried that we might not actually get the right behavior even with this patch, as I'm not sure if Rayon will ever loose work when a worker thread panics. We should also be careful more generally in that regard, as I would want us to avoid losing track of jobserver tokens and eventually stalling out entirely due to panics in worker threads (Cc @alexcrichton re:lazy spawn, too, in case this has effects there). |
In terms of robustness of jobserver tokens it sort of depends on the rayon integration with jobserver, but I don't think it's necessary to catch panics to be robust, we'd just need to audit |
Well, I guess it is true that presumably whenever we do panic we're going to end up failing the build, so it doesn't matter too much in practice whether we're leaking a jobserver token. |
The way to do this is to not use panics for error reporting in rustc, but just for actual bugs. That is easier said than done though. I also don't think there is a way to get the behavior we want here from Rayon iterators.
The primitives exposed by |
☔ The latest upstream changes (presumably #68272) made this pull request unmergeable. Please resolve the merge conflicts. |
We discussed this a bit in our last parallel meeting, and I felt that we probably don't want to do this, at least not yet. It's not clear that we care about determinism in the error path (which is the only case that this affects, I believe) -- any tests that are currently different between master and parallel can be tagged as This feels like it may have a nontrivial performance cost, and regardless I would not myself want us to commit to making sure things continue to work (e.g., we don't ICE) after the first fatal error panic in the compiler. As such, I would not be willing to accept this PR as-is, as I believe it goes in the wrong direction. Happy to hear arguments against that position though! |
It is very clear we care about this. Running the compiler multiple times on the same input should produce the same set of compiler errors.
No. All tests already run with
I don't think that is likely, but we can measure that.
This is a property we already committed to by having any parallelism at all because work can happen in parallel with the the first fatal error being raised and that work cannot ICE. |
I don't personally think this is true; I would be fine with nondeterministic (and different) output from the compiler when it errors. To be clear, I think the performance cost isn't critical here, or at least I wouldn't worry about it. We're unlikely to use parallelism at a granularity where it would matter in practice, I suspect.
I think that's slightly different -- right? In the sense that today, the first fatal error will propagate out and exit the compiler, even if ICEs occur on other threads. Whereas this PR would instead propagate the ICE, I think, if I follow it correctly. |
That's not true, we'll unwind with either the ICE or the fatal error. Being first doesn't matter. That's an unrelated thing we should fix. If there's both an ICE and a fatal error, we should unwind with the ICE essentially giving fatal errors less priority. Currently we could have a scenario where the compiler ICEs, but exits like there was a regular fatal error. It will still print the ICE to stderr though, as that happens before unwinding. |
I don't follow. In the situation where an ICE only occurs if an error has been encountered (i.e. we have stored a Ty::Err or something in "global" state), then in previous parallel and non-parallel code, we would never hit such an ICE. In the current PR, we would consistently hit that ICE. "If there's both an ICE and a fatal error, we should unwind with the ICE essentially giving fatal errors less priority." -- I disagree; I think if we do have a fatal error already there's no need to print ICEs that occur (possibly as a result of that error). In practice of course I think we can't 100% catch this in parallel code... but that seems like a bad reason to not even try. I believe that we have an underlying disagreement that goes beyond the specifics of this PR -- I don't think it's true that we need to be deterministic when we're going to fail to compile. You, I believe, think we should be. I don't know how to resolve that disagreement. |
It seems like you want to suppress panics that occur after the first fatal error. That seems like a horrible idea to me as it masks compiler bugs, but anyway that is ortogonal to this PR. The effect of this PR would be to make the set of error messages deterministic for the parallel compiler and also make test output consistent between the parallel and non-parallel compiler with the downside of a probably insignificant performance regression. Let's see what @rust-lang/compiler has to say. @estebank and @michaelwoerister in particular might have opinions. |
This PR also has the effect that, in both cases, we always evaluate all elements of some iterators (those that are "par_iter", but the current trajectory is towards many such iterators I believe), regardless of fatal errors or ICEs during that evaluation. |
@Mark-Simulacrum Yes, for the non-parallel compiler previous users of |
@bors try @rust-timer queue |
Awaiting bors try build completion |
⌛ Trying commit 934abf2 with merge 91caf4340bf16936868be675355f00a5e07ccc36... |
☀️ Try build successful - checks-azure |
Queued 91caf4340bf16936868be675355f00a5e07ccc36 with parent 34700c1, future comparison URL. |
Finished benchmarking try commit 91caf4340bf16936868be675355f00a5e07ccc36, comparison URL. |
I'm ok with this as long as we don't kill performance. |
I don't really have a lot of time to look into this. Is this something that would warrant a design meeting? That would certainly make it easier for me personally to schedule time for it. |
I don't know that this individually has the weight needed for a design meeting. Perhaps it does. I am not sure that I can drive such a discussion. I'm going to try and get some folks from the parallel compiler WG to weigh in as well, though I think (hope!) I've faithfully represented those discussions here as well. |
@rustbot modify labels to +T-compiler |
Closing this pull request as Zoxc is stepping back from compiler development; see rust-lang/team#316. |
use `par_for_each_in` in `par_body_owners` and `collect_crate_mono_items` Using `par_iter` in non-parallel mode will cause the entire process to abort when any iteration panics. So we can use `par_for_each_in` instead to make the error message consistent with parallel mode. This means that the compiler will output more error messages in some cases. This fixes the following ui tests when set `parallel-compiler = true`: ``` [ui] src/test\ui\privacy\privacy2.rs [ui] src/test\ui\privacy\privacy3.rs [ui] src/test\ui\type_length_limit.rs ``` This refers to rust-lang#68171 Updates rust-lang#75760
This ensures that fatal errors cannot non-deterministically hide errors that occur later.
r? @Mark-Simulacrum