-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move alignment checks to codegen #117473
base: master
Are you sure you want to change the base?
Move alignment checks to codegen #117473
Conversation
This comment has been minimized.
This comment has been minimized.
c0b5969
to
487ffa6
Compare
This comment has been minimized.
This comment has been minimized.
487ffa6
to
f976ac6
Compare
This comment has been minimized.
This comment has been minimized.
f976ac6
to
b8cc419
Compare
This comment has been minimized.
This comment has been minimized.
b8cc419
to
f6feccf
Compare
This comment has been minimized.
This comment has been minimized.
f6feccf
to
72aaa7d
Compare
This comment has been minimized.
This comment has been minimized.
72aaa7d
to
df639ed
Compare
This comment was marked as outdated.
This comment was marked as outdated.
df639ed
to
9fc6dde
Compare
This comment has been minimized.
This comment has been minimized.
9fc6dde
to
4c33915
Compare
This comment has been minimized.
This comment has been minimized.
4c33915
to
165048a
Compare
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
…=<try> Move alignment checks to codegen Implementing UB checks entirely in a MIR transform is quite limiting, we don't know for sure what all our types are so we need to make a lot of sacrifices. For example in here we used to emit MIR to compute the alignment mask at runtime, because the pointee type could be generic. Implementing the checks in codegen frees us from that requirement, because we get to deal with monomorphized types. But I don't think we can move these checks entirely into codegen, because inserting the check needs to insert a new terminator into a basic block, which splits the previous basic block into two. We can't add control flow like this in codegen, but we can in MIR. So now the MIR transform just inserts a `TerminatorKind::UbCheck` which is effectively a `Goto` that also reads an `Operand` (because it either goes to the target block or terminates), and codegen expands that new terminator into the actual check. --- Also I'm writing this with the expectation that I implement the niche checks in the same manner, because they have the same problem with polymorphic MIR, possibly worse. r? `@ghost`
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (8d257b9): comparison URL. Overall result: no relevant changes - no action neededBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. @bors rollup=never Instruction countThis benchmark run did not return any relevant results for this metric. Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Bootstrap: 674.671s -> 673.178s (-0.22%) |
9b79098
to
770ca3d
Compare
r? oli-obk |
This PR changes MIR cc @oli-obk, @RalfJung, @JakobDegen, @davidtwco, @celinval, @vakaras The Miri subtree was changed cc @rust-lang/miri Some changes occurred to MIR optimizations cc @rust-lang/wg-mir-opt Some changes occurred in compiler/rustc_codegen_cranelift cc @bjorn3 This PR changes Stable MIR cc @oli-obk, @celinval, @ouz-a Some changes occurred in compiler/rustc_codegen_gcc |
@@ -246,7 +246,6 @@ pub enum AssertMessage { | |||
RemainderByZero(Operand), | |||
ResumedAfterReturn(CoroutineKind), | |||
ResumedAfterPanic(CoroutineKind), | |||
MisalignedPointerDereference { required: Operand, found: Operand }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please just mark this as deprecated for now instead or removing it? Thanks
770ca3d
to
fa98120
Compare
☔ The latest upstream changes (presumably #124972) made this pull request unmergeable. Please resolve the merge conflicts. |
Has anyone considered creating MIR passes on monomorphic MIR? I see a pattern of pushing things to codegen that should really be implemented as instrumentation passes. The code generator shouldn't be creating new basic blocks. |
Yes. Many times. Nobody is happy with the amount of cleverness in codegen. MIR is monomorphized on-the-fly as an optimization, because otherwise we'd have to clone all MIR bodies at codegen so that we can mutate them. Or we could probably have a really complicated accessor for the MIR like |
fa98120
to
52f2d3f
Compare
This comment has been minimized.
This comment has been minimized.
Do you know what the overhead would be if we clone the bodies lazily, just for functions that need transformation? For example, I'm assuming these checks would only be required in functions that perform unsafe operations. |
I think you have a bit of an optimistic view of the situation based on only looking at the changes in this PR. Consider also #121174. And also, if we had such a change I would like to use it to do SimplifyCfg on monomorphic MIR, to clean up the result of this traversal strategy: rust/compiler/rustc_codegen_ssa/src/mir/mod.rs Lines 270 to 281 in 6e1d947
I know from looking at the IR we produce that the mono-reachable traversal produces goto chains. Like everything else here, that optimization is possible to implement in a lazy fashion without some MIR to mutate, but it would be complicated. |
52f2d3f
to
0bc84fa
Compare
Based on what I'm seeing in #125025, maybe cloning all the MIR is not too expensive. |
That's similar to our findings when we migrated to using StableMIR in Kani. StableMIR supports monomorphic bodies for instances. |
FWIW, MIR also supports monomorphic bodies -- in the MIR-to-MiniRust translation, we monomorphize the entire MIR body before translating it. |
Yes, that's how StableMIR is implemented too, but I believe you still need to clone the body. |
|
||
pub fn pointers_to_check<F>( | ||
statement: &mir::Statement<'_>, | ||
required_align_of: F, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the API design here so that a large part of the logic lies with the caller? Is this expected to be used differently from different places in the future? Or was it just to avoid passing self
?
Please add a codegen test that shows how we now also have alignment checks in generic code that didn't have them before |
☔ The latest upstream changes (presumably #128002) made this pull request unmergeable. Please resolve the merge conflicts. |
It turns out that since we're only checking that reads and writes are done to a place based on an aligned pointer, we actually don't run into the unsized pointee case anymore. Previously this pass was designed to check all derefs, not just reads and writes. That means that some of the pass logic can be cleaned up, though it still needs a carve-out for unsized locals. |
r? mir-opt I'm going on leave next week |
Implementing UB checks entirely in a MIR transform is quite limiting. Since MIR transforms work on polymorphic MIR we don't know for sure what all our types are, and sometimes we just have to give up on inserting a check. For example we used to emit MIR to compute the alignment mask at runtime, because the pointee type could be generic. and we used to skip alignment checks where we weren't sure the pointee was sized. Implementing the checks in codegen frees us from those problems, because we get to deal with monomorphized types.
Initially I implemented this by stripping down the MIR pass to insert a new terminator, which codegen would lower to a check if it saw fit. That's the perf run that has no regression: #117473 (comment). Since then, I've decided that the better strategy is to do this entirely in codegen. Only touching codegen dramatically reduces the amount of code in the compiler that this needs to touch, and it means we will insert checks into functions from the standard library which get codegenned in a crate compiled with debug assertions. Previously,
*misaligned_ptr
would be checked, butmisaligned_ptr.read()
would not. With this PR, now it is. With this PR, we get checks inptr::read
. That's this perf run: #117473 (comment)The only thing that jumps out at me about this codegen change is that between any two statements, codegen can change which backend block it is generating code for without changing the current MIR block. We already do insert blocks on the fly for panics, but in that case we don't stay in the new block.
I'm writing this with the expectation that I implement the niche checks in the same manner, because they have the same problem with polymorphic MIR, possibly worse.
I did a GitHub code search and the only users of the old opt-out which was
-Zmir-enable-passes=-CheckAlignment
were turning it off because of the problem withi686-pc-windows-msvc
, but that shouldn't be a problem anymore because we don't emit alignment checks on that target. Note that-Zmir-enable-passes=-CheckAlignment
will silently stop doing anything. We never check that the passes given to-Zmir-enable-passes
actually match the names of any actual passes.If the new checks cause issues, users now have the opt-out from #123411:
-Zub-checks=no
.