-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory unsafety problem in safe Rust #69225
Comments
cc @rust-lang/compiler The release team is considering making a point release for Rust 1.41 (we briefly discussed it in last week's meeting), and I'd love for this to be included in it if we can get a PR up soon. |
Hey LLVM ICE-breakers! This bug has been identified as a good cc @comex @DutchGhost @hanna-kruppe @hdhoang @heyrutvik @JOE1994 @jryans @mmilenko @nagisa @nikic @Noah-Kennedy @SiavoshZarrasvand @spastorino @vertexclique @vgxbj |
I cannot reproduce this on playground. The program works fine there on 1.41.0 in release mode. EDIT: Ah, you already said that. |
Just to add a data point, I can reproduce this on Linux with the latest nightly:
I was able to reproduce the above with the exact same output with Rust 1.41 stable. Rust 1.40 stable does not exhibit the problem:
I think this is all consistent with @dfyz's report, except this at least confirms that it isn't macOS specific. |
This is expected. 1.40.0 was released on 2019-12-19 based on what was then the |
If I had to guess, I'd say #67015 is the likely culprit. This fixed 3 codegen issues, so it touches codegen-critical code. |
I managed to reproduce the segfault using rustc nightly on Linux, but not with my build generated with this config.toml: config.toml
Using rustc nightly I checked MIRs before and after ConstProp and the MIRs are identical. So if this is caused by ConstProp then it's because of a difference in generated code of a library, not of this program. |
I think I solved it! Reverting a983e05 fixes the issue. It seemed like the culprit to me all day, but I couldn't test it at work :) Can someone help me how to best make a PR for this issue? I think the best way would be to add a testcase that must fail, but it seems this is very platform-specific etc, so maybe that is not a good idea after all? Any ideas here? Thanks! |
(This was already bisected by the OP) I managed to reproduce this with a local build with Some of the PRs in the rollup cause conflicts when reverted because we have since then edit: it seems we found this at the same time @shahn :) |
This comment has been minimized.
This comment has been minimized.
To add another data point, the problem disappears when |
The call to |
So, I checked out and built a stage 1 This means I now have two compilers which only differ in the standard library they use. Let's call these compilers I then noticed that the 4 unoptimized Indeed, if I pass any of Does it look likely that LTO might be at fault here? |
This fixes a a segfault in safe code, a stable regression. Reported in \rust-lang#69225. This reverts commit a983e05. Also adds a test for the expected behaviour.
This fixes a a segfault in safe code, a stable regression. Reported in \rust-lang#69225. This reverts commit a983e05. Also adds a test for the expected behaviour.
I think I just found a way to reproduce the issue even after reverting #67174. Here's a slightly longer, but still safe program that reliably segfaults on Windows, Linux and macOS using the latest nightly with #67174 reverted: fn do_test(x: usize) {
let mut arr = vec![vec![0u8; 3]];
let mut z = vec![0];
for arr_ref in arr.iter_mut() {
for y in 0..x {
for _ in 0..1 {
z.reserve_exact(x);
let iterator = std::iter::repeat(0).take(x);
let mut cnt = 0;
iterator.for_each(|_| {
z[0] = 0;
cnt += 1;
});
let a = y * x;
let b = (y + 1) * x - 1;
let slice = &mut arr_ref[a..b];
slice[1 << 24] += 1;
}
}
}
}
fn main() {
do_test(1);
do_test(2);
} Windows:
Linux:
macOS:
This program doesn't depend on the number of codegen units, so it segfaults in the Playground, too (on stable, beta, and nightly). I also reproduced this by compiling The underlying LLVM bug is still the same, so upgrading to LLVM 10 or cherry-picking the LLVM fix makes the segfault go away. I really wish I understood what's going on better. It does look like the bounds checks are elided because of the extra |
My impression is that 1.41.1 has just been finalized in #69359 (bad timing on my part), so there is not much that can be done at this point. Is it at least a good idea to update the comment in |
If the patch we included in 1.41.1 doesn't actually fix the problem we should reconsider whether we want to backport the new fix and rebuild the release. There was consensus in the release team meeting not to backport the LLVM fix, but I personally think cc @Mark-Simulacrum @rust-lang/release |
@dfyz we'll try to get another build of 1.41.1 with the LLVM fix backported, while we wait for consensus on actually shipping that. |
FWIW, for me the new reproducer works as expected ( |
I (accidentally) found out that setting Disregard that, 1.37.0 uses LLVM 8. - %71 = icmp eq {}* %70, null
+ %71 = icmp ule {}* %70, null
|
That's using LLVM 8, so the blamed SCEV change shouldn't exist at all. |
My bad, sorry for the confusion (I was so happy to reduce it to a one-line diff I didn't even bother checking the LLVM version). |
We prepared new 1.41.1 artifacts with the LLVM fix cherry-picked in it. You can test them locally with:
|
Cherry-pick the LLVM fix for #69225 An additional reproducer was provided in #69225 -- the new testcase here -- which still crashes even after #69241 reverted #67174. Now this pull request updates LLVM with the cherry-picked reversion of its own. This is also going to stable in #69444. I have not tried to reapply #67174 yet -- cc @kraai @shahn
[triagebot] The issue was successfully resolved without any involvement from the pinged compiler team. |
1.41.1 is out, I guess it's time to finally close this issue. |
[beta] backports This backports the following PRs: * ci: switch macOS builders to 10.15 #68863 * Backport release notes of 1.41.1 #69468 * Cherry-pick the LLVM fix for #69225 #69450 * `lit_to_const`: gracefully bubble up type errors. #69330 * [beta] bootstrap from 1.41.1 stable #69518 * bootstrap: Configure cmake when building sanitizer runtimes #69104 r? @ghost
I have a small program (a simplification of a test function from a larger project) that slices a small array and tries to access an out-of-bounds element of the slice. Running it with
cargo run --release
using the stable1.41.0
release prints something like this (tested on macOS 10.15 and Ubuntu 19.10):It looks like the resulting slice somehow has length
2**64 - 1
, so the bounds checking is omitted, which predictably results in a segfault. On1.39.0
and1.40.0
the very same program prints what I would expect:The problem goes away if I do any of the following:
do_test(...);
calls inmain()
;for _ in 0..1 {
loop;for y in 0..x {
loop withfor y in 0..1 {
;z.extend(std::iter::repeat(0).take(x));
line or replace it withz.extend(std::iter::repeat(0).take(1));
;for arr_ref in arr {
loop withlet arr_ref = &arr[0];
;RUSTFLAGS="-C opt-level=2"
;RUSTFLAGS="-C codegen-units=1"
.My best guess is
-C opt-level=3
enables a problematic optimization pass in LLVM, which results in miscompilation. This is corroborated by the fact that MIR (--emit mir
) and LLVM IR before optimizations (--emit llvm-ir -C no-prepopulate-passes
) is the same for both-C opt-level=2
and-C opt-level=3
.Some additional info that might be helpful:
codegen-units = 1
);1.41.0
release (no idea what makes it different);cargo-bisect-rustc
says the regression first happened in the2019-12-12
nightly, specifically in this commit. This seems suspicious to me, given that1.40.0
, which does not exhibit the problem, was released after this date.I'm attaching the program inline in case the GitHub repo doesn't work (if you want to compile it without Cargo, use
rustc -C opt-level=3 main.rs
):The text was updated successfully, but these errors were encountered: