Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Miscompilation in unoptimized builds on aarch64 with 1.57.0 #92786

Closed
saethlin opened this issue Jan 11, 2022 · 17 comments · Fixed by #93081
Closed

Miscompilation in unoptimized builds on aarch64 with 1.57.0 #92786

saethlin opened this issue Jan 11, 2022 · 17 comments · Fixed by #93081
Assignees
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. P-critical Critical priority regression-from-stable-to-stable Performance or correctness regression from one stable version to another. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@saethlin
Copy link
Member

saethlin commented Jan 11, 2022

Minimized from a program @jfro was trying to debug.
I tried this code:

fn main() {
    for (i, _) in [0u8].iter().enumerate() {
        let test = i as f32;
        println!("i: {}", test);
        let _wtf = i as u32;
    }
}

I expected to see this happen:

i: 0

Instead, this happened:

i: 536870900

Meta

I see this on aarch64 and not on x86_64. I don't have access to any other architectures. Output is as expected with --release. I can also observe this bug on the previous stable (1.56.1), current beta (1.58.0-beta.3) and current nightly (89b9f7b 2022-01-10).

rustc --version --verbose:

rustc 1.57.0 (f1edd0429 2021-11-29)
binary: rustc
commit-hash: f1edd0429582dd29cccacaf50fd134b05593bd9c
commit-date: 2021-11-29
host: aarch64-unknown-linux-gnu
release: 1.57.0
LLVM version: 13.0.0
@saethlin saethlin added the C-bug Category: This is a bug. label Jan 11, 2022
@jfro
Copy link

jfro commented Jan 11, 2022

Was able to also test Rust 1.53 & 1.55, both perform as expected and show i: 0

@nikic nikic added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. regression-from-stable-to-stable Performance or correctness regression from one stable version to another. labels Jan 11, 2022
@rustbot rustbot added the I-prioritize Issue: Indicates that prioritization has been requested for this issue. label Jan 11, 2022
@inquisitivecrystal
Copy link
Contributor

@jfro Are the numbers printed the same every time or do they vary?

@jfro
Copy link

jfro commented Jan 11, 2022

Consistently 536870900 on the problem arch & compilers

@inquisitivecrystal
Copy link
Contributor

Consistently 536870900 on the problem arch & compilers

Thanks. The reason I was asking is that, if you're reading invalid memory, the result is usually random. Since this result is consistent, something else is probably going on.

@nikic
Copy link
Contributor

nikic commented Jan 11, 2022

For the sake of convenience, could you please provide the --emit=llvm-ir output for this program?

@jfro
Copy link

jfro commented Jan 11, 2022

here you go:
main.ll.zip

@kjvalencik
Copy link

kjvalencik commented Jan 11, 2022

It might be a coincidence but, 536870900 is i32::MAX / 4 (4 bytes in 32-bits) and rounded to the f32. Seems like maybe the bits are being inverted?

@kjvalencik
Copy link

kjvalencik commented Jan 11, 2022

I'm able to reproduce with cargo run, but if I emit the IR and then compile it with clang, I do not see the issue. I get different IR if I compile with rustc --emit=llvm-ir src/main.rs vs. the entire cargo command from cargo build -v. Only the 2nd has the bug.

Edit: It's caused by debuginfo=2. Lowering it to debuginfo=1 removes the problem.

@nikic
Copy link
Contributor

nikic commented Jan 12, 2022

When I use the llc from the rustup toolchain, I see LLVM going wild with mrs nzcv clobbers:

 Depth=1
.Ltmp169:
  .loc  25 2 9 is_stmt 1                // test.rs:2:9
  ldr d0, [sp, #104]
  mrs s0, NZCV
  str s0, [sp, #12]                   // 4-byte Folded Spill
  mrs d1, NZCV
  str d1, [sp, #128]
.Ltmp170:
  .loc  25 2 9 is_stmt 0                // test.rs:2:9
  mrs d1, NZCV
  str d1, [sp, #136]
.Ltmp171:
  .loc  25 2 14                         // test.rs:2:14
  mrs d1, NZCV
  str d1, [sp, #144]
.Ltmp172:
  .loc  25 3 20 is_stmt 1               // test.rs:3:20
  mrs x8, NZCV
  ucvtf s0, x8
  str s0, [sp, #152]

Note that ucvtf also gets passed mrs NZCV, so we're actually converting some condition flags to float, which makes about zero sense.

Weirdly I don't see this when using the llc from my local rust build :/

@nikic
Copy link
Contributor

nikic commented Jan 12, 2022

target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
target triple = "aarch64-unknown-linux-gnu"

define void @test(i64* %p1, float* %p2, i32* %p3) {
  br label %loop

loop:
  %val = load i64, i64* %p1
  %test = uitofp i64 %val to float
  store float %test, float* %p2
  %trunc = trunc i64 %val to i32
  store i32 %trunc, i32* %p3
  br label %loop
}

For this example, the llc distributed by rustup produces (with -O0):

// %bb.0:
	sub	sp, sp, #32                     // =32
	.cfi_def_cfa_offset 32
	str	x0, [sp, #8]                    // 8-byte Folded Spill
	str	x1, [sp, #16]                   // 8-byte Folded Spill
	str	x2, [sp, #24]                   // 8-byte Folded Spill
.LBB0_1:                                // %loop
                                        // =>This Inner Loop Header: Depth=1
	ldr	x8, [sp, #24]                   // 8-byte Folded Reload
	ldr	x9, [sp, #16]                   // 8-byte Folded Reload
	ldr	x10, [sp, #8]                   // 8-byte Folded Reload
	ldr	d0, [x10]
	mrs	s0, NZCV
	mrs	x10, NZCV
	ucvtf	s1, x10
	str	s1, [x9]
	str	s0, [x8]
	b	.LBB0_1

While my own build produces:

// %bb.0:
	sub	sp, sp, #32                     // =32
	.cfi_def_cfa_offset 32
	str	x2, [sp, #8]                    // 8-byte Folded Spill
	str	x1, [sp, #16]                   // 8-byte Folded Spill
	str	x0, [sp, #24]                   // 8-byte Folded Spill
	b	.LBB0_1
.LBB0_1:                                // %loop
                                        // =>This Inner Loop Header: Depth=1
	ldr	x9, [sp, #8]                    // 8-byte Folded Reload
	ldr	x10, [sp, #16]                  // 8-byte Folded Reload
	ldr	x8, [sp, #24]                   // 8-byte Folded Reload
	ldr	x8, [x8]
	ucvtf	s0, x8
	str	s0, [x10]
                                        // kill: def $w8 killed $w8 killed $x8
	str	w8, [x9]
	b	.LBB0_1

@nikic
Copy link
Contributor

nikic commented Jan 12, 2022

Looks like it works if LLVM is compiled with assertions and breaks when compiled without assertions :(

@kjvalencik
Copy link

kjvalencik commented Jan 12, 2022

@nikic Are you saying that you can reproduce two different results using the same IR? I was seeing the issue with llvm@13 distributed from Homebrew. Not sure if that's compiled with or without assertions. I only see the problem with IR generated with debuginfo=2 and compiling with -O0.

This is the line of IR that is causing the issues. It is only present with debuginfo=2 and removing it fixes the bug:

  store i32 %_wtf, i32* %_wtf.dbg.spill, align 4, !dbg !727

Additionally, it's only a problem if I assign _wtf to the spill directly. If I store a constant or use a temporary, the issue goes away.

  %21 = add i32 %_wtf, 0
  store i32 %21, i32* %_wtf.dbg.spill, align 4, !dbg !727

@kjvalencik
Copy link

This llvm bug seems like it may be relevant. Switching to -fast-isel fixes the miscompilation.

@nikic
Copy link
Contributor

nikic commented Jan 12, 2022

Bisect shows this is fixed by llvm/llvm-project@67bf3ac on main.

Edit: It looks that shortly after that llvm/llvm-project@04fb9b7 landed and the change was reverted again.

@kjvalencik
Copy link

@nikic The revert comment suggests that it should be fixed by this commit. Are you able to confirm?

@nikic
Copy link
Contributor

nikic commented Jan 12, 2022

I've filed llvm/llvm-project#53162 to check whether we can still get this in 13.0.1, though probably not. In that case I'll cherry-pick into our fork.

@apiraino
Copy link
Contributor

Assigning priority as discussed in the Zulip thread of the Prioritization Working Group.

@rustbot label -I-prioritize +P-critical

@rustbot rustbot added P-critical Critical priority and removed I-prioritize Issue: Indicates that prioritization has been requested for this issue. labels Jan 13, 2022
@nikic nikic added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Jan 14, 2022
@nikic nikic self-assigned this Jan 14, 2022
@bors bors closed this as completed in 563250a Jan 27, 2022
ehuss pushed a commit to ehuss/rust that referenced this issue Feb 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. P-critical Critical priority regression-from-stable-to-stable Performance or correctness regression from one stable version to another. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants