Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adapt data layout to match LLVM's #420

Merged
merged 8 commits into from
Feb 16, 2024
Merged

adapt data layout to match LLVM's #420

merged 8 commits into from
Feb 16, 2024

Conversation

tsatke
Copy link
Contributor

@tsatke tsatke commented Feb 14, 2024

Update required because of rust-lang/rust#116672, which became effective once rust-lang/rust#120055 was merged.

@tsatke
Copy link
Contributor Author

tsatke commented Feb 14, 2024

I'm not sure why the those last two tests fail... any help is appreciated.

It seems like the _test_sentinel is passed through correctly and we're also reaching the line where we write the exit_code to the qemu exit device.

@phil-opp
Copy link
Member

Thanks a lot for the PR! I'll look into the test failures.

@phil-opp
Copy link
Member

Ok, this is very strange. The error happens in the BIOS build when we call Rsdp::search_for_on_bios: https://github.com/tsatke/bootloader/blob/090f0309fb41c2ff19b1a79b5567cecc3ca084df/bios/stage-4/src/main.rs#L283 . This function now triggers an undefined instruction exception. When we look at the disassembly, we see that the function is compiled to the ud2 instruction for some reason:

0000000000139150 <_ZN4rsdp4Rsdp18search_for_on_bios17hb3292c3a8455d663E>:
  139150:	0f 0b                	ud2    
  139152:	cc                   	int3   
  139153:	cc                   	int3   
  139154:	cc                   	int3   
  139155:	cc                   	int3   
  139156:	cc                   	int3   
  139157:	cc                   	int3   
  139158:	cc                   	int3   
  139159:	cc                   	int3   
  13915a:	cc                   	int3   
  13915b:	cc                   	int3   
  13915c:	cc                   	int3   
  13915d:	cc                   	int3   
  13915e:	cc                   	int3   
  13915f:	cc                   	int3  

The ud2 instruction explicitly causes an undefined instruction error. So it looks like something has changed in LLVM 18 that causes it to miscompile this function...

@Freax13
Copy link
Member

Freax13 commented Feb 14, 2024

Here's a hacky workaround:

diff --git a/bios/stage-4/src/main.rs b/bios/stage-4/src/main.rs
index 9ade45f..439b1fe 100644
--- a/bios/stage-4/src/main.rs
+++ b/bios/stage-4/src/main.rs
@@ -262,6 +262,7 @@ fn detect_rsdp() -> Option<PhysAddr> {
     #[derive(Clone)]
     struct IdentityMapped;
     impl AcpiHandler for IdentityMapped {
+        #[inline(never)]
         unsafe fn map_physical_region<T>(
             &self,
             physical_address: usize,

ud2 is generally a sign of UB, so we (or the acpi devs) might still want to figure out what's happening and if we're doing something wrong.

@Freax13
Copy link
Member

Freax13 commented Feb 15, 2024

I tried compiling stage-4 with --emit=llvm-ir and search_for_on_bios doesn't even show up in the generated IR, but it does once I apply my workaround. This leads me to believe that this is likely not caused by LLVM, but by rustc itself.

@tsatke
Copy link
Contributor Author

tsatke commented Feb 15, 2024

I agree with @Freax13, this should be investigated further. However, to unblock people (including myself), I've added the workaround, @phil-opp obv it depends on you whether you want this merged for now. I also don't think I'm experienced enough with rustc to investigate this efficiently by myself.

I'm not sure what the Semver Checks do, and I'm not sure why they still fail with the data layout mismatch.

@phil-opp
Copy link
Member

I tried compiling stage-4 with --emit=llvm-ir and search_for_on_bios doesn't even show up in the generated IR, but it does once I apply my workaround. This leads me to believe that this is likely not caused by LLVM, but by rustc itself.

Interesting! I guess a good next step could be to use the cargo-bisect-rustc tool to try to narrow down the rustc change that triggered this. The CI logs look like the regression happened between the following commits: rust-lang/rust@b381d3a...a84bb95

(Note that this might not be a bug caused by rustc. It could also be some undefined behavior in the bootloader crate that is just triggered by some changes in rustc. )

@phil-opp
Copy link
Member

@tsatke Thanks for implementing the workaround! I'm still debating whether I want to merge it before understanding the actual issue though...

I'm not sure what the Semver Checks do, and I'm not sure why they still fail with the data layout mismatch.

This check runs the cargo-semver-checks tool to detect accidental breaking changes. It works by comparing the latest crates.io version with the PR code, which won't work in this case because the crates.io release doesn't compile anymore on the latest nightly. So the failure is expected here.

@lordofdestiny
Copy link

Does this mean that the whole tutorial path is broken right now? Is there a workaround we can do to progress with the blog?

@Freax13
Copy link
Member

Freax13 commented Feb 15, 2024

searched nightlies: from nightly-2024-02-13 to nightly-2024-02-15
regressed nightly: nightly-2024-02-14
searched commit range: rust-lang/rust@b381d3a...a84bb95
regressed commit: rust-lang/rust@eaff1af

bisected with cargo-bisect-rustc v0.6.8

Host triple: x86_64-unknown-linux-gnu
Reproduce with:

cargo bisect-rustc --start 2024-02-13 --end 2024-02-15 --with-src --target x86_64-unknown-none --component llvm-tools -vvv -- test --target x86_64-unknown-linux-gnu check_boot_info 

@Freax13
Copy link
Member

Freax13 commented Feb 15, 2024

It turns out that the IR emitted by --emit=llvm-ir already has some optimizations run on it. If I additionally enable -C no-prepopulate-passes, I can find search_for_on_bios in the IR.

@phil-opp phil-opp added the nightly-breakage Code is broken on the latest Rust nightly label Feb 16, 2024
@phil-opp
Copy link
Member

It turns out that the IR emitted by --emit=llvm-ir already has some optimizations run on it. If I additionally enable -C no-prepopulate-passes, I can find search_for_on_bios in the IR.

So for some reason LLVM decides to optimize this whole function out, right? Could you upload the LLVM here or tell us the exact build options that you used for it?

@phil-opp
Copy link
Member

Let's get this merged for now to unblock people. But we should really figure out the root cause of this. I opened #425 to track this.

Thanks again for the PR @tsatke and thanks for helping debugging this @Freax13!

@phil-opp phil-opp enabled auto-merge February 16, 2024 15:23
@phil-opp phil-opp merged commit bc8c554 into rust-osdev:main Feb 16, 2024
7 of 8 checks passed
@phil-opp phil-opp mentioned this pull request Feb 16, 2024
@tsatke tsatke deleted the patch-1 branch February 19, 2024 15:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
nightly-breakage Code is broken on the latest Rust nightly
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants