Skip to content
This repository has been archived by the owner on Mar 7, 2021. It is now read-only.

Fixed invalid module format & recent nightly support #67

Closed
wants to merge 29 commits into from

Conversation

TheDan64
Copy link
Contributor

@TheDan64 TheDan64 commented Feb 1, 2019

Hi there, we've been looking into writing a kernel module in rust, and your repo has been super helpful, thanks!

It seems like you guys had some issues with R_X86_64_GOTPCREL being introduced in rust nightly late last year causing those Invalid module format error messages

@ahomescu was able to narrow it down to this commit: rust-lang/rust@6009da0

And so the fix seems to be adding "needs-plt": true to the target json file. We've also updated the code to work on more recent nightlies.

This works locally but for some reason travis' test sysctl-tests still fails with Invalid module format for some reason. Is this test using a separate target info perhaps that needs to be update too? Or maybe is something being cached?

@alex
Copy link
Member

alex commented Feb 2, 2019

@safarir
Copy link

safarir commented Mar 12, 2019

I gave this a try on 1.35.0-nightly, 2019-03-09 and I am still getting some R_X86_64_GOTPCREL.

On ubuntu 18.04, 4.15.0-34-generic

@ahomescu
Copy link
Contributor

I'm investigating the issue that @safarir pointed out, but am currently running into other problems:

  • The module fails to build with nightly-2019-02-04 or newer when building liballoc, most likely due to rust-lang/rust@4f4f4a4 which bumps the Rust edition to 2018 (among other changes)
  • For rustc older than that, I still get a build error from bindgen because of changes in Linux 5.0, due to torvalds/linux@8bd66d1. I'll revert to an older kernel and try again.

@alex
Copy link
Member

alex commented Mar 12, 2019

Ugh, frustrating. What's the bindgen error, is it something that plausibly upstream should fix?

@ahomescu
Copy link
Contributor

ahomescu commented Mar 13, 2019

With kernel 4.19.28-1-lts (on Arch Linux, haven't tried Ubuntu) and Rust toolchain nightly-2019-02-03, I can confirm that the module builds and loads for me.

@alex The issue is that bindgen does not support "asm gotos", I get errors about that. I'll check if the bindgen repo has an issue for that, and open one myself if not. I can't really estimate how much effort it would take to implement this.

EDIT: Added a snippet of the bindgen errors I'm getting:

/usr/lib/modules/5.0.0-arch1-1-ARCH/build/./arch/x86/include/asm/bitops.h:220:9: error: 'asm goto' constructs are not supported yet
/usr/lib/modules/5.0.0-arch1-1-ARCH/build/./arch/x86/include/asm/bitops.h:266:9: error: 'asm goto' constructs are not supported yet
/usr/lib/modules/5.0.0-arch1-1-ARCH/build/./arch/x86/include/asm/bitops.h:319:9: error: 'asm goto' constructs are not supported yet
/usr/lib/modules/5.0.0-arch1-1-ARCH/build/./arch/x86/include/asm/jump_label.h:23:2: error: 'asm goto' constructs are not supported yet
/usr/lib/modules/5.0.0-arch1-1-ARCH/build/./arch/x86/include/asm/jump_label.h:39:2: error: 'asm goto' constructs are not supported yet

@alex
Copy link
Member

alex commented Mar 13, 2019 via email

@ahomescu
Copy link
Contributor

ahomescu commented Mar 13, 2019

Looks like this isn't even a bindgen issue, it's a clang issue: https://bugs.llvm.org/show_bug.cgi?id=9295, fixed in https://reviews.llvm.org/D56571 and landing in LLVM 9??? Edit: basically, clang exits during parsing because it can't parse asm gotos.

@ahomescu
Copy link
Contributor

Update on the liballoc issue: I fixed that by updating cargo-xbuild to 0.5.5, which fixed this exact issue in commit rust-osdev/cargo-xbuild@a345833. Their README mentions this now: https://github.com/rust-osdev/cargo-xbuild#installation

@ahomescu
Copy link
Contributor

Another issue I just ran into: updating bindgen from 0.47.3 to 0.48.1 broke the build with this error:

error[E0587]: type has conflicting packed and align representation hints
    --> .../out/bindings.rs:3894:1
     |
3894 | / pub struct xregs_state {
3895 | |     pub i387: fxregs_state,
3896 | |     pub header: xstate_header,
3897 | |     pub extended_state_area: __IncompleteArrayField<u8>,
3898 | | }
     | |_^

due to the output of bindgen switching from

#[repr(C)]
pub struct xregs_state {
    pub _bindgen_opaque_blob: [u8; 576usize],
}

to

#[repr(C, packed(64))]
#[repr(align(64))]
pub struct xregs_state {
    pub i387: fxregs_state,
    pub header: xstate_header,
    pub extended_state_area: __IncompleteArrayField<u8>,
}

The build works with bindgen 0.47.3, so I'm wondering if the top-level Cargo.toml should require that version.

@alex
Copy link
Member

alex commented Mar 13, 2019

I'm ok pinning -- but let's file a bug with bindgen first, I don't want us to be stuck on old versions forever :-)

@ahomescu
Copy link
Contributor

Current status: my build works with the following versions:

  • kernel version 4.19.28-1-lts
  • latest Rust nightly (from today)
  • cargo-xbuild 0.5.5
  • bindgen 0.47.3

@ahomescu
Copy link
Contributor

@emilio
Copy link

emilio commented Mar 13, 2019

FWIW, you don't need to pin. You can just mark the xreg_state as opaque and it'd generate a blob with the right layout, which is what it used to do before.

@alex
Copy link
Member

alex commented Mar 13, 2019

Good point.

@jason-ni
Copy link

Hi @TheDan64 , the target json switch for PLT is "needs-plt": true. Underscore doesn't work.

@TheDan64
Copy link
Contributor Author

@jason-ni Sorry, that is correct in the modified json file, but not in my initial comment. Will edit.

@TheDan64
Copy link
Contributor Author

TheDan64 commented Mar 27, 2019

@alex I wonder if these tests are failing because travis seems to be on kernel 4.4? Building the module worked in 4.19 for me

@alex
Copy link
Member

alex commented Mar 27, 2019

@TheDan64 Can you try adapting the makefile changes you did in hello-world to the makefile in tests/? That seems likely to be an issue to me.

tests/run_tests.py Outdated Show resolved Hide resolved
@alex
Copy link
Member

alex commented Mar 27, 2019

type has conflicting packed and align representation hints

this is the issue where we need to exclude some struct, right?

@TheDan64
Copy link
Contributor Author

Yeah. How do I go about marking xreg_state as opaque?

@alex
Copy link
Member

alex commented Mar 27, 2019

In build.rs add a call to https://docs.rs/bindgen/0.23.1/bindgen/struct.Builder.html#method.opaque_type -- probably from a sequence at the top like we do for INCLUDED_*. Ideally with a comment containing a link to the bindgen + rust issues that require it to be opaque :-)

@TheDan64
Copy link
Contributor Author

The next error I see when testing locally is:

error: #[derive] can't be used on a #[repr(packed)] struct that does not derive Copy (error E0133)
Which seems to be coming from this struct:

#[repr(C, packed)]
#[derive(Debug, Default)]
pub struct desc_struct {
    pub limit0: u16,
    pub base0: u16,
    pub _bitfield_1: __BindgenBitfieldUnit<[u8; 4usize], u8>,
}

Should this be made opaque too? I'm guessing __BindgenBitfieldUnit isn't Copy, which is why this struct isn't either?

@alex
Copy link
Member

alex commented Mar 27, 2019

__BindgenBitfieldUnit is Copy (https://github.com/rust-lang/rust-bindgen/blob/master/src/codegen/bitfield_unit.rs#L1-L9). I think the issue is that desc_struct itself isn't Copy?

@TheDan64
Copy link
Contributor Author

Oh yeah; good point. Does bindgen have a way to mark structs as Copy?

@alex
Copy link
Member

alex commented Mar 27, 2019

Not only does it have a way, I think it does that by default! We have .no_copy(".*") in build.rs... I can't remember why, I'll have to check if git blame illuminates anything.

I don't have an off the cuff solution for that -- we don't need that struct AFAIK, so maybe we can make it (or the type containing it?) opaque. Or maybe we can explicitly add Copy to it.

@alex
Copy link
Member

alex commented Mar 27, 2019

The commit that introduces it is utterly useless unfortunately: 01a7774

Bad Alex... no cookie.

@alex
Copy link
Member

alex commented Mar 27, 2019

You can explicitly mark it Copy with https://docs.rs/bindgen/0.49.0/bindgen/struct.Builder.html#method.derive_copy. You could also see what happens if you remove the no_copy() line.

@TheDan64
Copy link
Contributor Author

I've rebased as you requested - however there were a ton of merge conflicts so hopefully I got it right..

@alex
Copy link
Member

alex commented Apr 30, 2019

Thanks! Looks like there's something going wrong in the build process -- some issue with how the makefile changes interact #68 ?

@TheDan64
Copy link
Contributor Author

TheDan64 commented May 3, 2019

I fixed the diff so that it uses KDIR like on master; but still failing and I'm not really sure why

Edit: Oh, the else clause maybe. hmm

src/lib.rs Outdated Show resolved Hide resolved
Co-Authored-By: TheDan64 <ThaDan64@gmail.com>
@alex
Copy link
Member

alex commented May 4, 2019

IT WORKS ON XENIAL!

I think we should just disable trusty, and review this!!!!

src/lib.rs Show resolved Hide resolved
.travis.yml Outdated Show resolved Hide resolved
@alex
Copy link
Member

alex commented May 4, 2019

Thanks so much for your persistence on this! I'm excited to be able to merge this once @geofft has a look.

Copy link
Member

@alex alex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! If it looks good to @geofft, we'll land! Thanks again!

@geofft
Copy link
Collaborator

geofft commented May 5, 2019

I'd like to know what precisely makes this build, mostly with the goal of making sure we don't accidentally regress it in the future again. I have two test branches ("test-without-ld" and "geofft-test-2"), the first one one based on this PR minus some things and the second one based on @alex's recent work on master plus some things, and I'm still unsure why the first one passes Travis and the second one doesn't. (There are very few changes between them; notably they both have the PLT change. Note GitHub's compare view is a by-commit comparison, you probably want to fetch the branches locally and diff them if you want to see the difference.) I'm hoping I can figure this out tonight or early tomorrow.

Also I want to resolve the discussion in #75 before we merge this.

Definitely happy to drop Trusty so we can get back to building productive things on new kernels and figure out the relocation problem with older kernels later. :-) Thanks for your work on this!

@geofft
Copy link
Collaborator

geofft commented May 6, 2019

OK, after some cherry-picking, it looks like those are the same question - the as_bytes() form works and the constant &str doesn't. Take a look at the history of http://github.com/alex/linux-kernel-module-rust/commits/geofft-test-3 which is based on master: the most recent commit passes, and the previous ones (which cherry-pick only the PLT changes from this PR) do not. Can we explain why??

Besides that I think there's enough try/revert/try-again on this branch that I'd like to get rid of that, because everything's already confusing enough today and I'm worried it'll just be more confusing in the future - if everyone's okay with me doing so, I'd like to rebase this down to the semantic changes (and make sure we keep credit for everyone!). I can prep that by tomorrow morning. In particular I'd like to split out the Makefile changes and review those in a separate PR, since those don't seem to be necessary to get things to work.

@geofft
Copy link
Collaborator

geofft commented May 6, 2019

@TheDan64 @ahomescu - I'd like to force-push just the three commits in #79 to immunant:master and land that to resolve this PR, and then split out everything else into separate PRs - is that fine with you?

@geofft
Copy link
Collaborator

geofft commented May 6, 2019

Actually I forgot we're in rebase mode and not merge mode so it doesn't matter where they're merged from... #79 is merged and master is passing CI, thank you 🎉

I'll open PRs for the remaining commits on this branch and close this.

@geofft
Copy link
Collaborator

geofft commented May 6, 2019

OK, I think everything's been split out:

Besides that the only thing in my local diff is a stray newline, so I'm closing this. But please open a new PR if I missed something!

Thanks again for getting this project green :-D

@geofft geofft closed this May 6, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants