Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rustc and cargo binaries crash on startup on ARM with glibc, because libpthread is loaded too late #35843

Closed
edef1c opened this issue Aug 20, 2016 · 28 comments · Fixed by rust-lang-deprecated/rust-buildbot#123

Comments

@edef1c
Copy link
Contributor

edef1c commented Aug 20, 2016

[edef@alarm ~]$ curl -sSf https://static.rust-lang.org/rustup.sh | sh -s -- --channel=nightly 
rustup: gpg available. signatures will be verified
rustup: downloading manifest for 'nightly'
rustup: downloading toolchain for 'nightly'
######################################################################## 100.0%
gpg: assuming signed data in '/home/edef/.rustup/dl/53065fae72c98b6e8b0d/rust-nightly-armv7-unknown-linux-gnueabihf.tar.gz'
gpg: Signature made Fri Aug 19 14:19:07 2016 UTC using RSA key ID 5CB4A9347B3B09DC
gpg: Good signature from "Rust Language (Tag and Release Signing Key) <rust-key@rust-lang.org>" [unknown]
gpg: WARNING: This key is not certified with a trusted signature!
gpg:          There is no indication that the signature belongs to the owner.
Primary key fingerprint: 108F 6620 5EAE B0AA A8DD  5E1C 85AB 96E6 FA1B E5FE
     Subkey fingerprint: C134 66B7 E169 A085 1886  3216 5CB4 A934 7B3B 09DC
rustup: installing toolchain for 'nightly'
rustup: extracting installer
install: uninstalling component 'rustc'
install: uninstalling component 'rust-std-armv7-unknown-linux-gnueabihf'
install: uninstalling component 'rust-docs'
install: uninstalling component 'cargo'
install: creating uninstall script at /usr/local/lib/rustlib/uninstall.sh
install: installing component 'rustc'
install: installing component 'rust-std-armv7-unknown-linux-gnueabihf'
install: installing component 'rust-docs'
install: installing component 'cargo'

    Rust is ready to roll.

[edef@alarm ~]$ rustc
Segmentation fault (core dumped)
[edef@alarm ~]$ cargo
Segmentation fault (core dumped)

related: https://sourceware.org/bugzilla/show_bug.cgi?id=16628

workaround:

[edef@alarm ~]$ sudo patchelf --add-needed libpthread.so.0 $(which rustc)
warning: working around a Linux kernel bug by creating a hole of 61440 bytes in ‘/usr/local/bin/rustc’
[edef@alarm ~]$ rustc
Usage: rustc [OPTIONS] INPUT

Options:
    -h --help           Display this message
    --cfg SPEC          Configure the compilation environment
    -L [KIND=]PATH      Add a directory to the library search path. The
                        optional KIND can be one of dependency, crate, native,
                        framework or all (the default).
    -l [KIND=]NAME      Link the generated crate(s) to the specified native
                        library NAME. The optional KIND can be one of static,
                        dylib, or framework. If omitted, dylib is assumed.
    --crate-type [bin|lib|rlib|dylib|cdylib|staticlib]
                        Comma separated list of types of crates for the
                        compiler to emit
    --crate-name NAME   Specify the name of the crate being built
    --emit [asm|llvm-bc|llvm-ir|obj|link|dep-info]
                        Comma separated list of types of output for the
                        compiler to emit
    --print [crate-name|file-names|sysroot|cfg|target-list|target-cpus|target-features|relocation-models|code-models]
                        Comma separated list of compiler information to print
                        on stdout
    -g                  Equivalent to -C debuginfo=2
    -O                  Equivalent to -C opt-level=2
    -o FILENAME         Write output to <filename>
    --out-dir DIR       Write output to compiler-chosen filename in <dir>
    --explain OPT       Provide a detailed explanation of an error message
    --test              Build a test harness
    --target TARGET     Target triple for which the code is compiled
    -W --warn OPT       Set lint warnings
    -A --allow OPT      Set lint allowed
    -D --deny OPT       Set lint denied
    -F --forbid OPT     Set lint forbidden
    --cap-lints LEVEL   Set the most restrictive lint level. More restrictive
                        lints are capped at this level
    -C --codegen OPT[=VALUE]
                        Set a codegen option
    -V --version        Print version info and exit
    -v --verbose        Use verbose output

Additional help:
    -C help             Print codegen options
    -W help             Print 'lint' options and default settings
    -Z help             Print internal options for debugging rustc
    --help -v           Print the full set of options rustc accepts

[edef@alarm ~]$ sudo patchelf --add-needed /lib/ld-linux-armhf.so.3 $(which cargo)
[edef@alarm ~]$ cargo
Rust's package manager

Usage:
    cargo <command> [<args>...]
    cargo [options]

Options:
    -h, --help          Display this message
    -V, --version       Print version info and exit
    --list              List installed commands
    --explain CODE      Run `rustc --explain CODE`
    -v, --verbose ...   Use verbose output
    -q, --quiet         No output printed to stdout
    --color WHEN        Coloring: auto, always, never
    --frozen            Require Cargo.lock and cache are up to date
    --locked            Require Cargo.lock is up to date

Some common cargo commands are (see all commands with --list):
    build       Compile the current project
    clean       Remove the target directory
    doc         Build this project's and its dependencies' documentation
    new         Create a new cargo project
    init        Create a new cargo project in an existing directory
    run         Build and execute src/main.rs
    test        Run the tests
    bench       Run the benchmarks
    update      Update dependencies listed in Cargo.lock
    search      Search registry for crates
    publish     Package and upload this project to the registry
    install     Install a Rust binary

See 'cargo help <command>' for more information on a specific command.
[edef@alarm ~]$  

I'm not sure why rustc works at all on x86.

@edef1c edef1c changed the title rustc and related binaries crash on startup on ARM, because libpthread is loaded too late rustc and related binaries crash on startup on ARM with glibc, because libpthread is loaded too late Aug 20, 2016
@edef1c
Copy link
Contributor Author

edef1c commented Aug 20, 2016

all of this happens before main is even called, this is entirely glibc shooting itself in the foot

[edef@alarm ~]$ gdb rustc
(gdb) break main   
Breakpoint 1 at 0x890
(gdb) run   
Starting program: /usr/local/bin/rustc 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
0x00000000 in ?? ()
(gdb) 

@MagaTailor
Copy link

MagaTailor commented Aug 20, 2016

Which glibc version is needed to be built against for this crash to happen? I never had this problem but others sure did:

#33483 (comment)
#33928

You could probably find out the difference if you used one of my uploads.

@nagisa
Copy link
Member

nagisa commented Aug 20, 2016

Are we sure its pthread and not

warning: working around a Linux kernel bug by creating a hole of 61440 bytes in ‘/usr/local/bin/rustc’

?

@eddyb
Copy link
Member

eddyb commented Aug 20, 2016

@nagisa It would be very interesting if it was that, but I've seen that all the time when using patchelf on more or less any binary (the first time, anyway).

@MagaTailor
Copy link

@nagisa It's probably more complicated, a quick recap:

  • official, buildbot binaries crash on some systems, but not mine (glibc 2.19)
  • my own builds don't crash on affected people's systems, including another glibc 2.19-based one, go figure
  • kernel versions seem to range from 3.4 to 4.x

@edef1c
Copy link
Contributor Author

edef1c commented Aug 20, 2016

@nagisa Yes, I'm fairly certain it's glibc / pthread. Take a look at the linked glibc bugzilla page, and note how it crashes in the same function, doing the same things. Removing the DT_NEEDED entry again with patchelf brings the problem back, while leaving the kernel bug workaround in place.
This basically boils down to glibc splitting pthread being an awful idea.
On x86, it always sets the (setjmp/longjmp) exception handlers up in the TLS, as opposed to a separate area for the threadless runtime, which is why things do work on x86.

@edef1c
Copy link
Contributor Author

edef1c commented Aug 20, 2016

whoops, seems I didn't post my useful gdb logs, and assumed I'd already posted them for all to read.
I don't have them right now, but it boils down to __pthread_initialize_minimal_internal being the thing that jumps to address 0, specifically somewhere in these two lines, afaik the first statement, resolving the PLT entry

#ifdef SHARED
  /* Transfer the old value from the dynamic linker's internal location.  */
  *__libc_dl_error_tsd () = *(*GL(dl_error_catch_tsd)) ();
  GL(dl_error_catch_tsd) = &__libc_dl_error_tsd;

@nagisa
Copy link
Member

nagisa commented Aug 20, 2016

Oh, so yeah, buildbot, as always, uses quite old versions of glibc and binutils/gcc. I’m more inclined to blame this bug on old version of binutils used on buildbot, but it is very possible glibc may also be a cause, especially given it provides no forward stability guarantees whatsoever. The jump to NULL from PLT is one of the more prevalent reasons of library version mismatches, too, though the fact that loading the pthread earlier fixes this makes this sentence irrelevant.

@edef1c
Copy link
Contributor Author

edef1c commented Aug 21, 2016

This is before main hits, it's glibc initialisation. The underlying bug exists in just about any version of glibc from the past decade, and no code outside glibc is invoked the entire time.

@ghost
Copy link

ghost commented Aug 22, 2016

The cargo installed via this method also produces a segfault.

(gdb) run
Starting program: /usr/local/bin/cargo 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
0x00000000 in ?? ()
(gdb) bt full
#0  0x00000000 in ?? ()
No symbol table info available.
#1  0x76f7e7b4 in __pthread_initialize_minimal_internal () from /usr/lib/libpthread.so.0
No symbol table info available.
#2  0x76f7dd54 in _init () from /usr/lib/libpthread.so.0
No symbol table info available.
#3  0x76fdf2e8 in call_init.part () from /lib/ld-linux-armhf.so.3
No symbol table info available.
#4  0x76fdf4b8 in _dl_init () from /lib/ld-linux-armhf.so.3
No symbol table info available.
#5  0x76fcfbc4 in _dl_start_user () from /lib/ld-linux-armhf.so.3
No symbol table info available.
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb)

However the workaround above (when applied to cargo) does not fix this.

edit: should have read my own backtrace, running sudo patchelf --add-needed /lib/ld-linux-armhf.so.3 $(which cargo) cleared things up.

@edef1c
Copy link
Contributor Author

edef1c commented Aug 22, 2016

@ghost Awesome! I hadn't gotten cargo working yet, so I'm very glad you've figured this out ❤️
I've added it to the OP.
I think this might have to do with cargo loading libdl?

@edef1c edef1c changed the title rustc and related binaries crash on startup on ARM with glibc, because libpthread is loaded too late rustc and cargo binaries crash on startup on ARM with glibc, because libpthread is loaded too late Aug 22, 2016
@ghost
Copy link

ghost commented Aug 22, 2016

Just for clarity, I also ran the previous patchelf against cargo prior to doing that. So I can't say for sure whether or not it was required. The backtrace is from the initial crash, perhaps I should have posted an intermediate one.

edit: nvm, I see that it worked for you.

@edef1c
Copy link
Contributor Author

edef1c commented Aug 22, 2016

@ghost I tested both with and without the pthread patchelf, but cargo already has libpthread in its DT_NEEDED, and the duplicate entry doesn't seem to be necessary. I'm now happily porting things of my own to ARM [=

@MagaTailor
Copy link

Any idea if #35982 could be related?

@japaric
Copy link
Member

japaric commented Sep 14, 2016

(a) Does this effectively mean that our binary releases of rustc for ARM crash everywhere? Or is it just on systems with newer glibc?

(b) Is there any workaround for this? Perhaps compiling against glibc-2.19 or doing the patchelf trick on the buildbots. Would rustc still work on systems with older glibcs if patched like that?

cc @alexcrichton

@edef1c
Copy link
Contributor Author

edef1c commented Sep 14, 2016

@japaric
a) no idea, I could give it a go (what distros use old glibc?)
b) yes, the patchelf trick wouldn't break anything on systems that already work

@japaric
Copy link
Member

japaric commented Sep 14, 2016

a) no idea, I could give it a go (what distros use old glibc?)

Thanks, I'd be very helpful if you can. I think we should test Debian Wheezy (2.13), Ubuntu Precise (2.15), Ubuntu Trusty (2.19) and perhaps Raspbian (is that Debian Wheezy based?). In theory we support glibc-2.14 and newer so it's okay if Wheezy doesn't work.

b) yes, the patchelf trick wouldn't break anything on systems that already work

Ah, I forgot to ask: can the patchelf trick used on a x86_64 host to patch the arm binary? (we produce our arm binaries via cross compilation) I've only tested patching the binary on the ARM host itself.

@edef1c
Copy link
Contributor Author

edef1c commented Sep 15, 2016

@japaric patchelf cares not for the target, it just manipulates ELF files.

@japaric
Copy link
Member

japaric commented Sep 15, 2016

@edef1c

Trying the patchelf the arm binary on a x86_64 host results in:

$ file rustc
rustc: ELF 32-bit LSB shared object, ARM, EABI5 version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-armhf.so.3, for GNU/Linux 3.2.72, not stripped

$ file $(which patchelf)
/usr/bin/patchelf: ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.6.32, BuildID[sha1]=bba1e8ce02627bc87d0b313ba0a9d7ed1bfda81c, stripped

$ patchelf --add-needed libpthread.so.0 rustc
stat: No such file or directory

but the stat command is there:

$ which stat
/usr/bin/stat

Could I be missing some other package/command?

@TimNN
Copy link
Contributor

TimNN commented Sep 15, 2016

@japaric: I think stat: No such file or directory means that the stat command / function was execute with a path that does not exists. I think you may need to specify an absolute path to libpthread.

@japaric
Copy link
Member

japaric commented Sep 15, 2016

@TimNN

if I use the path where libpthread.so.0 resides on the target ARM device, which is not where libpthread lives on the x86_64 host.

$ ls /usr/lib/libpthread.so.0
ls: cannot access '/usr/lib/libpthread.so.0': No such file or directory

$ patchelf --debug --add-needed /usr/lib/libpthread.so.0 rustc
patching ELF file `--add-needed'
Kernel page size is 4096 bytes
stat: No such file or directory

if I instead use the path where libpthread resides on the host:

$ ls /lib/x86_64-linux-gnu/libpthread.so.0
/lib/x86_64-linux-gnu/libpthread.so.0

$ patchelf --debug --add-needed /lib/x86_64-linux-gnu/libpthread.so.0 rustc
patching ELF file `--add-needed'
Kernel page size is 4096 bytes
stat: No such file or directory

😕

@TimNN
Copy link
Contributor

TimNN commented Sep 15, 2016

😕 So much for helpful error messages...

Then maybe it needs an absolute path to rustc?

You can also try running with strace, something like strace -fe trace=file patchelf ... should print all file related system calls I think, this should show which file exactly cannot be found.

@sanmai-NL
Copy link

strace -e stat patchelf --add-needed libpthread.so.0 rustc

should work indeed.

@japaric
Copy link
Member

japaric commented Sep 15, 2016

$ strace -e stat patchelf --add-needed libpthread.so.0 rustc
stat("--add-needed", 0x7fffbcb94610)    = -1 ENOENT (No such file or directory)
stat: No such file or directory
+++ exited with 1 +++

lol

Ubuntu 16.04 ships with patchelf 0.8 and that version doesn't have the --add-needed command:

$ patchelf --version
patchelf 0.8

$ patchelf --help
syntax: patchelf
  [--set-interpreter FILENAME]
  [--print-interpreter]
  [--set-rpath RPATH]
  [--shrink-rpath]
  [--print-rpath]
  [--force-rpath]
  [--remove-needed LIBRARY]
  [--debug]
  [--version]
  FILENAME

I have patchelf 0.9 on the ARM device. Which does have the --add-needed flag:

$ patchelf --version
patchelf 0.9

$ patchelf --help
syntax: patchelf
  [--set-interpreter FILENAME]
  [--page-size SIZE]
  [--print-interpreter]
  [--print-soname]              Prints 'DT_SONAME' entry of .dynamic section. Raises an error if DT_SONAME doesn't exist
  [--set-soname SONAME]         Sets 'DT_SONAME' entry to SONAME.
  [--set-rpath RPATH]
  [--remove-rpath]
  [--shrink-rpath]
  [--print-rpath]
  [--force-rpath]
  [--add-needed LIBRARY]
  [--remove-needed LIBRARY]
  [--replace-needed LIBRARY NEW_LIBRARY]
  [--print-needed]
  [--no-default-lib]
  [--debug]
  [--version]
  FILENAME

I'll compile a newer patchelf from source on the Ubuntu docker container I'm using and try that newer version.

Thanks for the help @TimNN and @sanmai-NL

@japaric
Copy link
Member

japaric commented Sep 15, 2016

OK, using patchelf 0.9, I can fix rustc with patchelf --add-needed libpthread.so.0 rustc. We can replicate this trick in the buildbots but this appears to affect rustc, cargo and rustup (all those are built using the same docker image and against glibc-2.14) so all those would have to be patchelfed ...

@alexcrichton How do you feel about that ^ "fix"?

Alternatively, we could explore bumping the glibc version (2.19 appears to produce working binaries) that we use in the buildbots. We are currently using glibc-2.14 to support systems with (very) old glibcs but, right now, our binaries segfault on systems with recent glibcs because of this bug.

@sanmai-NL
Copy link

@japaric: Bump the glibc dep please. Working around an issue in known flawed, old software seems less preferable to me.

@japaric
Copy link
Member

japaric commented Sep 18, 2016

Bump the glibc dep please

Using a glibc-2.16/gcc-4.9 cross toolchain I can produce good rustc binaries. That seems like the smallest bump we can make because I can't get crosstool-ng to produce a glibc-2.15/gcc-4.9 toolchain.

@alexcrichton
Copy link
Member

@japaric I'd prefer to bump the glibc dep, but I've seen you've done that, so I'll merge!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants