Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"File too big" on s390x #2632

Closed
cuviper opened this issue Jan 8, 2021 · 9 comments · Fixed by #2707
Closed

"File too big" on s390x #2632

cuviper opened this issue Jan 8, 2021 · 9 comments · Fixed by #2707
Labels

Comments

@cuviper
Copy link
Member

cuviper commented Jan 8, 2021

Problem
On s390x-unknown-linux-gnu, I saw an error while installing:

info: profile set to 'default'
info: default host triple is s390x-unknown-linux-gnu
info: syncing channel updates for 'stable-s390x-unknown-linux-gnu'
info: latest update on 2020-12-31, rust version 1.49.0 (e1884a8e3 2020-12-29)
info: downloading component 'cargo'
info: downloading component 'clippy'
info: downloading component 'rust-std'
info: downloading component 'rustc'
info: downloading component 'rustfmt'
info: installing component 'cargo'
info: using up to 500.0 MiB of RAM to unpack components
info: installing component 'clippy'
info: installing component 'rust-std'
 11.5 MiB /  21.9 MiB ( 53 %)   0 B/s in  1s ETA: Unknown                                                          21.9 MiB /  21.9 MiB (100 %)  11.5 MiB/s in  1s ETA:  0s
info: installing component 'rustc'
error: error: 'File too big rustc-1.49.0-s390x-unknown-linux-gnu/rustc/lib/librustc_driver-d57ba40a74a3b9c1.so 220354152'
 11.0 MiB /  53.4 MiB ( 21 %)   0 B/s in  1s ETA: Unknown                                                          23.7 MiB /  53.4 MiB ( 44 %)  11.0 MiB/s in  2s ETA:  2s                                                          34.6 MiB /  53.4 MiB ( 65 %)  11.8 MiB/s in  3s ETA:  1s                                                          44.1 MiB /  53.4 MiB ( 83 %)  11.5 MiB/s in  4s ETA:  0s                                                          53.4 MiB /  53.4 MiB (100 %)  11.0 MiB/s in  4s ETA:  0s
info: installing component 'rustfmt'
info: default toolchain set to 'stable-s390x-unknown-linux-gnu'

  stable-s390x-unknown-linux-gnu installed - rustc 1.49.0 (e1884a8e3 2020-12-29)


Rust is installed now. Great!

It seems to work fine anyway.

Steps

  1. Normal install on s390x: curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

Possible Solution(s)
Increase MAX_FILE_SIZE? It seems #2363 was the most recent time...

Notes

Output of rustup --version:

rustup 1.23.1 (3df2264a9 2020-11-30)
info: This is the version for the rustup toolchain manager, not the rustc compiler.
info: The currently active `rustc` version is `rustc 1.49.0 (e1884a8e3 2020-12-29)`

Output of rustup show:

Default host: s390x-unknown-linux-gnu
rustup home:  /root/.rustup

installed toolchains
--------------------

stable-s390x-unknown-linux-gnu (default)
nightly-2020-10-01-s390x-unknown-linux-gnu
nightly-2020-11-28-s390x-unknown-linux-gnu
nightly-2020-12-03-s390x-unknown-linux-gnu
nightly-2020-12-04-s390x-unknown-linux-gnu
nightly-s390x-unknown-linux-gnu

active toolchain
----------------

stable-s390x-unknown-linux-gnu (default)
rustc 1.49.0 (e1884a8e3 2020-12-29)

Every one of those nightlies gave a similar error on librustc_driver.

@cuviper cuviper added the bug label Jan 8, 2021
@kinnison
Copy link
Contributor

kinnison commented Jan 9, 2021

As you noted, it worked anyway. In 1.23 we changed it from a rejection of "too large" files, to simply reporting them, but trying anyway. It worries me that we're reaching above 220 megabytes for the compiler driver file, but I suppose it's not a specific problem.

I'd suggest we should change the message so it's a warning, and warns that there's a very large file which may cause problems, rather than showing as an error. Would you agree with that? I suppose we could also just bump the max file size check a bit higher, but that feels like we'd simply be accepting that we continue to bloat up the compiler on some targets.

@cuviper
Copy link
Member Author

cuviper commented Jan 11, 2021

As you noted, it worked anyway.

The problem with an "error" is that I wasn't sure what happened. It seemed to work, but I had no idea what the consequences really were. I was trying to debug a regression on that target, so this raised extra doubt whether I had a working toolchain at all.

It worries me that we're reaching above 220 megabytes for the compiler driver file, but I suppose it's not a specific problem.

FWIW, a big part of that file is static LLVM, which may be dynamically linked on other targets.

To be honest, while it does seem like a useful sanity check, I'm not sure that it belongs in rustup at all. The large file has already been built and distributed, out of the control of the rustup user. Instead, maybe this should be something we check and/or assert as a limit in rust CI instead. (cc @Mark-Simulacrum, @pietroalbini)

I'd suggest we should change the message so it's a warning, and warns that there's a very large file which may cause problems, rather than showing as an error. Would you agree with that?

Even as a "warning", what would the user do about this? What sort of problems are you supposing?

@kinnison
Copy link
Contributor

Unpacking is (currently) done by allocating enough RAM to unpack the file from the compressed tarball into RAM, and then to write it to disk in a thread. This permits interleaving of IOs which massively improves unpacking speed on many platforms. Unfortunately it means that severely RAM limited targets can, without the large-file check, end up OOMd which Rust's default allocator does not handle terribly gracefully (from a user perspective at least). Previously we simply refused to install in these circumstances, these days we permit the install to continue, but at least if the install crashes due to OOM we can see what caused it.

We intend to switch such large elements to being streamed out (slower but low risk) but that work has yet to be done as there's essentially one part time rustup maintainer currently, and there's other somewhat higher priority problems to deal with :(

I'd be very pleased if we were tracking the installation artifact sizes usefully in rust CI, because if we intend to increase support for targets such as raspberry pi etc. we'll need to be thinking more seriously about RAM footprint of the tooling.

@rbtcollins
Copy link
Contributor

Concrete problems:

Users can fix some of these themselves of course.

Longer term we'll introduce a parallel writing chunked IO driver (io_uring will help, but isn't sufficient because it only helps linux - we need to do async CloseHandle calls on windows - 'its complicated') - at which point rust can write as much data as it wants too and rustup won't care.

@cuviper
Copy link
Member Author

cuviper commented Jan 12, 2021

We intend to switch such large elements to being streamed out (slower but low risk) but that work has yet to be done as there's essentially one part time rustup maintainer currently, and there's other somewhat higher priority problems to deal with :(

A safer slow path sounds like the right call, so the fast path doesn't have to force limitations.
I definitely sympathize with maintainer time pressure though.

@rbtcollins
Copy link
Contributor

We have such a safe path, see RUSTUP_IO_THREADS and RUSTUP_UNPACK_RAM in the config docs - https://rust-lang.github.io/rustup/environment-variables.html

@Mark-Simulacrum
Copy link
Member

I think setting a limit during Rust's CI might be complicated, since we no longer package gzip'd tarballs in CI, recompressing at a later point - I think modern rustup is probably always using the xz tarballs though so for that purpose it may not matter. If just xz file size limits make things better I think that would be relatively easy to add.

Would adding such a check allow us to ensure end users don't see this error? It sounds like it might be good either way to add some text to the error indicating what to do (e.g., filing an issue so we know)..

@cuviper
Copy link
Member Author

cuviper commented Jan 12, 2021

@rbtcollins those are just knobs controlling the current behavior, right? I see a slow path as more of an automatic fallback, so you can zip along in the happy fast path for most of the process, but switch to slower streaming when a large file is encountered.

@rbtcollins
Copy link
Contributor

Ah, no, the streaming implementation we have designed won't be slower, it will just be more memory efficient. I've made some headway on implementing it, but I get a few hours a month to hack on rustup, so its going to be another while it comes together.

rbtcollins added a commit to rbtcollins/rustup.rs that referenced this issue Apr 4, 2021
Fixes rust-lang#2632, rust-lang#2145, rust-lang#2564

Files over 16M are now written incrementally chunks rather than buffered
in memory in one full linear buffer. This chunk size is not
configurable.

For threaded unpacking, the entire memory buffer will be used to buffer
chunks and a single worker thread will dispatch IO operations from the
buffer, so minimal performance impact should be anticipated (file
size/16M round trips at worst, and most network file systems will
latency hide linear writes).

For immediate unpacking, each chunk is dispatched directly to disk,
which may impact performance as less latency hiding is possible - but
for immediate unpacking clarity of behaviour is the priority.
rbtcollins added a commit to rbtcollins/rustup.rs that referenced this issue Apr 5, 2021
Fixes rust-lang#2632, rust-lang#2145, rust-lang#2564

Files over 16M are now written incrementally chunks rather than buffered
in memory in one full linear buffer. This chunk size is not
configurable.

For threaded unpacking, the entire memory buffer will be used to buffer
chunks and a single worker thread will dispatch IO operations from the
buffer, so minimal performance impact should be anticipated (file
size/16M round trips at worst, and most network file systems will
latency hide linear writes).

For immediate unpacking, each chunk is dispatched directly to disk,
which may impact performance as less latency hiding is possible - but
for immediate unpacking clarity of behaviour is the priority.
rbtcollins added a commit to rbtcollins/rustup.rs that referenced this issue Apr 27, 2021
Fixes rust-lang#2632, rust-lang#2145, rust-lang#2564

Files over 16M are now written incrementally chunks rather than buffered
in memory in one full linear buffer. This chunk size is not
configurable.

For threaded unpacking, the entire memory buffer will be used to buffer
chunks and a single worker thread will dispatch IO operations from the
buffer, so minimal performance impact should be anticipated (file
size/16M round trips at worst, and most network file systems will
latency hide linear writes).

For immediate unpacking, each chunk is dispatched directly to disk,
which may impact performance as less latency hiding is possible - but
for immediate unpacking clarity of behaviour is the priority.
rbtcollins added a commit to rbtcollins/rustup.rs that referenced this issue Jul 11, 2021
Fixes rust-lang#2632, rust-lang#2145, rust-lang#2564

Files over 16M are now written incrementally chunks rather than buffered
in memory in one full linear buffer. This chunk size is not
configurable.

For threaded unpacking, the entire memory buffer will be used to buffer
chunks and a single worker thread will dispatch IO operations from the
buffer, so minimal performance impact should be anticipated (file
size/16M round trips at worst, and most network file systems will
latency hide linear writes).

For immediate unpacking, each chunk is dispatched directly to disk,
which may impact performance as less latency hiding is possible - but
for immediate unpacking clarity of behaviour is the priority.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants