Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using include_bytes! on large binary blobs compiles more slowly than expected #65818

Open
alexcrichton opened this issue Oct 25, 2019 · 10 comments
Labels
I-compiletime Issue: Problems and improvements with respect to compile times. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@alexcrichton
Copy link
Member

I've been playing around recently with compiling a crate that simply contains:

pub const BYTES: &[u8] = include_bytes!("large-binary-blob");

and I've been surprised that the compile time of this crate is pretty nontrivial!

For example this crate:

pub const BYTES: &[u8] = &[];

takes 0.060 seconds to compile on nightly for me. If a multi-megabyte binary (such as cargo itself) is included:

pub const BYTES: &[u8] = include_bytes!("/path/to/.cargo/bin/cargo");

this crate takes 0.729 seconds to compile!

There appear to be at least two major sources of slowdown:

  1. In the expansion of include_bytes! the expr_lit function calls from_lit_kind which calls to_lit_token that runs an operation per-byte, which is pretty expensive for multi-megabyte files.
  2. Later in compilation while emitting metadata when we're processing constants we'll end up pretty printing them and pretty printing the byte string literal for a multi-megabyte file takes quite some time!

There may be a few other sources of slowdown, but ideally usage of include_bytes! and large byte blobs in general should never iterate over the bytes and perform expensive operations, but rather the bytes should just be transferred as a whole in various contexts if absolutely necessary.

@alexcrichton alexcrichton added the I-compiletime Issue: Problems and improvements with respect to compile times. label Oct 25, 2019
@jonas-schievink
Copy link
Contributor

Similar to #56900 (which kind of went under the radar)

@jonas-schievink jonas-schievink added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Oct 25, 2019
@nagisa
Copy link
Member

nagisa commented Oct 26, 2019

Most ideally this would result in rustc telling linker to link that file into whatever symbol. Possible but super non-trivial to achieve. (At least as far as runtime behaviour of this item is concerned)

@hanna-kruppe
Copy link
Contributor

hanna-kruppe commented Oct 26, 2019

Most ideally this would result in rustc telling linker to link that file into whatever symbol. Possible but super non-trivial to achieve. (At least as far as runtime behaviour of this item is concerned)

I'm sure you're aware but just to spell it out for completeness: this would not work in general since const eval may need the file contents, it would be a fast path for the special where the bytes are only needed at runtime.

Depending on the details, I am also a bit concerned about robustness. For example, just storing the file path in the rlib could break if the included file is moved or deleted between the rlib being created and it being linked into a binary or other final artifact. It also wouldn't extend to staticlibs, I believe, which are probably a common crate type for projects where this optimization would help.

OTOH, using objcopy to turn the included file into a thin object file would allow including it in rlibs and staticlibs, resolving those issues, at the cost of some more file I/O (but still side-stepping all work that's due to rustc internals).

@alexcrichton
Copy link
Member Author

After seeing the const-ness of needing this I tried out a few different formulations as well:

pub static FOO: &[u8] = include_bytes!("...");

or...

pub fn foo() -> &'static [u8] {
    include_bytes!("...")
}

unfortunately though while they were slightly faster they were still quite slow (on the order of hundreds of milliseconds for a 10MB file or so)

@Mark-Simulacrum
Copy link
Member

Later in compilation while emitting metadata when we're processing constants we'll end up pretty printing them and pretty printing the byte string literal for a multi-megabyte file takes quite some time!

I believe the only reason we pretty print constants is for rustdoc -- though I could be wrong -- I imagine maybe the solution then is to impose some sort of limit? And/or if we can somehow store that the constant came from include_bytes, I wonder if displaying include_bytes!(...) makes more sense than the raw contents in rustdoc...

@Twey
Copy link

Twey commented Jan 2, 2020

In my tests, a 256 MiB file (dd if=/dev/zero of=big-file.dat bs=1M count=256) takes over 10 GiB of RAM and 25 s to compile.
Interestingly, a file created from /dev/urandom takes significantly less RAM and time (about 2 GiB and 10 s).

@kennykerr
Copy link
Contributor

I just came across this with the windows crate where I embed the default metadata describing the Windows API amounting to around 20MB. Are there any plans to fix this or should I write my own implementation?

@petrochenkov
Copy link
Contributor

I tried to experiment with this last year, there are very real benefits from not escaping bytes from include_bytes and not turning them into a Symbol, even if nothing else is done (maybe 2x-10x speed/memory by order of magnitude, I don't remember exactly).

The main issues:

  • We need to do this without increasing size of Token because it regresses other cases. This excludes some naïve approaches.
  • We may want to introduce a new literal kind for the result of include_bytes (which doesn't match regular byte strings in macros) otherwise we get subtle correctness issues when comparing literals from include_bytes and other sources.

IIRC, one viable implementation candidate was supporting interning non-UTF-8 byte arrays in string interner and representing them as Symbols (they would occupy a range of Symbols not overlapping with regular string Symbols, e.g. in the upper half of u32 range).
In other words, this issue is not a fundamental problem, but it's not entirely trivial either, and requires some care.

@makr11st
Copy link

makr11st commented Sep 8, 2022

Hi All, I was wondering if there was any progress in the recent months with this issue?

bors added a commit to rust-lang-ci/rust that referenced this issue Nov 12, 2022
…etrochenkov

Delay `include_bytes` to AST lowering

Hopefully addresses rust-lang#65818.
This PR introduces a new `ExprKind::IncludedBytes` which stores the path and bytes of a file included with `include_bytes!()`. We can then create a literal from the bytes during AST lowering, which means we don't need to escape the bytes into valid UTF8 which is the cause of most of the overhead of embedding large binary blobs.
Aaron1011 pushed a commit to Aaron1011/rust that referenced this issue Jan 6, 2023
…etrochenkov

Delay `include_bytes` to AST lowering

Hopefully addresses rust-lang#65818.
This PR introduces a new `ExprKind::IncludedBytes` which stores the path and bytes of a file included with `include_bytes!()`. We can then create a literal from the bytes during AST lowering, which means we don't need to escape the bytes into valid UTF8 which is the cause of most of the overhead of embedding large binary blobs.
@clubby789
Copy link
Contributor

Hi All, I was wondering if there was any progress in the recent months with this issue?

#103812 improved this considerably (small benchmark). There's still some slowness on the LLVM side however

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I-compiletime Issue: Problems and improvements with respect to compile times. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

10 participants