Delay `include_bytes` to AST lowering #103812

clubby789 · 2022-10-31T18:45:57Z

Hopefully addresses #65818.
This PR introduces a new ExprKind::IncludedBytes which stores the path and bytes of a file included with include_bytes!(). We can then create a literal from the bytes during AST lowering, which means we don't need to escape the bytes into valid UTF8 which is the cause of most of the overhead of embedding large binary blobs.

rustbot · 2022-10-31T18:46:06Z

r? @TaKO8Ki

(rustbot has picked a reviewer for you, use r? to override)

rustbot · 2022-10-31T21:45:35Z

Some changes occurred in src/tools/clippy

cc @rust-lang/clippy

Some changes occurred in src/tools/rustfmt

cc @rust-lang/rustfmt

compiler/rustc_builtin_macros/src/source_util.rs

compiler/rustc_ast_pretty/src/pprust/state.rs

jyn514 · 2022-10-31T23:30:50Z

@bors try @rust-timer queue

rust-timer · 2022-10-31T23:30:52Z

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

bors · 2022-10-31T23:30:59Z

⌛ Trying commit 261c30cc4c962434dc5608ba767ed7b3cd7a35be with merge f3fc6066fb6e47915c9c0f888fbb7a66f620e14f...

bors · 2022-11-01T01:52:43Z

☀️ Try build successful - checks-actions
Build commit: f3fc6066fb6e47915c9c0f888fbb7a66f620e14f (f3fc6066fb6e47915c9c0f888fbb7a66f620e14f)

rust-timer · 2022-11-01T01:52:46Z

Queued f3fc6066fb6e47915c9c0f888fbb7a66f620e14f with parent 95a3a72, future comparison URL.

rust-timer · 2022-11-01T03:11:30Z

Finished benchmarking commit (f3fc6066fb6e47915c9c0f888fbb7a66f620e14f): comparison URL.

Overall result: no relevant changes - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	4.2%	[4.2%, 4.2%]	1
Improvements ✅ (primary)	-3.8%	[-3.8%, -3.8%]	1
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-3.8%	[-3.8%, -3.8%]	1

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	3.5%	[3.5%, 3.5%]	1
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-	-	0

Kobzol · 2022-11-01T18:39:40Z

This looks great! Could you please try to benchmark some code that includes a large binary file with and without this PR? We don't have a benchmark for this present in the rustc-perf suite at the moment.

clubby789 · 2022-11-01T20:17:56Z

With this program:

static BYTES: &[u8] = include_bytes!("out");

fn main() {
    println!("{}", BYTES.len());
}

Using GNU time to record the time and max RSS:

> dd if=/dev/urandom of=src/out bs=500M count=1
...
> cargo clean && /usr/bin/time -f 'mem %M, user %Us' cargo +nightly build
   Compiling big_inc v0.1.0 (/tmp/big_inc)
    Finished dev [unoptimized + debuginfo] target(s) in 10.52s
mem 4265648, user 9.55s
> cargo clean && /usr/bin/time -f 'mem %M, user %Us' cargo +stage1 build
   Compiling big_inc v0.1.0 (/tmp/big_inc)
    Finished dev [unoptimized + debuginfo] target(s) in 3.39s
mem 2755012, user 2.70s

clubby789 · 2022-11-06T17:36:55Z

Looking at this a bit further, around 50% of the remaining compile time after this patch is spent in https://github.com/llvm/llvm-project/blob/244331ae833aaf33503bbd36890e704afb66a237/llvm/lib/IR/Constants.cpp#L2970-L2971 - not sure if that could be improved at all.

EDIT: To the best of my knowledge, this is due to LLVM hashing the entire constant byte by byte. This is also done on the Rust side at some points (hashing query results), but the impact is less severe.

est31 · 2022-11-07T06:50:06Z

@clubby789 people have sent LLVM patches in the past to improve things for Rust (edit: in fact this occurs all the time). If you want to change llvm too and need help, you could maybe ask the llvm working group on zulip.

Kobzol · 2022-11-07T14:41:38Z

Hopefully the LLVM code could be improved (e.g. a faster hash function), but I wouldn't block this PR on this now, as this PR already seems to improve the situation for the Rust side a lot.

petrochenkov · 2022-11-10T15:39:23Z

This is very similar to adding a new AST node for format_args!, see the discussion in https://rust-lang.zulipchat.com/#narrow/stream/233931-t-compiler.2Fmajor-changes/topic/Make.20format_args!.28.29.20its.20own.20AST.20node.20.28ast.E2.80.A6.20compiler-team.23541/near/295952151 and below.

What this node needs to have is a "canonical token representation", into which it turns one any attempt at tokenization, which includes pretty printing.
Thankfully, in this case the token representation is pretty simple.

Also, every use of ast::ExprKind::Lit needs to be audited and equivalent logic should be added for ast::ExprKind::IncludedBytes if relevant. It's hard to verify whether it was done or not from the diff.

compiler/rustc_ast_pretty/src/pprust/state.rs

compiler/rustc_passes/src/hir_stats.rs

petrochenkov · 2022-11-10T15:50:06Z

r? @petrochenkov
@rustbot author

clubby789 · 2022-11-10T17:02:32Z

Also, every use of ast::ExprKind::Lit needs to be audited and equivalent logic should be added for ast::ExprKind::IncludedBytes if relevant. It's hard to verify whether it was done or not from the diff.

The initial diff just changed some matches until it compiled - I've now gone through all references to ExprKind::Lit and made sure IncludedBytes is handled (like a ByteStr) where appropriate

compiler/rustc_ast/src/ast.rs

compiler/rustc_ast/src/mut_visit.rs

compiler/rustc_ast_lowering/src/expr.rs

src/tools/rustfmt/src/expr.rs

compiler/rustc_expand/src/build.rs

compiler/rustc_ast_pretty/src/pprust/state.rs

compiler/rustc_expand/src/proc_macro_server.rs

petrochenkov · 2022-11-11T16:02:32Z

Thanks!
r=me after squashing commits and rebasing.
@rustbot author

petrochenkov · 2022-11-11T16:48:38Z

@bors r+

bors · 2022-11-11T16:48:40Z

📌 Commit b2da155 has been approved by petrochenkov

It is now in the queue for this repository.

bors · 2022-11-12T14:30:37Z

⌛ Testing commit b2da155 with merge 8ef2485...

bors · 2022-11-12T17:39:04Z

☀️ Test successful - checks-actions
Approved by: petrochenkov
Pushing 8ef2485 to master...

rust-timer · 2022-11-12T18:56:45Z

Finished benchmarking commit (8ef2485): comparison URL.

Overall result: no relevant changes - no action needed

@rustbot label: -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	3.7%	[3.7%, 3.7%]	1
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-2.4%	[-2.4%, -2.4%]	1
All ❌✅ (primary)	-	-	0

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	2.3%	[2.3%, 2.3%]	1
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	2.3%	[2.3%, 2.3%]	1

…etrochenkov Delay `include_bytes` to AST lowering Hopefully addresses rust-lang#65818. This PR introduces a new `ExprKind::IncludedBytes` which stores the path and bytes of a file included with `include_bytes!()`. We can then create a literal from the bytes during AST lowering, which means we don't need to escape the bytes into valid UTF8 which is the cause of most of the overhead of embedding large binary blobs.

Kobzol · 2023-07-01T12:28:25Z

FWIW, I tried to replace djbHash with xxHash on the benchmark posted above, here are the results:

xxHash
Benchmark 1: cargo clean && cargo +local build
  Time (mean ± σ):      3.864 s ±  0.042 s    [User: 1.898 s, System: 2.269 s]
  Range (min … max):    3.749 s …  3.902 s    10 runs

djbHash
Benchmark 1: cargo clean && cargo +local build
  Time (mean ± σ):      4.307 s ±  0.025 s    [User: 2.244 s, System: 2.366 s]
  Range (min … max):    4.273 s …  4.351 s    10 runs

(I shouldn't also measure the clean step, but w/e, this is unscientific anyway :) ).

There is a clear improvement, but it's nothing really big (given the fact that we hash a stress test - 500 MiB file!). So I'm not sure if it's worth it to switch it.

rustbot assigned TaKO8Ki Oct 31, 2022

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Oct 31, 2022

This comment has been minimized.

Sign in to view

camsteffen reviewed Oct 31, 2022

View reviewed changes

compiler/rustc_builtin_macros/src/source_util.rs Outdated Show resolved Hide resolved

est31 reviewed Oct 31, 2022

View reviewed changes

compiler/rustc_ast_pretty/src/pprust/state.rs Outdated Show resolved Hide resolved

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Oct 31, 2022

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Nov 1, 2022

est31 mentioned this pull request Nov 4, 2022

Error with multiple large include_bytes! uses. #103979

Closed

petrochenkov self-assigned this Nov 5, 2022

clubby789 mentioned this pull request Nov 7, 2022

Poor performance with hashing large constant strings llvm/llvm-project#58860

Open

petrochenkov reviewed Nov 10, 2022

View reviewed changes

compiler/rustc_ast_pretty/src/pprust/state.rs Outdated Show resolved Hide resolved

compiler/rustc_passes/src/hir_stats.rs Show resolved Hide resolved

rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 10, 2022

petrochenkov added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Nov 10, 2022

petrochenkov reviewed Nov 11, 2022

View reviewed changes

petrochenkov added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 11, 2022

Introduce ExprKind::IncludedBytes

b2da155

clubby789 force-pushed the improve-include-bytes branch from e3d8936 to b2da155 Compare November 11, 2022 16:32

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Nov 11, 2022

bors added the merged-by-bors This PR was explicitly merged by bors. label Nov 12, 2022

bors merged commit 8ef2485 into rust-lang:master Nov 12, 2022

rustbot added this to the 1.67.0 milestone Nov 12, 2022

trinity-1686a mentioned this pull request Nov 14, 2022

non-pattern macro in pattern position: include_bytes #104414

Closed

Yosshi999 mentioned this pull request Jan 3, 2023

モデルの数が多いとビルドできない VOICEVOX/voicevox_core#362

Closed

clubby789 mentioned this pull request Jan 7, 2023

Using include_bytes! on large binary blobs compiles more slowly than expected #65818

Open

clubby789 deleted the improve-include-bytes branch February 11, 2023 14:44

jkelleyrtp mentioned this pull request Aug 6, 2024

Support relative assets like include_str!() DioxusLabs/manganis#59

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Delay `include_bytes` to AST lowering #103812

Delay `include_bytes` to AST lowering #103812

clubby789 commented Oct 31, 2022

rustbot commented Oct 31, 2022

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

rustbot commented Oct 31, 2022

jyn514 commented Oct 31, 2022

rust-timer commented Oct 31, 2022

bors commented Oct 31, 2022

bors commented Nov 1, 2022

rust-timer commented Nov 1, 2022

rust-timer commented Nov 1, 2022

Kobzol commented Nov 1, 2022

clubby789 commented Nov 1, 2022

clubby789 commented Nov 6, 2022 •

edited

Loading

est31 commented Nov 7, 2022 •

edited

Loading

Kobzol commented Nov 7, 2022 •

edited

Loading

petrochenkov commented Nov 10, 2022

petrochenkov commented Nov 10, 2022

clubby789 commented Nov 10, 2022 •

edited

Loading

petrochenkov commented Nov 11, 2022

petrochenkov commented Nov 11, 2022

bors commented Nov 11, 2022

bors commented Nov 12, 2022

bors commented Nov 12, 2022

rust-timer commented Nov 12, 2022

Kobzol commented Jul 1, 2023

Delay include_bytes to AST lowering #103812

Delay include_bytes to AST lowering #103812

Conversation

clubby789 commented Oct 31, 2022

rustbot commented Oct 31, 2022

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

rustbot commented Oct 31, 2022

jyn514 commented Oct 31, 2022

rust-timer commented Oct 31, 2022

bors commented Oct 31, 2022

bors commented Nov 1, 2022

rust-timer commented Nov 1, 2022

rust-timer commented Nov 1, 2022

Overall result: no relevant changes - no action needed

Instruction count

Max RSS (memory usage)

Cycles

Kobzol commented Nov 1, 2022

clubby789 commented Nov 1, 2022

clubby789 commented Nov 6, 2022 • edited Loading

est31 commented Nov 7, 2022 • edited Loading

Kobzol commented Nov 7, 2022 • edited Loading

petrochenkov commented Nov 10, 2022

petrochenkov commented Nov 10, 2022

clubby789 commented Nov 10, 2022 • edited Loading

petrochenkov commented Nov 11, 2022

petrochenkov commented Nov 11, 2022

bors commented Nov 11, 2022

bors commented Nov 12, 2022

bors commented Nov 12, 2022

rust-timer commented Nov 12, 2022

Overall result: no relevant changes - no action needed

Instruction count

Max RSS (memory usage)

Cycles

Kobzol commented Jul 1, 2023

Delay `include_bytes` to AST lowering #103812

Delay `include_bytes` to AST lowering #103812

clubby789 commented Nov 6, 2022 •

edited

Loading

est31 commented Nov 7, 2022 •

edited

Loading

Kobzol commented Nov 7, 2022 •

edited

Loading

clubby789 commented Nov 10, 2022 •

edited

Loading