Directly save a byte representation of the dep-graph and work-product index #83322

cjgillot · 2021-03-20T15:03:48Z

Those files are internal to the incremental engine. They are not meant to be portable.

r? @ghost

cjgillot · 2021-03-20T15:04:58Z

@bors try @rust-timer queue

rust-timer · 2021-03-20T15:04:59Z

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

bors · 2021-03-20T15:05:06Z

⌛ Trying commit de6aaf996d41dddc68ba0f1e5702c42ecd316236 with merge 6a12039846406dd55e03e6296ebaaab7cd258012...

bors · 2021-03-20T15:55:31Z

☀️ Try build successful - checks-actions
Build commit: 6a12039846406dd55e03e6296ebaaab7cd258012 (6a12039846406dd55e03e6296ebaaab7cd258012)

rust-timer · 2021-03-20T15:55:33Z

Queued 6a12039846406dd55e03e6296ebaaab7cd258012 with parent 41b315a, future comparison URL.

rust-timer · 2021-03-20T20:27:57Z

Finished benchmarking try commit (6a12039846406dd55e03e6296ebaaab7cd258012): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf

cjgillot · 2021-03-20T20:49:42Z

@bors try @rust-timer queue

rust-timer · 2021-03-20T20:49:43Z

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

bors · 2021-03-20T20:49:50Z

⌛ Trying commit a1ea84fbd3fd67d65505013e7d031cf30c6f5d10 with merge 13d82bed22285425f21da9852b33173df9108b4e...

bors · 2021-03-20T21:50:12Z

☀️ Try build successful - checks-actions
Build commit: 13d82bed22285425f21da9852b33173df9108b4e (13d82bed22285425f21da9852b33173df9108b4e)

rust-timer · 2021-03-20T21:50:14Z

Queued 13d82bed22285425f21da9852b33173df9108b4e with parent 61edfd5, future comparison URL.

rust-timer · 2021-03-21T01:37:47Z

Finished benchmarking try commit (13d82bed22285425f21da9852b33173df9108b4e): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf

cjgillot · 2021-03-21T11:30:55Z

r? @michaelwoerister

michaelwoerister · 2021-03-22T10:29:05Z

Thanks for the PR, @cjgillot!

What's the difference between "raw" and "opaque"? The opaque encoder/decoder are already meant to be the "raw" serialization framework. Is there anything that the "raw" version does that the opaque version could not be made to do?

cjgillot · 2021-03-22T10:47:29Z

The opaque serialization uses LEB128 encoding, so as to have an architecture independent file. The raw encoding dumps bytes as they are.

michaelwoerister · 2021-03-23T11:07:40Z

Ah, that makes sense. How does that affect the size of the dep-graph file?

cjgillot · 2021-03-23T12:10:28Z

It's difficult to say. LEB128 is variable-length, so encoding small numbers takes less room. In the best case, the LEB128 reduces size by 75%, but in the worst case LEB128 overhead is 15%.
I expect the dep-graph file to get slightly larger with this change. I shall run some benchmarks to get a relevant size estimate.

bjorn3 · 2021-03-24T18:44:04Z

compiler/rustc_serialize/src/raw.rs

+
+macro_rules! write_raw {
+    ($enc:expr, $value:expr, $int_ty:ty) => {{
+        let bytes = $value.to_ne_bytes();


Can you please use to_le_bytes instead? This is just as fast on little endian systems, but for example makes it possible to move the incr cache to a big endian system and use then --target.

Is this an actual use case? This PR makes the file format unportable (because of isize/usize mainly), with the objective to memmap part of it. If portability is required, I need to change the implementation to make sure of it.

Right I can't name any. I do remember some talk about maybe changing crate metadata to be an incr comp snapshot or something like that in the future. I can't remember the details. In that case portability is very important for cross-compilation.

michaelwoerister · 2021-03-26T10:24:15Z

@rust-lang/wg-compiler-performance How far are we away from collecting file sizes via self-profiling and perf.rlo? It should be quite easy to add the recording part to measureme.

rylev · 2021-03-26T11:01:36Z

How far are we away from collecting file sizes via self-profiling and perf.rlo? It should be quite easy to add the recording part to measureme.

No one is currently working on this as far as I know. However, I'm also not aware of any reason why this would not be possible to add.

bors · 2021-11-20T17:52:35Z

⌛ Trying commit bf32e3f8ff612fba7ad7e860a661575a1d91b293 with merge c16d991f9b3d8ee4b2b0f31533933924006b013e...

bors · 2021-11-20T19:31:37Z

☀️ Try build successful - checks-actions
Build commit: c16d991f9b3d8ee4b2b0f31533933924006b013e (c16d991f9b3d8ee4b2b0f31533933924006b013e)

rust-timer · 2021-11-20T19:31:39Z

Queued c16d991f9b3d8ee4b2b0f31533933924006b013e with parent 93542a8, future comparison URL.

rust-timer · 2021-11-20T22:05:17Z

Finished benchmarking commit (c16d991f9b3d8ee4b2b0f31533933924006b013e): comparison url.

Summary: This change led to very large relevant mixed results 🤷 in compiler performance.

Very large improvement in instruction counts (up to -5.1% on incr-unchanged builds of stm32f4)
Moderate regression in instruction counts (up to 0.6% on full builds of diesel)

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR led to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf +perf-regression

cjgillot · 2021-11-21T19:55:19Z

As expected, there is a ~30% hit on file sizes. This is sizeable. What should be the decision process?
Pros:

reduction of instruction count;
this is a prerequisite for memmapping those files.

Cons:

larger disk usage;
slower load time from disk.

We can also wait for a preliminary implementation of memmap dep-graph to benchmark it.

Mark-Simulacrum · 2021-11-21T21:09:17Z

A few questions:

Do we have a sense of whether the disk usage is caused by some types/queries in particular? I'm primarily wondering if we can apply some optimizations to shrink the file size impact, since it seems somewhat non-obvious that the impact is strictly necessary to get the mmap-ability.

Do we know what the instruction count improvements are primarily coming from? Can we get those benefits without the disk space increase, or at least, at smaller cost? (e.g., is it due to avoiding variable-sized integer encoding/decoding?)

Also, is the impact limited just to incremental artifacts (I presume so?) -- if so, then the 30-40% impact is likely to be less for smaller workspaces, since their target directory size is probably dominated by compiled dependencies, not the leaf crate's incremental data. For rustc and other large workspaces (where incremental is applied to many crates), though, a 30-40% increase is going to be rather painful -- I'm worried that it may make it harder for folks to use incremental.

michaelwoerister · 2021-11-25T13:55:08Z

Do we have a sense of whether the disk usage is caused by some types/queries in particular?

If I'm reading the code correctly this is only about the DepGraph in particular. So what we encode here are mostly node indices, fingerprints, and DepNodes (which internally are a pair of discriminant and fingerprint).

slower load time from disk.

Is this observable somewhere?

Some ideas on reducing the file size:

Do we know how DepNode discriminants are encoded? Two bytes should certainly be enough, but worst case they might be encoded as eight bytes.
The performance data seems to suggest that we encode about 3.2 million DepNodes for the largest crates (see incr_comp_encode_dep_graph execution count of incr-full build). We might get away with encoding DepNode indices with just 3 bytes, allowing for up to ~16 million nodes.

Regarding the decision process: I don't know :) My thoughts:

The instruction count reduction is nice. But it is in a range where it's not entirely clear if it is worth the increased disk space usage.
The increased disk space usage is significant. But it isn't so big that it's completely out of the question that we accept it (IMO).
The instruction count reduction doesn't really translate to wall-time improvements because we already load the graph in the background.

Having a memmapped implementation that doesn't require any up-front decoding might indeed be helpful for making a decision here.

bors · 2022-02-20T05:27:11Z

☔ The latest upstream changes (presumably #94174) made this pull request unmergeable. Please resolve the merge conflicts.

rust-log-analyzer · 2022-03-05T16:07:37Z

The job mingw-check failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)

    Checking rustc_serialize v0.0.0 (/checkout/compiler/rustc_serialize)
    Checking petgraph v0.5.1
    Checking object v0.28.1
    Checking gimli v0.26.1
error: expected `;`, found keyword `self`
   --> compiler/rustc_serialize/src/raw.rs:468:42
    |
468 |         self.emit_raw_bytes(v.as_bytes())
    |                                          ^ help: add `;` here
469 |         self.emit_u8(STR_SENTINEL)
    |         ---- unexpected token
    Checking rustc_macros v0.1.0 (/checkout/compiler/rustc_macros)
error[E0308]: mismatched types
   --> compiler/rustc_serialize/src/raw.rs:468:9
    |
    |
468 |         self.emit_raw_bytes(v.as_bytes())
    |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^- help: consider using a semicolon here
    |         expected `()`, found enum `Result`
    |
    = note: expected unit type `()`
                    found enum `Result<(), std::io::Error>`

bors · 2022-05-20T23:12:53Z

☔ The latest upstream changes (presumably #95418) made this pull request unmergeable. Please resolve the merge conflicts.

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Mar 20, 2021

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Mar 20, 2021

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Mar 20, 2021

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Mar 21, 2021

cjgillot marked this pull request as ready for review March 21, 2021 11:30

rust-highfive assigned michaelwoerister Mar 21, 2021

cjgillot force-pushed the rawencoder branch from a1ea84f to 7c131bd Compare March 22, 2021 18:13

bjorn3 reviewed Mar 24, 2021

View reviewed changes

rustbot added perf-regression Performance regression. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Nov 20, 2021

JohnCSimon added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 12, 2021

michaelwoerister added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 16, 2021

cjgillot force-pushed the rawencoder branch from bf32e3f to cd692ea Compare January 29, 2022 15:44

cjgillot force-pushed the rawencoder branch from cd692ea to 37c1ba5 Compare March 5, 2022 09:07

This comment has been minimized.

Sign in to view

cjgillot added 6 commits March 5, 2022 17:03

Introduce raw encoder and decoder.

c7ea77d

Add test.

f03955f

Move IntEncodedWithFixedSize.

5fef727

Inline more in serialization.

c6643f7

Use raw encoder for dep-graph.

5f21be5

Use raw encoder for work-product index.

b51974f

cjgillot force-pushed the rawencoder branch from 37c1ba5 to b51974f Compare March 5, 2022 16:03

cjgillot mentioned this pull request Apr 1, 2022

Memory-map the dep-graph instead of reading it up front #95543

Closed

cjgillot closed this Jul 30, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Directly save a byte representation of the dep-graph and work-product index #83322

Directly save a byte representation of the dep-graph and work-product index #83322

cjgillot commented Mar 20, 2021

cjgillot commented Mar 20, 2021

rust-timer commented Mar 20, 2021

bors commented Mar 20, 2021

bors commented Mar 20, 2021

rust-timer commented Mar 20, 2021

rust-timer commented Mar 20, 2021

cjgillot commented Mar 20, 2021

rust-timer commented Mar 20, 2021

bors commented Mar 20, 2021

bors commented Mar 20, 2021

rust-timer commented Mar 20, 2021

rust-timer commented Mar 21, 2021

cjgillot commented Mar 21, 2021

michaelwoerister commented Mar 22, 2021

cjgillot commented Mar 22, 2021

michaelwoerister commented Mar 23, 2021

cjgillot commented Mar 23, 2021

bjorn3 Mar 24, 2021

cjgillot Mar 24, 2021

bjorn3 Mar 26, 2021

michaelwoerister commented Mar 26, 2021

rylev commented Mar 26, 2021

bors commented Nov 20, 2021

bors commented Nov 20, 2021

rust-timer commented Nov 20, 2021

rust-timer commented Nov 20, 2021

cjgillot commented Nov 21, 2021

Mark-Simulacrum commented Nov 21, 2021

michaelwoerister commented Nov 25, 2021

bors commented Feb 20, 2022

This comment has been minimized.

rust-log-analyzer commented Mar 5, 2022

bors commented May 20, 2022

Directly save a byte representation of the dep-graph and work-product index #83322

Directly save a byte representation of the dep-graph and work-product index #83322

Conversation

cjgillot commented Mar 20, 2021

cjgillot commented Mar 20, 2021

rust-timer commented Mar 20, 2021

bors commented Mar 20, 2021

bors commented Mar 20, 2021

rust-timer commented Mar 20, 2021

rust-timer commented Mar 20, 2021

cjgillot commented Mar 20, 2021

rust-timer commented Mar 20, 2021

bors commented Mar 20, 2021

bors commented Mar 20, 2021

rust-timer commented Mar 20, 2021

rust-timer commented Mar 21, 2021

cjgillot commented Mar 21, 2021

michaelwoerister commented Mar 22, 2021

cjgillot commented Mar 22, 2021

michaelwoerister commented Mar 23, 2021

cjgillot commented Mar 23, 2021

bjorn3 Mar 24, 2021

Choose a reason for hiding this comment

cjgillot Mar 24, 2021

Choose a reason for hiding this comment

bjorn3 Mar 26, 2021

Choose a reason for hiding this comment

michaelwoerister commented Mar 26, 2021

rylev commented Mar 26, 2021

bors commented Nov 20, 2021

bors commented Nov 20, 2021

rust-timer commented Nov 20, 2021

rust-timer commented Nov 20, 2021

cjgillot commented Nov 21, 2021

Mark-Simulacrum commented Nov 21, 2021

michaelwoerister commented Nov 25, 2021

bors commented Feb 20, 2022

This comment has been minimized.

rust-log-analyzer commented Mar 5, 2022

bors commented May 20, 2022