Stream files into the local store while capturing them #12563

stuhood · 2021-08-13T05:24:42Z

A private repository experienced OOM issues in CI with a large number of tests. Allocation profiling highlighted that a large batch of 32MB/64MB allocations was occurring around the same time, all caused by the fact that PosixFS::read_file slurps an entire file into memory as FileContent. Because local process execution uses PosixFS::read_file to capture sandbox outputs, a bunch of processes finishing at once could lead to memory usage spikes.

To fix that, this change removes PosixFS::read_file (which was only ever used to digest files), and replaces it with {Store,local::ByteStore,ShardedLmdb}::store* methods which make two-ish passes over a Read instance in order to digest and capture it.

The methods take an immutable_data parameter so that callers can indicate that they know that data will not change, which allows for avoiding hashing it on the second pass. Captures from process sandboxes are treated as immutable, while memoized captures from the workspace are not.

#12569 adds relevant benchmarks. Profiling allocation during those benchmarks shows peak memory usage of 945MB on main, and 362MB on the branch. Additionally, the benchmarks all run faster (likely due to reduced allocation):

snapshot_capture/snapshot_capture((100, 100, false, 100))
                        time:   [1.8909 s 1.9000 s 1.9109 s]
                        change: [-27.459% -25.841% -24.085%] (p = 0.00 < 0.05)
                        Performance has improved.
snapshot_capture/snapshot_capture((20, 10000000, true, 10))
                        time:   [1.2539 s 1.2655 s 1.2782 s]
                        change: [-62.138% -61.318% -60.505%] (p = 0.00 < 0.05)
                        Performance has improved.
snapshot_capture/snapshot_capture((1, 200000000, true, 10))
                        time:   [3.5281 s 3.5773 s 3.6299 s]
                        change: [-13.420% -11.985% -10.434%] (p = 0.00 < 0.05)
                        Performance has improved.

Eric-Arellano

Exciting results! Is this cherry-pickable to 2.6? Seems fine to not cherry-pick the benchmarks portion.

stuhood · 2021-08-14T03:38:07Z

Exciting results! Is this cherry-pickable to 2.6? Seems fine to not cherry-pick the benchmarks portion.

Should be, yea.

illicitonion

Nice optimisation!

src/rust/engine/hashing/src/lib.rs

src/rust/engine/sharded_lmdb/src/tests.rs

src/rust/engine/sharded_lmdb/src/lib.rs

illicitonion · 2021-08-14T12:21:10Z

Also FWIW if the problem is the number of allocations, rather than size, we could re-use a pool of Vecs to do the reads, rather than allocate a new one for each read, but that would still hit the same issue in the face of large files (i.e. reading a 1GB file would still need 1GB of RAM allocated, which this solution nicely avoids :))

Refactor the test harness to allow for creating larger files (without allocating their full contents in memory), and add benchmarks for a series of file counts, sizes, and repetitions. Preparation for #12563. [ci skip-build-wheels]

…in two passes). [ci skip-build-wheels]

…ation via Node. [ci skip-build-wheels]

[ci skip-build-wheels]

# Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]

) A private repository experienced OOM issues in CI with a large number of tests. Allocation profiling highlighted that a large batch of `32MB`/`64MB` allocations was occurring around the same time, all caused by the fact that `PosixFS::read_file` slurps an entire file into memory as `FileContent`. Because local process execution uses `PosixFS::read_file` to capture sandbox outputs, a bunch of processes finishing at once could lead to memory usage spikes. To fix that, this change removes `PosixFS::read_file` (which was only ever used to digest files), and replaces it with `{Store,local::ByteStore,ShardedLmdb}::store*` methods which make two-ish passes over a `Read` instance in order to digest and capture it. The methods take an `immutable_data` parameter so that callers can indicate that they know that data will not change, which allows for avoiding hashing it on the second pass. Captures from process sandboxes are treated as immutable, while memoized captures from the workspace are not. ---- ``` snapshot_capture/snapshot_capture((100, 100, false, 100)) time: [1.8909 s 1.9000 s 1.9109 s] change: [-27.459% -25.841% -24.085%] (p = 0.00 < 0.05) Performance has improved. snapshot_capture/snapshot_capture((20, 10000000, true, 10)) time: [1.2539 s 1.2655 s 1.2782 s] change: [-62.138% -61.318% -60.505%] (p = 0.00 < 0.05) Performance has improved. snapshot_capture/snapshot_capture((1, 200000000, true, 10)) time: [3.5281 s 3.5773 s 3.6299 s] change: [-13.420% -11.985% -10.434%] (p = 0.00 < 0.05) Performance has improved. ```

…12563) (#12572) A private repository experienced OOM issues in CI with a large number of tests. Allocation profiling highlighted that a large batch of `32MB`/`64MB` allocations was occurring around the same time, all caused by the fact that `PosixFS::read_file` slurps an entire file into memory as `FileContent`. Because local process execution uses `PosixFS::read_file` to capture sandbox outputs, a bunch of processes finishing at once could lead to memory usage spikes. To fix that, this change removes `PosixFS::read_file` (which was only ever used to digest files), and replaces it with `{Store,local::ByteStore,ShardedLmdb}::store*` methods which make two-ish passes over a `Read` instance in order to digest and capture it. The methods take an `immutable_data` parameter so that callers can indicate that they know that data will not change, which allows for avoiding hashing it on the second pass. Captures from process sandboxes are treated as immutable, while memoized captures from the workspace are not. ---- ``` snapshot_capture/snapshot_capture((100, 100, false, 100)) time: [1.8909 s 1.9000 s 1.9109 s] change: [-27.459% -25.841% -24.085%] (p = 0.00 < 0.05) Performance has improved. snapshot_capture/snapshot_capture((20, 10000000, true, 10)) time: [1.2539 s 1.2655 s 1.2782 s] change: [-62.138% -61.318% -60.505%] (p = 0.00 < 0.05) Performance has improved. snapshot_capture/snapshot_capture((1, 200000000, true, 10)) time: [3.5281 s 3.5773 s 3.6299 s] change: [-13.420% -11.985% -10.434%] (p = 0.00 < 0.05) Performance has improved. ``` [ci skip-rust] [ci skip-build-wheels]

jsirois · 2021-08-15T18:24:46Z

src/rust/engine/sharded_lmdb/src/lib.rs

+  pub async fn store<F, R>(
+    &self,
+    initial_lease: bool,
+    data_is_immutable: bool,


This is the only scary part of the review. The eyeball glazes over the bool literals at the final call sites where this is plumbed. I like a bool enum to force spelling out what choice is being made more clearly, but maybe you consider that overkill here:

pants/src/rust/engine/client/src/options/args.rs

Lines 18 to 21 in 7da328c

enum Negate {

True,

False,

}

That said, if you're modifying this code and not reading closely in the 1st place, I'm sympathetic to the idea you'll get what you deserve.

stuhood force-pushed the stuhood/streaming-store branch 2 times, most recently from 918e4ce to 585cae4 Compare August 13, 2021 15:27

stuhood mentioned this pull request Aug 13, 2021

[internal] Add a benchmark for Snapshot capture #12569

Merged

stuhood force-pushed the stuhood/streaming-store branch from 585cae4 to 3eed58e Compare August 14, 2021 00:35

stuhood requested review from tdyas, illicitonion and jsirois August 14, 2021 00:56

stuhood marked this pull request as ready for review August 14, 2021 00:56

Eric-Arellano reviewed Aug 14, 2021

View reviewed changes

stuhood force-pushed the stuhood/streaming-store branch from 3eed58e to 17f845e Compare August 14, 2021 04:45

illicitonion approved these changes Aug 14, 2021

View reviewed changes

stuhood added 4 commits August 14, 2021 09:35

Add Store::store_file, which streams from a Read instance into LMDB (…

f8b2f1a

…in two passes). [ci skip-build-wheels]

Use Store::store_file for Snapshot capture and memoized Digest calcul…

fd63fc5

…ation via Node. [ci skip-build-wheels]

Fix benchmarks for new parameters.

b070d4b

[ci skip-build-wheels]

Review feedback.

2e87455

# Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]

stuhood force-pushed the stuhood/streaming-store branch from 17f845e to 2e87455 Compare August 14, 2021 22:37

stuhood enabled auto-merge (squash) August 14, 2021 22:54

stuhood merged commit f4155dd into pantsbuild:main Aug 14, 2021

stuhood deleted the stuhood/streaming-store branch August 14, 2021 23:28

stuhood added the needs-cherrypick label Aug 15, 2021

stuhood added this to the 2.6.x milestone Aug 15, 2021

jsirois reviewed Aug 15, 2021

View reviewed changes

Eric-Arellano removed the needs-cherrypick label Aug 23, 2021

stuhood mentioned this pull request Sep 2, 2021

Introduce the spread app layout. pex-tool/pex#1431

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stream files into the local store while capturing them #12563

Stream files into the local store while capturing them #12563

stuhood commented Aug 13, 2021 •

edited

Loading

Eric-Arellano left a comment

stuhood commented Aug 14, 2021

illicitonion left a comment

illicitonion commented Aug 14, 2021

jsirois Aug 15, 2021

jsirois Aug 15, 2021

Stream files into the local store while capturing them #12563

Stream files into the local store while capturing them #12563

Conversation

stuhood commented Aug 13, 2021 • edited Loading

Eric-Arellano left a comment

Choose a reason for hiding this comment

stuhood commented Aug 14, 2021

illicitonion left a comment

Choose a reason for hiding this comment

illicitonion commented Aug 14, 2021

jsirois Aug 15, 2021

Choose a reason for hiding this comment

jsirois Aug 15, 2021

Choose a reason for hiding this comment

stuhood commented Aug 13, 2021 •

edited

Loading