Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stream large files during unpacking #2707

Merged
merged 2 commits into from
Apr 6, 2021
Merged

Conversation

rbtcollins
Copy link
Contributor

Fixes #2632, #2145, #2564

Files over 16M are now written incrementally chunks rather than buffered
in memory in one full linear buffer. This chunk size is not
configurable.

For threaded unpacking, the entire memory buffer will be used to buffer
chunks and a single worker thread will dispatch IO operations from the
buffer, so minimal performance impact should be anticipated (file
size/16M round trips at worst, and most network file systems will
latency hide linear writes).

For immediate unpacking, each chunk is dispatched directly to disk,
which may impact performance as less latency hiding is possible - but
for immediate unpacking clarity of behaviour is the priority.

Fixes rust-lang#2632, rust-lang#2145, rust-lang#2564

Files over 16M are now written incrementally chunks rather than buffered
in memory in one full linear buffer. This chunk size is not
configurable.

For threaded unpacking, the entire memory buffer will be used to buffer
chunks and a single worker thread will dispatch IO operations from the
buffer, so minimal performance impact should be anticipated (file
size/16M round trips at worst, and most network file systems will
latency hide linear writes).

For immediate unpacking, each chunk is dispatched directly to disk,
which may impact performance as less latency hiding is possible - but
for immediate unpacking clarity of behaviour is the priority.
Copy link
Contributor

@kinnison kinnison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the most part I follow this given the conversations we have had before. I like the CompletedIO approach to the budget reclaim, that refactor pleases me.

A couple of nit-picking whinges, and I'll be pretty happy. I want to play a bit with this locally though too, so even if you do the cleanups now, don't merge yet.

src/diskio/mod.rs Show resolved Hide resolved
src/dist/component/package.rs Show resolved Hide resolved
Copy link
Contributor

@kinnison kinnison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like I want to try this locally a bit, but I'm okay with merging this otherwise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

"File too big" on s390x
2 participants