-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify/Optimize FileEncoder #115542
Simplify/Optimize FileEncoder #115542
Conversation
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
⌛ Trying commit a3671fd3993f906b9a9879b1465a918b9d80cac3 with merge 9bd418951e5c1d9a80f4d5ff524fde43a43149bf... |
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (9bd418951e5c1d9a80f4d5ff524fde43a43149bf): comparison URL. Overall result: ❌ regressions - ACTION NEEDEDBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 627.22s -> 627.249s (0.00%) |
This comment has been minimized.
This comment has been minimized.
That's pretty slow but the regressions are almost exclusively confined to I have to wonder if one of my other PRs will obviate this. Ah well, let's see what happens if we optimize the leb128 path a bunch... |
This comment has been minimized.
This comment has been minimized.
⌛ Trying commit f16b987d8162c60ac2f91c185451b1182cff1acf with merge a0f5ca19b76ebc68ddad77a862195f68e03eec62... |
This comment has been minimized.
This comment has been minimized.
💔 Test failed - checks-actions |
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
⌛ Trying commit f16b987d8162c60ac2f91c185451b1182cff1acf with merge 230afd4b68f6d1786901abded18b5dc9ea431540... |
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
@bors r=WaffleLapkin rollup=never |
☀️ Test successful - checks-actions |
1 similar comment
☀️ Test successful - checks-actions |
Finished benchmarking commit (3223b0b): comparison URL. Overall result: ❌✅ regressions and improvements - ACTION NEEDEDNext Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Bootstrap: 632.887s -> 633.664s (0.12%) |
@saethlin @WaffleLapkin looks like a small regression snuck back in. The regressions seem real, but perhaps it's not worth it given the much larger amount of improvements. Thoughts? |
You may be looking at the relative graph instead of the absolute one? |
(yeah, agreed.) |
Clamp instead of asserting in FileEncoder::write_with r? `@WaffleLapkin` If this isn't the regression mentioned in rust-lang#115542 (comment) I'd have to actually look into it.
If #116188 doesn't recoup the regression, I'll mark this as triaged on the basis that the regression is very small and in a change that's overwhelmingly positive. As you suggest, it's not worth our time to look into a tiny regression in an overwhelmingly green report unless there's an easy fix. |
The cachegrind diff for the regressed benchmark indicates that the regression is somehow in LLVM. I'm not really sure how that's possible, but between this cachegrind diff and the linked PR I think I've done the amount of investigation that this merits and also come up with nothing that points towards fixing the lone regression.
|
FileEncoder is basically a BufWriter except that it exposes access to the not-written-to-yet region of the buffer so that some users can write directly to the buffer. This strategy is awesome because it lets us avoid calling memcpy for small copies, but the previous strategy was based on the writer accessing a
&mut [MaybeUninit<u8>; N]
and returning a&[u8]
which is an API which currently mandates the use of unsafe code, making that interface in general not that appealing.So this PR cleans up the FileEncoder implementation and builds on that general idea of direct buffer access in order to prevent
memcpy
calls in a few key places when encoding the dep graph and rmeta tables. The interface used here is now 100% safe, but with the caveat that internally we need to avoid trusting the number of bytes that the provided function claims to have written.The original primary objective of this PR was to clean up the FileEncoder implementation so that the fix for the following issues would be easy to implement. The fix for these issues is to correctly update self.buffered even when writes fail, which I think it's easy to verify manually is now done, because all the FileEncoder methods are small.
Fixes #115298
Fixes #114671
Fixes #114045
Fixes #108100
Fixes #106787