-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make io::Error use 64 bits on targets with 64 bit pointers. #87869
Conversation
r? @yaahc (rust-highfive has picked a reviewer for you, use r? to override) |
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only read the comments in repr_bitpacked
carefully, and modulo some small comments that all sounds very sensible. :)
☔ The latest upstream changes (presumably #83342) made this pull request unmergeable. Please resolve the merge conflicts. |
Ping from triage: @thomcc Can you please address the merge conflict? |
|
1c66cff
to
decba7f
Compare
☔ The latest upstream changes (presumably #90846) made this pull request unmergeable. Please resolve the merge conflicts. |
Maybe to help with the merge conflicts, first land a PR that only wraps the |
I think this was actually waiting on the reviewer — they hadn't updated since oct. The merge errors are because it modified all creation sites of errors in libstd, so they accumulate very easy, even though they're mostly mechanical. Had I gotten any indication it would be reviewed I'd have updated it. Anyway, I'll try and rebase it in a week or so. |
Oops, sorry about that 😅, no idea why I swapped it to waiting on author, I'll take a look at this next week. |
Co-authored-by: Ralf Jung <post@ralfj.de>
Co-authored-by: the8472 <the8472@users.noreply.github.com>
0bb5b76
to
1c1286e
Compare
Updated to address tests, and address some comments that came outside of this issue. Also, I noticed the I also added a few more assertions to ensure that none of the tags use bits outside TAG_MASK, and replace some 4's with |
…n't quite accurate anymore
1c1286e
to
9cbe994
Compare
📌 Commit 9cbe994 has been approved by |
☀️ Test successful - checks-actions |
Finished benchmarking commit (734368a): comparison url. Summary: This benchmark run shows 22 relevant improvements 🎉 but 13 relevant regressions 😿 to instruction counts.
If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf. Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression |
Hmm, interesting. The regressions seem to be in layout computation ( Hmmmmm... |
Looking more closely, the regressions are... almost all versions of Thankfully, the benches with perf deltas that seem to more realistic crates, seem to be more consistently green -- it's not 100% green in the real usage, but its a heck of a lot more consistent, and for the most part seems like the code that uses (Honestly... if you ignore the Anyway, it kinda seems fine to regress a synthetic benchmark like that in this case (where there are runtime benefits in very commonly used types), especially given how many of the non-synthetic benchmarks had a compile-time improvement. But it may be unsurprising that I'm in favor of my own patch. (That said, I aslo I can't quite wrap my head around what I'd poke in the compiler improve this anyway (not more specifically than ✨ layout ✨, at least), so I'd appreciate pointers if it turns out I'm responsible for un-regressing it). |
I'm going to go ahead and mark this as triaged -- I tend to agree that this is likely not something to be directly addressed, and is limited to just one benchmark. |
I've wanted this for a long time, but didn't see a good way to do it without having extra allocation. When looking at it yesterday, it was more clear what to do for some reason.
This approach avoids any additional allocations, and reduces the size by half (8 bytes, down from 16). AFAICT it doesn't come additional runtime cost, and the compiler seems to do a better job with code using it.
Additionally, this
io::Error
has a niche (still), soio::Result<()>
is also 64 bits (8 bytes, down from 16), andio::Result<usize>
(used for lots of io trait functions) is 2x64 bits (16 bytes, down from 24 — this means on x86_64 it can use the nice rax/rdx 2-reg struct return). More generally, it shaves a whole 64 bit integer register off of the size of basically anyio::Result<()>
.(For clarity: Improving
io::Result
(rather than io::Error) was most of the motivation for this)On 32 bit (or other non-64bit) targets we still use something equivalent the old repr — I don't think think there's improving it, since one of the fields it stores is a
i32
, so we can't get below that, and it's already about as close as we can get to it.Isn't Pointer Tagging Dodgy?
The details of the layout, and why its implemented the way it is, are explained in the header comment of library/std/src/io/error/repr_bitpacked.rs. There's probably more details than there need to be, but I didn't trim it down that much, since there's a lot of stuff I did deliberately, that might have not seemed that way.
There's actually only one variant holding a pointer which gets tagged. This one is the (holder for the) user-provided error.
I believe the scheme used to tag it is not UB, and that it preserves pointer provenance (even though often pointer tagging does not) because the tagging operation is just
core::ptr::add
, and untagging iscore::ptr::sub
. The result of both operations lands inside the original allocation, so it would follow the safety contract ofcore::ptr::{add,sub}
.The other pointer this had to encode is not tagged — or rather, the tagged repr is equivalent to untagged (it's tagged with 0b00, and has >=4b alignment, so we can reuse the bottom bits). And the other variants we encode are just integers, which (which can be untagged using bitwise operations without worry — they're integers).
CC @RalfJung for the stuff in repr_bitpacked.rs, as my comments are informed by a lot of the UCG work, but it's possible I missed something or got it wrong (even if the implementation is okay, there are parts of the header comment that says things like "We can't do $x" which could be false).
Why So Many Changes?
The repr change was mostly internal, but changed one widely used API: I had to switch how
io::Error::new_const
works.This required switching
io::Error::new_const
to take the full message data (including the kind) as a&'static
, rather than just the string. This would have been really tedious, but I made a macro that made it much simpler, but it was a wide change sinceio::Error::new_const
is used everywhere.This included changing files for a lot of targets I don't have easy access to (SGX? Haiku? Windows? Who has heard of these things), so I expect there to be spottiness in CI initially, unless luck is on my side.
Anyway this large only tangentially-related change is all in the first commit (although that commit also pulls the previous repr out into its own file), whereas the packing stuff is all in commit 2.
P.S. I haven't looked at all of this since writing it, and will do a pass over it again later, sorry for any obvious typos or w/e. I also definitely repeat myself in comments and such.
(It probably could use more tests too. I did some basic testing, and made it so we
debug_assert!
in cases the decode isn't what we encoded, but I don't know the degree which I can assume libstd's testing of IO would exercise this. That is: it wouldn't be surprising to me if libstds IO testing were minimal, especially around error cases, although I have no idea).