-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
struct field reordering and optimization #37429
struct field reordering and optimization #37429
Conversation
} | ||
|
||
/// Extend the Struct with more fields. | ||
pub fn extend<I>(&mut self, dl: &TargetDataLayout, | ||
fn fill_in_fields<I>(&mut self, dl: &TargetDataLayout, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can probably move the whole body of fill_in_fields
into fn new
. Also maybe rename it to from_fields
.
EDIT: Oh is it because of if fields.len() == 0 {return Ok(())};
below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't think of it that way, but you're right in that that's a good reason. I just preferred to separate the overview from the steps, if you will.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that this function shouldn't be possible to call at all, this is really outlining, and there's no usefulness gained by having self
- that is, I'd expect the Struct
be created at the very end.
// We have to fix the last element of path here as only we know the right value. | ||
let mut i = *path.last().unwrap(); | ||
i = st.field_index[i as usize]; | ||
*path.last_mut().unwrap() = i; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can use let i = path.last_mut().unwrap();
and then read and write *i
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That wouldn't be an improvement because the borrow lasts too long. It would have to be wrapped in braces and indented further or I'd have to use mem::forget. We can't push the 0 onto the Vec while i
exists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh whoops. Only braces would work, btw, borrows can't even be terminated manually.
let mut fields = fields.into_iter().map(|field| { | ||
field.layout(infcx) | ||
}).collect::<Vec<_>>(); | ||
fields.insert(0, Ok(&discr)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
None of these changes seem to be really needed - or are you planning to move the ZSTs here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is because we can no longer iterate over the fields in order until after we have the struct.
@@ -511,41 +512,70 @@ pub struct Struct { | |||
/// If true, the size is exact, otherwise it's only a lower bound. | |||
pub sized: bool, | |||
|
|||
/// Offsets for the first byte of each field. | |||
/// Offsets for the first byte of each field, ordered in increasing order. | |||
/// The ith element of this vector is not always the ith field of the struct. | |||
/// FIXME(eddyb) use small vector optimization for the common case. | |||
pub offsets: Vec<Size>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To have the increasing order offsets
(instead of declaration order - is it really necessary?) you need to rename the field to permuted_offsets
or ordered_offsets
and (re-)add a field_offset
method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I just make it private and have field_offset
and num_fields
functions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds okay - but is it really necessary to have the offsets permuted?
472a158
to
9553ddd
Compare
Sub. I'm sure we're gonna see some interesting stuff in the crater run, hopefully all fixable with a #[repr(C)] |
@arthurprs I'm waiting on #36622 at the moment. Then we shall see. Probably that can be worked around, but I don't have enough experience yet to know what I should or shouldn't be logging and also there's another project I can work on that's not Rust. If you see places that still assume fields are in order, tell me. I'm more than willing to accept help finding them. |
9553ddd
to
af058d6
Compare
ping @eddyb Looks like some updates have happened since you last looked, want to take another peek? |
also tagging with |
@alexcrichton |
Ah ok, no worries! Just checking up |
All but 18 of the Debuginfo still needs conversion. |
Some thoughts (long term, no immediate action required): It would be good to have two things related to field reordering:
Also, reordering accidentally uses stable sorting algorithm now, just because sorting in libstd is stable. |
@petrochenkov I like the idea of Both of these should be almost absurdly trivial to actually add. I could work out how to do it without leaning on @eddyb, I think. |
376ac0c
to
408bed9
Compare
☔ The latest upstream changes (presumably #37675) made this pull request unmergeable. Please resolve the merge conflicts. |
408bed9
to
a565851
Compare
} | ||
|
||
/// Extend the Struct with more fields. | ||
pub fn extend<I>(&mut self, dl: &TargetDataLayout, | ||
fn fill_in_fields<I>(&mut self, dl: &TargetDataLayout, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that this function shouldn't be possible to call at all, this is really outlining, and there's no usefulness gained by having self
- that is, I'd expect the Struct
be created at the very end.
-> Result<(), LayoutError<'gcx>> | ||
where I: Iterator<Item=Result<&'a Layout, LayoutError<'gcx>>> { | ||
self.offsets.reserve(fields.size_hint().0); | ||
let fields = fields.collect::<Result<Vec<_>, LayoutError<'gcx>>>()?; | ||
if fields.len() == 0 {return Ok(())}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems unnecessary, as a zero-length Vec
is free.
|
||
self.offsets = vec![Size::from_bytes(0); fields.len()]; | ||
let mut inverse_gep_index: Vec<u32> = Vec::with_capacity(fields.len()); | ||
inverse_gep_index.extend(0..fields.len() as u32); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be written as let mut inverse_gep_index: Vec<u32> = (0..fields.len() as u32).collect();
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also the same naming problem. laid_to_def_order
might work? Still ugly.
inverse_gep_index.extend(0..fields.len() as u32); | ||
|
||
if repr == attr::ReprAny { | ||
let start: usize = if is_enum_variant {1} else {0}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the usize
type here necessary?
@@ -511,41 +512,76 @@ pub struct Struct { | |||
/// If true, the size is exact, otherwise it's only a lower bound. | |||
pub sized: bool, | |||
|
|||
/// Offsets for the first byte of each field. | |||
/// Offsets for the first byte of each field, ordered to match the tys. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you refer to "the definition" or "definition order" instead of "types"? That is, the order the fields come in is the source definition order, the types just happen to be associated with each field so the most straight-forward way to get them is in source definition order, but they're secondary to the order.
/// FIXME(eddyb) use small vector optimization for the common case. | ||
pub offsets: Vec<Size>, | ||
|
||
/// Maps field indices to GEP indices, depending how fields were permuted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Maps from definition order to the order the fields were laid out in", maybe? Still not perfect though.
GEP should really not be mentioned if avoided. def_to_laid_order
might work, although jarring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ended up calling these source order and memory order, I don't think there's anything better. The gep_index
vec is now memory_index
.
If there's better nomenclature, I don't know what it is.
}).collect::<Vec<_>>(); | ||
fields.insert(0, Ok(&discr)); | ||
let st = Struct::new(dl, | ||
fields.iter().cloned(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if you make Struct::new
take the Vec<Layout>
? Maybe that way the double allocating can be avoided.
@@ -26,8 +26,9 @@ fn main() { | |||
assert_eq!(size_of::<E>(), 1); | |||
assert_eq!(size_of::<Option<E>>(), 1); | |||
assert_eq!(size_of::<Result<E, ()>>(), 1); | |||
assert_eq!(size_of::<S>(), 4); | |||
assert_eq!(size_of::<Option<S>>(), 4); | |||
// The next asserts are correct given the currently dumb field reordering algorithm, which actually makes this struct larger. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would sorting by descending alignment work in more cases than ascending?
To anyone watching, all the run-pass tests now pass. |
1fbefa0
to
825ec31
Compare
@@ -1305,6 +1313,8 @@ impl<'tcx> EnumMemberDescriptionFactory<'tcx> { | |||
|
|||
// Creates MemberDescriptions for the fields of a single enum variant. | |||
struct VariantMemberDescriptionFactory<'tcx> { | |||
// Cloned from the layout::Struct describing the variant. | |||
offsets: Vec<layout::Size>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The layouts are interned so you can just use &'tcx [layout::Size]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still need to address this.
9bf17ce
to
b0c6c53
Compare
Crater report has 3 legitimate regressions, one of them a LLVM assertion, the others are a bug with packed: pub const SIZEOF_QUERY: usize = 21;
#[repr(C,packed)]
pub struct p0f_api_query {
pub magic: u32,
pub addr_type: u8,
pub addr: [u8; 16],
} The correct size is indeed 21, but this PR changes it to 24, presumably some logic went missing. |
Will there be another repr attribute? Something like |
} | ||
} | ||
|
||
let at_least = if let Some(i) = min_from_extern { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be written min_from_extern.unwrap_or(min_default)
.
can_optimize = !reprs.iter().any(|r| { | ||
match *r { | ||
attr::ReprAny => false, | ||
attr::ReprInt(_) => false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should use |
between these patterns to avoid repeating false
(and true
below).
I do not think this should be in the next beta, should the next beta be soon. I do not have the schedules memorized yet. This code breaks other people's broken unsafe code in fun ways, but it may also break the compiler. I am not confident enough that this code doesn't create compiler bugs to wish to shove it into a beta as quickly as possible. If @eddyb or @nikomatsakis thinks we're fine, then go for it. But I'd hold off, just in case. As for breaking unsafe code, there were at least 2 places in the compiler with missing |
To clarify, the 1.15 beta branch has not been forked off yet. That was scheduled to happen some time during the next week. @camlorn is there perhaps a "boolean flag" style switch somewhere we could turn this optimization off on beta easily? If so we could leave this on master, easily backport a "turn off this optimization" to beta, and let this ride the trains to 1.16 stable. |
@alexcrichton There's a check for |
Ok, so we have a few options now. I'm assuming that this PR itself is going into the 1.15 beta no matter what.
|
I can open another PR which just turns it off. All you have to do is short-circuit one condition in |
I was wrong. Turning this off is a little complicated. If we do, some tests are going to fail, so we'll have to restore the old versions. |
@camlorn can't we add a compiler flag ( In retrospect I wish I had raised this to the core team to discuss proper handling. |
@camlorn it looks like this may have regressed a test on Windows AppVeyor, specifically:
That wouldn't happen to look familiar off-hand, would it? |
@nikomatsakis I'm going to see if I can figure out how to reproduce this locally. |
Ok turns out we're accidentally not running debuginfo tests on the bots. That may mean that this is trivially reproducible, though. |
Upon following the readme's directions verbatim, I get this. Consequently, I can't reproduce. If I don't use rustbuild, it also happens, but later in the process. I checked against WSL: the broken test passes there. Taking a cursory look at the test, I don't see anything that could be different based on platform. These are also both using gdb. |
@camlorn the |
@gnzlbg |
@camlorn go for it, you already did the work, those are just the 3-5 easy lines of code to make sure nobody else screws it up, totally worth it |
@gnzlbg |
What a pleasant surprise! @camlorn Can you read over the comments in #28951 and confirm that this PR addresses any concerns that were raised there? If it does, can you give me the go-ahead to close it? |
Disable field reordering This was decided via IRC and needs a backport to beta. Basically, #37429 broke servo, and probably needs an announcement and opt-in flag. I didn't run all tests locally but think I've already reverted all the ones that need to be reverted. r? @nikomatsakis
This is work in progress. The goal is to divorce the order of fields in source code from the order of fields in the LLVM IR, then optimize structs (and tuples/enum variants)by always ordering fields from least to most aligned. It does not work yet. I intend to check compiler memory usage as a benchmark, and a crater run will probably be required.
I don't know enough of the compiler to complete this work unaided. If you see places that still need updating, please mention them. The only one I know of currently is debuginfo, which I'm putting off intentionally until a bit later.
r? @eddyb