Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enum layout optimizations #19765

Merged
merged 17 commits into from
Dec 29, 2014
Merged

Enum layout optimizations #19765

merged 17 commits into from
Dec 29, 2014

Conversation

luqmana
Copy link
Member

@luqmana luqmana commented Dec 12, 2014

This extends the nullable enum opt to traverse beyond just the first level to find possible fields to use as the discriminant. So now, it'll work through structs, tuples, and fixed sized arrays. This also introduces a new lang item, NonZero, that you can use to wrap raw pointers or integral types to indicate to rustc that the underlying value is known to never be 0/NULL. We then use this in Vec, Rc and Arc to have them also benefit from the nullable enum opt.

As per rust-lang/rfcs#499 NonZero is not exposed via the libstd facade.

x86_64 Linux:
                        T       Option<T> (Before)      Option<T> (After)
----------------------------------------------------------------------------------
Vec<int>                24          32                      24
String                  24          32                      24
Rc<int>                 8           16                      8
Arc<int>                8           16                      8
[Box<int>, ..2]         16          24                      16
(String, uint)          32          40                      32

Fixes #19419.
Fixes #13194.
Fixes #9378.
Fixes #7576.

@jdm
Copy link
Contributor

jdm commented Dec 12, 2014

👯

@cgaebel
Copy link
Contributor

cgaebel commented Dec 12, 2014

\o/

@pythonesque
Copy link
Contributor

Woo!

@flaper87
Copy link
Contributor

awesome 🍰

@nikomatsakis
Copy link
Contributor

f? @jld


impl<T> RawPtr<T> for NonZero<*const T> {
#[inline]
fn null() -> NonZero<*const T> { NonZero(null()) }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is safe at all, even if right now I can't come up with a memory safety violation in safe code using it.
Though someone more motivated could potentially use it to transmute arbitrary values.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The existence of NonZero::null does seem like a bit of an oxymoron in any case.

@nikomatsakis nikomatsakis self-assigned this Dec 15, 2014
// &T/&mut T/Box<T> could either be a thin or fat pointer depending on T
ty::ty_rptr(_, ty::mt { ty, .. }) | ty::ty_uniq(ty) => match ty.sty {
// &[T] and &str are a pointer and length pair
ty::ty_vec(_, None) | ty::ty_str => Some(vec![FAT_PTR_ADDR]),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why special case these things? why not just call type_is_sized all the time? seems like it just makes the code less DRY

@nikomatsakis
Copy link
Contributor

So, this looks basically good to me, the big thing that seems to be missing are tests that do matches and so forth on values that contain NonZero instances. There are some tests that check the size, but none that actually use the values in various mysterious ways that I can see. (Admittedly, using this in Rc and Vec etc provides a certain amount of testing, but I'd prefer to see targeted unit tests, probably just as #[test] fns in the NonZero module.)

@luqmana
Copy link
Member Author

luqmana commented Dec 22, 2014

@nikomatsakis Updated.

@nikomatsakis
Copy link
Contributor

@luqmana I'm assuming there were only minor changes here? (r+ under that assumption)

@nikomatsakis
Copy link
Contributor

Actually, revoking r+. The main thing I wanted added was unit tests that test matching against a Option<Foo<T>> for each type Foo that uses NonZero (Vec, Rc, etc). I see some tests checking the size of an Option<Rc<T>>, but no tests that actually check that we correctly distinguish None from Some (and in particular, testing empty vectors Some(Vec::new()) vs None). If I'm wrong and the tests are there, sorry about that.


fn find_discr_field_candidate<'tcx>(tcx: &ty::ctxt<'tcx>,
ty: Ty<'tcx>,
path: &mut DiscrField) -> bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: could this return Option<DiscrField>? (And likewise for find_ptr.) That seems to be what it's doing anyway, and it would be a little clearer to read.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jld It did originally but I changed it to this way to get rid of all the extraneous allocation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I expessed that badly: I meant returning an optional vector in reverse order, like it's doing now; the recursive cases would move the Vec, push onto it, and then return it. It's not a big deal in any case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, that makes sense. I've updated the code.

@luqmana luqmana force-pushed the nonzero-lang-item branch 4 times, most recently from 49773d8 to c0badcd Compare December 28, 2014 20:19
bors added a commit that referenced this pull request Dec 29, 2014
This extends the nullable enum opt to traverse beyond just the first level to find possible fields to use as the discriminant. So now, it'll work through structs, tuples, and fixed sized arrays. This also introduces a new lang item, NonZero, that you can use to wrap raw pointers or integral types to indicate to rustc that the underlying value is known to never be 0/NULL. We then use this in Vec, Rc and Arc to have them also benefit from the nullable enum opt.

As per rust-lang/rfcs#499 NonZero is not exposed via the `libstd` facade.

```
x86_64 Linux:
                        T       Option<T> (Before)      Option<T> (After)
----------------------------------------------------------------------------------
Vec<int>                24          32                      24
String                  24          32                      24
Rc<int>                 8           16                      8
Arc<int>                8           16                      8
[Box<int>, ..2]         16          24                      16
(String, uint)          32          40                      32
```

Fixes #19419.
Fixes #13194.
Fixes #9378.
Fixes #7576.
@bors bors merged commit 766a719 into rust-lang:master Dec 29, 2014
@luqmana luqmana deleted the nonzero-lang-item branch December 29, 2014 20:37
@brson
Copy link
Contributor

brson commented Jan 5, 2015

Nice wins.

#[lang="non_zero"]
#[deriving(Copy, Clone, Eq, PartialEq, Ord, PartialOrd, Show)]
#[experimental]
pub struct NonZero<T: Zeroable>(T);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@luqmana: Is there any reason this couldn't have been pub struct NonZero<T: Zeroable>(pub T)?
The way it's currently defined, NonZero cannot be used to initialize statics and consts. :-(

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we don't have unsafe fields, this has the same issue as UnsafeCell (public safely modifiable fields can lead to unsafety).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this issue explained anywhere? I must have missed that discussion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vadimcn I don't think there's an extended documentation/discussion written anywhere about this but this falls into the safety guarantees. As @eddyb mentioned, there's no support for unsafe fields, therefore we don't have a way to tell the user that accessing a certain field is considered an unsafe operation.

In the case of UnsafeCell, the field is public but, as the docstring in the file says, it shouldn't be.

Now, I wonder if it'd be fair to allow calls to constructors on static items by requiring them to be in an unsafe block.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why accessing the inner value of NonZero is unsafe?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think creating, not accessing, the value is unsafe in this case: NonZero(0) is bad!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to what @huonw said: NonZero guarantees the wrapped raw pointer will never be NULL or 0. If public access to the wrapped pointer is allowed, it would be possible to zero the value out.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@huonw, @flaper87: I can still do these things, since NonZero::new() does not perform any input validation. So what's gained? That I have to wrap it in an unsafe {} block?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment