Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Amend 0809 to reduce the number of traits. #1401

Closed
wants to merge 1 commit into from

Conversation

Stebalien
Copy link
Contributor

Use the following trait structure:

trait Placer<Data: ?Sized> {
    type Place: Place<Data>;
    fn make_place(&mut self) -> Self::Place;
}

trait Boxer<Data: ?Sized>: Sized {
    type Place: Place<Data, Owner=Self>;
    fn make_place() -> Self::Place;
}

unsafe trait Place<Data: ?Sized> {
    type Owner;
    fn pointer(&mut self) -> *mut Data;
    unsafe fn finalize(self) -> Self::Owner;
}

The key differences are:

  1. The Place trait must be unsafe because the compiler will
    blindly write to the pointer returned by Place::pointer.
  2. Merge InPlace into Place.
  3. Make the boxing side of the hierarchy mirror the "in (PLACE)"
    side of hierarchy.

Note: I'm submitting this as an amendment because it only modifies an appendix. I'd be happy to submit a separate RFC if wanted.

@Stebalien
Copy link
Contributor Author

Implementation: rust-lang/rust@master...Stebalien:simpler-place

@pnkfelix
Copy link
Member

I do not have any immediate problem with this amendment.

I wll note that another acceptable variant might be to make the Placer and Boxer traits unsafe (rather than making Place unsafe as suggested here); but I don't have an opinion as to which variant is better.

@Stebalien
Copy link
Contributor Author

What's the closest approximation of the non-existent plan for DST support?

For sized objects, it makes sense to make Place unsafe because one can say: any implementation of Place::pointer() must return a pointer to a valid region of memory large enough to store Data. We can ensure that Placers and Boxers return valid Places because the Place trait type signature completely specifies the behavior of the implementation.

However, if pointer can point to a DST, it makes more sense to make Placer/Boxer unsafe because returning something that implements Place<DST> doesn't guarantee that the place will have enough space to store the DST.

Does this make sense?

Use the following trait structure:

    unsafe trait Placer<Data: ?Sized> {
        type Place: Place<Data>;
        fn make_place(&mut self) -> Self::Place;
    }

    unsafe trait Boxer<Data: ?Sized>: Sized {
        type Place: Place<Data, Owner=Self>;
        fn make_place() -> Self::Place;
    }

    trait Place<Data: ?Sized> {
        type Owner;
        fn pointer(&mut self) -> *mut Data;
        unsafe fn finalize(self) -> Self::Owner;
    }

The key differences are:

1. The `Placer` and `Boxer` traits must be unsafe because the compiler will
   blindly write to the pointers returned by
   `Placer::make_place(self).pointer()` and
   `Boxer::make_place().pointer()`. Note: we could have made `Place`
   itself unsafe but, if we ever decide to eventually support DST
   placement new, the `DstBoxer` and `DstPlacer` traits would need to be
   unsafe so there's no point in making Place unsafe.

2. Merge `InPlace` into `Place`.

3. Make the boxing side of the hierarchy mirror the "in (PLACE)"
   side of hierarchy.
@Ericson2314
Copy link
Contributor

Can we do

trait Boxer<Data: ?Sized>: Sized + Placer
    where <Self as Placer>::Place: Place<Data, Owner=Self> { }

With the corresponding blanket impl?

@Stebalien
Copy link
Contributor Author

@Ericson2314,

No, Placer and Boxer are significantly different. Boxer::make_place() doesn't take a instance of a Boxer while Placer::make_place(&mut self) takes a reference to an instance of Placer and we can't just cook up this reference out of thin air.

@Ericson2314
Copy link
Contributor

Ah, I didn't notice the presence/lack of that arguments, whoops.

Well I suppose with allocators, now boxers would take an argument---the allocator, so maybe they could be unified after all, but that can be dealt with separately.

@Ericson2314
Copy link
Contributor

What's the closest approximation of the non-existent plan for DST support?
However, if pointer can point to a DST, it makes more sense to make Placer/Boxer unsafe...

Inspired by the allocator RFC, add a Kind (size and alignment), argument, and make a wrapper function for the Sized case?

fn make_place_unsize(&mut self, kind: Kind) -> Self::Place;

@pnkfelix
Copy link
Member

Well I suppose with allocators, now boxers would take an argument---the allocator, so maybe they could be unified after all, but that can be dealt with separately.

I have been assuming that if you needed to pass a concrete allocator value in, then you would use the placement-in form, even if the final type is a Box<T, ArenaAlloc>


Admittedly, such an approach looked somewhat better (IMO) when it was written as

let handle = in arena { boxed_value };

rather than the newer:

let handle = arena <- boxed_value;

Anyway, my point is really just that the main use case for box expr(and thus the Boxer trait) is when the allocator in question has no concrete value and can be constructed out of thin air after its type has been supplied (e.g. via type inference). When you need to supply an allocator expression, the box syntax is not appropriate.

@pnkfelix pnkfelix self-assigned this Dec 13, 2015
@pnkfelix pnkfelix added T-lang Relevant to the language team, which will review and decide on the RFC. T-libs-api Relevant to the library API team, which will review and decide on the RFC. labels Dec 13, 2015
@pnkfelix
Copy link
Member

(this cross-cuts lang and libs, i think, so i am tagging with both, but i hope we can resolve at lang team mtg)

@Ericson2314
Copy link
Contributor

@pnkfelix Maybe add a default bound?

trait Boxer<Data: ?Sized>: Sized + Placer + Default
    where <Self as Placer>::Place: Place<Data, Owner=Self> { }

So there is still only want trait to worry about?

Basically my concern is every boxer can be a placer too, so it would be nice to enforce that relationship without adding much of a burden.

@pnkfelix
Copy link
Member

Basically my concern is every boxer can be a placer too

I don't understand this claim.

Are you anticipating code that is generic over all Placers, where a client would want to pass in a boxer? That is the only interpretation that I can work out of the claim above. (In which case I guess one would have to pass in a sentinel representative of the Boxer, just to get the macro expansion to work out. Implementing Default isn't going to solve that problem.)

@Stebalien
Copy link
Contributor Author

There's a bit of a problem with both this and the original trait hierarchies. The Placer trait can't handle the following case because, given a Placer<T>, there is exactly one associated Owner type (<Placer::<T>::Place as Place<T>>::Owner).

let a: Box<_> = HEAP <- 1; // Owner = Box<T>
let b: Rc<_> = HEAP <- 1; // Owner = Rc<T>

@Ericson2314
Copy link
Contributor

Ah, @pnkfelix I missed that Boxer would be implemented by Box, Rc, etc themselves, rather than e.g. HEAP.

This almost seems good, except type inference is still confused.

pub mod protocol {
    use std::default::Default;

    trait Placer<Data: ?Sized> {
        type Place: Place<Data>;
        fn make_place(&mut self) -> Self::Place;
    }

    trait Boxer<Data: ?Sized>: Sized
        where <Self::Placer as Placer<Data>>::Place: Place<Data, Owner=Self>
    {
        type Placer: Placer<Data> + Default;
    }

    fn box_make_place<D, B: Boxer<D>>() -> <B::Placer as Placer<D>>::Place 
        where <B::Placer as Placer<D>>::Place: Place<D, Owner=B>
    {
        let mut p: B::Placer = Default::default();
        p.make_place()
    }

    unsafe trait Place<Data: ?Sized> {
        type Owner;
        fn pointer(&mut self) -> *mut Data;
        unsafe fn finalize(self) -> Self::Owner;
    }

}

macro_rules! box_ {
    ($value:expr) => { {
        let mut place = ::protocol::box_make_place();
        let raw_place = ::protocol::Place::pointer(&mut place);
        let value = $value;
        unsafe {
            ::std::ptr::write(raw_place, value);
            ::protocol::Place::finalize(place)
        }
    } }
}

fn main() {
    let b: Box<usize> = box_!(1);
}

@pnkfelix
Copy link
Member

@Stebalien It sounds like you are anticipating some future use case where you want to (simultaneously) pass in a placer (a la A <- B) and use the context to decide what the resulting type of the overall expression is (a la box C) ?

Am I interpreting that correctly?

I am not sure there exists a good way today to encode such flexibility.


(Digression that risks derailing the comment thread of this amendment PR follows)

E.g. this won't work:

pub trait InPlace<Data> where Data: ?Sized {
    fn pointer(&mut self) -> *mut Data;
    unsafe fn finalize<Owner>(self) -> Owner;
}

(because there's no way to provide an implementation of that finalize method).

We could do something like this:

pub trait InPlace<Data> where Data: ?Sized {
    fn pointer(&mut self) -> *mut Data;
    unsafe fn finalize<O:Owner>(self) -> O;
}
trait Owner {
    ...
}

but then we'd have to design a generic Owner trait.

@pnkfelix
Copy link
Member

@Ericson2314

This almost seems good, except type inference is still confused.

yes, type inference is what continues to block us landing the box desugaring. @eddyb has a patch that gets it going, but I am not sure if we'll actually land it or not.

@Ericson2314
Copy link
Contributor

For @Stebalien 's latest concern, here is a variation on my last one.

pub mod protocol {
    use std::default::Default;

    trait Placer<Data: ?Sized, Owner> {
        type Place: Place<Data, Owner=Owner>;
        fn make_place(&mut self) -> Self::Place;
    }

    trait Boxer<Data: ?Sized>: Sized
    {
        type Placer: Placer<Data, Self> + Default;
    }

    fn box_make_place<D, B>() -> <B::Placer as Placer<D, B>>::Place
        where B: Boxer<D>
    {
        let mut p: B::Placer = Default::default();
        p.make_place()
    }

    unsafe trait Place<Data: ?Sized> {
        type Owner;
        fn pointer(&mut self) -> *mut Data;
        unsafe fn finalize(self) -> Self::Owner;
    }

}

macro_rules! box_ {
    ($value:expr) => { {
        let mut place = ::protocol::box_make_place();
        let raw_place = ::protocol::Place::pointer(&mut place);
        let value = $value;
        unsafe {
            ::std::ptr::write(raw_place, value);
            ::protocol::Place::finalize(place)
        }
    } }
}

fn main() {
    let b: Box<usize> = box_!(1);
}

@Ericson2314
Copy link
Contributor

@pnkfelix I have an idea of using "phantom args" to aid dispatch. Will try to write up.

@Ericson2314
Copy link
Contributor

Ok, successfully get "trait not implemented" rather than "not enough information" error with this.

pub mod protocol {
    use std::default::Default;
    use std::marker::PhantomData;

    trait Placer<Data: ?Sized, Owner> {
        type Place: Place<Data, Owner=Owner>;
        fn make_place(&mut self) -> Self::Place;
    }

    trait Boxer<Data: ?Sized>: Sized
    {
        type Placer: Placer<Data, Self> + Default;
    }

    fn box_make_place<D, B>(_: PhantomData<B>) 
            -> <B::Placer as Placer<D, B>>::Place
        where B: Boxer<D>
    {
        let mut p: B::Placer = Default::default();
        p.make_place()
    }

    unsafe trait Place<Data: ?Sized> {
        type Owner;
        fn pointer(&mut self) -> *mut Data;
        unsafe fn finalize(self) -> Self::Owner;
    }

    type BoxerHelp<T> = (T, PhantomData<T>);
}

macro_rules! box_ {
    ($value:expr) => { {
        let (a, b): ::protocol::BoxerHelp<_>;
        b = ::std::marker::PhantomData;
        let mut place = ::protocol::box_make_place(b);
        let raw_place = ::protocol::Place::pointer(&mut place);
        let value = $value;
        unsafe {
            ::std::ptr::write(raw_place, value);
            a = ::protocol::Place::finalize(place)
        }
        a
    } }
}

fn main() {
    let b: Box<usize> = box_!(1);
}

@Stebalien
Copy link
Contributor Author

@pnkfelix,

@Stebalien It sounds like you are anticipating some future use case where you want to (simultaneously) pass in a placer (a la A <- B) and use the context to decide what the resulting type of the overall expression is (a la box C) ?

Am I interpreting that correctly?

I'm trying to use a custom allocator (in this case, a not so custom HEAP) for both Box and Rc. Am I missing something? How is one supposed to do this.


@Ericson2314, that may work although I'm not sure why Boxer needs to use a Placer. That just seems to complicate things.

@eddyb
Copy link
Member

eddyb commented Dec 14, 2015

@Ericson2314 Could you check if your scheme solves the test case from rust-lang/rust#27292?

I guess it might be a different way of solving the problem I solved by having a single (safe) method for writing the value and finalizing the placer, but not the main issue (related to DSTs).

@Ericson2314
Copy link
Contributor

@Stebalien I'll concede that once I correctly understood Boxer, it became less of a repeat of Placer than I thought. But considering we do have a Placer: Default, HEAP, for every Boxer I can think of, might as well use it to bootstrap the impl. I like box notation always being a shorthand for some <- notation too.

@eddyb In the PR description right? working on it.

@pnkfelix
Copy link
Member

@Stebalien

I'm trying to use a custom allocator (in this case, a not so custom HEAP) for both Box and Rc. Am I missing something? How is one supposed to do this.

I don't know yet how we'll do it.

But at least now I understand what you are trying to do. :)

One obvious and perhaps ugly hack is to not plug your allocator directly in as the LHS of <-, but rather wrap it in a newtype struct that indicates the kind of Owner to create:

struct BoxIn<A:Allocator>(A);
struct RcIn<A:Allocator>(A);
let a = BoxIn(alloc) <- EXPR;
let b = RcIn(alloc) <- EXPR;
impl<A> Placer for BoxIn<A> { ... }
impl<A> Placer for RcIn<A> { ... }

@Ericson2314
Copy link
Contributor

Ok, here is both appendices modified to my proposal.
https://gist.github.com/Ericson2314/fecfa774335786613e46

@eddyb It doesn't work for DSTs. My diagnosis is the compiler is confused because it can't rule out there isn't no, e.g. Rc<T> => Box<DST> coersion assuming T => DST. I am not sure if that different/better than before.

@Stebalien this includes HEAP working with Box and Rc without newtype wrappers too.

@Ericson2314
Copy link
Contributor

One last benefit of associating Boxer with a placer is moving HEAP out of liballoc (for motivation see #1398 (comment)). The old way, it must be there for coherence to implement Boxer. But with the new way, Boxer can be implement for all allocators (assuming allocator->placer blanket impl), in liballoc, and HEAP defined downstream.

edit missing word.

@Stebalien
Copy link
Contributor Author

@Ericson2314, a Placer really isn't supposed to be thrown away like that; that's why it takes an &mut self.

One last benefit of associating Boxer with a placer is HEAP out of liballoc (for motivation see #1398 (comment)).

Sorry, I don't understand this statement.

But with the new way, Boxer can be implement for all allocators (assuming allocator->placer blanket impl), in liballoc, and HEAP defined downstream.

You can't define a blanket impl of Boxer or Placer outside of libcore (coherence).

@Ericson2314
Copy link
Contributor

You are right :):

I forgot the Default bound in my previous post, oops [it's now editted]. I assume you also forgot because you worked from mine. But don't forget the <Self::Boxer>::Place::Owner=Self--there is a one to one relationship between boxers and places. That means we need to add an allocate parameter to IntermediateBox. Finally, the allocator (&mut self) is needed to allocator any memory, so using HEAP even if it could type check wouldn't make sense.

None of these error are inescapable. Your second impl fixed looks like:

impl<A: Allocator + Default, T> Boxer<T> for Box<A, T> {
    type Place = IntermediateBox<A, T>;
    fn make_place() -> Self::Place {
        <A as Default>::default().make_place()
    }
}

Which indeed shows my changes to Boxer are not necessary. [This also keeps HEAP associated just with the "system allocator", which I prefer.]

I guess the question is when would one write a Boxer instance that doesn't look like this (as in using a default allocator)?

@Stebalien
Copy link
Contributor Author

I forgot the Default bound in my previous post, oops [it's now editted]. I assume you also forgot because you worked from mine.

and

Finally, the allocator (&mut self) is needed to allocator any memory, so using HEAP even if it could type check wouldn't make sense.

Boxer isn't supposed to take a custom allocator anyways (edited).

But don't forget the Self::Boxer::Place::Owner=Self--there is a one to one relationship between boxers and places. That means we need to add an allocate parameter to IntermediateBox.

Added (defaults to Heap like Box).

@Stebalien
Copy link
Contributor Author

I guess the question is when would one write a Boxer instance that doesn't look like this (as in using a default allocator)?

impl Boxer<[T]> for Vec<T> {}

let v: Vec<_> = box [1, 2, 3];

@Stebalien
Copy link
Contributor Author

Also, I'd expect the Heap::default() to construct a new heap with the default parameters, not return the default heap.

@Ericson2314
Copy link
Contributor

let v: Vec<_> = box [1, 2, 3];`

Masterstroke :). If the placer for Vec was by-value, this would actually fall into my pattern with Vec's existing default(), but that would be limiting in other, more important, circumstances.

Also, I'd expect the Heap::default() to construct a new heap with the default parameters, not return the default heap.

Hmm, I don't think system allocators let on do that. I think of HEAP as a handle to the global heap, and in general placers should be handles (e.g. &mut Vec<T>) [See the allocator vs. pool distinction in the allocator RFC]. Maybe it should get a different name?

...your edit...

I prefer my fixed version of your counter-example, because I'd still like any allocator implementing default to get box syntax. But that is a definitely matter of taste.

@eddyb
Copy link
Member

eddyb commented Dec 18, 2015

@Ericson2314 I meant the testcase added in the PR, with the safe and inference-friendly API.

Also, memcpy optimization is ensured by abusing &mut's noalias.

@Stebalien
Copy link
Contributor Author

@eddyb, this would also allow Place to be safe to implement (because Unique has an unsafe constructor and requires that the inner pointer points to allocated memory).

@Ericson2314
Copy link
Contributor

@eddyb No luck. https://gist.github.com/Ericson2314/d6e3f9c4c37eb2d4a19b first revision just gets macro definition in function, second changes some T parameters to associated types to reduce number of inference variables, third adds some ?Size which, frankly, I am shocked weren't needed in the original (unless my param->assoc conversion added more Size-bounds accidentally).

I tried adding as _ and as Box<_> in various places, and your traits fared worse than the ones from this PR (with my tweaks), as as... did do the trick with those. I am not sure why that is.

@Ericson2314
Copy link
Contributor

*mut problems

Mm, we'd need to magically add those attributes to Unique, and expose new primitives so the information isn't lost, right? Do all that, and one might as well add a lifetime parameter for stack usage, and woah now we have a new reference type.

tl;dr: I rather live with the unsafely or &mut misuse for now, and file that under &in &out motivation.

@Stebalien
Copy link
Contributor Author

@eddyb in my tests, the only thing that affects RVO is inlining, not &mut T versus *mut T. Are you seeing something else?

use std::{mem, ptr};

struct Thing([usize; 256]);
impl Thing {
    #[inline(never)]
    fn new() -> Self {
        Thing([0usize; 256])
    }
    // #[inline(never)]
    fn pointer(&mut self) -> *mut [usize; 256] {
        &mut self.0 as *mut _
    }
}

#[inline(never)]
fn new_inner() -> [usize; 256] {
    let mut out = [0usize; 256];
    for (i, v) in out.iter_mut().enumerate() {
        *v = i;
    }
    out
}

fn main() {
    let mut thing = Thing::new();
    let ptr = thing.pointer();
    unsafe {
        ptr::write(ptr, new_inner());
    }
}

@eddyb
Copy link
Member

eddyb commented Dec 20, 2015

@Stebalien I was seeing noalias from &mut T remove all redundant memcpy's.

@Stebalien
Copy link
Contributor Author

Do you have a test case I can play with?

Steven Allen

@eddyb
Copy link
Member

eddyb commented Dec 20, 2015

Well, this might explain why I can't seem to reproduce with just my prototype testcase:

declare noalias i8* @__rust_allocate(i64, i64) unnamed_addr #1
...
  %0 = tail call i8* @__rust_allocate(i64 20, i64 4) #1, !noalias !0

So it seems we've started putting noalias metadata on the allocator functions, and after inlining everything is nicely optimized..

With #[inline(never)] on the BoxPlace::make implementation for Box, we get:
One memcpy and one memmove with *mut T.
The same thing with &mut T.

Well, isn't it wonderful when changes somewhere else break your "solution"?
It appears that &mut T had noalias at some point and that was removed (or I'm imagining things).

Making Place::pointer #[inline(never)] as well produces the optimization, leaving only one memcpy, but it appears dereferenceable(20), not noalias, is responsible for the optimization, and it's being lost in inlining.

Or... not? Even *mut T optimizes to one memcpy with both #[inline(never)] in place.

I give up. cc @dotdash @Aatch What is going on here?!

@Stebalien
Copy link
Contributor Author

I'm mostly wondering if we can just transmute to &mut T when desugaring.

@eddyb
Copy link
Member

eddyb commented Dec 20, 2015

@Stebalien as opposed to unsafe { &mut *ptr }?

@Stebalien
Copy link
Contributor Author

@eddyb Either way. My point is, does it really matter what Place::pointer returns or can we just transmute/cast it into something that has the properties we need.

@eddyb
Copy link
Member

eddyb commented Dec 20, 2015

@Stebalien at this point, my original testcase doesn't show differences between &mut T and *mut T anymore and I'm waiting for an expert opinion.

Stebalien added a commit to Stebalien/rfcs that referenced this pull request Dec 21, 2015
@Stebalien
Copy link
Contributor Author

@Ericson2314, I have replacement (full) RFC draft that includes many of your suggestions (and other changes that have come out of discussions here and elsewhere) here. I'll probably submit a pull request in a few days

@eddyb
Copy link
Member

eddyb commented Dec 22, 2015

@Stebalien I still prefer my safe desugaring, although it technically can be built on top of the main traits.

@eddyb
Copy link
Member

eddyb commented Dec 22, 2015

@Stebalien No, I mean the safe write_and_fin which writes to the place and calls the (unsafe) fin method on it, consuming the place in the process.

@Stebalien
Copy link
Contributor Author

Sorry, I wish GitHub would show usernames in emails instead of names...

How does that interact with Placer? One of the goals of the current design is to define the finalize once (on the Place) instead of on a Placer/Boxer independently. Also, putting finalize on Boxer/Placer would break the blanket impl of Placer for Allocators.

However, I considered using the following safe method but wasn't sure how well it would optimize:

trait Place<Data> {
    type Owner;
    fn emplace(self, data: Data) -> Self::Owner;
}

Where the implementer would be responsible for emplacing the data. The downside is that this gives implementers the ability to shoot themselves in the foot by allocating the place inside emplace instead of inside Placer::make_place/Boxer::make_place. Another downside is that it's not really compatible with DST placement.

@eddyb
Copy link
Member

eddyb commented Dec 22, 2015

@Stebalien I just have it a default method next to finalize. Or it can be a free function generic on the traits being used.
The previous ramblings are about optimizations. Currently inlining seems to do the job for Box, at least, so I'm not too worried.

@Stebalien
Copy link
Contributor Author

So something like:

fn emplace<D, P: Place<D>>(place: P, data: D) -> P::Owner {
    unsafe {
        ptr::write(place.pointer(), value);
        place.finalize()
    }
}

The one problem with this is that Place::pointer is called after evaluating data so a sufficiently stupid programmer could shoot him or herself in the foot by doing the allocation inside Place::pointer. However, I don't know if this is something we really care about because Place::pointer should be side-effect free.

@eddyb
Copy link
Member

eddyb commented Dec 22, 2015

Just realized that a MIR desugaring wouldn't be affected by the unsafe/inference issues I was trying to sidestep, as long as HIR was somewhere in-between (maybe going as far as having write_and_finalize as a special kind of HIR expression).

@Stebalien
Copy link
Contributor Author

Incorporated feedback and moved into a new RFC (#1426). It was growing a bit large for a patch.

@Stebalien Stebalien closed this Dec 23, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-lang Relevant to the language team, which will review and decide on the RFC. T-libs-api Relevant to the library API team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants