Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove _zeroed variants and pass a flag instead #44

Closed
TimDiekmann opened this issue Mar 8, 2020 · 18 comments · Fixed by rust-lang/rust#70362
Closed

Remove _zeroed variants and pass a flag instead #44

TimDiekmann opened this issue Mar 8, 2020 · 18 comments · Fixed by rust-lang/rust#70362
Labels
A-Alloc Trait Issues regarding the Alloc trait Proposal

Comments

@TimDiekmann
Copy link
Member

TimDiekmann commented Mar 8, 2020

Currently, for almost all methods (except dealloc and shrink(_in_place)) in AllocRef, there is a _zeroed variant, which blows up the trait.

I propose to remove those variants and add a zeroed: bool to the signatures. E.g.:

fn alloc(&self, layout: Layout, zeroed: bool) -> Result<...>;
fn realloc(&self, ptr: NonNull<u8>, layout: Layout, new_size: usize, zeroed: bool) -> Result<...>;
@TimDiekmann TimDiekmann added A-Alloc Trait Issues regarding the Alloc trait Proposal labels Mar 8, 2020
@Amanieu
Copy link
Member

Amanieu commented Mar 8, 2020

I would prefer an enum over a bool: http://blakesmith.me/2019/05/07/rust-patterns-enums-instead-of-booleans.html

Otherwise this seems like a good idea.

Do we want to do something similar for grow/grow_in_place and shrink/shrink_in_place?

@TimDiekmann
Copy link
Member Author

I'm aware of that post, but I wasn't able to find a good enum name so far.

Maybe something like

enum ??? {
    Zeroed,
    Unspecified
}

Yes for grow/grow_in_place, for shrink/shrink_in_place it doesn't make sense, as no new memory is returned.

@Amanieu
Copy link
Member

Amanieu commented Mar 8, 2020

I mean having an InPlace flag instead of having two separate methods.

@TimDiekmann
Copy link
Member Author

Oh, yeah, that's also a good idea. So basically like grow and shrink in rust-lang/rust#69824? 😄
https://github.com/rust-lang/rust/blob/32c0b1761d0cbef051408c6c28f78302935d8188/src/liballoc/raw_vec.rs#L465

@Amanieu
Copy link
Member

Amanieu commented Mar 8, 2020

Tentative names:

enum AllocContents {
    Zero,
    Uninitialized, // With a big warning that reading this is UB if you haven't initialized it
}

enum AllocConstraints {
    None,
    InPlace,
}

Note that I am less sure about the InPlace thing, so we might want to defer on that for now.

@Lokathor
Copy link

Lokathor commented Mar 8, 2020

I'd call it NewMemory, because a realloc doesn't also zeroed the previous data.

@TimDiekmann
Copy link
Member Author

As this will be in core::alloc::, I'd omit the Alloc prefix.
I just came up with enum Init { Zero, Uninitialized/Unspecified }. enum Constraint sounds like a bitmask with the abillity to add more variants later.

@Lokathor
Copy link

Lokathor commented Mar 8, 2020

Names should generally make sense on their own, without knowing the module path to the item.

@TimDiekmann
Copy link
Member Author

TimDiekmann commented Mar 9, 2020

Can we agree on this?

/// Passed to an allocating method to determine how newly requested memory is 
/// initialized.
#[derive(Debug, Copy, Clone, PartialEq, Eq)]
enum AllocInit {
    /// Depending on the used allocator, newly allocated memory is not 
    /// necessarily initialized. Reading uninitialized memory is 
    /// [undefined behavior]!
    ///
    /// [undefined behavior]: https://doc.rust-lang.org/reference/behavior-considered-undefined.html
    Unspecified,
    /// Newly allocated memory returned by an allocator is guaranteed to be 
    /// zeroed.
    Zero,
}

/// Passed to an allocator to determine how newly requested memory is 
/// allocated when growing or shrinking an existing memory block.
#[derive(Debug, Copy, Clone, PartialEq, Eq)]
enum AllocPlacement {
    /// The location is not specified. The allocator is allowed to make a 
    /// byte-by-byte copy of the memory.
    Unspecified,
    /// When allocating or releasing memory, the contents of the memory must not 
    /// be copied.
    InPlace,
}

@Wodann
Copy link

Wodann commented Mar 9, 2020

Sounds good to me. Nice work on the RawVec PR too 👏🏻

@TimDiekmann
Copy link
Member Author

Thank you!

@Lokathor
Copy link

I feel, dubious about those definitions.

  • Unspecified is a poor name because that's not used anywhere else in Rust. Just say Uninitialized.
  • Zero should be changed to Zeroed, so that the tag names are both adjectives.
  • AllocPlacement only makes sense with realloc, so we can call it ReAllocPlacement (bikeshed the capitalization there).
  • The comments on the placement variants are kinda fuzzy and could probably be improved.

Consider this revision:

/// A desired initial state for allocated memory.
#[derive(Debug, Copy, Clone, PartialEq, Eq)]
enum AllocInit {
    /// The new memory is not initialized to any particular value.
    /// 
    /// Remember that reading uninitialized memory is Undefined Behavior.
    Uninitialized,
    /// The new memory is guaranteed to be zeroed.
    Zeroed,
}

/// A reallocation constraint.
#[derive(Debug, Copy, Clone, PartialEq, Eq)]
enum ReAllocPlacement {
    /// The new address of the memory can be any heap location.
    ///
    /// If the allocation _does_ move, it's the responsibility of the allocator
    /// to also move the data from the previous location to the new location.
    Unspecified,
    /// The address of the new memory must not change.
    ///
    /// If the allocation would have to be moved to a new location to fit, the
    /// reallocation request should instead fail.
    InPlace,
}

@TimDiekmann
Copy link
Member Author

TimDiekmann commented Mar 10, 2020

Unspecified is a poor name because that's not used anywhere else in Rust. Just say Uninitialized.

Fair point. Additionally, AllocInit::Uninitialized feels fluently, and it's possible to use core::alloc::{AllocInit::*, ReAllocPlacement::*};, which might comes in handy when using the API

Zero should be changed to Zeroed, so that the tag names are both adjectives.

👍

AllocPlacement only makes sense with realloc, so we can call it ReAllocPlacement (bikeshed the capitalization there).

I'd use ReallocPlacement as other allocation APIs also uses the term realloc, not re_alloc.

The comments on the placement variants are kinda fuzzy and could probably be improved.

Thanks, sounds better!

The new address of the memory can be any heap location.

Is that true? Is this necessarily a heap location? I can imagine, that this could also be a global buffer or something, which has been allocated on the stack.

@Lokathor
Copy link

Good point on the end there....

Well, so as far as I know, LLVM has the concept of the call stack memory, and then it also has the concept that memory can come "from places that aren't the call stack memory", and that's all it thinks about. And memory "not in the call stack memory space" can come from all sorts of places that you might want an allocator for, and if you've got an OS under you then ultimately you're asking the OS to tell the memory management unit to do the right thing. And if you don't have an OS under you then you're doing whatever the device's data sheets say you can do because you are the OS and you have that phenomenal cosmic power. And we need to allow for all these possibilities.

So... What we probably want to do is to say "The new address of the memory can be any valid location." and then the words "valid location" are a link to a section on the top level of some module that describes Rust's rules (or lack of rules) for what is even a legal allocator.

Such a section would be similar to how the core::ptr module has a section that describes the pointer validity rules, and how the core::sync::atomic and std::thread modules describe that Rust has some sort of system for threads and atomics and you should go read the C++ specs to learn about it because right now we just do "what llvm does" and that's "what C++ does".

@Lokathor
Copy link

The main rules would be things that "make sense anyway", like you can't start allocating from a location just past the end of the stack and then call a function because when you push stuff onto the stack it starts writing into your allocation pool and the program gets very sad.

@TimDiekmann
Copy link
Member Author

Adding ReallocPlacement isn't probably a that good decision I initially thought. Lying out the current methods:

fn grow(&mut self, ptr: NonNull<T>, layout: Layout, new_size: usize, init: AllocInit) -> Result<(NonNull<T>, usize), AllocErr>;
fn grow_in_place(&mut self, ptr: NonNull<T>, layout: Layout, new_size: usize, init: AllocInit) -> Result<usize, AllocErr>;
fn shrink(&mut self, ptr: NonNull<T>, layout: Layout, new_size: usize) -> Result<(NonNull<T>, usize), AllocErr>;
fn shrink_in_place(&mut self, ptr: NonNull<T>, layout: Layout, new_size: usize) -> Result<usize, AllocErr>;

grow and grow_in_place / shrink and shrink_in_place has different return values. _in_place means, that the pointer does not change, so returning a pointer is ... pointless, literally. Returning an excess in shrink_in_place results in a trivial default implementation of

fn shrink_in_place(&mut self, ptr: NonNull<T>, layout: Layout, new_size: usize) -> Result<usize, AllocErr> {
    Ok((ptr, layout.size())
}

which cannot fail.

So I'd come to this conclusion:

  • Add AllocInit,
  • remove Result from shrink_in_place, and
  • keep _in_place-variants.

@Amanieu
Copy link
Member

Amanieu commented Mar 18, 2020

We just agreed in #42 that we can't remove the Result from shrink since success implies that a different (smaller) Layout can now be used for dealloc.

@TimDiekmann
Copy link
Member Author

I was just about to say that 😄
The only solution to this would be to remove the excess API from shrink(_in_place). This would also fix a soundness hole in the current RawVec::into_box implementation.

bors added a commit to rust-lang-ci/rust that referenced this issue Apr 2, 2020
Overhaul of the `AllocRef` trait to match allocator-wg's latest consens; Take 2

GitHub won't let me reopen rust-lang#69889 so I make a new PR.

In addition to rust-lang#69889 this fixes the unsoundness of `RawVec::into_box` when using allocators supporting overallocating. Also it uses `MemoryBlock` in `AllocRef` to unify `_in_place` methods by passing `&mut MemoryBlock`. Additionally, `RawVec` now checks for `size_of::<T>()` again and ignore every ZST. The internal capacity of `RawVec` isn't used by ZSTs anymore, as `into_box` now requires a length to be specified.

r? @Amanieu

fixes rust-lang/wg-allocators#38
fixes rust-lang/wg-allocators#41
fixes rust-lang/wg-allocators#44
fixes rust-lang/wg-allocators#51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-Alloc Trait Issues regarding the Alloc trait Proposal
Projects
None yet
4 participants