-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC]: Standardize methods for leaking containers #2969
Conversation
A bit unsure about the addition of both This wouldn't actually change whether something is safe or not, it would just make null checks explicit and forgo the need for an extra method. It would also reduce the amount of unsafe code used in implementing these methods as you don't need to duplicate the methods. |
@clarfon We already have The pointers returned from the The goal of adding new explicit methods like |
@KodrAus I get that, my main concern is that if the goal is to prefer If someone receives a |
That's a fair question. I'd figured that since these We could look at deprecating I should add these to the alternatives 🙂 |
To clarify, I'm not the kind of person who'd be using these APIs so I think it's super fair to have folks weigh in on whether they'd prefer the from/into methods in addition to leak/unleak. I mostly bring this up because the goal of this RFC is to align everything with where we want to be, and "we did it this other way already" isn't a strong enough justification for doing stuff IMHO. ;) |
I’ve started updating the text with some better motivating examples for |
One drawback of using |
I think what this RFC needs for the |
Hi, looking through this doc it appears that there is some codification of standard methods across multiple types. Would it make sense to codify this into a trait? For example |
text/0000-container-leak.md
Outdated
|
||
For multi-value containers like `Vec<T>` and `String`, any subset of the following method pairs should be added to work with their raw representations: | ||
|
||
- `leak`: shrink the container to its initialized length, leak it and return an arbitrarily long-lived shared or mutable reference to its allocated content. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shrink the container to its initialized length
Is shrinking it first the best option here? I'd personally expect Vec::leak
to not shrink the contents. I think of leak
as just something like "hey this Vec you got here, never deallocate it", like std::mem::forget
, except we can still access the leaked contents. Shrinking it first might involve a lot of copying, just to save some unused capacity. Without shrinking, it does what I'd expect from a leak
function: it's basically a no-op other than stopping deallocation from happening.
If I want to not waste any unused capacity, I could still easily call shrink_to_fit
first (or leak the .into_boxed_slice()
instead).
If leak
does shrink, I'd have to use unsafe code to leak it 'efficiently' (i.e. without copying/shrinking, but wasting some unused heap space).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have an example use-case where this inefficiency would matter? The best one I can think of is a startup latency sensitive embedded system, that might want to quickly leak a lot of data at startup. However that use-case might be better off using a separate "pointer-bump" allocator for data it wants to leak. Not possible yet. But with custom allocators it will be.
Other than that situation, I'd expect shrink_to_fit
to almost always be the right choice, because the inefficiency of something done once at startup would be outweighed by the long-term gain of having a bit less memory pressure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't feel very much like Rust to hide a potentially complex operation (shrinking/reallocating) inside a function that appears to be cheap or even free (leak()
). It might very well be the case that there are not many programs where it would matter much, but it'd be good if functions do what their name suggests. The name leak
suggests it just leaks the memory allocation of the container. Not that it copies all the data over to another new allocation first and leaks that instead while deallocating the original allocation.
If leak()
doesn't shrink, that's easily explained in the documentation, and shrink-leaking is still easy to achieve with by calling shrink_to_fit()
first (or using into_boxed_*().leak()
). (Or maybe a shrink_and_leak()
can be added?)
If leak()
doesn't just leak but also shrinks, then just leaking is hard to achieve. It'd require unsafe {}
and a raw pointer.
I also think leak()
should be consistent with leak_raw
, which only leaks and doesn't shrink.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is shrinking it first the best option here? I'd personally expect Vec::leak to not shrink the contents.
Maybe whether or not these methods drop extra capacity should be left unspecified? I think Vec::leak
only shrinks as a consequence of being implemented through Vec::into_boxed_slice
so that it can use Box::leak
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think of leak as just something like "hey this Vec you got here, never deallocate it"
Current leak implementations can return a lifetime shorter than &'static
, which means at the end of its lifetime it would be legal to reconstitute the original allocation. For that one needs to know the capacity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Current leak implementations can return a lifetime shorter than
&'static
, which means at the end of its lifetime it would be legal to reconstitute the original allocation.
As far as I know, the only reason it can return lifetimes shorter than 'static
is to make it possible to leak Vec<Something<'notstatic>>
. Turning a leaked Vec back into the original Vec doesn't really seem like an intended use case for leak()
. into_raw_parts
seems like the right function in that case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Preferable the term leak should only be used in a context where the intent is to leak the vector, and it should not shrink the vector.
Different versions of into_raw(_parts)
/from_raw(_parts)
should generally be used to decompose this types if you later one intent to recompose them.
Except that sadly due to historic reason some of the into_raw
methods do not return NonNull
and as such require an otherwise unnecessary unsafe block (or .unwrap()
) to create a NonNull
.
With regard to shrink_to_fit()
we should warn with bold text in the leak documentation that reconstructing a Vec
from which you only have the len
and data_ptr
will lead to unsafe behavior when dropping it as the Layout
for alloc
and dealloc
do not match. That is why Vec::into_boxed_slice()
does call Vec::schrink_to_fit()
as else wise dropping the boxed slice would cause unsafe behavior of which the consequences are allocator defined.
Using leak()
to create a pointer the the underlying slice and then passing it to a C-FFI as ptr
+len
was a bug I just ran into...
cc @SimonSapin you've done some work on |
I no longer have the bandwidth to carry this forward but if somebody else would like to pick it up sometime in the future then please feel free! |
Specifies a standard set of methods with return types and semantics for leaking containers like
Box<T>
andString
.Rendered
Links
into_raw_non_null
methods were deprecated.Vec::leak
rust#62195 we recently stabilizedVec::leak
based onBox::leak
.vec_into_raw_parts
rust#65816 we've been discussing whetherVec::into_raw_parts
should return a*mut u8
or aNonNull
.