-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BTreeMap: tag and explain unsafe internal functions or assert preconditions #68468
Conversation
Benchmark comparison should be the same as in #68418, I suppose - just some more unsafe tags and comments. These do seem to shift the focus of the optimizer somewhat away from set::is_subset, regardless of the usual noise.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Left a few comments.
@@ -369,7 +369,7 @@ impl<BorrowType, K, V, Type> NodeRef<BorrowType, K, V, Type> { | |||
/// If the node is a leaf, this function simply opens up its data. | |||
/// If the node is an internal node, so not a leaf, it does have all the data a leaf has | |||
/// (header, keys and values), and this function exposes that. | |||
/// See `NodeRef` on why the node may not be a shared root. | |||
/// Unsafe because the node may not be a shared root, see why in the `NodeRef` comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// Unsafe because the node may not be a shared root, see why in the `NodeRef` comments. | |
/// The node must not be a shared root. For more information, see the `NodeRef` comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hard to believe I fell in the old must-not/may-not trap (a Dutch thing). But now, is it clear why the function is unsafe? I edited the age-old comments on new_kv
and new_edge
just to get that across.
@@ -647,7 +637,7 @@ impl<'a, K, V> NodeRef<marker::Mut<'a>, K, V, marker::Internal> { | |||
|
|||
fn correct_childrens_parent_links(&mut self, first: usize, after_last: usize) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be annotated, I think, as the first/after_last must be <= self.len()
I believe.
(Or we can assert that condition).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, we can't simply assert it like that. split_off
invokes move_suffix
which invokes move_edges
with a fixed dest_offset
of 1, which is out of bounds if right_new_len
is zero. Nothing is moved, no children's parents are corrected.
One may condemn split_off
for investing time to move an empty suffix but I'm opting for a stranger debug_assert right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed my mind, added an if-test on right_new_len
I also resolved your comments, as I agree that neither need be addressed here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hopefully a final round and then we can get this landed.
@@ -240,7 +240,7 @@ impl<K, V> Root<K, V> { | |||
/// Adds a new internal node with a single edge, pointing to the previous root, and make that | |||
/// new node the root. This increases the height by 1 and is the opposite of `pop_level`. | |||
pub fn push_level(&mut self) -> NodeRef<marker::Mut<'_>, K, V, marker::Internal> { | |||
debug_assert!(!self.is_shared_root()); | |||
debug_assert!(!self.is_shared_root()); // &mut to a static is impossible anyway |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to see us drop the debug asserts that should be statically impossible (e.g. like here, where we have unique access).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dropped
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So actually, not committing that (yet). Seems I should retract the comment instead. And perhaps mark unsafe. Or do you imply that &mut self
logically implies that the caller should know that the shared root is unacceptable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we should retract this. I missed that we don't actually "own" the root here necessarily (i.e., we may have a NonNull or perhaps not)
pub fn new_kv(node: NodeRef<BorrowType, K, V, NodeType>, idx: usize) -> Self { | ||
// Necessary for correctness, but in a private module | ||
/// Creates a new handle to a key/value pair in `node`. | ||
/// Unsafe because the caller must ensure that `idx` be less than `node.len()`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// Unsafe because the caller must ensure that `idx` be less than `node.len()`. | |
/// Unsafe because the caller must ensure that `idx < node.len()`. | |
/// Note that `node.len()` is always safe to call. |
Please verify that the second line is actually true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that node.len()
is safe. Everything I learnt over the past months tells me that it reads through a pointer that is always well aligned from memory that is always allocated, storing the value in a register or stack exclusive to the caller. But in the course of #58431, @RalfJung wrote in into_slices_mut
that len
wreaks havoc on things elsewhere and had to be avoided.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, it is true that under stacked borrows a call to len() can invalidate some other reference, but that doesn't really make it unsafe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm reading up on stacked borrows, and two hours later I might have understood something, i.e. what you're asking. I guess invalidating is a lot more subtle than killing. Still bad enough that it prompted the rewrite of into_slices_mut
, but that was only to satisfy Miri and to avoid future trouble - there was no genuine UB. "Verify" means to check that Miri is happy executing the expressions in the debug_assert (while it normally skips those).
If that's right, then yes, the second line is actually true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also applied that comment style to new_edge
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Calling
len
has no preconditon other than the argument being well-typed
Sure, but I was thinking of a precondition (or restriction, if precondition is the wrong term) on the unsafe new_kv
function. Though it does seem that if you start documenting that, you'd have to do that in a lot of functions for a lot of "interactions" with memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not a precondition, I'd say it's part of the side-effects of this function. It's part of the side-effect of all &self
functions that calling it kills exclusive aliases to self
. This one here is special only insofar as the NodeRef
type is used instead of a "normal" reference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I understand. "Note that node.len()
is always safe to call" does not go into what side effects there might be.
My answer to "that the second line is actually true" is simply: yes, NodeRef::len()
is not marked unsafe.
And Mark's point is that it is merely a hint to the caller, not a justification of the debug_assert, so there's not real reason to keep it nor to remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, let's drop it from here perhaps and make an explicit note on NodeRef::len()
? Unfortunately given the unsafety landscape today the "is not unsafe" part of NodeRef::len()
doesn't really actually play out to being necessarily indicative of lacking preconditions :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dropped it and recycled the bytes to somewhat wishy-washy comments in NodeRef::len()
. And not on NodeRef::height()
or any other &self
method...
Please also squash the commits into one while addressing the latest feedback. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I transplanted the earlier demo into a commit on top of the current commits here, which gives branch btreemap_prefer_middle_demo. Testing btree::map::test_basic_small
clearly demonstrates to me that there is no protection against a &mut self
to the shared root, apart from the debug_assert inside into_key_slice_mut
. Even Miri doesn't object (though of course a little later Miri does object to transmuting the shared root to a leaf pointer).
Perhaps what you based on is that the node
field of a NodeRef
cannot be turned into a '&mut` on its own?
Okay, I think this is good to go modulo the |
I've become quite fond of the idea that marker::Mut should imply exclusivity. But alas, pretending to have a marker::Mut reference to the shared root is exactly what |
So really what I just wrote is that |
ee8f7a2
to
5aecf55
Compare
I tried split up the search code such that it never returns baloney handles, but it's not the only place where this happens: even the Drop implementation constructs a So such handles are not baloney but unavoidable, and marker::Mut cannot imply exclusivity. |
…itions Co-Authored-By: Mark Rousskov <mark.simulacrum@gmail.com>
5aecf55
to
ba87a50
Compare
I couldn't let it be, could I... Turns out it's relatively easy to get rid of marker::Mut/Owned |
@bors r+ Let's land this so we can make some progress, and we can continue iterating in future PRs if necessary. |
📌 Commit ba87a50 has been approved by |
…ark-Simulacrum BTreeMap: tag and explain unsafe internal functions or assert preconditions rust-lang#68418 concluded that it's not desirable to tag all internal functions with preconditions as being unsafe. This PR does it to some functions, documents why, and elsewhere enforces the preconditions with asserts.
Rollup of 6 pull requests Successful merges: - #66648 (Implement clone_from for BTreeMap and BTreeSet) - #68468 (BTreeMap: tag and explain unsafe internal functions or assert preconditions) - #68626 (Use termize instead of term_size) - #68640 (Document remaining undocumented `From` implementations for IPs) - #68651 (Document `From` implementation for NonZero nums) - #68655 (Fix revision annotations in borrowck-feature-nll-overrides-migrate) Failed merges: r? @ghost
#68418 concluded that it's not desirable to tag all internal functions with preconditions as being unsafe. This PR does it to some functions, documents why, and elsewhere enforces the preconditions with asserts.