Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BTreeMap: consider leaf nodes to have empty edges #77025

Closed
wants to merge 1 commit into from

Conversation

ssomers
Copy link
Contributor

@ssomers ssomers commented Sep 21, 2020

Tried to make internal and leaf nodes more alike, because:

  • The btree code is ambiguous about leaf edges (i.e., edges within leaf nodes). Iteration relies on them heavily, but some of the comments in node.rs suggest there are no leaf edges. In my definition of edges, internal edges contain a pointer to a child node, leaf edges don't.
  • The btree code splits up / repeats some methods according to whether the node is leaf or internal, while the only difference is the kind of leaf edge contents.

r? @Mark-Simulacrum

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Sep 21, 2020
@ssomers
Copy link
Contributor Author

ssomers commented Sep 21, 2020

Some of the stuff split off into #77005. The rest is a little disappointing. The only way I found to make gdb_providers work is obnoxious. I couldn't python to access the InternalNode type, no matter how I try to wrap it.

Performance of insertion may be better for reasonably sized key/values.

 name                                           r3 ns/iter  r4 ns/iter  diff ns/iter   diff %  speedup
 btree::map::clone_fat_100                      45,305      52,676             7,371   16.27%   x 0.86
 btree::map::clone_fat_100_and_clear            45,755      51,880             6,125   13.39%   x 0.88
 btree::map::clone_fat_val_100                  23,963      27,126             3,163   13.20%   x 0.88
 btree::map::clone_fat_val_100_and_clear        23,618      27,295             3,677   15.57%   x 0.87
 btree::map::clone_fat_val_100_and_into_iter    34,394      38,174             3,780   10.99%   x 0.90
 btree::map::clone_slim_10k                     231,170     207,235          -23,935  -10.35%   x 1.12
 btree::map::clone_slim_10k_and_into_iter       230,603     207,045          -23,558  -10.22%   x 1.11
 btree::map::first_and_last_10k                 68          59                    -9  -13.24%   x 1.15
 btree::map::range_included_included            473,273     418,305          -54,968  -11.61%   x 1.13
 btree::set::difference_staggered_100_vs_100    694         905                  211   30.40%   x 0.77
 btree::set::difference_staggered_10k_vs_10k    68,588      85,926            17,338   25.28%   x 0.80
 btree::set::intersection_staggered_10k_vs_10k  61,094      68,490             7,396   12.11%   x 0.89
 btree::set::is_subset_100_vs_10k               1,197       1,338                141   11.78%   x 0.89

@ssomers ssomers force-pushed the btree_NodeTypeTrait branch from ec77ae3 to 0a96e95 Compare September 24, 2020 11:57
@ssomers
Copy link
Contributor Author

ssomers commented Sep 24, 2020

Found tricks to simplify the gdb introspection. The result is still a bit disappointing but acceptable to me.

@ssomers ssomers marked this pull request as ready for review September 24, 2020 12:07
@Mark-Simulacrum
Copy link
Member

I'll wait for #77005 to land before going ahead to review this.

@Mark-Simulacrum Mark-Simulacrum added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 24, 2020
@ssomers
Copy link
Contributor Author

ssomers commented Sep 25, 2020

And I'll wait for #77005 to land to rebase.

@ssomers ssomers force-pushed the btree_NodeTypeTrait branch 2 times, most recently from 3479bf6 to 3ed17fb Compare September 26, 2020 11:07
impl<K, V, E> UniversalNode<K, V, E> {
/// Creates a new `Node`. Unsafe because all nodes should really be hidden behind
/// `BoxedNode`, preventing accidental dropping of uninitialized keys and values.
unsafe fn new() -> Box<Self> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the unsafe (that I transplanted from LeafNode::new) is silly. The function does nothing unsafe. The value returned does not own any keys or values yet. I'm pretty sure that after wrapping it in BoxedNode, you could still wreak havoc without invoking any unsafe function.

@ssomers ssomers force-pushed the btree_NodeTypeTrait branch from 3ed17fb to 553d129 Compare September 26, 2020 12:36
@Mark-Simulacrum
Copy link
Member

Hm, so I'm not sure this is a win. The node code is already pretty scattered (at least in my mind) -- and introducing another type parameter that readers need to figure out feels pretty bad, especially because it doesn't look like it really cleans anything up in a major way, just introduces additional indirection (at the type-level, at least).

Can you say more about the motivation for this? Perhaps there's an end goal in mind here that I'm not yet seeing?

@ssomers
Copy link
Contributor Author

ssomers commented Sep 26, 2020

The motivation is in this PR's description. The first point can be covered by adapting comments, the second point is somewhat addressed by the leafy_ methods, but they complicate matters too because the length is adjusted before the edges are treated. I think the methods look better here, but the type overhead is more than I hoped.

@ssomers
Copy link
Contributor Author

ssomers commented Sep 26, 2020

Or very concretely, if you try to treat edges in the split of internal nodes just like leafy_split treats the other arrays:

            ptr::copy_nonoverlapping(
                self.node.edge_at(self.idx + 1),
                MaybeUninit::slice_as_mut_ptr(&mut new_node.edges),
                new_len + 1,
            );

you end up realizing that the debug_assert in edge_at fails because length was already reduced. And it's not trivial to find out because any assert causes panic in drops, without a trace of the original problem in the stack. But that's another issue.

@ssomers ssomers marked this pull request as draft September 28, 2020 11:49
@ssomers
Copy link
Contributor Author

ssomers commented Oct 1, 2020

Failing to find ways to make this more appealing

@ssomers ssomers closed this Oct 1, 2020
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this pull request Oct 5, 2020
…mments, r=Mark-Simulacrum

BTreeMap: admit the existence of leaf edges in comments

The btree code is ambiguous about leaf edges (i.e., edges within leaf nodes). Iteration relies on them heavily, but some of the comments suggest there are no leaf edges (extracted from rust-lang#77025)

r? @Mark-Simulacrum
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants