Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix OrdMap removal/range logic #154

Merged
merged 3 commits into from
Apr 29, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion benches/native.rs
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ extern crate im;
extern crate rand;
extern crate test;

use rand::{rngs::SmallRng, SeedableRng, Rng};
use rand::{rngs::SmallRng, Rng, SeedableRng};
use std::collections::{BTreeMap, HashMap, VecDeque};
use std::iter::FromIterator;
use test::Bencher;
Expand Down
2 changes: 1 addition & 1 deletion src/hash/set.rs
Original file line number Diff line number Diff line change
Expand Up @@ -33,8 +33,8 @@ use std::ops::{Add, Deref, Mul};

use crate::nodes::hamt::{hash_key, Drain as NodeDrain, HashValue, Iter as NodeIter, Node};
use crate::ordset::OrdSet;
use crate::Vector;
use crate::util::{Pool, PoolRef, Ref};
use crate::Vector;

/// Construct a set from a sequence of values.
///
Expand Down
154 changes: 89 additions & 65 deletions src/nodes/btree.rs
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ use crate::util::{Pool, PoolClone, PoolDefault, PoolRef};
use self::Insert::*;
use self::InsertAction::*;

const NODE_SIZE: usize = NodeSize::USIZE;
pub(crate) const NODE_SIZE: usize = NodeSize::USIZE;
const MEDIAN: usize = (NODE_SIZE + 1) >> 1;

pub trait BTreeValue {
Expand Down Expand Up @@ -92,9 +92,14 @@ pub(crate) enum Remove<A> {
Update(A, Node<A>),
}

enum Boundary {
Lowest,
Highest,
}

enum RemoveAction {
DeleteAt(usize),
PullUp(usize, usize, usize),
PullUp(Boundary, usize, usize),
Merge(usize),
StealFromLeft(usize),
StealFromRight(usize),
Expand Down Expand Up @@ -395,7 +400,17 @@ impl<A: BTreeValue> Node<A> {
path.push((self, index));
path
}
None => Vec::new(),
None => {
// go back up to find next
while let Some((node, idx)) = path.last() {
if node.keys.len() == *idx {
path.pop();
} else {
break;
}
}
path
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fix seems to be needed in lookup_prev and lookup_next as well. Currently, OrdMap::get_prev and OrdMap::get_next may return None when they are not supposed to.

Copy link

@xu-cheng xu-cheng Feb 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem I described above can be demonstrated using a test case similar to the one in the below:

assert_eq!(omap.range(..=i).next_back(), omap.get_prev(&i));
assert_eq!(omap.range(i..).next(), omap.get_next(&i));

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. Good observation.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arthurprs Any chance to update this PR to address it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not right now.

The author seemingly abandoned this project so eventually I switched to https://github.com/orium/rpds instead. It has an arguably worse API but at least it's correct/maintained.

},
Some(ref node) => {
path.push((self, index));
Expand Down Expand Up @@ -424,14 +439,22 @@ impl<A: BTreeValue> Node<A> {
path
}
Err(index) => match self.children[index] {
None if index == 0 => Vec::new(),
None => match self.keys.get(index - 1) {
Some(_) => {
path.push((self, index - 1));
path
None if index == 0 => {
// go back up to find prev
while let Some((_, idx)) = path.last_mut() {
if *idx == 0 {
path.pop();
} else {
*idx -= 1;
break;
}
}
None => Vec::new(),
},
path
}
None => {
path.push((self, index - 1));
path
}
Some(ref node) => {
path.push((self, index));
node.path_prev(key, path)
Expand Down Expand Up @@ -618,14 +641,32 @@ impl<A: BTreeValue> Node<A> {
A::Key: Borrow<BK>,
{
let index = A::search_key(&self.keys, key);
self.remove_index(pool, index, key)
self.remove_index(pool, index, Ok(key))
}

fn remove_target<BK>(
&mut self,
pool: &Pool<Node<A>>,
target: Result<&BK, Boundary>,
) -> Remove<A>
where
A: Clone,
BK: Ord + ?Sized,
A::Key: Borrow<BK>,
{
let index = match target {
Ok(key) => A::search_key(&self.keys, key),
Err(Boundary::Lowest) => Err(0),
Err(Boundary::Highest) => Err(self.keys.len()),
};
self.remove_index(pool, index, target)
}

fn remove_index<BK>(
&mut self,
pool: &Pool<Node<A>>,
index: Result<usize, usize>,
key: &BK,
target: Result<&BK, Boundary>,
) -> Remove<A>
where
A: Clone,
Expand All @@ -638,41 +679,30 @@ impl<A: BTreeValue> Node<A> {
match (&self.children[index], &self.children[index + 1]) {
// If we're a leaf, just delete the entry.
(&None, &None) => RemoveAction::DeleteAt(index),
// Right is empty. Attempt to steal from left if enough capacity,
// otherwise pull the predecessor up.
(&Some(ref left), &None) => {
if !left.too_small() {
RemoveAction::StealFromLeft(index + 1)
} else {
RemoveAction::PullUp(left.keys.len() - 1, index, index)
}
}
// Left is empty. Attempt to steal from right if enough capacity,
// otherwise pull the successor up.
(&None, &Some(ref right)) => {
if !right.too_small() {
RemoveAction::StealFromRight(index)
} else {
RemoveAction::PullUp(0, index, index + 1)
}
}
// Both left and right are non empty. Attempt to steal from left or
// right if enough capacity, otherwise just merge the children.
// First consider pulling either predecessor (from left) or successor (from right).
// otherwise just merge the two small children.
(&Some(ref left), &Some(ref right)) => {
if left.has_room() && !right.too_small() {
RemoveAction::StealFromRight(index)
} else if right.has_room() && !left.too_small() {
RemoveAction::StealFromLeft(index + 1)
if !left.too_small() {
RemoveAction::PullUp(Boundary::Highest, index, index)
} else if !right.too_small() {
RemoveAction::PullUp(Boundary::Lowest, index, index + 1)
} else {
RemoveAction::Merge(index)
}
}
_ => unreachable!("Branch missing children"),
}
}
// Key is adjacent to some key in node
// Target is adjacent to some key in node
Err(index) => match self.children[index] {
// No child at location means key isn't in map.
None => return Remove::NoChange,
// We're deading with a leaf node
None => match target {
// No child at location means key isn't in map.
Ok(_key) => return Remove::NoChange,
// Looking for the lowest or highest key
Err(Boundary::Lowest) => RemoveAction::DeleteAt(0),
Err(Boundary::Highest) => RemoveAction::DeleteAt(self.keys.len() - 1),
},
// Child at location, but it's at minimum capacity.
Some(ref child) if child.too_small() => {
let left = if index > 0 {
Expand Down Expand Up @@ -708,13 +738,13 @@ impl<A: BTreeValue> Node<A> {
self.children.remove(index);
Remove::Removed(pair)
}
RemoveAction::PullUp(target_index, pull_to, child_index) => {
RemoveAction::PullUp(boundary, pull_to, child_index) => {
let children = &mut self.children;
let mut update = None;
let value;
if let Some(&mut Some(ref mut child_ref)) = children.get_mut(child_index) {
let child = PoolRef::make_mut(pool, child_ref);
match child.remove_index(pool, Ok(target_index), key) {
match child.remove_target(pool, Err(boundary)) {
Remove::NoChange => unreachable!(),
Remove::Removed(pulled_value) => {
value = self.keys.set(pull_to, pulled_value);
Expand All @@ -741,7 +771,7 @@ impl<A: BTreeValue> Node<A> {
PoolRef::unwrap_or_clone(left),
PoolRef::unwrap_or_clone(right),
);
let (removed, new_child) = match merged_child.remove(pool, key) {
let (removed, new_child) = match merged_child.remove_target(pool, target) {
Remove::NoChange => unreachable!(),
Remove::Removed(removed) => (removed, merged_child),
Remove::Update(removed, updated_child) => (removed, updated_child),
Expand All @@ -760,21 +790,15 @@ impl<A: BTreeValue> Node<A> {
{
let mut children = self.children.as_mut_slice()[index - 1..=index]
.iter_mut()
.map(|n| {
if let Some(ref mut o) = *n {
o
} else {
unreachable!()
}
});
.map(|n| n.as_mut().unwrap());
let left = PoolRef::make_mut(pool, children.next().unwrap());
let child = PoolRef::make_mut(pool, children.next().unwrap());
// Prepare the rebalanced node.
child.push_min(
left.children.last().unwrap().clone(),
self.keys[index - 1].clone(),
);
match child.remove(pool, key) {
match child.remove_target(pool, target) {
Remove::NoChange => {
// Key wasn't there, we need to revert the steal.
child.pop_min();
Expand Down Expand Up @@ -806,18 +830,12 @@ impl<A: BTreeValue> Node<A> {
{
let mut children = self.children.as_mut_slice()[index..index + 2]
.iter_mut()
.map(|n| {
if let Some(ref mut o) = *n {
o
} else {
unreachable!()
}
});
.map(|n| n.as_mut().unwrap());
let child = PoolRef::make_mut(pool, children.next().unwrap());
let right = PoolRef::make_mut(pool, children.next().unwrap());
// Prepare the rebalanced node.
child.push_max(right.children[0].clone(), self.keys[index].clone());
match child.remove(pool, key) {
match child.remove_target(pool, target) {
Remove::NoChange => {
// Key wasn't there, we need to revert the steal.
child.pop_max();
Expand All @@ -844,11 +862,17 @@ impl<A: BTreeValue> Node<A> {
Remove::Removed(out_value)
}
RemoveAction::MergeFirst(index) => {
if self.keys[index].cmp_keys(key) != Ordering::Equal
&& !self.child_contains(index, key)
&& !self.child_contains(index + 1, key)
{
return Remove::NoChange;
if let Ok(key) = target {
// Bail early if we're looking for a not existing key
match self.keys[index].cmp_keys(key) {
Ordering::Less if !self.child_contains(index + 1, key) => {
return Remove::NoChange
}
Ordering::Greater if !self.child_contains(index, key) => {
return Remove::NoChange
}
_ => (),
}
}
let left = self.children.remove(index).unwrap();
let right = mem::replace(&mut self.children[index], None).unwrap();
Expand All @@ -860,7 +884,7 @@ impl<A: BTreeValue> Node<A> {
);
let update;
let out_value;
match merged.remove(pool, key) {
match merged.remove_target(pool, target) {
Remove::NoChange => {
panic!("nodes::btree::Node::remove: caught an absent key too late while merging");
}
Expand All @@ -887,7 +911,7 @@ impl<A: BTreeValue> Node<A> {
let out_value;
if let Some(&mut Some(ref mut child_ref)) = self.children.get_mut(index) {
let child = PoolRef::make_mut(pool, child_ref);
match child.remove(pool, key) {
match child.remove_target(pool, target) {
Remove::NoChange => return Remove::NoChange,
Remove::Removed(value) => {
out_value = value;
Expand Down
24 changes: 24 additions & 0 deletions src/ord/map.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2381,6 +2381,30 @@ mod test {
assert_eq!(vec![(1, 2), (2, 3), (3, 4), (4, 5), (5, 6)], range);
}

#[test]
fn range_iter_big() {
use crate::nodes::btree::NODE_SIZE;
use std::ops::Bound::Included;
const N: usize = NODE_SIZE * NODE_SIZE * 5; // enough for a sizeable 3 level tree

let data = (1usize..N).filter(|i| i % 2 == 0).map(|i| (i, ()));
let bmap = data
.clone()
.collect::<std::collections::BTreeMap<usize, ()>>();
let omap = data.collect::<OrdMap<usize, ()>>();

for i in (0..NODE_SIZE * 5).chain(N - NODE_SIZE * 5..=N + 1) {
assert_eq!(omap.range(i..).count(), bmap.range(i..).count());
assert_eq!(omap.range(..i).count(), bmap.range(..i).count());
assert_eq!(omap.range(i..i + 7).count(), bmap.range(i..i + 7).count());
assert_eq!(omap.range(i..=i + 7).count(), bmap.range(i..=i + 7).count());
assert_eq!(
omap.range((Included(i), Included(i + 7))).count(),
bmap.range((Included(i), Included(i + 7))).count(),
);
}
}

#[test]
fn issue_124() {
let mut map = OrdMap::new();
Expand Down