Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking Issue for {BTreeMap,BTreeSet}::extract_if #70530

Open
3 of 5 tasks
ssomers opened this issue Mar 29, 2020 · 26 comments
Open
3 of 5 tasks

Tracking Issue for {BTreeMap,BTreeSet}::extract_if #70530

ssomers opened this issue Mar 29, 2020 · 26 comments
Labels
A-collections Area: `std::collection` B-unstable Blocker: Implemented in the nightly compiler and unstable. C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. Libs-Tracked Libs issues that are tracked on the team's project board. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@ssomers
Copy link
Contributor

ssomers commented Mar 29, 2020

This is a tracking issue for the Implementation of a extract_if method on BTreeMap and BTreeSet, similar to the one in LinkedList and in Vec (#43244).
The feature gate for the issue is #![feature(btree_extract_if)] (previously btree_drain_filter)

About tracking issues

Tracking issues are used to record the overall progress of implementation.
They are also uses as hubs connecting to other relevant issues, e.g., bugs or open design questions.
A tracking issue is however not meant for large scale discussion, questions, or bug reports about a feature.
Instead, open a dedicated issue for the specific matter and add the relevant feature gate label.

Steps

  • Implement suboptimally, relying on an Ord bound: efficient for simple and common cases, but falling back on a coarse restart after complicated removals.
  • Possibly adjust the underlying tree representation (using Cells).
  • Implement all cases without relying on an Ord bound, tracking every adjustment
  • Adjust documentation (see instructions on rustc-dev-guide)
  • Stabilization PR (see instructions on rustc-dev-guide)

Unresolved Questions

Implementation history

@ssomers ssomers added the C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. label Mar 29, 2020
@jonas-schievink jonas-schievink added A-collections Area: `std::collection` B-unstable Blocker: Implemented in the nightly compiler and unstable. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Mar 29, 2020
@droundy
Copy link
Contributor

droundy commented Jul 8, 2020

Is there another issue for adding these to HashSet and HashMap?

@mbrubeck
Copy link
Contributor

mbrubeck commented Jul 8, 2020

Is there another issue for adding these to HashSet and HashMap?

That is #59618.

@KodrAus KodrAus added the Libs-Tracked Libs issues that are tracked on the team's project board. label Jul 29, 2020
@simias
Copy link

simias commented Nov 13, 2020

I stumbled upon this issue from the docs and I must say that I'm a bit confused: I wanted to replace a HashMap with a BTreeMap in my code, but then I noticed that it didn't have a retain method unlike the hash counterpart. Then I noticed this experimental drain_filter method.

As far as I can tell isn't this method the same thing as HashMap's retain, just with the boolean result of the predicate complemented?

Looking at the APIs of BTreeMap, Vec and HashMap I find the following methods that all deal with filtering the contents of the collection:

Vec::retain: predicate that gets passed an &T
HashMap::retain: predicate that gets passed an &K, &mut V

Vec::drain_filter: predicate that gets passed an &mut T
BTreeMap::drain_filter (this issue): predicate that gets passed a &K, &mut V

LinkedList has drain_filter (taking a &mut T) but no retain. And then there's this old RFC from 2015 that proposes BTreeMap::retain: rust-lang/rfcs#1338

I genuinely don't get the logic behind these choices of mutability and naming.

In particular I really think that the APIs of HashMap and BTreeMap should match as much as practically possible since they are so conceptually similar and it's not uncommon to want to replace one by the other in hindsight (in particular if your ordering requirements change).

@simias
Copy link

simias commented Nov 13, 2020

Oh, I noticed an obvious difference between retain and drain_filter: the later returns an iterator. So now I understand the need for both, but I'm still surprised by the lack of consistency between collections when it comes to mutability and the lack of drain_filter for HashMaps.

@ssomers
Copy link
Contributor Author

ssomers commented Nov 13, 2020

When I first looked into it, I remember a drain_filter where the predicate had the opposite meaning. I liked the current one better: true means "yes, drain".

now I understand the need for both

If you really do, please explain, but I think you mean you understand why someone made retain and later someone made a more general drain_filter, probably for another container. At least that's how I hope it went.

@fogti
Copy link
Contributor

fogti commented Nov 13, 2020

I think if we could replace retain with an implementation solely based upon drain_filter, this would prove that retain is therefore redundant.

@simias
Copy link

simias commented Nov 13, 2020

Right, I meant that I understand why drain_filter is effectively a more powerful retain. I assumed that either retain came first and was kept for compatibility or that it was somehow more efficient.

I also found this issue in the meantime: #59618 so apparently HashMap is going to get a drain_filter too, which means that eventually the APIs will be less confusing and less weirdly incompatible between the two *Maps.

In the meantime it's a real hard scratcher, for me at least.

@mbrubeck
Copy link
Contributor

HashMap::drain_filter is already available in the nightly toolchain. It's behind a feature flag, like the other drain_filter methods. On stable Rust, you can use HashMap::drain_filter via hashbrown.

For the status of drain_filter in all the standard collection types, see rust-lang/rfcs#2140. For discussion of the naming and API design for drain_filter, see #43244, especially this comment.

@simias
Copy link

simias commented Nov 13, 2020

Thank you @mbrubeck, this was the context I lacked.

@mbrubeck
Copy link
Contributor

I submitted #79026 to add BTreeMap::retain and BTreeSet::retain, implemented in terms of drain_filter as suggested above.

@ssomers
Copy link
Contributor Author

ssomers commented Nov 13, 2020

if we could replace retain with an implementation solely based upon drain_filter

I think retain is the same as drain_filter with the predicate inverted and the result dropped, as in this playground demo, or even better, in a playground demo that isn't spectacularly wrong. But in the hashbrown code they are unrelated. Maybe that's to improve performance, because drain_filter does add some overhead for setting up and using an iterator. Either way, it's pretty easy to implement retain in BTreeMap/Set.

@ssomers
Copy link
Contributor Author

ssomers commented Nov 13, 2020

Note that this (nightly only) drain_filter in BTreeMap/Set is as optimal as I can imagine when you drain only a few elements, but definitely not when you drain a lot of elements. Of course, if you typically drain all but a few elements, you shouldn't be using drain_filter at all - better into_iter, into_keys or into_values the whole lot and divert the few ones you want to retain to a new map/set. But if you drain say half the elements, it's nowhere near as efficient as something like split_off. So I'm not sure this is good enough to become stable at all.

Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this issue Nov 13, 2020
Implement BTreeMap::retain and BTreeSet::retain

Adds new methods `BTreeMap::retain` and `BTreeSet::retain`.  These are implemented on top of `drain_filter` (rust-lang#70530).

The API of these methods is identical to `HashMap::retain` and `HashSet::retain`, which were implemented in rust-lang#39560 and stabilized in rust-lang#36648.  The docs and tests are also copied from HashMap/HashSet.

The new methods are unstable, behind the `btree_retain` feature gate, with tracking issue rust-lang#79025.  See also rust-lang/rfcs#1338.
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this issue Nov 13, 2020
Implement BTreeMap::retain and BTreeSet::retain

Adds new methods `BTreeMap::retain` and `BTreeSet::retain`.  These are implemented on top of `drain_filter` (rust-lang#70530).

The API of these methods is identical to `HashMap::retain` and `HashSet::retain`, which were implemented in rust-lang#39560 and stabilized in rust-lang#36648.  The docs and tests are also copied from HashMap/HashSet.

The new methods are unstable, behind the `btree_retain` feature gate, with tracking issue rust-lang#79025.  See also rust-lang/rfcs#1338.
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this issue Nov 14, 2020
Implement BTreeMap::retain and BTreeSet::retain

Adds new methods `BTreeMap::retain` and `BTreeSet::retain`.  These are implemented on top of `drain_filter` (rust-lang#70530).

The API of these methods is identical to `HashMap::retain` and `HashSet::retain`, which were implemented in rust-lang#39560 and stabilized in rust-lang#36648.  The docs and tests are also copied from HashMap/HashSet.

The new methods are unstable, behind the `btree_retain` feature gate, with tracking issue rust-lang#79025.  See also rust-lang/rfcs#1338.
@maboesanman
Copy link

As it is now the documentation makes no claims about the order of the events returned from the iterator. Is this intended? having this iterator return in sorted order, or even returning a double ended iterator similar to range could be pretty useful.

@ssomers
Copy link
Contributor Author

ssomers commented Jan 24, 2021

It's not intended not to be made, this claim… I mean, I forgot to mention that, because it's guaranteed to come in ascending order. PS ⇒ #81434

@maboesanman
Copy link

it's certainly worth documenting that, but I also think it's worth making this into a double ended iterator.

@ssomers
Copy link
Contributor Author

ssomers commented Jan 24, 2021

It's definitely not easy to make drain_filter double-ended, and almost definitely possible. I wonder why though. Unlike iterators and range, drain_filter does not allow shortcutting, stopping when you've reached the elements you were interested in. You have to see the whole collection through, front to back. So by querying the length up front, it seems to me you can easily do the same thing you could do with a double-ended iterator.

@allada
Copy link

allada commented Nov 25, 2021

I just came across a use case for this method but found it could not handle the case where I need to run my predicate function on the map in reversed order. There are work arounds to this problem, but thought I'd notate potential pain pointers of users.

@kornelski
Copy link
Contributor

kornelski commented Feb 2, 2022

I've just had a use-case for something like this, but I needed to drain and filter a specific range of keys (remove certain values that are after a given key in the btreemap). drain on Vec takes a range. Maybe this one could too?

@vkrasnov
Copy link

+1 for @kornelski I most definitely have a use-case for a drain with range in btreemap

@kaimast
Copy link

kaimast commented Sep 6, 2022

I also use this to remove all entries within a specific range. I probably would would not need drain_filter if there was a drain function that took a range...

@the8472
Copy link
Member

the8472 commented Nov 9, 2022

If the iterator is only partially consumed or not consumed at all, each of the remaining elements is still subjected to the closure, which may change its value and, by returning true, have the element removed and dropped.

For the same reasons this is problematic in Vec and LinkedLists this is also problematic on the BTree collections.

#43244 (comment)
rust-lang/rfcs#3288

@Kixiron
Copy link
Member

Kixiron commented Mar 21, 2023

What's the status of this feature?

bors added a commit to rust-lang-ci/rust that referenced this issue Jun 15, 2023
Don't drain-on-drop in DrainFilter impls of various collections.

This removes drain-on-drop behavior from various unstable DrainFilter impls (not yet for HashSet/Map) because that behavior [is problematic](rust-lang#43244 (comment)) (because it can lead to panic-in-drop when user closures panic) and may become forbidden if [this draft RFC passes](rust-lang/rfcs#3288).

closes rust-lang#101122

[ACP](rust-lang/libs-team#136)

affected tracking issues
* rust-lang#43244
* rust-lang#70530
* rust-lang#59618

Related hashbrown update: rust-lang/hashbrown#374
@Nemo157 Nemo157 changed the title Tracking Issue for {BTreeMap,BTreeSet}::drain_filter Tracking Issue for {BTreeMap,BTreeSet}::extract_if Jun 16, 2023
@KonradHoeffner
Copy link
Contributor

This feature stopped working for me with the newest nightly version as of 2023-06-16.

@Nemo157
Copy link
Member

Nemo157 commented Jun 16, 2023

The feature was renamed in #104455.

@KonradHoeffner
Copy link
Contributor

Thanks! After replacing "btree_drain_filter" with "btree_extract_if" and "drain_filter" with "extract_if" the code compiles as before.

rwestphal added a commit to holo-routing/holo that referenced this issue Jun 18, 2023
sync w/ upstream

References:
* rust-lang/rust#104455
* rust-lang/rust#70530

Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
andrefs added a commit to andrefs/chilon_rs that referenced this issue Sep 12, 2023
rwestphal added a commit to holo-routing/holo that referenced this issue Jan 25, 2024
sync w/ upstream

References:
* rust-lang/rust#104455
* rust-lang/rust#70530

Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
@QuineDot
Copy link

QuineDot commented Jun 2, 2024

The ExtractIf definitions have lifetime bounds on the F parameter which, as far as I can tell, are unnecessary.

// BTreeMap version
pub struct ExtractIf<'a, K, V, F, A: Allocator + Clone = Global> where
    F: 'a + FnMut(&K, &mut V) -> bool,

Which causes this to fail without extra bounds, for example.

Edit: Eh, that was a bad example since it type-erased F and would thus need the bound anyway. (The outer covariance of &mut _ makes it hard to observe the bound in an uncontrived way.)

The T: 'a on BTreeSet can also probably be removed (it's implied on the extract_if method anyway).

The same is probably also true for linked_list::ExtractIf.

Inspired by this conversation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-collections Area: `std::collection` B-unstable Blocker: Implemented in the nightly compiler and unstable. C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. Libs-Tracked Libs issues that are tracked on the team's project board. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests