Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Entry API equivalent for Sets #1490

Closed
Gankra opened this issue Feb 5, 2016 · 16 comments · Fixed by rust-lang/rust#120077 or rust-lang/rust#133548
Closed

Entry API equivalent for Sets #1490

Gankra opened this issue Feb 5, 2016 · 16 comments · Fixed by rust-lang/rust#120077 or rust-lang/rust#133548
Labels
T-libs-api Relevant to the library API team, which will review and decide on the RFC.

Comments

@Gankra
Copy link
Contributor

Gankra commented Feb 5, 2016

By merging RFC 1194, set recovery we have acknowledged that the values of keys "matter". That is, it's reasonable to have an equal key, but want to know about the details of the stored key.

That RFC added fn get(&T) -> Option<&T>, take(&T) -> Option<T>, and replace(T) -> Option<T>.

However, what if I have an entry-like situation?

Today, this is the best we can do:

fn get_or_insert(set: &mut HashSet<Key>, key: Key) -> &Key {
  let dupe = key.clone();
  if !set.contains(&key) {
    set.insert(key)
  }
  set.get(&dupe).unwrap();
}

Not only do we incur double-lookup (triple-lookup in the insertion case!), we also incur an unconditional Clone even though we already had a by-value key!

Optimally, we could write

fn get_or_insert(set: &mut HashSet<Key>, key: Key) -> &Key {
  set.entry(key).into_ref()
}

What's the entry API for sets? Well, a heck of a lot simpler. The entry API on maps is all about deferred value handling, and that doesn't make sense for sets.

  • Vacant::insert and Occupied::insert don't make sense because we already have the key
  • Occupied::get_mut and into_mut don't make sense because we don't acknowledge key mutation
  • Occupied::get and into_ref (to mirror into_mut), and remove are the only ones that make sense
  • It may also make sense to provide something like replace() to explicitly overwrite the old key... or something..?

So basically it would be something like entry(K) -> WasVacant(Entry) | WasOccupied(Entry). Critically, you get the same interface no matter what state the world was in, because there's nothing to do in the Vacant case but insert what was already given.

Supporting this would probably mean expanding the Entry API to "care about keys".

I haven't thought about the full implications here, and I don't have the bandwidth to write a full RFC at the moment.

@ticki
Copy link
Contributor

ticki commented Feb 5, 2016

👍 Needed this today.

@apasel422
Copy link
Contributor

+1

@Robbepop
Copy link
Contributor

+1 and thanks apasel422 for linking my PR and pointing me to this RFC! ;)

@nrc nrc added the T-libs-api Relevant to the library API team, which will review and decide on the RFC. label Aug 18, 2016
@Centril
Copy link
Contributor

Centril commented Dec 28, 2016

Also needed this today, specifically:

let set: HashSet<String> = HashSet::new();
let ... = set.entry("the_key").or_insert(|| String::new("the_key"));

@jplatte
Copy link
Contributor

jplatte commented Mar 28, 2017

I had a discussion about this on reddit today, and assumed that because replace is a thing, insert was meant to not replace an existing key. The current implementation, AFAICT, doesn't replace the key as expected. However given the "best we can do" scenario @gankro wrote, I'm now unsure about this. Is key replacement in insert deliberately left unspecified? Or is there something else that I am missing that makes the "best we can do" code behave differently than the following:

fn get_or_insert(set: &mut HashSet<Key>, key: Key) -> &Key {
    let dupe = key.clone();
    set.insert(key);
    set.get(&dupe).unwrap()
}

@izderadicka
Copy link

+1

@fschutt
Copy link

fschutt commented Dec 16, 2017

I needed this today. It's a shame Rust doesn't have this. Please add it to sets.

@resilar
Copy link

resilar commented Mar 10, 2018

+1, this would allow a safe zero-copy implementation of my makeuniq.rs script.

@strega-nil
Copy link

For string interning, this is a very useful feature, and I hit into it today.

@Lucretiel
Copy link

Lucretiel commented Oct 12, 2018

Would definitely like to see this! I have a use case where even if the keys don't "matter", it's still useful to "insert a value and get a reference to either the existing value or inserted value". I'm working on an iterator adapter that filters out duplicates, and without an entry API, there's either an unnecessary lookup or an unnecessary clone:

struct Dedupe<I: Iterator>
    where I::Item: Eq + Hash + Clone {
    iter: I,
    seen: HashSet<I::Item>
}

impl<I: Iterator> Iterator for Dedupe<I>
    where I::Item: Eq + Hash + Clone {
    fn next(&mut self) -> Option<Self::Item> {
        loop {
            let item = self.iter.next()?;
            // Alternatively, do a contains() followed by insert()
            if self.seen.insert(item.clone()) {
                break Some(item);
            }
        }
    }
}

With the entry API, you could do:

fn next(&mut self) -> Option<Self::Item> {
    loop {
        if let WasVacant(item) = self.seen.entry(self.iter.next()?) {
            break Some(item.clone());   // Clone only on a cache miss
        }
    }
}

Essentially, there's a class of use case where you want to check if an T is present, insert it if not, then continue working with it as an &T without having to duplicate the lookup. This use case exists even if the "matteringness" of a particular key vs an equal key doesn't exist.

@clayrab
Copy link

clayrab commented Apr 25, 2022

+1

@SUPERCILEX
Copy link

This happened: rust-lang/hashbrown#342

@amunra
Copy link

amunra commented Dec 29, 2023

Needed this today for BTreeSet, specifically to avoid a .clone() of the key.

For my specific case, the workaround is:

if !set.contains(key) {  // N.B., `key` here is a `Borrow<..>` ref.
    set.insert(key.clone());
}

but this is suboptimal.

@cuviper
Copy link
Member

cuviper commented Dec 29, 2023

The regular Entry API doesn't help avoid that clone, because it always takes the key by value. You would need something more like HashMap::raw_entry_mut (rust-lang/rust#56167) or BTreeMap cursors (rust-lang/rust#107540).

@programmerjake
Copy link
Member

tracking issue for hashset: rust-lang/rust#60896

also, can this issue be reopened, BTreeSet::entry is still missing.

@cuviper
Copy link
Member

cuviper commented Nov 27, 2024

See rust-lang/rust#133548 for BTreeSet.

rust-timer added a commit to rust-lang-ci/rust that referenced this issue Nov 30, 2024
Rollup merge of rust-lang#133548 - cuviper:btreeset-entry-api, r=Mark-Simulacrum

Add `BTreeSet` entry APIs to match `HashSet`

The following methods are added, along with the corresponding `Entry` implementation.

```rust
impl<T, A: Allocator + Clone> BTreeSet<T, A> {
    pub fn get_or_insert(&mut self, value: T) -> &T
    where
        T: Ord,
    {...}
    pub fn get_or_insert_with<Q: ?Sized, F>(&mut self, value: &Q, f: F) -> &T
    where
        T: Borrow<Q> + Ord,
        Q: Ord,
        F: FnOnce(&Q) -> T,
    {...}

    pub fn entry(&mut self, value: T) -> Entry<'_, T, A>
    where
        T: Ord,
    {...}
}
```

Tracking issue rust-lang#133549
Closes rust-lang/rfcs#1490
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-libs-api Relevant to the library API team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging a pull request may close this issue.