Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TreeMap: find_equiv(_mut) for str-like keys #15220

Closed
wants to merge 1 commit into from

Conversation

vhbit
Copy link
Contributor

@vhbit vhbit commented Jun 27, 2014

  • it allows to lookup using any str-equiv object, making TreeMaps finally usable (for example, it is much easier to work with JSON with lookup values being static strs)
  • actually provides pretty flexible solution which could be extended to other equivalent types (although it might be not that performant)

@alexcrichton
Copy link
Member

This seems to be taking issues like #12135 to the extreme by having a super-specialized implementation for just strings on keys that generally want Equiv. I'm not sure that Equiv is the right interface for a TreeMap, and it's true that this is certainly solving a problem that cannot otherwise be done today.

I would personally be more in favor of finding a better TreeMap-specific solution or sorting through the Equiv problem. For example, maybe a TreeMap could expose some form of binary search interface?

For previous discussion, see #13205, #11074, #11691, #12135, and #14968.

@vhbit
Copy link
Contributor Author

vhbit commented Jun 29, 2014

There is nothing could be done through Equiv.

One of solutions is to modify Ord to accept instead of type itself possibly a AutoCoercion trait which will describe how one value could be "converted" to "familiar" type so they could be compared easily.

@alexcrichton
Copy link
Member

That is correct that Equiv won't work, but it's akin to that were a specialized subset of keys have some fancy functionality, and not even all the functionality you may want. For finding operations it seems exposing the BST-nature of the data structure would be useful?

@vhbit
Copy link
Contributor Author

vhbit commented Jun 30, 2014

It seems I'm missing something, the essence of BST-nature is Ord, isn't it? It is impossible to avoid it, so the way to go must be changing what is compared.

So there could be a couple of cases:

  • wrapped data - when the same data is wrapped and to compare we have to strip it down to essence. It is the Str case
  • data with no "internal" similarities - in this case it the problem could be solved by mapping value to a type, which conforms to Ord and use it for comparison instead of original one (actually wrapped-data example above is just a special case when value is mapped to internal representation). For example,
trait ToFastHash {
    fn to_fast_hash(&self) -> u64;
}

or more general

trait ToOrdValue<T:Ord> {
    fn to_ord_value(&self) -> T;
}

although I haven't tested yet if the second one will work in Rust now.

@alexcrichton
Copy link
Member

I was thinking something more along the lines of:

fn find_closure<'a>(&'a self, f: |&K| -> Ordering) -> Option<&'a V>;
fn find_mut_closure<'a>(&'a mut self, f: |&K| -> Ordering) -> Option<&'a mut V>;

That way you tell the tree what you're doing and with a closure you can compare against anything you may want.

@vhbit
Copy link
Contributor Author

vhbit commented Jun 30, 2014

I'm a bit confused here as actually it is exactly what this patch does (although my naming is not that good):

  • absolutely general tree_find_(mut)?_with(node: &'r Option<Box<TreeNode<K, V>>>, f: |&K| -> Ordering)
  • specialized find_equiv for Str which is just provides corresponding closure
  • and there is still place for providing find_equiv for any other trait (for example, ToOrdValue) using the same tree_find_with as foundation.

@alexcrichton
Copy link
Member

I was more talking about the public-facing API. The patch specializes string-like keys, and no others. I was thinking that the finding functions would be exposed for all keys so if I had a map of strings I could possibly do something like a case-insensitive lookup.

@vhbit
Copy link
Contributor Author

vhbit commented Jun 30, 2014

Aha, got it, sorry for misunderstanding :-)

Sounds good. Although I still vote for providing both ways:

  • find_equiv: for the most required stuff (possible criteria - what is exposed through stdlib, like TreeMap<String,...> in Json)
  • find_with: for everything else as a last but extremely powerful resort.

@alexcrichton
Copy link
Member

I'd prefer to not single-out string-like keys specifically, which may mean axing find_equiv, but find_with I would be ok with.

@vhbit
Copy link
Contributor Author

vhbit commented Jul 4, 2014

Sorry for a delay, here is update.

I'm still haven't axed find_equiv as for simple use cases like everyday JSON querying through find_closure is too heavy. While current implementation favors Str it still allows to introduce find_equiv method for other types later meanwhile providing a practical solution.

One more related issue: #14549

@alexcrichton
Copy link
Member

The TreeMap is a general data structure useful for more types other than just json. For this reason it seems odd to add string-specific functionality for the one json use case while it doesn't benefit other possible use cases.

I would be more in favor of enhancing the json module with any helper functions while only having the closure-based function in the treemap module. Additionally, I think these changes should also be made:

  • Most closure-based function on hashmap use the suffix with rather than closure
  • The closure-based functions should be public
  • The closure-based functions should be documented

@vhbit
Copy link
Contributor Author

vhbit commented Jul 6, 2014

The TreeMap is a general data structure useful for more types other than just json.

It's exactly my point. I can't imagine why someone will need to re-implement those json helper functions in their own modules.

For this reason it seems odd to add string-specific functionality for the one json use case while it doesn't benefit other possible use cases.

While I mention json a lot, it is actually much broader issue - I believe str is one of primitive types and I as developer expect it to be fully supported (no matter of allocation) in stdlib, especially in all kinds of maps as one of the most important collection primitives. I have quite a lot of string "indexing" in my own data structures and while TreeMap fits use case better, right now I'm teased to use HashMap just because find_equiv.

So it is all about opportunity to compose a lot of "primitives" (strings, maps) together out of box without writing any additional boilerplate code.

In my opinion, it is all about libcollections. It might be renaming find_equiv to find_str for TreeMap<Str, V> if equiv part is too confusing. It also might be
creating a separate StringMap, but it looks for me like a bit of overkill in terms of how much code is required to provide the same interface as TreeMap does.

@thestinger
Copy link
Contributor

These containers should just expose methods taking a closure instead of using the trait. It doesn't make sense to have a trait like Equiv or special-case hacks.

While I mention json a lot, it is actually much broader issue - I believe str is one of primitive types and I as developer expect it to be fully supported (no matter of allocation) in stdlib, especially in all kinds of maps as one of the most important collection primitives. I have quite a lot of string "indexing" in my own data structures and while TreeMap fits use case better, right now I'm teased to use HashMap just because find_equiv.

Strings aren't meant to be special beyond the support for string literals. Special cases for certain primitives types is bad language design. A rope or small string type should be just as flexible, and anything preventing that is a bug.

@vhbit
Copy link
Contributor Author

vhbit commented Jul 8, 2014

@thestinger,

These containers should just expose methods taking a closure instead of using the trait.

Yep, definitely this will work, but it actually addresses only the second part of "Simple things should be simple, complex things should be possible". Consider the following examples:

/// Current way
// key1, key2: String;
let mut t = TreeMap::new()
t.insert(key1, 34);
t.insert(key2, 45);
let v = t.find(&"test".to_string()); // Ohh, allocating?!

/// Closures 
// key1, key2: String;
let mut t = TreeMap::new()
t.insert(key1, 34);
t.insert(key2, 45);
let v = t.find_with(|k| "test".cmp(&k.as_slice()); // Ohh, and that for every key to request?

/// Proposed 
//  key1, key2: CustomType<Str>;
let mut t = TreeMap::new()
t.insert(key1, 34);
t.insert(key2, 45);
let v = t.find_equiv(&"test");

It could be much easier if there was a way to return a closure, in this case it could be done as:

// str_eq<T:Str, V: Str>(s: &T) -> |v: &V| -> Ordering
t.find_with(str_eq("test"))

which is not that simple, but still is quite nice. But I've failed to get anything like this to work using current closures. Any hints are appreciated.

Special cases for certain primitives types is bad language design. A rope or small string type should be just as flexible, and anything preventing that is a bug.

It seems I wasn't clear. While I've mentioned str the code is actually about Str, which is just another trait. So if user's custom rope or small string type implements that trait (and it probably will if he wants it to co-operate with stdlib) - it just works. Again, it is all about stdlib, not about language.

@vhbit
Copy link
Contributor Author

vhbit commented Jul 8, 2014

@alexcrichton, documented, squashed, will appreciate comments on wording as writing docs in English sometimes blows my mind.

And also waiting for the final decision regarding find_equiv for Str.

///
/// assert_eq!(ua.unwrap().as_slice(), "Curl-Rust/0.1");
/// ```
#[allow(dead_code)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This dead_code allow shouldn't be necessary.

@vhbit
Copy link
Contributor Author

vhbit commented Jul 9, 2014

Axed Str, updated samples/tests

@alexcrichton
Copy link
Member

Needs a rebase against #15540

@alexcrichton
Copy link
Member

But otherwise looks good to me!

find_with/find_mut_with which use provided closure for navigating tree
and searching as flexible as possible
@vhbit
Copy link
Contributor Author

vhbit commented Jul 9, 2014

Rebased

bors added a commit that referenced this pull request Jul 9, 2014
- it allows to lookup using any str-equiv object, making TreeMaps finally usable (for example, it is much easier to work with JSON with lookup values being static strs)
- actually provides pretty flexible solution which could be extended to other equivalent types (although it might be not that performant)
@bors bors closed this Jul 9, 2014
//
// The only difference is that non-mutable version uses loop instead
// of recursion (performance considerations)
// It seems to be impossible to avoid recursion with mutability
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is no longer the case. The mutable version is almost identical now, except for a temporary variable to appease borrowck

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, fixed in #15749

@vhbit vhbit deleted the treemap-str-equiv branch July 11, 2014 19:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants