Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

maximum on dictionary returns max key instead of max val #14672

Closed
skariel opened this issue Jan 14, 2016 · 10 comments
Closed

maximum on dictionary returns max key instead of max val #14672

skariel opened this issue Jan 14, 2016 · 10 comments
Assignees
Labels
collections Data structures holding multiple items, e.g. sets
Milestone

Comments

@skariel
Copy link
Contributor

skariel commented Jan 14, 2016

I don't really know how this works in Matlab, but I find the current situation extremely confusing and error prone:

d = Dict{Int,Int}(1=>2,3=>6,5=>10,7=>8)
maximum(d)    # gives pair 7,8 when I was expecting 5,10

finding the maximum of the keys is easy with maximum(keys(d)). Finding the key with the maximum value on the other hand currently requires writing a small function. Unless I'm missing something of course :)

So, is this behaviour intentional?

@skariel skariel changed the title maximum on dictionary return max key instead of max val maximum on dictionary returns max key instead of max val Jan 14, 2016
@skariel
Copy link
Contributor Author

skariel commented Jan 14, 2016

here's a possible function that does what I would like the default maximum to do on a dict:

function mymax{K,V}(d::Dict{K,V})
    @assert length(d) > 0
    maxfound = false
    maxkey = zero(K)
    maxval = zero(V)
    for (k,v) in d 
        if !maxfound
            maxfound = true
            maxkey = k
            maxval = d[k]
        elseif d[k] > maxval
            maxkey = k
            maxval = d[k]
        end
    end
    Pair(maxkey,maxval)
end

@tkelman
Copy link
Contributor

tkelman commented Jan 14, 2016

That won't give the right behavior if the dict only contains negative values. nevermind sorry

Pairs are ordered lexicographically, but reductions on Associative could maybe be special-cased. For now you can use something like mapfoldl(Base.IdFun(), (x,y) -> x.second > y.second ? x : y, d)

@StefanKarpinski
Copy link
Sponsor Member

This is a decent argument for making pairs compare their RHS first and LHS second, since the primary application is to represent the data in an associative mapping and getting the maximum key is, as you point out, simple, whereas there's no obvious way to get the key associated with the maximum value.

Trying this out seems pretty decent:

julia> Base.isless(p::Pair, q::Pair) =
           ifelse(!isequal(p.second,q.second),
               isless(p.second,q.second),
               isless(p.first,q.first)
           )
isless (generic function with 44 methods)

julia> d = Dict{Int,Int}(1=>2,3=>6,5=>10,7=>8)
Dict(7=>8,3=>6,5=>10,1=>2)

julia> maximum(values(d))
10

julia> maximum(keys(d))
7

julia> maximum(d)
5=>10

julia> maximum(d)[1]
5

It's kind of an unintuitive ordering though, and getting the key for the maximum value this way does have the side effect that it compares keys as well as values, which you don't necessarily care about – you just want some key that maps to the maximum value, and may not even care about the keys being comparable. Of course, in that case, you can just find all the keys equal to the maximum value.

@nalimilan
Copy link
Member

It's kind of an unintuitive ordering though, and getting the key for the maximum value this way does have the side effect that it compares keys as well as values, which you don't necessarily care about – you just want some key that maps to the maximum value, and may not even care about the keys being comparable. Of course, in that case, you can just find all the keys equal to the maximum value.

Indeed, it doesn't look like this ordering for pairs is really worth it. It would make more sense to special-case Associative as @tkelman suggested.

@GunnarFarneback
Copy link
Contributor

I just ran into this as well, from the findmax direction.

julia> d = Dict{Int,Int}(1=>2,3=>6,5=>10,7=>8)
Dict{Int64,Int64} with 4 entries:
  7 => 8
  3 => 6
  5 => 10
  1 => 2

julia> findmax(d)
(7=>8,1)

Not only does it find the largest key, which is of questionable value, but it also returns the index with respect to an arbitrary internal order, which seems completely useless.

I'm in favor of specialcasing reductions on associatives, focusing on values rather than keys.

@djsegal
Copy link

djsegal commented Jul 16, 2017

bump. i agree it's a little weird that findmax checks keys instead of values

@JeffBezanson
Copy link
Sponsor Member

I've been thinking about Dict iteration a bunch lately, and how dealing with Pairs is generally difficult. See:

#5794
#17886
#20402 (comment)
#22194 (comment)

@JeffBezanson
Copy link
Sponsor Member

JeffBezanson commented Jul 16, 2017

For the specific case of finding the key with the largest value, it seems findmax and indmax should work (but currently don't, nor do they work for e.g. OffsetArrays).

@JeffBezanson JeffBezanson added the collections Data structures holding multiple items, e.g. sets label Aug 17, 2017
@JeffBezanson
Copy link
Sponsor Member

Part of #20402

@JeffBezanson JeffBezanson added this to the 1.0 milestone Aug 17, 2017
@JeffBezanson JeffBezanson self-assigned this Aug 17, 2017
@JeffBezanson
Copy link
Sponsor Member

findmax and indmax now do what the OP requests. If you want the maximum value, maximum(values(d)) works. From there, I think it's simplest for maximum just to operate on iterators like it does now, so the behavior with pairs is strange but it's what falls out. I think the only possible further change from there is to change how Dicts iterate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
collections Data structures holding multiple items, e.g. sets
Projects
None yet
Development

No branches or pull requests

7 participants