Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unintuitive findmin and findmax #39203

Closed
Moelf opened this issue Jan 11, 2021 · 15 comments · Fixed by #41076
Closed

Unintuitive findmin and findmax #39203

Moelf opened this issue Jan 11, 2021 · 15 comments · Fixed by #41076
Assignees
Labels
breaking This change will break code search & find The find* family of functions

Comments

@Moelf
Copy link
Contributor

Moelf commented Jan 11, 2021

argmin(collection) will return the index of the smallest element in the collection, however, with the newly added:

argmin(f, collection)

it returns the value of the minimal element (with evaluation) instead:

julia> argmin([1,2,-3,5])
3

julia> argmin(identity, [1,2,-3,5])
-3

And this is not even useful since it's just taking the second element from the returned value of findmin(f, collection):

argmin(f, domain) = findmin(f, domain)[2]

We should have argmin/argmax to return index instead.

@aplavin
Copy link
Contributor

aplavin commented Jan 11, 2021

For the record, argmin(f, collection) and friends are added in #35316.

@JeffBezanson
Copy link
Member

This was discussed quite a bit. What's implemented is the mathematical definition of argmin/argmax. It's slightly more general than returning an index, since not all collections have indices, e.g. sets or iterators.

@Moelf
Copy link
Contributor Author

Moelf commented Jan 12, 2021

I think it's very unintuitive that in one case it returns the index of max/min, in another case, it returns the value of evaluation.

And currently, we don't have a way to get the index of max/min within an array giving a by function (in 1st argument).

For consistency and mathematical definition, I would suggest deprecate argmax(X) (2.0) since it won't work if X is an iterator/set by the original argument for it? Then make a new idxmax/idxmin which can also take a by function as 1st argument. (can be added in 1.7)

@nalimilan
Copy link
Member

The discussion happened at #27613. I agree it's confusing. For consistency with findfirst/findall/etc., ideally findmin and findmax would return a single index as discussed at #24865, replacing argmax(x), and other functions would be used to get both the maximum and its index. But that would break a lot of code.

@nalimilan nalimilan added breaking This change will break code search & find The find* family of functions labels Jan 12, 2021
@aplavin
Copy link
Contributor

aplavin commented Jan 12, 2021

As I understand, there are multiple subissues here, and not all of them are about consistency by itself.

  1. First, obtaining index of min/max/first/etc value in array is more fundamental than obtaining the corresponding value: one can go from index to value by arr[ix], but not the other way around. So findmin/findmax functions that return both are basically just optimizations when one needs them.
    Currently (both before Add 2-arg versions of findmax/min, argmax/min #35316 and after it) it is possible to obtain index of maximal element with argmax(A), minimal with argmin(A), and first matching item by findfirst(A)/findfirst(f, A). However, it is not possible to get index of the element with maximal f(a), as argmax(f, A) returns a and not its index.

  2. Second, there is consistency. Finding max and min is related to both finding first matching element (findfirst()) and to sorting (sort, partialsort, etc).

2.1. As for comparison with findfirst(A): it is similar to argmax(A) as both return the index. So, it's just the naming scheme that differs here. There is no analogy to findfirst(f, A), and this is not just the lack of functionality as I wrote above, but inconsistency as well:

julia> findfirst([false, false, true])
3

julia> findfirst(identity, [false, false, true])
3  # same as above

julia> argmax([false, false, true])
3

julia> argmax(identity, [false, false, true])
true  # ...

2.2. Now let's compare finding maximum and sorting. They are both similar as both require that ordering is specified, and maximum is just a common and optimized special case for partial sort.
Both sort/partialsort and their cousins that return indices sortperm/... allow to specify ordering in two ways: either with key function by=..., or with comparator function lt=... (the latter is likely way way less commonly used).
For max/min everything is very different: there are actually three functions to find max in different ways - maximum, argmax, and findmax. And none of them let the user specify the custom ordering in any way!

I don't have a perfect solution for these issues. But it seems reasonable to do some of the following (non-breaking!):

  • Add by= argument to maximum()/argmax/findmax.
  • Add function that returns index i of a in a = A[i] that gives the maximal f(a). This could be called argmax(f, A), but seems like there is a certain ambiguity with what it should mean. So, maybe indmax(f, A)? Reads very clear.
  • Deprecate/remove argmax(f, A) that is to be introduced in julia 1.7. I personally would never guess apriori that it returns a from A and not its index. If talking about mathematical definition of argmax for functions, does it mean that optimization packages are encouraged to overload Base.argmax as e.g. argmax(f) for global maximum? And maybe argmax just isn't the best function for finding index in an array? indmax again...

@chelate
Copy link

chelate commented Apr 17, 2021

+1 for just by keyword. A findmin(set, by = f) returning (min, index) would match the single argument form while being a valid orthogonal form of the findmin(f, set), which has a valid mathematical meaning.

A by keyword breaks nothing, and matches sort(set, by = f) and somehow makes the whole mess make sense as a very normal 1-argument, 2-argument dispatch.

@lassepe
Copy link
Contributor

lassepe commented Apr 22, 2021

Unpopular opinion: maybe the single-argument version of argmin should not exist in the first place? I think that it makes sense to have argmin behave as close as possible to the mathematical definition. If one wants to find the minimizing index one can always do armin(i -> itr[i], eachindex(itr)).

@haampie
Copy link
Contributor

haampie commented Apr 28, 2021

argmax needs to accept two things: (1) a set and (2) a partial order on that set. So I would expect an API similar to sort!, which allows the user to specify the partial order with lt, rev, by and order kwargs. Another issue is that sort!'s api is terrible.

Edit: oh, I realize in my mind argmax was supposed to act on the function i -> xs[i], so it would return an index and take xs & an order on xs. It's very confusing indeed.

@antoine-levitt
Copy link
Contributor

The problem here is that reducers use reduce(f, arr) for reduce(fx for x in arr) and this breaks that pattern. There are clearly two different things one could reasonably expect argmin(f, x) depending on one's priors and so it seems unwise to define it at all. The small added convenience isn't worth the large potential confusion.

@BioTurboNick
Copy link
Contributor

BioTurboNick commented May 6, 2021

I'll just add my voice here that it would probably be worthwhile to do something to prevent the surprise that these forms break the established pattern for when there are array first, and function first/array second forms of the same function.

Perhaps another solution is for the function domain forms to return different values.

For the array forms:
argmin/argmax returns the index into the array.
findmin/findmax returns the array value and the index into the array.

What if the function domain forms did this?
argmin/argmax returns the array value and the index into the array. (Current: the array value)
findmin/findmax returns the output value, the array value, and index into the array. (Current: the output value and array value)

That way the output is a different type and mistakes would be more obvious. And easy enough to discard the other outputs.

That is, the outputs are the equivalent of:

argmin(array) = argmin(array)
argmin(f, array) = array[argmin(f.(array))], argmin(f.(array))
findmin(array) = minimum(array), argmin(array)
findmin(f, array) = minimum(f.(array)), argmin(f, array)...

argmin([1, -0.3, 3, 4])
# 2

argmin(x -> x ^ 2, [1, -0.3, 3, 4])
# (-0.3, 2)

findmin([1, -0.3, 3, 4])
# (-0.3, 2)

findmin(x -> x ^ 2, [1, -0.3, 3, 4])
# (0.09, -0.3, 2)

Still a window for confusion if people are extracting values from findmin without inspecting its result, but perhaps a lot clearer what's happening?

For those domains for which there are no valid indexes, perhaps nothing would suffice for that position? Or it could just be omitted I suppose, since there would be no confusion if one is using a non-indexable collection as input.

@BioTurboNick
Copy link
Contributor

BioTurboNick commented May 7, 2021

Or, seen another way:
argmin(A)
is shorthand for
argmin(x -> getindex(A, x), eachindex(A))
and one could imagine a generalization of argmin to
argmin(f, g, h, domain) == argmin(x -> f(g(h(x))), domain)
such that the natural, expected extension is
argmin(f, x -> getindex(A, x), eachindex(A))
or
argmin(x -> f(getindex(A, x)), eachindex(A))

By that logic, maybe the function/array forms should always return the index into the array. OR, it could return a value for each step (f, g, h), leading to a generalization of the model I suggested in the last post.

@oscardssmith
Copy link
Member

triage thinks 1 arg argmax is correct, but findmin with a function is broken and needs to be fixed before 1.7.

@oscardssmith oscardssmith added this to the 1.7 milestone May 27, 2021
@JeffBezanson JeffBezanson changed the title Unintuitive argmin and argmax Unintuitive findmin and findmax May 27, 2021
@vtjnash
Copy link
Member

vtjnash commented Jun 3, 2021

Who's "assigned" to work on this for the milestone? Do we need to rebase #7327 for this?

@aplavin
Copy link
Contributor

aplavin commented Jun 23, 2021

This issue is marked "closed", but is fixed only partially (by @JeffBezanson's commit): for findmax + findmin, not for argmax + argmin.
The triage comment by @oscardssmith does not address the argmax(f, A) issue at all.
It would be really unfortunate if Julia is stuck with this error-prone behaviour "forever":

val, ix = findmax(A)
ix = argmax(A)

val, ix = findmax(f, A)
ix = argmax(f, A)  # No! Can easily lead to errors that are hard to find.

Why introduce argmax(f, A) at all if there is a conflict between two potential meanings? Currently this is a simple function argmax(A) to find index of the maximum. There are lots of users already surprised and confused by argmax(f, A) being defined but not equivalent to argmax(f.(A)): see this thread, https://discourse.julialang.org/t/findmax-and-friends-confusing-behaviour-to-be-introduced-in-1-7, and there were multiple questions on slack.

@aplavin
Copy link
Contributor

aplavin commented Jun 23, 2021

Wow, even a cursory search on JuliaHub and GitHub revealed many packages that effectively define argmax(f, A) to correspond to argmax(f.(A))! There are no packages (excluding Compat.jl) that define it to mean what is planned for 1.7.
I believe this makes a very strong argument not to introduce argmax(f, A) as currently planned.
For reference, putting the links below.

Most widely-used package: https://github.com/JuliaData/SentinelArrays.jl/blob/v1.3.3/src/chainedvector.jl#L771
Others:
https://github.com/tkluck/PolynomialRings.jl/blob/v0.7.3/src/MonomialOrderings.jl#L101
https://github.com/SebastianAment/CompressedSensing.jl/blob/e2de401f6e7be531603090cb55f9b7e06ba02468/src/util.jl#L179
https://github.com/lstagner/ConcaveHull.jl/blob/70840d3670df62e258a4994e874ac9c24d6e4fb6/src/concave_hull.jl#L39
https://github.com/wsshin/SALTBase.jl/blob/0add870b8df4105c63bfcc589ddc806a8b39d712/src/base.jl#L2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking This change will break code search & find The find* family of functions
Projects
None yet
Development

Successfully merging a pull request may close this issue.