Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make findn and findnz behavior match nnz #24724

Merged
merged 1 commit into from
Nov 29, 2017
Merged

Make findn and findnz behavior match nnz #24724

merged 1 commit into from
Nov 29, 2017

Conversation

andreasnoack
Copy link
Member

such that length of returned vectors is nnz.

This is a minimal fix of #23121.

Longer term we should consider the names of these functions. It has previously been proposed that the n and nz part should be something like filled/stored. However, there is also an inconsistency in the find part because some find functions only return a (linear) index (even it doesn't makes sense such as for dictionaries) while others return both indices and values.

@nalimilan
Copy link
Member

Longer term we should consider the names of these functions. It has previously been proposed that the n and nz part should be something like filled/stored. However, there is also an inconsistency in the find part because some find functions only return a (linear) index (even it doesn't makes sense such as for dictionaries) while others return both indices and values.

Longer term is pretty short these days. :-)

I agree it would make sense to restrict find* functions to return indices, and that findnz and findmin/findmax are outliers. The latter should be equivalent to indmin/indmax since that's more in line with other find functions.

Maybe findnz could be renamed to stored or filled and return a NamedTuple? nonzeros could then be stored(x).values. nnz could even be length(stored(x).values) if we want, but that's optional.

@andreasnoack
Copy link
Member Author

The latter should be equivalent to indmin/indmax since that's more in line with other find functions

Why not the other way around? Having the value seems useful.

Maybe findnz could be renamed to stored or filled and return a NamedTuple?

I don't understand the reason for this suggestion. Could you explain?

nnz could even be length(stored(x).values) if we want

Probably not. This would be much more expensive than the current nnz that basically just looks up two elements.

@nalimilan
Copy link
Member

The latter should be equivalent to indmin/indmax since that's more in line with other find functions

Why not the other way around? Having the value seems useful.

That's useful, but other find functions return only the index, so we'd better find a different name for this operation.

Maybe findnz could be renamed to stored or filled and return a NamedTuple?

I don't understand the reason for this suggestion. Could you explain?

The goal was in the next sentence: "nonzeros could then be stored(x).values. " And renaming findnz would again be more consistent with other find* methods returning only linear indices, not values.

@andreasnoack
Copy link
Member Author

That's useful, but other find functions return only the index, so we'd better find a different name for this operation.

Since it is currently not consistent with findmin, findmax, and findnz returning the values and not just the indices I don't think it would be crazy to go the other way and adjust the other find methods. No big deal though. However, I think the linear index part should be reconsidered since it is from a time when everything was an Array. E.g. linear index for dictionaries doesn't make a lot of sense. I think the current behavior of findmin/max(::Dict) is the right solution, i.e. the key is the index

julia> d = Dict('A' => 1, 'B' => -1)
Dict{Char,Int64} with 2 entries:
  'A' => 1
  'B' => -1

julia> findmax(d)
(1, 'A')

and similar return cartesian indices array types that have IndexCartesian.

@nalimilan
Copy link
Member

Since it is currently not consistent with findmin, findmax, and findnz returning the values and not just the indices I don't think it would be crazy to go the other way and adjust the other find methods. No big deal though.

Well, at least we have a consistent proposal to use find* functions in the Search & Find Julep, so I find it simpler to rename other functions.

However, I think the linear index part should be reconsidered since it is from a time when everything was an Array. E.g. linear index for dictionaries doesn't make a lot of sense. I think the current behavior of findmin/max(::Dict) is the right solution, i.e. the key is the index

See JuliaLang/Juleps#27 and #20684. I think we need to discuss this seriously before 0.7.

@andreasnoack
Copy link
Member Author

I'll merge this one since the renaming can happen independently from this change which just makes the behavior more consistent.

@andreasnoack andreasnoack merged commit 0335175 into master Nov 29, 2017
@andreasnoack andreasnoack deleted the anj/findnz branch November 29, 2017 11:47
@nalimilan nalimilan added the search & find The find* family of functions label Nov 29, 2017
@Sacha0
Copy link
Member

Sacha0 commented Nov 29, 2017

Have we a tracking issue for the tangential topics? And if so, should we apply a triage label? Best!

@nalimilan
Copy link
Member

See #24865 about findmin/findmax.

@nalimilan
Copy link
Member

This PR should have updated the SparseVector method too. See #24910 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
search & find The find* family of functions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants