Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement sortslices, deprecate sortrows/sortcols #28332

Merged
merged 1 commit into from
Jul 30, 2018
Merged

Conversation

Keno
Copy link
Member

@Keno Keno commented Jul 28, 2018

As discussed on triage, sortslices is the higher dimensional extension
of sortrows/sortcols. The dimensions being specified are the dimensions
(and for higher dimensions the order of the dimensions) to slice along. See
the help text for an example of the higher dimensional behavior. Deprecate
sortrows/sortcols in favor of sortslices.

@Keno Keno requested a review from mbauman July 28, 2018 22:56
@ararslan ararslan added arrays [a, r, r, a, y, s] deprecation This change introduces or involves a deprecation labels Jul 28, 2018
base/sort.jl Outdated
dims == :cols ? 2 :
error("Only :rows or :cols are allowed as symbolic dimensions")
end
_sortslices(A, Val{dims}(); kws...)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this file is included in the system image before Val is defined, which is causing the CI errors.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually not entirely convinced that it should live here. I think it might make more sense to go with multidimensional, esp since we now use this file from the compiler. This code has very little to do with sorting.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 That does seem like a more appropriate place for it, IMO.

base/sort.jl Outdated
Note that the default comparison function on one dimensional slices sorts
lexicographically.

For convenience, `dims=:rows` and `dims=:cols` are supported as alternative
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't do this for anything else, why is this function special in that regard?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough, it was requested for convenience, but I think the behavior is consistent enough that we can do without.

Note that the default comparison function on one dimensional slices sorts
lexicographically.

For convenience, `dims=:rows` and `dims=:cols` are supported as alternative
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we get rid of these awkward special cases?

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed; this makes supporting named dimensions a breaking change, so we shouldn't have this now.

@Keno
Copy link
Member Author

Keno commented Jul 29, 2018

One thing to note for those looking at this, is that in some ways this is incompatible with mapslices, e.g.:

julia> mapslices(println, [1 2; 3 4]; dims=1)
[1, 3]
[2, 4]

julia> sortslices([1 2; 3 4]; by=println, dims=1)
[3, 4]
[1, 2]

I.e. the dims keyword has opposite meaning here.

@JeffBezanson
Copy link
Sponsor Member

There was some discussion of the meaning of the dims argument to mapslices in #18326.

# Higher dimensions

`sortslices` extends naturally to higher dimensions. E.g., if `A` is a
a 2x2x2 array, `sortslices(A, dims=2)` will slices along the 3rd dimension,
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will slices => will slice
Did you mean dims=3 here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes


If `dims` is a tuple, the order of the dimensions in `dims` is
relevant and specifies the linear order of the slices. E.g., if `A` is three
dimensional and `dims` is `(1,2)`, we will take slices in the dimension three,
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extra the

sortslices(A; dims, alg::Algorithm=DEFAULT_UNSTABLE, lt=isless, by=identity, rev::Bool=false, order::Ordering=Forward)

Sort slices of an array `A`. The keyword argument `dims` must be either
an integer or a tuple of integers specifies along which dimension to take the
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"take the slices" sounds like it's where the colon goes to me. I think a slightly better wording might be:

dims … specifies the dimension(s) over which the slices are sorted.

# Higher dimensions

`sortslices` extends naturally to higher dimensions. E.g., if `A` is a
a 2x2x2 array, `sortslices(A, dims=3)` will slice along the 3rd dimension,
Copy link
Sponsor Member

@mbauman mbauman Jul 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/slice along/sort slices within/?

Copy link
Sponsor Member

@mbauman mbauman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just minor doc comments

@mbauman
Copy link
Sponsor Member

mbauman commented Jul 30, 2018

With regards to the meaning of dims: one of the best arguments for this meaning is how an array with named dimensions might extend this method. There, you'd might want to name the first dimension "rows" and the second dimension "cols" (or something more meaningful), and then sortslices(A, dims=:rows) could be implemented to mean sortrows. Perhaps I should have been more clear that this was a thought experiment/extension in our initial discussion — sorry, Keno! (cf.
#28332 (comment))

I also think that this is aligned with the meaning of dims in sort itself. There, too, we're sorting "within" the named dimension — it's just individual elements instead of entire slices.

@JeffBezanson JeffBezanson added this to the 0.7 milestone Jul 30, 2018
As discussed on triage, `sortslices` is the higher dimensional extension
of `sortrows`/`sortcols`. The dimensions being specified are the dimensions
(and for higher dimensions the order of the dimensions) to slice along. See
the help text for an example of the higher dimensional behavior. Deprecate
sortrows/sortcols in favor of sortslices.
@Keno Keno merged commit 33e8b62 into master Jul 30, 2018
@mbauman mbauman deleted the kf/sortslices branch July 30, 2018 23:09
KristofferC pushed a commit that referenced this pull request Feb 11, 2019
As discussed on triage, `sortslices` is the higher dimensional extension
of `sortrows`/`sortcols`. The dimensions being specified are the dimensions
(and for higher dimensions the order of the dimensions) to slice along. See
the help text for an example of the higher dimensional behavior. Deprecate
sortrows/sortcols in favor of sortslices.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrays [a, r, r, a, y, s] deprecation This change introduces or involves a deprecation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants