Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to stable sorting algorithm #47309

Merged

Conversation

petvana
Copy link
Member

@petvana petvana commented Oct 24, 2022

@LilithHafner Since you changed the docs to guarantee the stability, I propose to switch to DEFAULT_STABLE (by removing other cases) in the same PR #47303.

Warning: This PR is against your branch, not master.

@petvana petvana added the sorting Put things in order label Oct 24, 2022
@petvana
Copy link
Member Author

petvana commented Oct 24, 2022

I guess, we can deprecate defalg(v::AbstractArray) as well, right?

@LilithHafner
Copy link
Member

Unfortunately, folks outside of Base are still using defalg. According to my search, 4 packages add methods to defalg, two of which (DataFrames.jl and InlineStrings.jl) would have their dispatch mechanisms disabled by this change:

# DataFrames.jl
Sort.defalg(df::AbstractDataFrame) =
    size(df, 1) < 8192 ? Sort.MergeSort : SortingAlgorithms.TimSort

# InlineStrings.jl
Base.Sort.defalg(::AbstractArray{<:Union{SmallInlineStrings, Missing}}) = InlineStringSort

# StaticArrays.jl
defalg(a::StaticVector) =
    isimmutable(a) && length(a) <= 20 ? BitonicSort : QuickSort

# Catalyst.jl
Base.Sort.defalg(::ReactionComplex) = Base.DEFAULT_UNSTABLE

Eventually, it may be possible to remove & replace defalg (I think the construct is a bit ugly, myself) but I don't think we can do it quite yet.

@oscardssmith
Copy link
Member

to be fair, you could probably remove those uses. Between the stable quicksort and radix sort that you've added, I doubt that any of these are still optimal heuristics.

@petvana
Copy link
Member Author

petvana commented Oct 26, 2022

Unfortunately, folks outside of Base are still using defalg.

It makes some sense to allow user to select the default sorting algorithm for user-defined types. Thus, I'll revert changes replacint defalg.

The only remaining question is why sort for multidimensional arrays has used unstable algorithm (alg::Algorithm=DEFAULT_UNSTABLE) in past. I guess it is not important now, and we can use defalg as well.

sort(A; dims::Integer, alg::Algorithm=DEFAULT_UNSTABLE, lt=isless, by=identity, rev::Bool=false, order::Ordering=Forward)

@LilithHafner
Copy link
Member

Mergesort creates a temporary array with size lo-hi+1, and multidimensional array sorting creates a large array and sorts it a bit at a time using lo/hi rather than views. Before scratch space buffers were supported and passed around, this resulted in a quadratic number of unnecessary allocations. It was easy to avoid the problem by using QuickSort instead.

Multidimensional arrays should use defalg now, thanks.

This is making me think that perhaps defalg should take additional arguments if we choose to keep it (not this PR):

function sortperm(A::AbstractArray;
                  alg=nothing,
                  lt=isless,
                  by=identity,
                  rev::Union{Bool,Nothing}=nothing,
                  order::Ordering=Forward,
                  buffer::Union{AbstractVector{<:Integer}, Nothing}=nothing,
                  dims...)
    ordr = ord(lt,by,rev,order)
    _alg::Algorithm = alg === nothing ? defalg(sortperm, A, ordr; buffer, dims...) : alg
    ...

@petvana
Copy link
Member Author

petvana commented Oct 27, 2022

Multidimensional arrays should use defalg now, thanks.

@LilithHafner If you agree on that changes, you can merge this into your branch now ... and we can polish it there if necessary.

@LilithHafner LilithHafner merged commit a7c3e71 into JuliaLang:LilithHafner-patch-5 Oct 28, 2022
@LilithHafner LilithHafner mentioned this pull request Dec 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sorting Put things in order
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants