Skip to content

Commit

Permalink
Sorting documentation fixups for 1.9 (#48440)
Browse files Browse the repository at this point in the history
- Fix typos
- Clarify that ! means mutation, not "in-place-ness". This should be backported because sort! is even less in place in 1.9 than it already was in 1.8.
- Rewrite the section on default policy to reflect the new default policy
- Move examples and extended description of previously default sorting algorithms out of sort.md and into their respective docstrings (still rendered in sort.md)

Co-authored-by: Jeremie Knuesel <knuesel@gmail.com>
  • Loading branch information
LilithHafner and knuesel authored Jan 30, 2023
1 parent 7e8515c commit a1c4d85
Show file tree
Hide file tree
Showing 2 changed files with 39 additions and 57 deletions.
26 changes: 24 additions & 2 deletions base/sort.jl
Original file line number Diff line number Diff line change
Expand Up @@ -524,7 +524,7 @@ Base.size(v::WithoutMissingVector) = size(v.data)
send_to_end!(f::Function, v::AbstractVector; [lo, hi])
Send every element of `v` for which `f` returns `true` to the end of the vector and return
the index of the last element which for which `f` returns `false`.
the index of the last element for which `f` returns `false`.
`send_to_end!(f, v, lo, hi)` is equivalent to `send_to_end!(f, view(v, lo:hi))+lo-1`
Expand Down Expand Up @@ -1242,7 +1242,7 @@ Otherwise, we dispatch to [`InsertionSort`](@ref) for inputs with `length <= 40`
perform a presorted check ([`CheckSorted`](@ref)).
We check for short inputs before performing the presorted check to avoid the overhead of the
check for small inputs. Because the alternate dispatch is to [`InseritonSort`](@ref) which
check for small inputs. Because the alternate dispatch is to [`InsertionSort`](@ref) which
has efficient `O(n)` runtime on presorted inputs, the check is not necessary for small
inputs.
Expand Down Expand Up @@ -1891,6 +1891,26 @@ Characteristics:
ignores case).
* *in-place* in memory.
* *divide-and-conquer*: sort strategy similar to [`MergeSort`](@ref).
Note that `PartialQuickSort(k)` does not necessarily sort the whole array. For example,
```jldoctest
julia> x = rand(100);
julia> k = 50:100;
julia> s1 = sort(x; alg=QuickSort);
julia> s2 = sort(x; alg=PartialQuickSort(k));
julia> map(issorted, (s1, s2))
(true, false)
julia> map(x->issorted(x[k]), (s1, s2))
(true, true)
julia> s1[k] == s2[k]
true
"""
struct PartialQuickSort{T <: Union{Integer,OrdinalRange}} <: Algorithm
k::T
Expand Down Expand Up @@ -1927,6 +1947,8 @@ Characteristics:
case).
* *not in-place* in memory.
* *divide-and-conquer* sort strategy.
* *good performance* for large collections but typically not quite as
fast as [`QuickSort`](@ref).
"""
const MergeSort = MergeSortAlg()

Expand Down
70 changes: 15 additions & 55 deletions doc/src/base/sort.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,8 @@ julia> sort([2,3,1], rev=true)
1
```

To sort an array in-place, use the "bang" version of the sort function:
`sort` constructs a sorted copy leaving its input unchanged. Use the "bang" version of
the sort function to mutate an existing array:

```jldoctest
julia> a = [2,3,1];
Expand Down Expand Up @@ -134,74 +135,33 @@ Base.Sort.partialsortperm!

## Sorting Algorithms

There are currently four sorting algorithms available in base Julia:
There are currently four sorting algorithms publicly available in base Julia:

* [`InsertionSort`](@ref)
* [`QuickSort`](@ref)
* [`PartialQuickSort(k)`](@ref)
* [`MergeSort`](@ref)

`InsertionSort` is an O(n²) stable sorting algorithm. It is efficient for very small `n`,
and is used internally by `QuickSort`.
By default, the `sort` family of functions uses stable sorting algorithms that are fast
on most inputs. The exact algorithm choice is an implementation detail to allow for
future performance improvements. Currently, a hybrid of `RadixSort`, `ScratchQuickSort`,
`InsertionSort`, and `CountingSort` is used based on input type, size, and composition.
Implementation details are subject to change but currently available in the extended help
of `??Base.DEFAULT_STABLE` and the docstrings of internal sorting algorithms listed there.

`QuickSort` is a very fast sorting algorithm with an average-case time complexity of
O(n log n). `QuickSort` is stable, i.e., elements considered equal will remain in the same
order. Notice that O(n²) is worst-case complexity, but it gets vanishingly unlikely as the
pivot selection is randomized.

`PartialQuickSort(k::OrdinalRange)` is similar to `QuickSort`, but the output array is only
sorted in the range of `k`. For example:

```jldoctest
julia> x = rand(1:500, 100);
julia> k = 50:100;
julia> s1 = sort(x; alg=QuickSort);
julia> s2 = sort(x; alg=PartialQuickSort(k));
julia> map(issorted, (s1, s2))
(true, false)
julia> map(x->issorted(x[k]), (s1, s2))
(true, true)
julia> s1[k] == s2[k]
true
```

!!! compat "Julia 1.9"
The `QuickSort` and `PartialQuickSort` algorithms are stable since Julia 1.9.

`MergeSort` is an O(n log n) stable sorting algorithm but is not in-place – it requires a temporary
array of half the size of the input array – and is typically not quite as fast as `QuickSort`.
It is the default algorithm for non-numeric data.

The default sorting algorithms are chosen on the basis that they are fast and stable.
Usually, `QuickSort` is selected, but `InsertionSort` is preferred for small data.
You can also explicitly specify your preferred algorithm, e.g.
`sort!(v, alg=PartialQuickSort(10:20))`.

The mechanism by which Julia picks default sorting algorithms is implemented via the
`Base.Sort.defalg` function. It allows a particular algorithm to be registered as the
default in all sorting functions for specific arrays. For example, here is the default
method from [`sort.jl`](https://github.com/JuliaLang/julia/blob/master/base/sort.jl):

```julia
defalg(v::AbstractArray) = DEFAULT_STABLE
```

You may change the default behavior for specific types by defining new methods for `defalg`.
You can explicitly specify your preferred algorithm with the `alg` keyword
(e.g. `sort!(v, alg=PartialQuickSort(10:20))`) or reconfigure the default sorting algorithm
for custom types by adding a specialized method to the `Base.Sort.defalg` function.
For example, [InlineStrings.jl](https://github.com/JuliaStrings/InlineStrings.jl/blob/v1.3.2/src/InlineStrings.jl#L903)
defines the following method:
```julia
Base.Sort.defalg(::AbstractArray{<:Union{SmallInlineStrings, Missing}}) = InlineStringSort
```

!!! compat "Julia 1.9"
The default sorting algorithm (returned by `Base.Sort.defalg`) is guaranteed
to be stable since Julia 1.9. Previous versions had unstable edge cases when sorting numeric arrays.
The default sorting algorithm (returned by `Base.Sort.defalg`) is guaranteed to
be stable since Julia 1.9. Previous versions had unstable edge cases when
sorting numeric arrays.

## Alternate orderings

Expand Down

2 comments on commit a1c4d85

@nanosoldier
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Executing the daily package evaluation, I will reply here when finished:

@nanosoldier runtests(isdaily = true)

@nanosoldier
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your package evaluation job has completed - possible new issues were detected.
A full report can be found here.

Please sign in to comment.