Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add non-allocating sort function to libcore #790

Closed
steveklabnik opened this issue Feb 2, 2015 · 11 comments
Closed

Add non-allocating sort function to libcore #790

steveklabnik opened this issue Feb 2, 2015 · 11 comments
Labels
T-libs-api Relevant to the library API team, which will review and decide on the RFC.

Comments

@steveklabnik
Copy link
Member

Issue by mahkoh
Saturday Nov 22, 2014 at 22:45 GMT

For earlier discussion, see rust-lang/rust#19221

This issue was labelled with: A-collections, A-libs, I-enhancement in the Rust repository


@Morwenn
Copy link

Morwenn commented Jan 4, 2016

I have no experience with Rust, but I am currently maintaining a C++ sorting library. I couldn't help but notice this issue about sorting. My library includes adapted versions of the GrailSort and WikiSort mentioned in the previous issue, which are the only O(n log n) in-place stable algorithms I know that can work without allocating memory (even though allocating a fixed buffer is better for both of them).

You know how it is: benchmarks are full of surprises, and I couldn't reproduce the patterns for which GrailSort is better. In all my test cases, WikiSort is always better than GrailSort (it might come from my C++ port of GrailSort though), even when bigger buffers are given to GrailSort for the merge operations.

Now, as mentioned in the previous issue, WikiSort is complex. However, it can be made a bit simpler: I managed to replace some of the operations with some algorithms from the C++ standard library (I don't know whether Rust has such algorithms), the embedded sort for 3 values can be replaced by a call to a vanilla insertion sort and IME sorting networks are only worth it with integers and when carefully implemented. Those in WikiSort are even more complex to maintain the stability. I didn't try it yet, but I believe that all the sorting networks part can be replaced by an insertion sort too. Moreover, if the plan is to use no additional memory at all, even a fixed-sized buffer, the algorithm can even be simplified a bit.

That said, even with those modifications, I think that it's still around ~600 LOC and it really loses a bit of performance when it doesn't have even a fixed-size buffer. If you ever decide that you don't care about stability, or choose to provide separate stable and unstable sorting algorithms, then the pattern-defeating quicksort is the best in-place sorting algorithm I know of that does not use any additional memory, and it's around ~300 LOC. It's generally faster than introsort and may make its way into libc++ and libstdc++.

Update: a new version of pattern-defeating quicksort beats branch prediction and can subsequently be ~twice as fast as introsort, but the algorithm is more complex (borrowing ideas from BlockQuicksort: How Branch Mispredictions don't affect Quicksort).

Hope that will help you to choose the appropriate sorting algorithm :)

@pczarn
Copy link

pczarn commented Feb 24, 2016

The algorithm should make sure destructors run once if the comparator panics. The current algorithm in libcollections accounts for this.

@ahicks92
Copy link

ahicks92 commented Dec 1, 2016

I independently came to the conclusion that it would be nice to be able to do this. Is this something that is still open for discussion, and does anyone know why the current algorithm was chosen?

@sfackler
Copy link
Member

sfackler commented Dec 1, 2016

IIRC we wanted something stable and without an n^2 worst case.

@keeperofdakeys
Copy link

The grailsort and wikisort algorithms talked about in the previous issue seem to match those criteria.

@pczarn
Copy link

pczarn commented Dec 2, 2016

The pattern-defeating quicksort matches those criteria as well. I have a mostly complete implementation, but it's unpublished.

@Morwenn
Copy link

Morwenn commented Dec 2, 2016

It isn't stable though.

@burdges
Copy link

burdges commented Dec 2, 2016

CS.SE Q&A here with good background.

@ahicks92
Copy link

ahicks92 commented Dec 2, 2016

If we can't do it as sort_xxx, we could do it as sort_unstable_xxx or sort_noalloc_xxx.

Stability is obviously important, in both the sorting and backwards compatibility senses. But there's lots of things for which you don't care about stability, and our current implementations will grab a lot of ram for large arrays (at least, per the documentation). You could not feasibly sort a 1 GB vector with stdlib, unless you're operating on an atypically powerful machine.

If you do and it fails, it panics; you can't even catch the error. If we can't add an allocating one, we should at least add one which returns Result or something, so that you can do something appropriate besides completely crash.

Perhaps allocating is not actually as big a deal as I think, but adding these methods to stdlib seems to have very little long-term costs. Obviously, the best approach is sort, etc, not allocating.

@ghost
Copy link

ghost commented Jan 26, 2017

To put some numbers into perspective, I did my best at implementing both stable and unstable sorts in Rust.

The stable sort is a variant of timsort, which recently got merged into libstd.
The unstable sort is pattern-defeating quicksort, which I published as a standalone crate.

The differences in performance are pretty interesting.

@ghost
Copy link

ghost commented Jul 10, 2017

Non-allocating slice::sort_unstable is now in libcore and will be stable in Rust 1.20.

Tracking issue

@petrochenkov petrochenkov added the T-libs-api Relevant to the library API team, which will review and decide on the RFC. label Jan 29, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-libs-api Relevant to the library API team, which will review and decide on the RFC.
Projects
None yet
Development

No branches or pull requests

8 participants