Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add enforcement for np.sort and np.argsort #918

Merged
merged 59 commits into from
Nov 14, 2024
Merged

Conversation

yuema137
Copy link
Collaborator

This PR enforces deterministic sorting behavior by implementing the sort_enforcement module. Specifically, it:

  1. Makes mergesort the default sorting algorithm
  2. Disables non-deterministic sorting algorithms (quicksort and heapsort) to prevent their usage
  3. Wraps np.sort and np.argsort to enforce the behaviors mentioned above

The implementation is robust to import order - the enforcement takes effect whenever strax is imported, regardless of whether numpy is imported before or after. This ensures consistent sorting behavior across all strax operations, addressing the non-deterministic sorting issues reported in #916.

A unit test is also added accordingly.

@yuema137 yuema137 requested a review from dachengx October 23, 2024 22:48
@yuema137 yuema137 marked this pull request as draft October 23, 2024 22:48
@coveralls
Copy link

coveralls commented Oct 23, 2024

Coverage Status

coverage: 90.074% (+0.009%) from 90.065%
when pulling 248a2a8 on set_default_as_mergesort
into c3dd2e1 on master.

strax/sort_enforcement.py Outdated Show resolved Hide resolved
@yuema137
Copy link
Collaborator Author

Unfortunately, the kind for np.sort is hard-coded as quicksort in numba now:
https://github.com/numba/numba/blob/0f363d1b2dd19f2aa1a8cec5f0a99c3dd95512f8/numba/np/arrayobj.py#L6524

For np.argsort, mergesort and quicksort are both supported though.

So we have two options:

  1. Make a PR to numba to enable mergesort for np.sort (need to check if it's easy)
  2. Move all the sorting in strax outside of the numba decorators (need to check performance)

I'm checking the possibility for both ways.

@dachengx dachengx changed the title Add enforcement for np.sort and np.argsort Add enforcement for np.sort and np.argsort Nov 13, 2024
@yuema137 yuema137 requested a review from dachengx November 13, 2024 16:04
@dachengx dachengx marked this pull request as ready for review November 13, 2024 16:05
strax/processing/statistics.py Outdated Show resolved Hide resolved
@yuema137 yuema137 requested a review from dachengx November 14, 2024 05:34
Copy link
Collaborator

@dachengx dachengx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dachengx
Copy link
Collaborator

I should mention that this PR is following up XENONnT/straxen#1176, for future reference.

@dachengx dachengx merged commit 8489aa2 into master Nov 14, 2024
8 checks passed
@dachengx dachengx deleted the set_default_as_mergesort branch November 14, 2024 07:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants