-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bucket_sort aggregation misplace the first bucket #36322
Comments
Pinging @elastic/es-analytics-geo |
The sort order is based on count and there is no guarantee that equals element will not be reordered as a result of the sort. We could change to a stable sort or change the documentation to explain how the sort works but why are you using the |
This is a simplified version of my real use case. I need to aggregate 3 different fields together. I tried with What seems strange to me is that on every attempts I tried, everything is sorted as expected except the first one which is misplaced at the end. It is like if the internal ElasticSearch "loop" that sort every buckets do the right thing except for the first item. |
I didn't look carefully the implementation of the priority queue we use to perform the sort but as I said in my previous comment the output of the sort is correct. Though I wonder why we use a priority queue rather than a list and then |
@jimczi Indeed, the reason a priority queue was used was because it seemed suitable for also dealing with trimmed buckets in an efficient way. I vaguely recall discussing this and deciding it was a desirable trade-off against sort stability. |
We discussed this offline and we agreed that we should switch to a List and use |
I would like to work on it. |
I would like to work on this issue if @govi20 is not working on it. |
Looks like someone just sent a PR for this (#36748) If you're still interested in contributing, you can generally just leave a note that says "I'm going to start working on this" then raise a PR when you are ready. No need to get approval first. Thanks for helping out! We have more |
can we still work on issue or raise PR if there is already a PR raised. |
I would avoid working on issues that have an active PR. Sometimes a PR might go dormant (contributor gets busy and abandons the PR, technical issues make it close, etc), in which case we'll close the PR and can be worked on again. But if the PR is active and being worked on, it's probably best to choose a different issue to work on. |
Elasticsearch version 6.5.1
Plugins installed: []
JVM version JVM 1.8.0_192
OS version Debian 8.11
Description of the problem including expected versus actual behavior:
When sorting an aggregation with a
bucket_sort
based on its_count
, ifdoc_count
are equals the item that should be in first position is the last one, other items are in the right order.Every items should be in the right order, including the first one.
Steps to reproduce:
user
with a differentrank
each time (froma
tod
) :user
with ranka
user
with rankb
user
with rankc
user
with rankd
rank
property and sort bycount
with abucket_sort
:Aggregation result is :
We have ranks sorted has : b, c, d, a instead of a, b, c, d.
The text was updated successfully, but these errors were encountered: