Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix multi-value sort for unsigned long #16732

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

bugmakerrrrrr
Copy link
Contributor

Description

Fix #16698

Related Issues

Resolves #16698

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@bugmakerrrrrr
Copy link
Contributor Author

@reta ping :)

Copy link
Contributor

❌ Gradle check result for 82c08dd: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@reta
Copy link
Collaborator

reta commented Nov 27, 2024

@reta ping :)

My apologies @bugmakerrrrrr , I haven't had time to look at the issue yet, I believe your approach is viable but there could be a simpler option to fall back to doubles for unsigned long, I will check it shortly and get back to you. Thanks !

@bugmakerrrrrr
Copy link
Contributor Author

@reta Thank you for your reply. I actually considered the double method, but I think there are two issues with it. One is that the maximum integer value that double can accurately represent is 2^53, if we use double to handle unsigned long, there may be loss of precision and we get wrong result. The other is that doc values in a document is possible unsorted, so we still need to special handling of min/max/median. Please correct me if I misunderstood something.

Signed-off-by: panguixin <panguixin@bytedance.com>
Copy link
Contributor

❌ Gradle check result for 6550f60: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@@ -110,6 +111,28 @@ protected void doAssert(Object actualValue, Object expectedValue) {
}

if (expectedValue.equals(actualValue) == false) {
if (expectedValue instanceof ArrayList && actualValue instanceof ArrayList) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need to workaround the comparison, please use index based comparison (instead of array):

- match: { hits.hits.0.sort: [7] }

Would become

- match: { hits.hits.0.sort.0: 7 }

And works out of the box

@@ -110,6 +111,28 @@ protected void doAssert(Object actualValue, Object expectedValue) {
}

if (expectedValue.equals(actualValue) == false) {
if (expectedValue instanceof ArrayList && actualValue instanceof ArrayList) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need to workaround the comparison, please use index based comparison (instead of array):

- match: { hits.hits.0.sort: [7] }

Would become

- match: { hits.hits.0.sort.0: 7 }

And works out of the box

Signed-off-by: panguixin <panguixin@bytedance.com>
Copy link
Contributor

❌ Gradle check result for 92843d0: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@bugmakerrrrrr
Copy link
Contributor Author

❌ Gradle check result for 92843d0: FAILURE

After executing query phase on data node, the search result will be sent to the coordinator. If the coordinator is on a different node, the result will be serialized before sending, and the sort field will be reduced to the primary type during serialization. But if the coordinator is on the same node, the response will be passed to the handler directly and won't be serialized. In some cases, this may cause different sort field types when merging search results even for one field of one index. When merging top field docs, if the sort field types are not equal, the current implementation uses double to widen sort fields. In this case, the sort values exceeds the MAX_SAFE_INTEGER (2^53 – 1), the sort result is wrong when using double to compare. I have opened #16860 to track this issue.

@bugmakerrrrrr
Copy link
Contributor Author

❌ Gradle check result for 92843d0: FAILURE

@reta I believe that this can be fixed with #16881, would you mind taking a look at #16881 if you get a chance?

@reta
Copy link
Collaborator

reta commented Dec 19, 2024

@reta I believe that this can be fixed with #16881, would you mind taking a look at #16881 if you get a chance?

Apologies @bugmakerrrrrr , I won't be able to find the time for it during the holidays, just finished with this change, would appreciate your feedback. The MultiValueMode now properly supports unsigned_long, the dedicated XxxMultiValueMode classes are not needed, I think it is aligned with other types of doc values.

@reta reta added v3.0.0 Issues and PRs related to version 3.0.0 backport 2.x Backport to 2.x branch v2.19.0 Issues and PRs related to version 2.19.0 labels Dec 19, 2024
Copy link
Contributor

❌ Gradle check result for 49ef878: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>
Copy link
Contributor

❌ Gradle check result for 29c7e46: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 257ee07: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 257ee07: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@bugmakerrrrrr
Copy link
Contributor Author

@reta Sorry for disturbing you during the holidays. Thank you for sharing this update. I like the idea, it's more clean and consistent. I'll review the latest change soon. Hope you have a wonderful holiday.

* @opensearch.internal
*/

final class SingletonSortedNumericUnsignedLongValues extends SortedNumericUnsignedLongValues {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem to be a singleton values, but simply a wrapper around LongValues, perhaps renamed LongToNumericUnsignedLongValues?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is a wrapper, I was "inspired" by equivalent construct for doubles, but I will think more about your suggestion, perhaps we could simplify things as well, thank you

@reta
Copy link
Collaborator

reta commented Dec 20, 2024

@reta Sorry for disturbing you during the holidays.

Thanks @bugmakerrrrrr , no worries at all, just a bit difficult to find the time, happy holidays to you as well!

@rishabh6788
Copy link
Contributor

@bugmakerrrrrr Do you think it is a good idea to run a quick benchmark as well to check if there is any impact on the query performance as well? If yes, check out https://github.com/opensearch-project/OpenSearch/blob/main/PERFORMANCE_BENCHMARKS.md, id_5 will be good candidate.

reta added 2 commits December 20, 2024 17:02
Signed-off-by: Andriy Redko <andriy.redko@aiven.io>
Signed-off-by: Andriy Redko <andriy.redko@aiven.io>
Copy link
Contributor

❌ Gradle check result for cfbf241: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch bug Something isn't working Search Search query, autocomplete ...etc v2.19.0 Issues and PRs related to version 2.19.0 v3.0.0 Issues and PRs related to version 3.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Sort on multi-value unsigned long field is incorrect
3 participants