Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add advance(int) for numeric values in order to allow point based optimization to kick in #12089

Merged
merged 2 commits into from
Feb 1, 2024

Conversation

reta
Copy link
Collaborator

@reta reta commented Jan 30, 2024

Description

Add advance(int) for numeric values in order to allow point based optimization to kick in, huge thanks to @jed326 for investigating the cause of the issue (quoting below):

It looks like we encounter the problem with sort for, double_value, and unsigned_long_value as well.

The problem seems to be related to this change: apache/lucene#12405
Specifically: apache/lucene@d910990#diff-79c6a57519ecd1ef504629e62e13d17859a4ffedc58f4602e583ce758a15adc8R291-R295

As the comparator is getting set to NumericComparator::NumericLeafComparator. In the non-concurrent search case we don't go into the if statement so the competitiveIterator is not updated.

The if statement seems to be related to the number of documents on the segment though so it seems like we could still see this in the non-concurrent search case.

There are multiple ways to fix the issue:

  • disable optimization (not desirable)
  • add implementation of the advance(int) to NumericDoubleValues and SortedNumericDoubleValues, attempted in this pull request, since in most cases we delegate to NumericDocValues, the advance(int) is available in there

Related Issues

Closes #11875

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Failing checks are inspected and point to the corresponding known issue(s) (See: Troubleshooting Failing Builds)
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)
  • Public documentation issue/PR created

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@reta reta added the bug Something isn't working label Jan 30, 2024
@github-actions github-actions bot added flaky-test Random test failure that succeeds on second run Search Search query, autocomplete ...etc labels Jan 30, 2024
Copy link
Contributor

github-actions bot commented Jan 30, 2024

Compatibility status:

Checks if related components are compatible with change b682278

Incompatible components

Incompatible components: [https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/performance-analyzer-rca.git]

Skipped components

Compatible components

Compatible components: [https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/custom-codecs.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/sql.git]

Copy link
Contributor

❌ Gradle check result for 0d47358: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❕ Gradle check result for 491da59: UNSTABLE

  • TEST FAILURES:
      1 org.opensearch.search.SearchWeightedRoutingIT.testShardRoutingWithNetworkDisruption_FailOpenEnabled

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

Copy link

codecov bot commented Jan 30, 2024

Codecov Report

Attention: 27 lines in your changes are missing coverage. Please review.

Comparison is base (bf5e628) 71.40% compared to head (b682278) 71.38%.
Report is 2 commits behind head on main.

Files Patch % Lines
...java/org/opensearch/index/fielddata/FieldData.java 0.00% 8 Missing ⚠️
...x/fielddata/plain/SortedNumericIndexFieldData.java 0.00% 4 Missing ⚠️
...pensearch/index/fielddata/NumericDoubleValues.java 0.00% 3 Missing ⚠️
...ain/java/org/opensearch/search/MultiValueMode.java 0.00% 2 Missing ⚠️
.../fielddata/SingletonSortedNumericDoubleValues.java 0.00% 1 Missing ⚠️
...ex/fielddata/SortableLongBitsNumericDocValues.java 0.00% 1 Missing ⚠️
...elddata/SortableLongBitsToNumericDoubleValues.java 0.00% 1 Missing ⚠️
...a/SortableLongBitsToSortedNumericDoubleValues.java 0.00% 1 Missing ⚠️
...rch/index/fielddata/SortedNumericDoubleValues.java 0.00% 1 Missing ⚠️
...x/fielddata/UnsignedLongToNumericDoubleValues.java 0.00% 1 Missing ⚠️
... and 4 more
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #12089      +/-   ##
============================================
- Coverage     71.40%   71.38%   -0.02%     
- Complexity    59470    59512      +42     
============================================
  Files          4925     4925              
  Lines        279513   279540      +27     
  Branches      40646    40646              
============================================
- Hits         199577   199545      -32     
- Misses        63318    63392      +74     
+ Partials      16618    16603      -15     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

…imization to kick in

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>
Copy link
Contributor

❕ Gradle check result for 1434f65: UNSTABLE

  • TEST FAILURES:
      1 org.opensearch.cluster.coordination.AwarenessAttributeDecommissionIT.testConcurrentDecommissionAction

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

Copy link
Contributor

❌ Gradle check result for 7e24abd: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>
@reta
Copy link
Collaborator Author

reta commented Jan 31, 2024

❌ Gradle check result for 7e24abd: FAILURE

#10006
#10193

@neetikasinghal
Copy link
Contributor

Thanks for picking this change @reta. Looks good.

Copy link
Contributor

✅ Gradle check result for b682278: SUCCESS

Copy link
Collaborator

@msfroh msfroh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good. Thank you!

I'm wondering, though, if there's anything we can do to ensure that future implementations remember to implement advance?

@reta
Copy link
Collaborator Author

reta commented Feb 1, 2024

I'm wondering, though, if there's anything we can do to ensure that future implementations remember to implement advance?

Thanks @msfroh , so for numeric types, if we add one, we usually either copy test cases or extend the existing one with new field types, high chances are that with tests we have now, the new type would fail.

@gashutos
Copy link
Contributor

gashutos commented Feb 1, 2024

apache/lucene@d910990#diff-79c6a57519ecd1ef504629e62e13d17859a4ffedc58f4602e583ce758a15adc8R291

This seems costly since it gets invoked sizable number of times during sorting. Lets see if this is impacting perf numbers. Normal http_logs or other benchmark wont able to catch this since it will never get in to this condition for them.

LGTM , thanks @reta & @jed326

@reta reta merged commit 4471a8d into opensearch-project:main Feb 1, 2024
32 of 35 checks passed
@reta reta added the backport 2.x Backport to 2.x branch label Feb 1, 2024
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/OpenSearch/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/OpenSearch/backport-2.x
# Create a new branch
git switch --create backport/backport-12089-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 4471a8d49b3415a78a0d1429c63fc6cda4531235
# Push it to GitHub
git push --set-upstream origin backport/backport-12089-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/OpenSearch/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-12089-to-2.x.

reta added a commit to reta/OpenSearch that referenced this pull request Feb 1, 2024
…imization to kick in (opensearch-project#12089)

* Add advance(int) for numeric values in order to allow point based optimization to kick in

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>

* Address code review comments

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>

---------

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>
(cherry picked from commit 4471a8d)
andrross pushed a commit that referenced this pull request Feb 2, 2024
…imization to kick in (#12089) (#12129)

* Add advance(int) for numeric values in order to allow point based optimization to kick in

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>

* Address code review comments

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>

---------

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>
(cherry picked from commit 4471a8d)
peteralfonsi pushed a commit to peteralfonsi/OpenSearch that referenced this pull request Mar 1, 2024
…imization to kick in (opensearch-project#12089)

* Add advance(int) for numeric values in order to allow point based optimization to kick in

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>

* Address code review comments

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>

---------

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>
rayshrey pushed a commit to rayshrey/OpenSearch that referenced this pull request Mar 18, 2024
…imization to kick in (opensearch-project#12089)

* Add advance(int) for numeric values in order to allow point based optimization to kick in

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>

* Address code review comments

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>

---------

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>
shiv0408 pushed a commit to Gaurav614/OpenSearch that referenced this pull request Apr 25, 2024
…imization to kick in (opensearch-project#12089)

* Add advance(int) for numeric values in order to allow point based optimization to kick in

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>

* Address code review comments

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>

---------

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>
Signed-off-by: Shivansh Arora <hishiv@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch backport-failed bug Something isn't working flaky-test Random test failure that succeeds on second run Search:Relevance Search Search query, autocomplete ...etc
Projects
None yet
6 participants