-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: reset the filter built at segment level for date histogram optimization #12267
Conversation
Compatibility status:Checks if related components are compatible with change 47f53be Incompatible componentsSkipped componentsCompatible componentsCompatible components: [https://github.com/opensearch-project/custom-codecs.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/flow-framework.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/performance-analyzer.git] |
❕ Gradle check result for 096d7fa: UNSTABLE
Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure. |
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #12267 +/- ##
============================================
+ Coverage 71.37% 71.47% +0.09%
- Complexity 59781 59834 +53
============================================
Files 4959 4959
Lines 281116 281110 -6
Branches 40857 40855 -2
============================================
+ Hits 200639 200911 +272
+ Misses 63818 63498 -320
- Partials 16659 16701 +42 ☔ View full report in Codecov by Sentry. |
Can you please clarify more why / how the optimization can "leak" from one copy of a shard to another? The optimization should be computed local to the shard that it's executing on. If we have any state cached beyond the current shard-local In the example you cite above, once the deletions have refreshed on |
Seems this sentence causes confusion
Not saying optimization leak between shard, but between segments. |
Signed-off-by: bowenlan-amzn <bowenlan23@gmail.com>
❌ Gradle check result for 7682fe7: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for 47f53be: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❕ Gradle check result for 47f53be: UNSTABLE
Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure. |
The backport to
To backport manually, run these commands in your terminal: # Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/OpenSearch/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/OpenSearch/backport-2.x
# Create a new branch
git switch --create backport/backport-12267-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 197231fe06e35e0b0e9adeaa33b7f0dbde7c0031
# Push it to GitHub
git push --set-upstream origin backport/backport-12267-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/OpenSearch/backport-2.x Then, create a pull request where the |
The backport to
To backport manually, run these commands in your terminal: # Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/OpenSearch/backport-2.12 2.12
# Navigate to the new working tree
pushd ../.worktrees/OpenSearch/backport-2.12
# Create a new branch
git switch --create backport/backport-12267-to-2.12
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 197231fe06e35e0b0e9adeaa33b7f0dbde7c0031
# Push it to GitHub
git push --set-upstream origin backport/backport-12267-to-2.12
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/OpenSearch/backport-2.12 Then, create a pull request where the |
…ization (opensearch-project#12267) --------- Signed-off-by: bowenlan-amzn <bowenlan23@gmail.com>
…level (#12279) * Apply fast date histogram optimization at the segment level (#12073) --------- Signed-off-by: bowenlan-amzn <bowenlan23@gmail.com> (cherry picked from commit 9a0a69f) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * Fix: reset the filter built at segment level for date histogram optimization (#12267) --------- Signed-off-by: bowenlan-amzn <bowenlan23@gmail.com> --------- Signed-off-by: bowenlan-amzn <bowenlan23@gmail.com> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
…level (#12279) * Apply fast date histogram optimization at the segment level (#12073) --------- Signed-off-by: bowenlan-amzn <bowenlan23@gmail.com> (cherry picked from commit 9a0a69f) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * Fix: reset the filter built at segment level for date histogram optimization (#12267) --------- Signed-off-by: bowenlan-amzn <bowenlan23@gmail.com> --------- Signed-off-by: bowenlan-amzn <bowenlan23@gmail.com> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> (cherry picked from commit 25c2fde) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
…level (#12279) (#12284) * Apply fast date histogram optimization at the segment level (#12073) --------- (cherry picked from commit 9a0a69f) * Fix: reset the filter built at segment level for date histogram optimization (#12267) --------- --------- (cherry picked from commit 25c2fde) Signed-off-by: bowenlan-amzn <bowenlan23@gmail.com> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
…ization (opensearch-project#12267) --------- Signed-off-by: bowenlan-amzn <bowenlan23@gmail.com>
…ization (opensearch-project#12267) --------- Signed-off-by: bowenlan-amzn <bowenlan23@gmail.com>
…ization (opensearch-project#12267) --------- Signed-off-by: bowenlan-amzn <bowenlan23@gmail.com> Signed-off-by: Shivansh Arora <hishiv@amazon.com>
Description
#12073 handle the segment level match all scenario and build fast filters at segment level. However the built filter from a segment is saved in the FastFilterContext lived through the shard search and stop the subsequent segments from computing their filters. This change reset the filters built at segment level and add integration tests to cover that.
The reason previous flaky test catches this is pretty interesting. It's flaky because it happens at this scenario:
first search hit one shard (s1_1), second search hit another shard (s1_2)
The segments on these 2 shards are different, and the indexRandom method from test framework introduces random dummy document deletions which will stop the fast filter optimization we added in. So imagine s1_1 has multiple segments without deletion and gets the optimization kicks in while the segment level filters never change. s1_2 segments have deletion and optimization never kicks in and fall back to old code path. In the end, the results from these 2 searchs become different.
Related Issues
Resolves #12268
Check List
New functionality has been documented.New functionality has javadoc addedCommit changes are listed out in CHANGELOG.md file (See: Changelog)Public documentation issue/PR createdBy submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.