Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perform buildAggregation concurrently and partially support composite aggs #12697

Merged
merged 1 commit into from
Mar 27, 2024

Conversation

jed326
Copy link
Collaborator

@jed326 jed326 commented Mar 15, 2024

Description

In order to address Lucene AssertingCodec failures related to DocValue and SortedFields objects as well as improve aggregation performance, this PR moves the buildAggregation component into postCollection in order for it to be run concurrently on the index_searcher threadpool for concurrent segment search.

There is a separate issue with source scripting for composite aggregations, so we will re-enable concurrent segment search for composite aggregations however we will disable in case any scripts are present. See item 2: #12331 (comment)

In summary:

  1. Move buildAggregation to processPostCollection. To help do this we save the InternalAggregation object as a class field in Aggregator.
  2. In GlobalOrdinalsStringTermsAggregator defer DocValue creation until when it is used instead of when Aggregator is created.
  3. Make count field in MultiBucketConsumer thread safe to support running buildAggregation concurrently.
  4. Enable concurrent segment search for composite aggs if scripting is not used.
  5. Suppress some more lucene codecs in OpenSearchTestCase to ensure AssertingCodec is used.

Related Issues

Resolves #11673
Relates: #12331 (comment)

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Failing checks are inspected and point to the corresponding known issue(s) (See: Troubleshooting Failing Builds)
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)
    - [ ] Public documentation issue/PR created

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link
Contributor

❌ Gradle check result for 6954880: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

github-actions bot commented Mar 15, 2024

Compatibility status:

Checks if related components are compatible with change 9d7d569

Incompatible components

Skipped components

Compatible components

Compatible components: [https://github.com/opensearch-project/custom-codecs.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/flow-framework.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/performance-analyzer.git]

@jed326
Copy link
Collaborator Author

jed326 commented Mar 15, 2024

Ran all aggs related tests:

Tests with failures:
 - org.opensearch.search.aggregations.CombiIT.testMultipleAggsOnSameField_WithDifferentRequiredValueSourceType {p0={"search.concurrent_segment_search.enabled":"false"}}
 - org.opensearch.search.aggregations.CombiIT.testMultipleAggsOnSameField_WithDifferentRequiredValueSourceType {p0={"search.concurrent_segment_search.enabled":"true"}}
 - org.opensearch.search.aggregations.bucket.SignificantTermsSignificanceScoreIT.testBackgroundVsSeparateSet {p0={"search.concurrent_segment_search.enabled":"false"}}
 - org.opensearch.search.aggregations.bucket.SignificantTermsSignificanceScoreIT.testBackgroundVsSeparateSet {p0={"search.concurrent_segment_search.enabled":"true"}}
 - org.opensearch.search.aggregations.EquivalenceIT.testDuelTermsHistogram {p0={"search.concurrent_segment_search.enabled":"false"}}
 - org.opensearch.search.aggregations.EquivalenceIT.classMethod

1619 tests completed, 13 failed, 380 skipped

There's still an NPE issue somewhere in here for when postCollection is not called

Copy link
Contributor

❌ Gradle check result for 36942d9: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for c2a731e: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 2c43988: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 2e6381f: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 0714bf6: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 2f3f541:

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions github-actions bot added bug Something isn't working enhancement Enhancement or improvement to existing feature or request Search:Performance Search:Query Capabilities v2.13.0 Issues and PRs related to version 2.13.0 labels Mar 18, 2024
Copy link
Contributor

❌ Gradle check result for f5f34f2: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for ecf87ba: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for e4cf4ba: null

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for f1e496e: null

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@jed326 jed326 changed the title Parallelize build agg first try Perform buildAggregation concurrently and partially support composite aggs Mar 18, 2024
Copy link
Contributor

❌ Gradle check result for 6b18e9c: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 643b38b: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@jed326 jed326 force-pushed the composite-agg branch 2 times, most recently from 347acd7 to c9e410b Compare March 26, 2024 20:09
Copy link
Contributor

❕ Gradle check result for 347acd7: UNSTABLE

  • TEST FAILURES:
      1 org.opensearch.http.SearchRestCancellationIT.testAutomaticCancellationMultiSearchDuringFetchPhase

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

Copy link
Contributor

✅ Gradle check result for c9e410b: SUCCESS

Copy link
Contributor

❕ Gradle check result for 6b8dac0: UNSTABLE

  • TEST FAILURES:
      1 org.opensearch.remotestore.SegmentReplicationUsingRemoteStoreIT.testReplicaAlreadyAtCheckpoint

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

Copy link
Collaborator

@sohami sohami left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes LGTM.

@jed326
Copy link
Collaborator Author

jed326 commented Mar 27, 2024

Thanks! @reta there's been a few small changes since you approved, mind taking another look? (And merging if it looks good?)

@jed326 jed326 added the backport 2.x Backport to 2.x branch label Mar 27, 2024
Copy link
Contributor

✅ Gradle check result for 9d7d569: SUCCESS

@reta reta merged commit 618782d into opensearch-project:main Mar 27, 2024
33 of 34 checks passed
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/OpenSearch/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/OpenSearch/backport-2.x
# Create a new branch
git switch --create backport/backport-12697-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 618782d8fe8c26298caf912795513b23c33db149
# Push it to GitHub
git push --set-upstream origin backport/backport-12697-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/OpenSearch/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-12697-to-2.x.

@reta
Copy link
Collaborator

reta commented Mar 27, 2024

@jed326 sadly backport to 2.x failed, could you please submit a manual one? Thank you.

jed326 added a commit to jed326/OpenSearch that referenced this pull request Mar 27, 2024
reta pushed a commit that referenced this pull request Mar 27, 2024
…ons (#12697) (#12940)

Signed-off-by: Jay Deng <jayd0104@gmail.com>
shiv0408 pushed a commit to Gaurav614/OpenSearch that referenced this pull request Apr 25, 2024
…ons (opensearch-project#12697)

Signed-off-by: Jay Deng <jayd0104@gmail.com>
Signed-off-by: Shivansh Arora <hishiv@amazon.com>
harshavamsi pushed a commit to harshavamsi/OpenSearch that referenced this pull request Apr 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch backport-failed bug Something isn't working enhancement Enhancement or improvement to existing feature or request Search:Performance Search:Query Capabilities v2.14.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Concurrent Segment Search] Perform buildAggregation in parallel
5 participants