Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pass parent filter to inner query in nested query #10246

Merged
merged 1 commit into from
Sep 30, 2023

Conversation

heemin32
Copy link
Contributor

Description

Lucene introduced a new feature of joining child and parent document in knn search. apache/lucene#12434

To utilize the feature, we need to pass parent filter to inner query so that inner query can dedupe child document per parent document.

Once this change is made, we can use the parent filter data to call appropriate query in kNN repo. opensearch-project/k-NN#1065

Related Issues

N/A

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

github-actions bot commented Sep 27, 2023

Compatibility status:

Checks if related components are compatible with change 5ea44e6

Incompatible components

Skipped components

Compatible components

Compatible components: [https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/custom-codecs.git, https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/neural-search.git]

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@heemin32 heemin32 closed this Sep 27, 2023
@heemin32 heemin32 reopened this Sep 27, 2023
@heemin32 heemin32 closed this Sep 27, 2023
@heemin32 heemin32 reopened this Sep 27, 2023
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

Pass parent filter to inner query so that inner query can utilize the information

Signed-off-by: Heemin Kim <heemin@amazon.com>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

Copy link
Collaborator

@msfroh msfroh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

We'll should think about whether other queries may also benefit from specific knowledge of the parent filter.

Right now, I think most normal DocIdSetIterator-based queries get all the benefit from the existing ToParentBlockJoinQuery logic. Correct me if I'm wrong, but kNN only needs this special behavior because it's not iterating through a monotonically-increasing doc ID set, right?

@msfroh msfroh merged commit e156582 into opensearch-project:main Sep 30, 2023
13 checks passed
@heemin32
Copy link
Contributor Author

Looks good to me.

We'll should think about whether other queries may also benefit from specific knowledge of the parent filter.

Right now, I think most normal DocIdSetIterator-based queries get all the benefit from the existing ToParentBlockJoinQuery logic. Correct me if I'm wrong, but kNN only needs this special behavior because it's not iterating through a monotonically-increasing doc ID set, right?

That is correct. Only KNN needs this as of now.

@heemin32
Copy link
Contributor Author

@msfroh Could you back port it to 2.x as well?

@msfroh msfroh added the backport 2.x Backport to 2.x branch label Oct 2, 2023
opensearch-trigger-bot bot pushed a commit that referenced this pull request Oct 2, 2023
Pass parent filter to inner query so that inner query can utilize the information

Signed-off-by: Heemin Kim <heemin@amazon.com>
(cherry picked from commit e156582)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
@heemin32 heemin32 mentioned this pull request Oct 2, 2023
6 tasks
reta pushed a commit that referenced this pull request Oct 2, 2023
Pass parent filter to inner query so that inner query can utilize the information


(cherry picked from commit e156582)

Signed-off-by: Heemin Kim <heemin@amazon.com>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
@reta reta added v3.0.0 Issues and PRs related to version 3.0.0 v2.11.0 Issues and PRs related to version 2.11.0 labels Oct 2, 2023
rayshrey pushed a commit to rayshrey/OpenSearch that referenced this pull request Oct 3, 2023
…#10246)

Pass parent filter to inner query so that inner query can utilize the information

Signed-off-by: Heemin Kim <heemin@amazon.com>
deshsidd pushed a commit to deshsidd/OpenSearch that referenced this pull request Oct 9, 2023
…#10246)

Pass parent filter to inner query so that inner query can utilize the information

Signed-off-by: Heemin Kim <heemin@amazon.com>
vikasvb90 pushed a commit to vikasvb90/OpenSearch that referenced this pull request Oct 10, 2023
…#10246)

Pass parent filter to inner query so that inner query can utilize the information

Signed-off-by: Heemin Kim <heemin@amazon.com>
@heemin32 heemin32 deleted the parent-filter branch October 10, 2023 17:35
shiv0408 pushed a commit to Gaurav614/OpenSearch that referenced this pull request Apr 25, 2024
…#10246)

Pass parent filter to inner query so that inner query can utilize the information

Signed-off-by: Heemin Kim <heemin@amazon.com>
Signed-off-by: Shivansh Arora <hishiv@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch v2.11.0 Issues and PRs related to version 2.11.0 v3.0.0 Issues and PRs related to version 3.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants