-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed Hybrid query for cases when it's wrapped into other compound queries #498
Fixed Hybrid query for cases when it's wrapped into other compound queries #498
Conversation
fdcb5d8
to
31462ee
Compare
Signed-off-by: Martin Gaievski <gaievski@amazon.com>
31462ee
to
cc3bef8
Compare
80b6742
to
233a00f
Compare
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #498 +/- ##
============================================
- Coverage 84.65% 84.24% -0.41%
- Complexity 508 529 +21
============================================
Files 40 40
Lines 1505 1549 +44
Branches 234 245 +11
============================================
+ Hits 1274 1305 +31
- Misses 128 129 +1
- Partials 103 115 +12 ☔ View full report in Codecov by Sentry. |
518cf18
to
5717abd
Compare
@martin-gaievski Please add the index mapping and query that you tested via curl or postman in the issue description so that we understand how this was tested. |
src/main/java/org/opensearch/neuralsearch/search/query/HybridQueryPhaseSearcher.java
Outdated
Show resolved
Hide resolved
src/main/java/org/opensearch/neuralsearch/search/query/HybridQueryPhaseSearcher.java
Outdated
Show resolved
Hide resolved
src/main/java/org/opensearch/neuralsearch/search/query/HybridQueryPhaseSearcher.java
Outdated
Show resolved
Hide resolved
src/main/java/org/opensearch/neuralsearch/search/query/HybridQueryPhaseSearcher.java
Outdated
Show resolved
Hide resolved
src/main/java/org/opensearch/neuralsearch/search/query/HybridQueryPhaseSearcher.java
Outdated
Show resolved
Hide resolved
…vels Signed-off-by: Martin Gaievski <gaievski@amazon.com>
src/main/java/org/opensearch/neuralsearch/search/query/HybridQueryPhaseSearcher.java
Outdated
Show resolved
Hide resolved
src/main/java/org/opensearch/neuralsearch/search/query/HybridQueryPhaseSearcher.java
Outdated
Show resolved
Hide resolved
src/main/java/org/opensearch/neuralsearch/search/query/HybridQueryPhaseSearcher.java
Outdated
Show resolved
Hide resolved
src/main/java/org/opensearch/neuralsearch/search/query/HybridQueryPhaseSearcher.java
Outdated
Show resolved
Hide resolved
src/main/java/org/opensearch/neuralsearch/search/query/HybridQueryPhaseSearcher.java
Outdated
Show resolved
Hide resolved
src/main/java/org/opensearch/neuralsearch/search/query/HybridQueryPhaseSearcher.java
Outdated
Show resolved
Hide resolved
@martin-gaievski fix the title of the PR it doesn't provide info around what this PR is about. Other than that code look good to me |
Signed-off-by: Martin Gaievski <gaievski@amazon.com>
src/main/java/org/opensearch/neuralsearch/search/query/HybridQueryPhaseSearcher.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Martin Gaievski <gaievski@amazon.com>
fixed the title @navneet1v |
The backport to
To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-2.x 2.x
# Navigate to the new working tree
cd .worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-498-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 3991fe7c04e2f2c104c0be0365428c5747f21f6c
# Push it to GitHub
git push --set-upstream origin backport/backport-498-to-2.x
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-2.x Then, create a pull request where the |
…eries (opensearch-project#498) * Fixed nested field case Signed-off-by: Martin Gaievski <gaievski@amazon.com> (cherry picked from commit 3991fe7)
…eries (opensearch-project#498) * Fixed nested field case Signed-off-by: Martin Gaievski <gaievski@amazon.com> (cherry picked from commit 3991fe7) Signed-off-by: Martin Gaievski <gaievski@amazon.com>
…eries (opensearch-project#498) (opensearch-project#501) * Fixed nested field case (cherry picked from commit 3991fe7) Signed-off-by: Martin Gaievski <gaievski@amazon.com> (cherry picked from commit 27a38cb) Signed-off-by: Martin Gaievski <gaievski@amazon.com>
…eries (opensearch-project#498) * Fixed nested field case Signed-off-by: Martin Gaievski <gaievski@amazon.com> Signed-off-by: Gopala-Krishna.Char <gopala-krishna.char@intrafind.de>
…eries (opensearch-project#498) * Fixed nested field case Signed-off-by: Martin Gaievski <gaievski@amazon.com> Signed-off-by: Gopala-Krishna.Char <gopala-krishna.char@intrafind.de>
* Handle case with nested list of objects Signed-off-by: Gopala-Krishna.Char <gopala-krishna.char@intrafind.de> * fix validateEmbeddingsFieldValues Method Signed-off-by: Gopala-Krishna.Char <gopala-krishna.char@intrafind.de> * spotless formatting Signed-off-by: Gopala-Krishna.Char <gopala-krishna.char@intrafind.de> * Onboard jenkins prod docker images on github actions (#483) Signed-off-by: Peter Zhu <zhujiaxi@amazon.com> Signed-off-by: Gopala-Krishna.Char <gopala-krishna.char@intrafind.de> * Update dependency org.json:json to v20231013 (#481) Co-authored-by: mend-for-github-com[bot] <50673670+mend-for-github-com[bot]@users.noreply.github.com> Signed-off-by: Gopala-Krishna.Char <gopala-krishna.char@intrafind.de> * [Backport main manually][bug fix] Fix async actions are left in neural_sparse query (#438) (#479) * [bug fix] Fix async actions are left in neural_sparse query (#438) * add serialization and deserialization Signed-off-by: zhichao-aws <zhichaog@amazon.com> * hash, equals. + UT Signed-off-by: zhichao-aws <zhichaog@amazon.com> * tidy Signed-off-by: zhichao-aws <zhichaog@amazon.com> * add test Signed-off-by: zhichao-aws <zhichaog@amazon.com> --------- Signed-off-by: zhichao-aws <zhichaog@amazon.com> (cherry picked from commit 51e6c00) * rm max_token_score Signed-off-by: zhichao-aws <zhichaog@amazon.com> * add changelog Signed-off-by: zhichao-aws <zhichaog@amazon.com> * tidy Signed-off-by: zhichao-aws <zhichaog@amazon.com> --------- Signed-off-by: zhichao-aws <zhichaog@amazon.com> Signed-off-by: Gopala-Krishna.Char <gopala-krishna.char@intrafind.de> * Fixed exception for case when Hybrid query being wrapped into bool query (#490) * Adding null check for case when hybrid query wrapped into bool query Signed-off-by: Martin Gaievski <gaievski@amazon.com> Signed-off-by: Gopala-Krishna.Char <gopala-krishna.char@intrafind.de> * Fixed Hybrid query for cases when it's wrapped into other compound queries (#498) * Fixed nested field case Signed-off-by: Martin Gaievski <gaievski@amazon.com> Signed-off-by: Gopala-Krishna.Char <gopala-krishna.char@intrafind.de> * Added the github action to copy the attached issues label to PR. (#504) Signed-off-by: Navneet Verma <navneev@amazon.com> Signed-off-by: Gopala-Krishna.Char <gopala-krishna.char@intrafind.de> * Added support for jdk-21 (#500) * Added support for jdk-21 Signed-off-by: Martin Gaievski <gaievski@amazon.com> Signed-off-by: Gopala-Krishna.Char <gopala-krishna.char@intrafind.de> * Add unit tests + small fixes Signed-off-by: krishy91 <crgkc.r@gmail.com> * fix indentation Signed-off-by: krishy91 <crgkc.r@gmail.com> * remove unused code + add 2nd level nesting test Signed-off-by: krishy91 <crgkc.r@gmail.com> * add integration test for list of nested objects Signed-off-by: krishy91 <crgkc.r@gmail.com> --------- Signed-off-by: Gopala-Krishna.Char <gopala-krishna.char@intrafind.de> Signed-off-by: Peter Zhu <zhujiaxi@amazon.com> Signed-off-by: zhichao-aws <zhichaog@amazon.com> Signed-off-by: Martin Gaievski <gaievski@amazon.com> Signed-off-by: Navneet Verma <navneev@amazon.com> Signed-off-by: krishy91 <crgkc.r@gmail.com> Co-authored-by: Gopala-Krishna.Char <gopala-krishna.char@intrafind.de> Co-authored-by: Peter Zhu <zhujiaxi@amazon.com> Co-authored-by: mend-for-github-com[bot] <50673670+mend-for-github-com[bot]@users.noreply.github.com> Co-authored-by: zhichao-aws <zhichaog@amazon.com> Co-authored-by: Martin Gaievski <gaievski@amazon.com> Co-authored-by: Navneet Verma <navneev@amazon.com>
* Handle case with nested list of objects Signed-off-by: Gopala-Krishna.Char <gopala-krishna.char@intrafind.de> * fix validateEmbeddingsFieldValues Method Signed-off-by: Gopala-Krishna.Char <gopala-krishna.char@intrafind.de> * spotless formatting Signed-off-by: Gopala-Krishna.Char <gopala-krishna.char@intrafind.de> * Onboard jenkins prod docker images on github actions (#483) Signed-off-by: Peter Zhu <zhujiaxi@amazon.com> Signed-off-by: Gopala-Krishna.Char <gopala-krishna.char@intrafind.de> * Update dependency org.json:json to v20231013 (#481) Co-authored-by: mend-for-github-com[bot] <50673670+mend-for-github-com[bot]@users.noreply.github.com> Signed-off-by: Gopala-Krishna.Char <gopala-krishna.char@intrafind.de> * [Backport main manually][bug fix] Fix async actions are left in neural_sparse query (#438) (#479) * [bug fix] Fix async actions are left in neural_sparse query (#438) * add serialization and deserialization Signed-off-by: zhichao-aws <zhichaog@amazon.com> * hash, equals. + UT Signed-off-by: zhichao-aws <zhichaog@amazon.com> * tidy Signed-off-by: zhichao-aws <zhichaog@amazon.com> * add test Signed-off-by: zhichao-aws <zhichaog@amazon.com> --------- Signed-off-by: zhichao-aws <zhichaog@amazon.com> (cherry picked from commit 51e6c00) * rm max_token_score Signed-off-by: zhichao-aws <zhichaog@amazon.com> * add changelog Signed-off-by: zhichao-aws <zhichaog@amazon.com> * tidy Signed-off-by: zhichao-aws <zhichaog@amazon.com> --------- Signed-off-by: zhichao-aws <zhichaog@amazon.com> Signed-off-by: Gopala-Krishna.Char <gopala-krishna.char@intrafind.de> * Fixed exception for case when Hybrid query being wrapped into bool query (#490) * Adding null check for case when hybrid query wrapped into bool query Signed-off-by: Martin Gaievski <gaievski@amazon.com> Signed-off-by: Gopala-Krishna.Char <gopala-krishna.char@intrafind.de> * Fixed Hybrid query for cases when it's wrapped into other compound queries (#498) * Fixed nested field case Signed-off-by: Martin Gaievski <gaievski@amazon.com> Signed-off-by: Gopala-Krishna.Char <gopala-krishna.char@intrafind.de> * Added the github action to copy the attached issues label to PR. (#504) Signed-off-by: Navneet Verma <navneev@amazon.com> Signed-off-by: Gopala-Krishna.Char <gopala-krishna.char@intrafind.de> * Added support for jdk-21 (#500) * Added support for jdk-21 Signed-off-by: Martin Gaievski <gaievski@amazon.com> Signed-off-by: Gopala-Krishna.Char <gopala-krishna.char@intrafind.de> * Add unit tests + small fixes Signed-off-by: krishy91 <crgkc.r@gmail.com> * fix indentation Signed-off-by: krishy91 <crgkc.r@gmail.com> * remove unused code + add 2nd level nesting test Signed-off-by: krishy91 <crgkc.r@gmail.com> * add integration test for list of nested objects Signed-off-by: krishy91 <crgkc.r@gmail.com> --------- Signed-off-by: Gopala-Krishna.Char <gopala-krishna.char@intrafind.de> Signed-off-by: Peter Zhu <zhujiaxi@amazon.com> Signed-off-by: zhichao-aws <zhichaog@amazon.com> Signed-off-by: Martin Gaievski <gaievski@amazon.com> Signed-off-by: Navneet Verma <navneev@amazon.com> Signed-off-by: krishy91 <crgkc.r@gmail.com> Co-authored-by: Gopala-Krishna.Char <gopala-krishna.char@intrafind.de> Co-authored-by: Peter Zhu <zhujiaxi@amazon.com> Co-authored-by: mend-for-github-com[bot] <50673670+mend-for-github-com[bot]@users.noreply.github.com> Co-authored-by: zhichao-aws <zhichaog@amazon.com> Co-authored-by: Martin Gaievski <gaievski@amazon.com> Co-authored-by: Navneet Verma <navneev@amazon.com> (cherry picked from commit ea49d3c)
* Handle case with nested list of objects Signed-off-by: Gopala-Krishna.Char <gopala-krishna.char@intrafind.de> * fix validateEmbeddingsFieldValues Method Signed-off-by: Gopala-Krishna.Char <gopala-krishna.char@intrafind.de> * spotless formatting Signed-off-by: Gopala-Krishna.Char <gopala-krishna.char@intrafind.de> * Onboard jenkins prod docker images on github actions (#483) Signed-off-by: Peter Zhu <zhujiaxi@amazon.com> Signed-off-by: Gopala-Krishna.Char <gopala-krishna.char@intrafind.de> * Update dependency org.json:json to v20231013 (#481) Co-authored-by: mend-for-github-com[bot] <50673670+mend-for-github-com[bot]@users.noreply.github.com> Signed-off-by: Gopala-Krishna.Char <gopala-krishna.char@intrafind.de> * [Backport main manually][bug fix] Fix async actions are left in neural_sparse query (#438) (#479) * [bug fix] Fix async actions are left in neural_sparse query (#438) * add serialization and deserialization Signed-off-by: zhichao-aws <zhichaog@amazon.com> * hash, equals. + UT Signed-off-by: zhichao-aws <zhichaog@amazon.com> * tidy Signed-off-by: zhichao-aws <zhichaog@amazon.com> * add test Signed-off-by: zhichao-aws <zhichaog@amazon.com> --------- Signed-off-by: zhichao-aws <zhichaog@amazon.com> (cherry picked from commit 51e6c00) * rm max_token_score Signed-off-by: zhichao-aws <zhichaog@amazon.com> * add changelog Signed-off-by: zhichao-aws <zhichaog@amazon.com> * tidy Signed-off-by: zhichao-aws <zhichaog@amazon.com> --------- Signed-off-by: zhichao-aws <zhichaog@amazon.com> Signed-off-by: Gopala-Krishna.Char <gopala-krishna.char@intrafind.de> * Fixed exception for case when Hybrid query being wrapped into bool query (#490) * Adding null check for case when hybrid query wrapped into bool query Signed-off-by: Martin Gaievski <gaievski@amazon.com> Signed-off-by: Gopala-Krishna.Char <gopala-krishna.char@intrafind.de> * Fixed Hybrid query for cases when it's wrapped into other compound queries (#498) * Fixed nested field case Signed-off-by: Martin Gaievski <gaievski@amazon.com> Signed-off-by: Gopala-Krishna.Char <gopala-krishna.char@intrafind.de> * Added the github action to copy the attached issues label to PR. (#504) Signed-off-by: Navneet Verma <navneev@amazon.com> Signed-off-by: Gopala-Krishna.Char <gopala-krishna.char@intrafind.de> * Added support for jdk-21 (#500) * Added support for jdk-21 Signed-off-by: Martin Gaievski <gaievski@amazon.com> Signed-off-by: Gopala-Krishna.Char <gopala-krishna.char@intrafind.de> * Add unit tests + small fixes Signed-off-by: krishy91 <crgkc.r@gmail.com> * fix indentation Signed-off-by: krishy91 <crgkc.r@gmail.com> * remove unused code + add 2nd level nesting test Signed-off-by: krishy91 <crgkc.r@gmail.com> * add integration test for list of nested objects Signed-off-by: krishy91 <crgkc.r@gmail.com> --------- Signed-off-by: Gopala-Krishna.Char <gopala-krishna.char@intrafind.de> Signed-off-by: Peter Zhu <zhujiaxi@amazon.com> Signed-off-by: zhichao-aws <zhichaog@amazon.com> Signed-off-by: Martin Gaievski <gaievski@amazon.com> Signed-off-by: Navneet Verma <navneev@amazon.com> Signed-off-by: krishy91 <crgkc.r@gmail.com> Co-authored-by: Gopala-Krishna.Char <gopala-krishna.char@intrafind.de> Co-authored-by: Peter Zhu <zhujiaxi@amazon.com> Co-authored-by: mend-for-github-com[bot] <50673670+mend-for-github-com[bot]@users.noreply.github.com> Co-authored-by: zhichao-aws <zhichaog@amazon.com> Co-authored-by: Martin Gaievski <gaievski@amazon.com> Co-authored-by: Navneet Verma <navneev@amazon.com> (cherry picked from commit ea49d3c) Co-authored-by: Gopala-Krishna Char <crgkc.r@gmail.com>
* Handle case with nested list of objects Signed-off-by: Gopala-Krishna.Char <gopala-krishna.char@intrafind.de> * fix validateEmbeddingsFieldValues Method Signed-off-by: Gopala-Krishna.Char <gopala-krishna.char@intrafind.de> * spotless formatting Signed-off-by: Gopala-Krishna.Char <gopala-krishna.char@intrafind.de> * Onboard jenkins prod docker images on github actions (opensearch-project#483) Signed-off-by: Peter Zhu <zhujiaxi@amazon.com> Signed-off-by: Gopala-Krishna.Char <gopala-krishna.char@intrafind.de> * Update dependency org.json:json to v20231013 (opensearch-project#481) Co-authored-by: mend-for-github-com[bot] <50673670+mend-for-github-com[bot]@users.noreply.github.com> Signed-off-by: Gopala-Krishna.Char <gopala-krishna.char@intrafind.de> * [Backport main manually][bug fix] Fix async actions are left in neural_sparse query (opensearch-project#438) (opensearch-project#479) * [bug fix] Fix async actions are left in neural_sparse query (opensearch-project#438) * add serialization and deserialization Signed-off-by: zhichao-aws <zhichaog@amazon.com> * hash, equals. + UT Signed-off-by: zhichao-aws <zhichaog@amazon.com> * tidy Signed-off-by: zhichao-aws <zhichaog@amazon.com> * add test Signed-off-by: zhichao-aws <zhichaog@amazon.com> --------- Signed-off-by: zhichao-aws <zhichaog@amazon.com> (cherry picked from commit 51e6c00) * rm max_token_score Signed-off-by: zhichao-aws <zhichaog@amazon.com> * add changelog Signed-off-by: zhichao-aws <zhichaog@amazon.com> * tidy Signed-off-by: zhichao-aws <zhichaog@amazon.com> --------- Signed-off-by: zhichao-aws <zhichaog@amazon.com> Signed-off-by: Gopala-Krishna.Char <gopala-krishna.char@intrafind.de> * Fixed exception for case when Hybrid query being wrapped into bool query (opensearch-project#490) * Adding null check for case when hybrid query wrapped into bool query Signed-off-by: Martin Gaievski <gaievski@amazon.com> Signed-off-by: Gopala-Krishna.Char <gopala-krishna.char@intrafind.de> * Fixed Hybrid query for cases when it's wrapped into other compound queries (opensearch-project#498) * Fixed nested field case Signed-off-by: Martin Gaievski <gaievski@amazon.com> Signed-off-by: Gopala-Krishna.Char <gopala-krishna.char@intrafind.de> * Added the github action to copy the attached issues label to PR. (opensearch-project#504) Signed-off-by: Navneet Verma <navneev@amazon.com> Signed-off-by: Gopala-Krishna.Char <gopala-krishna.char@intrafind.de> * Added support for jdk-21 (opensearch-project#500) * Added support for jdk-21 Signed-off-by: Martin Gaievski <gaievski@amazon.com> Signed-off-by: Gopala-Krishna.Char <gopala-krishna.char@intrafind.de> * Add unit tests + small fixes Signed-off-by: krishy91 <crgkc.r@gmail.com> * fix indentation Signed-off-by: krishy91 <crgkc.r@gmail.com> * remove unused code + add 2nd level nesting test Signed-off-by: krishy91 <crgkc.r@gmail.com> * add integration test for list of nested objects Signed-off-by: krishy91 <crgkc.r@gmail.com> --------- Signed-off-by: Gopala-Krishna.Char <gopala-krishna.char@intrafind.de> Signed-off-by: Peter Zhu <zhujiaxi@amazon.com> Signed-off-by: zhichao-aws <zhichaog@amazon.com> Signed-off-by: Martin Gaievski <gaievski@amazon.com> Signed-off-by: Navneet Verma <navneev@amazon.com> Signed-off-by: krishy91 <crgkc.r@gmail.com> Co-authored-by: Gopala-Krishna.Char <gopala-krishna.char@intrafind.de> Co-authored-by: Peter Zhu <zhujiaxi@amazon.com> Co-authored-by: mend-for-github-com[bot] <50673670+mend-for-github-com[bot]@users.noreply.github.com> Co-authored-by: zhichao-aws <zhichaog@amazon.com> Co-authored-by: Martin Gaievski <gaievski@amazon.com> Co-authored-by: Navneet Verma <navneev@amazon.com> Signed-off-by: yuye-aws <yuyezhu@amazon.com>
Description
Nested fields have few problems that we're addressing in this PR:
query is wrapped into the Bool query inside the core code (ref), that happening behind the scenes and depends only on presents of the field in the index mapping. This case should work.
wrapping up hybrid search query into another query in general. That is not how hybrid query were designed, it must be a top level query. We need to track and block such queries.
We're mainly addressing case 1, and trying our best for case 2.
Below is snippet of example from original github issue to illustrate case 1. Without this PR search query returns wrong scores:
index mapping
add doc to index
search request
Before this change this is the search query response that system can return
below is response from system with this fix
Issues Resolved
#466
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.