-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding new rewrite override parameter to terms query #15012
Conversation
Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>
❌ Gradle check result for 6617c87: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>
Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>
❌ Gradle check result for 2ce46c4: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for b3b17a6: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>
❌ Gradle check result for 9829e9c: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>
❌ Gradle check result for c978ea3: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for 31e4964: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>
❌ Gradle check result for 67425be: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for 79ae3ef: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for 79ae3ef: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for de9e944: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for ce8a043: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for bb260f4: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for 815de3d: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
server/src/main/java/org/opensearch/index/query/TermsQueryBuilder.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>
It doesn't 'intelligently' decide. At the very minimum the heuristic should take into account how large a doc values field is in terms of values per document instead of making a static '8x penalty' decision. I'm concerned about the proposal in this pull request to add an additional set of parameters to the request. The issue was introduced by an internal optimization attempt that regresses in some scenarios. This is an implementation detail and shouldn't bother the client. Adding an opt-out parameter exposes this complexity to the client and outsources the problem solution. It's a long-term liability looking forward and makes the API even more complex than it already is. Furthermore it adds additional efforts on the Java client side, the documentation, and it breaks compatibility with ES for something that isn't a 'feature' but only a workaround. OpenSearch should solve this internally, and rather opt-out from doing this optimization if it doesn't have solid data that doc values filtering is actually faster for a given request. Considering that the 'optimiziation' was introduced with 2.12 this issue should be treated as a regression and receive bugfixes for all OS versions starting with 2.12 to unblock users migrating from 2.11 or older - without requiring extra work on the client side. Some users may be stuck at this point, or if they did upgrade may suffer from performance degradation and being unable to easily rollback. |
Thanks @alexmm-amzn , I suspect it deserves an issue on Apache Lucene side? Why do you think OpenSearch should solve this internally only? |
Thanks for your comments @alexmm-amzn, I want to throw some light around how This cost model has been working successfully in lucene and OpenSearch well for a while, numeric ranges have had In this particular case, adding the |
I'm fine with creating a ticket to Lucene to improve the heuristics as well. The decision to allow this optimization is currently made in OpenSearch and was introduced for
Yes, but
If the heuristic is fixed this control shouldn't be necessary, but it adds complexity and cannot be easily removed again later without breaking backward compatibility of the API. And until the heuristic is fixed having this control means that clients are forced to modify their applications to work around this performance regression - this is costly and outsources the fix to the client. Instead of investing into such a temporary workaround I'd rather suggest working with the Lucene team to improve the heuristic and get it backported to OS 2.12 and later. |
Signed-off-by: Harsha Vamsi Kalluri <harshavamsi096@gmail.com>
There isn't really a "Lucene team". There's the Lucene community, which can accept changes as they see fit. I did submit a proposed improvement to the cost estimate for multi-term keyword queries back in March, and it was looked at in August. It will come out in Lucene 9.12, which will be released at the end of September, in time to get picked up by OpenSearch 2.18. We could disable the My main concern with that approach is that we may roll back the performance improvement that some folks have seen in 2.12 and later. (It's hard to tell how many people have seen performance improve, since nobody reaches out to say things have become faster.) |
Chatted with @harshavamsi and we're thinking that a cluster setting that goes back to the 2.11 behavior could be an option. It's simpler than the per-clause "expert" setting -- you don't need to modify your application code. That provides an immediate solution for the folks who saw better performance on 2.11. Going forward, we can improve the cost estimation logic to do the right thing. |
So how would this setting work, disable |
Yes, I think it should disable it (when the field is indexed and has doc values, as is the default). The indexed query would be returned instead. The other branches (indexed-only/doc values-only) can still kick in for a field that only supports one or the other. I think disabled by default might be prudent. For now, we have evidence that folks upgrading from 2.11 are seeing regressions. We have not heard from anyone who likes the new behavior (though, again, there could be people who aren't reaching out to say things are better). If someone upgrades from 2.12-2.16 and wants the behavior from that range, they can enable it. Going forward, once we have a better heuristic, we could consider making "enabled" be the default, but leave the setting in place just in case. |
👍 , thanks @msfroh ! |
Closing in favor of #15637 |
Thanks, this makes sense to me. Will unblock upgrades from 2.11 to a later version and also provide a mitigation option for clients that already upgraded to 2.12 or later and only then noticed the regression. |
Description
I've noticed that some users are reporting slowdowns for
MultiTermQueries
on keyword fields as reported in #14755. This PR adds a newRewrite_Override
parameter while running queries that users can set to force certain kinds of behaviors while running the query.Default
falls back to the default approach,INDEX_ONLY
uses only the index structure,DOC_VALUES_ONLY
uses only the doc_values.Related Issues
Resolves #14755
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.