-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sorted cardinality results don't include the largest bucket #67782
Comments
Pinging @elastic/es-analytics-geo (Team:Analytics) |
Here is a copy-and-paste-able bash reproduction:
For those following along at home, the test finds the play with the most distinct speakers in shakespear's corpus. |
For what it is worth you could get failures of this sometimes even without bugs if documents wind up in the wrong spot. But on a single shard index it should always pass because |
The cardinality agg delays calculating stuff until just before it is needed. Before elastic#64016 it used the `postCollect` phase to do this work which was perfect for the `terms` agg but we decided that `postCollect` was dangerous because some aggs, notably the `parent` and `child` aggs need to know which children to build and they *can't* during `postCollect`. After elastic#64016 we built the cardinality agg results when we built the buckets. But we if you sort on the cardinality agg then you need to do the `postCollect` stuff in order to know which buckets to build! So you have a chicken and egg problem. Sort of. This change splits the difference by running the delayed cardinality agg stuff as soon as you *either* try to build the buckets *or* read the cardinality for use with sorting. This works, but is a little janky and feels wrong. It feels like we could make a structural fix to the way we read metric values from aggs before building the buckets that would make this sort of bug much more difficult to cause. But any sort of solution to this is a larger structural change. So this fixes the bug in the quick and janky way and we hope to do a more structural fix to the way we read metrics soon. Closes elastic#67782
Reference to the Kibana issue: elastic/kibana#82206 |
#67839 makes the right results:
|
The cardinality agg delays calculating stuff until just before it is needed. Before #64016 it used the `postCollect` phase to do this work which was perfect for the `terms` agg but we decided that `postCollect` was dangerous because some aggs, notably the `parent` and `child` aggs need to know which children to build and they *can't* during `postCollect`. After #64016 we built the cardinality agg results when we built the buckets. But we if you sort on the cardinality agg then you need to do the `postCollect` stuff in order to know which buckets to build! So you have a chicken and egg problem. Sort of. This change splits the difference by running the delayed cardinality agg stuff as soon as you *either* try to build the buckets *or* read the cardinality for use with sorting. This works, but is a little janky and feels wrong. It feels like we could make a structural fix to the way we read metric values from aggs before building the buckets that would make this sort of bug much more difficult to cause. But any sort of solution to this is a larger structural change. So this fixes the bug in the quick and janky way and we hope to do a more structural fix to the way we read metrics soon. Closes #67782
The cardinality agg delays calculating stuff until just before it is needed. Before elastic#64016 it used the `postCollect` phase to do this work which was perfect for the `terms` agg but we decided that `postCollect` was dangerous because some aggs, notably the `parent` and `child` aggs need to know which children to build and they *can't* during `postCollect`. After elastic#64016 we built the cardinality agg results when we built the buckets. But we if you sort on the cardinality agg then you need to do the `postCollect` stuff in order to know which buckets to build! So you have a chicken and egg problem. Sort of. This change splits the difference by running the delayed cardinality agg stuff as soon as you *either* try to build the buckets *or* read the cardinality for use with sorting. This works, but is a little janky and feels wrong. It feels like we could make a structural fix to the way we read metric values from aggs before building the buckets that would make this sort of bug much more difficult to cause. But any sort of solution to this is a larger structural change. So this fixes the bug in the quick and janky way and we hope to do a more structural fix to the way we read metrics soon. Closes elastic#67782
The cardinality agg delays calculating stuff until just before it is needed. Before elastic#64016 it used the `postCollect` phase to do this work which was perfect for the `terms` agg but we decided that `postCollect` was dangerous because some aggs, notably the `parent` and `child` aggs need to know which children to build and they *can't* during `postCollect`. After elastic#64016 we built the cardinality agg results when we built the buckets. But we if you sort on the cardinality agg then you need to do the `postCollect` stuff in order to know which buckets to build! So you have a chicken and egg problem. Sort of. This change splits the difference by running the delayed cardinality agg stuff as soon as you *either* try to build the buckets *or* read the cardinality for use with sorting. This works, but is a little janky and feels wrong. It feels like we could make a structural fix to the way we read metric values from aggs before building the buckets that would make this sort of bug much more difficult to cause. But any sort of solution to this is a larger structural change. So this fixes the bug in the quick and janky way and we hope to do a more structural fix to the way we read metrics soon. Closes elastic#67782
The cardinality agg delays calculating stuff until just before it is needed. Before #64016 it used the `postCollect` phase to do this work which was perfect for the `terms` agg but we decided that `postCollect` was dangerous because some aggs, notably the `parent` and `child` aggs need to know which children to build and they *can't* during `postCollect`. After #64016 we built the cardinality agg results when we built the buckets. But we if you sort on the cardinality agg then you need to do the `postCollect` stuff in order to know which buckets to build! So you have a chicken and egg problem. Sort of. This change splits the difference by running the delayed cardinality agg stuff as soon as you *either* try to build the buckets *or* read the cardinality for use with sorting. This works, but is a little janky and feels wrong. It feels like we could make a structural fix to the way we read metric values from aggs before building the buckets that would make this sort of bug much more difficult to cause. But any sort of solution to this is a larger structural change. So this fixes the bug in the quick and janky way and we hope to do a more structural fix to the way we read metrics soon. Closes #67782
The cardinality agg delays calculating stuff until just before it is needed. Before #64016 it used the `postCollect` phase to do this work which was perfect for the `terms` agg but we decided that `postCollect` was dangerous because some aggs, notably the `parent` and `child` aggs need to know which children to build and they *can't* during `postCollect`. After #64016 we built the cardinality agg results when we built the buckets. But we if you sort on the cardinality agg then you need to do the `postCollect` stuff in order to know which buckets to build! So you have a chicken and egg problem. Sort of. This change splits the difference by running the delayed cardinality agg stuff as soon as you *either* try to build the buckets *or* read the cardinality for use with sorting. This works, but is a little janky and feels wrong. It feels like we could make a structural fix to the way we read metric values from aggs before building the buckets that would make this sort of bug much more difficult to cause. But any sort of solution to this is a larger structural change. So this fixes the bug in the quick and janky way and we hope to do a more structural fix to the way we read metrics soon. Closes #67782
Elasticsearch version (
bin/elasticsearch --version
): 8.0.0, 7.12.0, 7.11.0Plugins installed: [] none, default distribution
JVM version (
java -version
): built-in JDKOS version (
uname -a
if on a Unix-like system): all (this is my current master source running but this impacts 7.x and 7.11 branches as well)Description of the problem including expected versus actual behavior: A cardinality agg with split by terms is no longer returning the term with the largest result count. Results vary based on the "size".
Almost 3 years ago we automated the Shakespeare Kibana getting started tutorial https://www.elastic.co/guide/en/kibana/6.8/tutorial-load-dataset.html
The test has been passing with the same expected results until about Oct 29, 2020 when the results returned by the aggregation changed. Unfortunately the test was skipped to allow Kibana to take the new Elasticsearch snapshot and wasn't investigated until now.
Steps to reproduce:
Please include a minimal but complete recreation of the problem,
including (e.g.) index creation, mappings, settings, query etc. The easier
you make for us to reproduce it, the more likely that somebody will take the
time to look at it.
curl -H 'Content-Type: application/x-ndjson' -XPOST 'localhost:9200/shakespeare/doc/_bulk?pretty' --data-binary @shakespeare_6.0.json
curl -XGET 'localhost:9220/shakespeare/_count'
"count":111396The results I get on latest master are incorrect;
If we increase the terms agg size to 12 we get results that show the largest bucket value of 71 which is what the Kibana test has expected since it was written almost 3 years ago and is what 7.10 shows;
Provide logs (if relevant):
The text was updated successfully, but these errors were encountered: