Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace AggregatorTestCase#search with AggregatorTestCase#searchAndReduce #60683

Merged
merged 4 commits into from
Aug 6, 2020

Conversation

jimczi
Copy link
Contributor

@jimczi jimczi commented Aug 4, 2020

This commit removes the ability to test the top level result of an aggregator
before it runs the final reduce. All aggregator tests that use AggregatorTestCase#search
are rewritten with AggregatorTestCase#searchAndReduce in order to ensure that we test
the final output (the one sent to the end user) rather than an intermediary result
that could be different.
This change also removes spurious commits triggered on top of a random index writer.
These commits slow down the tests and are redundant with the commits that the
random index writer performs.

…duce

This commit removes the ability to test the top level result of an aggregator
before it runs the final reduce. All aggregator tests that use AggregatorTestCase#search
are rewritten with AggregatorTestCase#searchAndReduce in order to ensure that we test
the final output (the one sent to the end user) rather than an intermediary result
that could be different.
This change also removes spurious commits triggered on top of a random index writer.
These commits slow down the tests and are redundant with the commits that the
random index writer performs.
@jimczi jimczi added >non-issue >test Issues or PRs that are addressing/adding tests :Analytics/Aggregations Aggregations v8.0.0 v7.10.0 labels Aug 4, 2020
@jimczi jimczi requested review from nik9000 and not-napoleon August 4, 2020 19:26
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (:Analytics/Aggregations)

@elasticmachine elasticmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Aug 4, 2020
Copy link
Member

@not-napoleon not-napoleon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Please clean up that one for loop that got missed, otherwise 👍

@@ -451,7 +450,8 @@ private void mergeBucketsWithPlan(List<Bucket> buckets, List<BucketRange> plan,
}
toMerge.add(buckets.get(startIdx)); // Don't remove the startIdx bucket because it will be replaced by the merged bucket

reduceContext.consumeBucketsAndMaybeBreak(- (toMerge.size() - 1));
int toRemove = toMerge.stream().mapToInt(b -> countInnerBucket(b)+1).sum();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this an actual bug you found while making this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A minor one, yes. The max bucket count is not accurate when buckets are auto-merged.

Copy link
Member

@nik9000 nik9000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for the auto date histo and variable width histo I'll miss being able to assert on things that aren't yet finally reduced, but it is worth it.

@jimczi jimczi merged commit 5de0ed9 into elastic:master Aug 6, 2020
@jimczi jimczi deleted the aggregator_tests_search_and_reduce branch August 6, 2020 12:08
jimczi added a commit to jimczi/elasticsearch that referenced this pull request Aug 6, 2020
This commit fixes the computation of the subset size on empty buckets (doc count of 0).
The aggregator test refactoring in elastic#60683 revealed this bug.
jimczi added a commit that referenced this pull request Aug 6, 2020
This commit fixes the computation of the subset size on empty buckets (doc count of 0).
The aggregator test refactoring in #60683 revealed this bug.
jimczi added a commit that referenced this pull request Aug 6, 2020
This commit fixes the computation of the subset size on empty buckets (doc count of 0).
The aggregator test refactoring in #60683 revealed this bug.
nik9000 added a commit that referenced this pull request Aug 19, 2020
With #60683 we stopped forcing aggregating all docs using a single
Aggregator which made some of our accuracy assumptions about the stats
aggregator incorrect. This adds a test that does the forcing and asserts
the old accuracy and adds a test without the forcing with much looser
accuracy guarantees.

Closes #61132
nik9000 added a commit that referenced this pull request Aug 19, 2020
With #60683 we stopped forcing aggregating all docs using a single
Aggregator which made some of our accuracy assumptions about the stats
aggregator incorrect. This adds a test that does the forcing and asserts
the old accuracy and adds a test without the forcing with much looser
accuracy guarantees.

Closes #61132
@jakelandis jakelandis removed the v8.0.0 label Jul 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/Aggregations Aggregations >non-issue Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) >test Issues or PRs that are addressing/adding tests v7.10.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants