-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pipeline aggregations: Ability to perform computations on aggregations #10568
Conversation
sub-classes of InternalAggregation now implement doReduce(ReduceContext) that is called from InternalAggregation.reduce(ReduceContext) which is now final
The list of reducers is fed through from the AggregatorFactory
These reducers will be passed through from the AggregatorParser
Mostly due to @jpountz's leaf collector changes
…add '.value' to the path
…ation will need it
uses the depth-first algorithm from http://en.wikipedia.org/wiki/Topological_sorting#Algorithms Needs some cleaning up
Most tests have been marked with @AwaitsFix since they require functionality to be implemented before they will pass
Another InternalHistogram instance can be passed into the method with the buckets and the name and will be used to set all the options such as minDocCount, formatter, Order etc.
@@ -42,8 +44,9 @@ | |||
private IntArray docCounts; | |||
|
|||
public BucketsAggregator(String name, AggregatorFactories factories, | |||
AggregationContext context, Aggregator parent, Map<String, Object> metaData) throws IOException { | |||
super(name, factories, context, parent, metaData); | |||
AggregationContext context, Aggregator parent, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indentation issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep
I like the fact that it's not too invasive and makes efforts to not have side-effects on other aggs by copying the agg trees instead of modifying them in-place. Most of my comments are cosmetics, in particular there are lots of indentation issues in aggregator constructor and anonymous LeafCollector definitions. Regarding documentation, I think we need to move general informations to the main page about aggs/reducers instead of having it scattered across individual reducers/aggs. |
@jpountz I pushed an update and replied to some of your comments. Any chance you could have another look? |
It looks to me like there are still some open TODOs about documentation but other than that LGTM. I pushed commits for some indentation issues I found. |
I'm working on the larger doc TODOs (e.g. centralize docs about |
# Conflicts: # src/main/java/org/elasticsearch/index/query/CommonTermsQueryBuilder.java # src/main/java/org/elasticsearch/search/aggregations/AggregationModule.java # src/main/java/org/elasticsearch/search/aggregations/AggregatorFactories.java # src/main/java/org/elasticsearch/search/aggregations/AggregatorParsers.java # src/main/java/org/elasticsearch/search/aggregations/InternalMultiBucketAggregation.java # src/main/java/org/elasticsearch/search/aggregations/bucket/nested/NestedAggregator.java # src/main/java/org/elasticsearch/search/aggregations/metrics/InternalNumericMetricsAggregation.java # src/test/java/org/elasticsearch/search/aggregations/bucket/nested/NestedAggregatorTest.java
To reproduce the failures use `-Dtests.seed=D9EF60095522804F`
# Conflicts: # src/main/java/org/elasticsearch/search/builder/SearchSourceBuilder.java
Adds a new type of aggregation called 'reducers' which act on the output of aggregations and compute extra information that they add to the aggregation tree. Reducers look much like any other aggregation in the request but have a
buckets_path
parameter which references the aggregation(s) to use.Internally there are two types of reducer; the first is given the output of its parent aggregation and computes new aggregations to add to the buckets of its parent, and the second (a specialisation of the first) is given a sibling aggregation and outputs an aggregation to be a sibling at the same level as that aggregation.
This PR includes the framework for the reducers, the derivative reducer (#9293), the moving average reducer(#10002) and the maximum bucket reducer(#10000). These reducer implementations are not all yet fully complete.
Known work left to do (these points will be done once this PR is merged into the master branch):
Contributes to #9876, #10002, #9293, and #10000