Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pipeline aggregations: Ability to perform computations on aggregations #10568

Merged
merged 78 commits into from
Apr 29, 2015

Conversation

colings86
Copy link
Contributor

Adds a new type of aggregation called 'reducers' which act on the output of aggregations and compute extra information that they add to the aggregation tree. Reducers look much like any other aggregation in the request but have a buckets_path parameter which references the aggregation(s) to use.

Internally there are two types of reducer; the first is given the output of its parent aggregation and computes new aggregations to add to the buckets of its parent, and the second (a specialisation of the first) is given a sibling aggregation and outputs an aggregation to be a sibling at the same level as that aggregation.

This PR includes the framework for the reducers, the derivative reducer (#9293), the moving average reducer(#10002) and the maximum bucket reducer(#10000). These reducer implementations are not all yet fully complete.

Known work left to do (these points will be done once this PR is merged into the master branch):

  • Add x-axis normalisation to the derivative reducer
  • Add lots more JUnit tests for all reducers

Contributes to #9876, #10002, #9293, and #10000

colings86 and others added 30 commits February 12, 2015 13:23
sub-classes of InternalAggregation now implement doReduce(ReduceContext) that is called from InternalAggregation.reduce(ReduceContext) which is now final
The list of reducers is fed through from the AggregatorFactory
These reducers will be passed through from the AggregatorParser
Mostly due to @jpountz's leaf collector changes
Most tests have been marked with @AwaitsFix since they require functionality to be implemented before they will pass
Another InternalHistogram instance can be passed into the method with the buckets and the name and will be used to set all the options such as minDocCount, formatter, Order etc.
@@ -42,8 +44,9 @@
private IntArray docCounts;

public BucketsAggregator(String name, AggregatorFactories factories,
AggregationContext context, Aggregator parent, Map<String, Object> metaData) throws IOException {
super(name, factories, context, parent, metaData);
AggregationContext context, Aggregator parent,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indentation issue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep

@jpountz
Copy link
Contributor

jpountz commented Apr 24, 2015

I like the fact that it's not too invasive and makes efforts to not have side-effects on other aggs by copying the agg trees instead of modifying them in-place.

Most of my comments are cosmetics, in particular there are lots of indentation issues in aggregator constructor and anonymous LeafCollector definitions.

Regarding documentation, I think we need to move general informations to the main page about aggs/reducers instead of having it scattered across individual reducers/aggs.

@colings86
Copy link
Contributor Author

@jpountz I pushed an update and replied to some of your comments. Any chance you could have another look?

@jpountz
Copy link
Contributor

jpountz commented Apr 29, 2015

It looks to me like there are still some open TODOs about documentation but other than that LGTM. I pushed commits for some indentation issues I found.

@polyfractal
Copy link
Contributor

I'm working on the larger doc TODOs (e.g. centralize docs about _count, buckets_path and path syntax, gap_policy, etc). I don't think it should block merging this though, especially since I'm restructuring part of the aggs docs as a whole and want some eyeballs...don't want to hold up this PR.

# Conflicts:
#	src/main/java/org/elasticsearch/index/query/CommonTermsQueryBuilder.java
#	src/main/java/org/elasticsearch/search/aggregations/AggregationModule.java
#	src/main/java/org/elasticsearch/search/aggregations/AggregatorFactories.java
#	src/main/java/org/elasticsearch/search/aggregations/AggregatorParsers.java
#	src/main/java/org/elasticsearch/search/aggregations/InternalMultiBucketAggregation.java
#	src/main/java/org/elasticsearch/search/aggregations/bucket/nested/NestedAggregator.java
#	src/main/java/org/elasticsearch/search/aggregations/metrics/InternalNumericMetricsAggregation.java
#	src/test/java/org/elasticsearch/search/aggregations/bucket/nested/NestedAggregatorTest.java
To reproduce the failures use `-Dtests.seed=D9EF60095522804F`
# Conflicts:
#	src/main/java/org/elasticsearch/search/builder/SearchSourceBuilder.java
@colings86 colings86 merged commit 0589adb into master Apr 29, 2015
@kevinkluge kevinkluge removed the review label Apr 29, 2015
@colings86 colings86 deleted the feature/aggs_2_0 branch May 19, 2015 14:55
@clintongormley clintongormley changed the title Ability to perform computations on aggregations Pipeline aggregations: Ability to perform computations on aggregations Jun 6, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants