Pipeline aggregations: Ability to perform computations on aggregations #10568

colings86 · 2015-04-13T14:09:49Z

Adds a new type of aggregation called 'reducers' which act on the output of aggregations and compute extra information that they add to the aggregation tree. Reducers look much like any other aggregation in the request but have a buckets_path parameter which references the aggregation(s) to use.

Internally there are two types of reducer; the first is given the output of its parent aggregation and computes new aggregations to add to the buckets of its parent, and the second (a specialisation of the first) is given a sibling aggregation and outputs an aggregation to be a sibling at the same level as that aggregation.

This PR includes the framework for the reducers, the derivative reducer (#9293), the moving average reducer(#10002) and the maximum bucket reducer(#10000). These reducer implementations are not all yet fully complete.

Known work left to do (these points will be done once this PR is merged into the master branch):

Add x-axis normalisation to the derivative reducer
Add lots more JUnit tests for all reducers

Contributes to #9876, #10002, #9293, and #10000

sub-classes of InternalAggregation now implement doReduce(ReduceContext) that is called from InternalAggregation.reduce(ReduceContext) which is now final

The list of reducers is fed through from the AggregatorFactory

These reducers will be passed through from the AggregatorParser

@jpountz

Mostly due to @jpountz's leaf collector changes

…add '.value' to the path

…regatorFactories

…ation will need it

uses the depth-first algorithm from http://en.wikipedia.org/wiki/Topological_sorting#Algorithms Needs some cleaning up

…c exception

Most tests have been marked with @AwaitsFix since they require functionality to be implemented before they will pass

Another InternalHistogram instance can be passed into the method with the buckets and the name and will be used to set all the options such as minDocCount, formatter, Order etc.

…tance

jpountz · 2015-04-24T08:00:20Z

src/main/java/org/elasticsearch/search/aggregations/bucket/BucketsAggregator.java

@@ -42,8 +44,9 @@
    private IntArray docCounts;

    public BucketsAggregator(String name, AggregatorFactories factories,
-                             AggregationContext context, Aggregator parent, Map<String, Object> metaData) throws IOException {
-        super(name, factories, context, parent, metaData);
+ AggregationContext context, Aggregator parent,


indentation issue?

jpountz · 2015-04-24T08:32:49Z

I like the fact that it's not too invasive and makes efforts to not have side-effects on other aggs by copying the agg trees instead of modifying them in-place.

Most of my comments are cosmetics, in particular there are lots of indentation issues in aggregator constructor and anonymous LeafCollector definitions.

Regarding documentation, I think we need to move general informations to the main page about aggs/reducers instead of having it scattered across individual reducers/aggs.

colings86 · 2015-04-29T08:33:28Z

@jpountz I pushed an update and replied to some of your comments. Any chance you could have another look?

jpountz · 2015-04-29T13:14:46Z

It looks to me like there are still some open TODOs about documentation but other than that LGTM. I pushed commits for some indentation issues I found.

polyfractal · 2015-04-29T13:24:28Z

I'm working on the larger doc TODOs (e.g. centralize docs about _count, buckets_path and path syntax, gap_policy, etc). I don't think it should block merging this though, especially since I'm restructuring part of the aggs docs as a whole and want some eyeballs...don't want to hold up this PR.

# Conflicts: # src/main/java/org/elasticsearch/index/query/CommonTermsQueryBuilder.java # src/main/java/org/elasticsearch/search/aggregations/AggregationModule.java # src/main/java/org/elasticsearch/search/aggregations/AggregatorFactories.java # src/main/java/org/elasticsearch/search/aggregations/AggregatorParsers.java # src/main/java/org/elasticsearch/search/aggregations/InternalMultiBucketAggregation.java # src/main/java/org/elasticsearch/search/aggregations/bucket/nested/NestedAggregator.java # src/main/java/org/elasticsearch/search/aggregations/metrics/InternalNumericMetricsAggregation.java # src/test/java/org/elasticsearch/search/aggregations/bucket/nested/NestedAggregatorTest.java

To reproduce the failures use `-Dtests.seed=D9EF60095522804F`

# Conflicts: # src/main/java/org/elasticsearch/search/builder/SearchSourceBuilder.java

colings86 and others added 30 commits February 12, 2015 13:23

make InternalAggregation.reduce(ReduceContext) use template pattern

e2949d7

sub-classes of InternalAggregation now implement doReduce(ReduceContext) that is called from InternalAggregation.reduce(ReduceContext) which is now final

Adds reducers list to InternalAggregation.reduce()

c60bb4d

The list of reducers is fed through from the AggregatorFactory

AggregatorFactories now stores reducers as well as aggregators

ae76239

These reducers will be passed through from the AggregatorParser

Reducers are now parsed in AggregatorParsers

1e947c8

Reducers are now wired end-to-end into the agg framework

55b82db

Basic derivative reducer

9cfa6c6

Fixing compile issues after rebase with master

d65e9a4

Mostly due to @jpountz's leaf collector changes

fix to the name of the injected aggregation for derivatives

ef4a910

Minor indentation/validation fix in AggregatorParsers.

f00a9b8

derivative reducer now works with both date_histogram and histogram

3a77754

can now reference single value metrics directly instead of having to …

9805b83

…add '.value' to the path

Can now specify a format for the returned derivative values

0f22d7e

Validation of the reducer factories is now called from within the Agg…

18c2cb6

…regatorFactories

bucketsPath is now in the Reducer class since every Reducer implement…

9357fc4

…ation will need it

Merge branch 'master' into feature/aggs_2_0

63f3281

First (rough) pass at dependancy resolution for reducers

3ab3ffa

uses the depth-first algorithm from http://en.wikipedia.org/wiki/Topological_sorting#Algorithms Needs some cleaning up

getProperty method in the aggregations framework now throws a specifi…

f20dae8

…c exception

Derivative Reducer now supported nth order derivatives

58f2cec

removed obselete NOCOMMIT and left over sysout call

247b6a7

Added Builder classes for Reducers

e994044

Added Builder for Derivatives Reducer

c97dd84

More update to support Reducer Builders

511e275

Tests for derivative reducer

f68bce5

Most tests have been marked with @AwaitsFix since they require functionality to be implemented before they will pass

InternalHistogram.Factory.create() can now work from prototype

269d4bc

Another InternalHistogram instance can be passed into the method with the buckets and the name and will be used to set all the options such as minDocCount, formatter, Order etc.

DerivativeReducer now copies histogram options from old histogram ins…

19cdfe2

…tance

Added support for _count and _key as bucketsPaths

3375c02

updated derivative tests to test _count

6c12cfd

Cleaning up NOCOMMITs which are resolved

f03fe5b

Cleaning up NOCOMMITs

7f84466

Added test for second_derivative

5a2c4ab

jpountz reviewed Apr 24, 2015
View reviewed changes

polyfractal and others added 4 commits April 24, 2015 22:38

$@polyfractal$

Rename helpers to follow naming conventions

26189ee

review comment fixes

31f26ec

$@polyfractal$

review comment fixes

935144a

$@polyfractal$

[DOCS] review comment fixes

bf9739d

jpountz added 2 commits April 29, 2015 15:06

Fix some indentation issues.

891dfee

Other indentation fixes

ccca038

colings86 added 5 commits April 29, 2015 14:55

fixed issue with eggs in percolation request for 1 shard

3bb8ff2

Muted intermittently failing tests

a33e77f

To reproduce the failures use `-Dtests.seed=D9EF60095522804F`

Merge branch 'master' into feature/aggs_2_0

88aa893

Merge branch 'master' into feature/aggs_2_0

0589adb

# Conflicts: # src/main/java/org/elasticsearch/search/builder/SearchSourceBuilder.java

colings86 merged commit 0589adb into master Apr 29, 2015

kevinkluge removed the review label Apr 29, 2015

colings86 deleted the feature/aggs_2_0 branch May 19, 2015 14:55

clintongormley changed the title ~~Ability to perform computations on aggregations~~ Pipeline aggregations: Ability to perform computations on aggregations Jun 6, 2015

eskibars mentioned this pull request Aug 27, 2015

Histogram Forecast elastic/kibana#1381

Closed

colings86 mentioned this pull request Aug 4, 2016

Should we remove/modify some of the experiment tags in the documentation #19798

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pipeline aggregations: Ability to perform computations on aggregations #10568

Pipeline aggregations: Ability to perform computations on aggregations #10568

colings86 commented Apr 13, 2015

jpountz Apr 24, 2015

colings86 Apr 27, 2015

jpountz commented Apr 24, 2015

colings86 commented Apr 29, 2015

jpountz commented Apr 29, 2015

polyfractal commented Apr 29, 2015

Pipeline aggregations: Ability to perform computations on aggregations #10568

Pipeline aggregations: Ability to perform computations on aggregations #10568

Conversation

colings86 commented Apr 13, 2015

jpountz Apr 24, 2015

Choose a reason for hiding this comment

colings86 Apr 27, 2015

Choose a reason for hiding this comment

jpountz commented Apr 24, 2015

colings86 commented Apr 29, 2015

jpountz commented Apr 29, 2015

polyfractal commented Apr 29, 2015