Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add rate aggregation #61369

Merged
merged 6 commits into from
Aug 25, 2020
Merged

Add rate aggregation #61369

merged 6 commits into from
Aug 25, 2020

Conversation

imotov
Copy link
Contributor

@imotov imotov commented Aug 20, 2020

Adds a new rate aggregation that can calculate a document rate for buckets
of a date_histogram.

Closes #60674

Adds a new rate aggregation that can calculate a document rate for buckets
of a date_histogram.

Closes elastic#60674
Copy link
Member

@not-napoleon not-napoleon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, LGTM. The nits are ignorable, but the comments about null values source/create unmapped, and about supporting DATE values should be addressed before merging. Thanks!

MINUTES_OF_HOUR((byte) 7, ChronoField.MINUTE_OF_HOUR) {
final long unitMillis = ChronoField.MINUTE_OF_HOUR.getBaseUnit().getDuration().toMillis();
MINUTES_OF_HOUR((byte) 7, "minute", ChronoField.MINUTE_OF_HOUR, true,
ChronoField.MINUTE_OF_HOUR.getBaseUnit().getDuration().toMillis()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Since we're touching this anyway, we could move to the new formatting standard (which would put each param on a new line here)


public class InternalRate extends InternalNumericMetricsAggregation.SingleValue implements Rate {
final double sum;
final double divider;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I think the technical term is divisor not divider.

}

@Override
protected RateAggregatorFactory innerBuild(QueryShardContext queryShardContext, ValuesSourceConfig config,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Formatting. This is a new file, should just run the formatter on the whole thing.

) throws IOException {
super(name, context, parent, metadata);
// TODO: stop expecting nulls here
this.valuesSource = valuesSourceConfig.hasValues() ? (ValuesSource.Numeric) valuesSourceConfig.getValuesSource() : null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't be using this work-around in new aggregators. Instead of using a null here, RateAggregatorFactory#createUnmapped should return an aggregator that uses a NO_OP_COLLECTOR, and we should rely on valuesSource being not null in this aggregator.


static void registerAggregators(ValuesSourceRegistry.Builder builder) {
builder.register(RateAggregationBuilder.REGISTRY_KEY,
List.of(CoreValuesSourceType.NUMERIC, CoreValuesSourceType.DATE, CoreValuesSourceType.BOOLEAN),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a use case for running Rate over a Date field? I'm having a hard time imagining one. If there is a use case for it, let's leave a comment so we remember why we want it, and if not let's not support passing in Date fields. It's much easier to add support later if we find a use case than it is to remove support for a supported data type, even if it only generates nonsense results.

@imotov
Copy link
Contributor Author

imotov commented Aug 20, 2020

@elasticmachine run elasticsearch-ci/packaging-sample-windows

@imotov imotov requested a review from csoulios August 20, 2020 21:46
@imotov imotov removed the WIP label Aug 20, 2020
@imotov imotov marked this pull request as ready for review August 20, 2020 23:07
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (:Analytics/Aggregations)

@elasticmachine elasticmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Aug 20, 2020
Copy link
Contributor

@csoulios csoulios left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

As long as the bucket doc_count is retrieved using the BucketsAggregator#getDocCounts() it will seamlessly work with aggregate doc counts!

@imotov
Copy link
Contributor Author

imotov commented Aug 25, 2020

@elasticmachine update branch

@imotov
Copy link
Contributor Author

imotov commented Aug 25, 2020

@elasticmachine run elasticsearch-ci/packaging-sample-windows

@imotov imotov merged commit f107dba into elastic:master Aug 25, 2020
imotov added a commit that referenced this pull request Aug 25, 2020
Adds a new rate aggregation that can calculate a document rate for buckets
of a date_histogram.

Closes #60674
@imotov imotov deleted the issue-60674-add-rate-agg branch August 25, 2020 21:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/Aggregations Aggregations >feature Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v7.10.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add new rate aggregation
5 participants