-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add rate aggregation #61369
Add rate aggregation #61369
Conversation
Adds a new rate aggregation that can calculate a document rate for buckets of a date_histogram. Closes elastic#60674
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, LGTM. The nits are ignorable, but the comments about null values source/create unmapped, and about supporting DATE values should be addressed before merging. Thanks!
MINUTES_OF_HOUR((byte) 7, ChronoField.MINUTE_OF_HOUR) { | ||
final long unitMillis = ChronoField.MINUTE_OF_HOUR.getBaseUnit().getDuration().toMillis(); | ||
MINUTES_OF_HOUR((byte) 7, "minute", ChronoField.MINUTE_OF_HOUR, true, | ||
ChronoField.MINUTE_OF_HOUR.getBaseUnit().getDuration().toMillis()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Since we're touching this anyway, we could move to the new formatting standard (which would put each param on a new line here)
|
||
public class InternalRate extends InternalNumericMetricsAggregation.SingleValue implements Rate { | ||
final double sum; | ||
final double divider; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I think the technical term is divisor
not divider
.
...n/analytics/src/main/java/org/elasticsearch/xpack/analytics/rate/RateAggregationBuilder.java
Show resolved
Hide resolved
} | ||
|
||
@Override | ||
protected RateAggregatorFactory innerBuild(QueryShardContext queryShardContext, ValuesSourceConfig config, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Formatting. This is a new file, should just run the formatter on the whole thing.
) throws IOException { | ||
super(name, context, parent, metadata); | ||
// TODO: stop expecting nulls here | ||
this.valuesSource = valuesSourceConfig.hasValues() ? (ValuesSource.Numeric) valuesSourceConfig.getValuesSource() : null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't be using this work-around in new aggregators. Instead of using a null
here, RateAggregatorFactory#createUnmapped
should return an aggregator that uses a NO_OP_COLLECTOR
, and we should rely on valuesSource
being not null in this aggregator.
|
||
static void registerAggregators(ValuesSourceRegistry.Builder builder) { | ||
builder.register(RateAggregationBuilder.REGISTRY_KEY, | ||
List.of(CoreValuesSourceType.NUMERIC, CoreValuesSourceType.DATE, CoreValuesSourceType.BOOLEAN), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a use case for running Rate over a Date field? I'm having a hard time imagining one. If there is a use case for it, let's leave a comment so we remember why we want it, and if not let's not support passing in Date fields. It's much easier to add support later if we find a use case than it is to remove support for a supported data type, even if it only generates nonsense results.
@elasticmachine run elasticsearch-ci/packaging-sample-windows |
Pinging @elastic/es-analytics-geo (:Analytics/Aggregations) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
As long as the bucket doc_count
is retrieved using the BucketsAggregator#getDocCounts()
it will seamlessly work with aggregate doc counts!
@elasticmachine update branch |
@elasticmachine run elasticsearch-ci/packaging-sample-windows |
Adds a new rate aggregation that can calculate a document rate for buckets
of a date_histogram.
Closes #60674