Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add hasValue() aggregation inspection helpers #36020

Merged
merged 18 commits into from
Jan 22, 2019

Conversation

polyfractal
Copy link
Contributor

This allows consumers of any InternalAggregation to easily determine if the agg "has a value". This is needed because InternalAggs represent "empty" in different manners according to convention. Some use NaN, +/- Inf, 0.0, etc.

The XContent deals with all these nuances and converts them to null, but internal consumers of the objects are stuck decoding the conventions themselves.

By adding hasValue(), the conventions can be handled internally and a simple interface is given for anything using InternalAggs.

The definition of "having a value" varies depending on agg, but generally:

  • If it is a metric agg, the agg has collected at least one value
  • If it is a bucketing agg, at least one of the buckets had doc_count > 0
  • Pipeline aggs behavior is variable and depends on the particular agg. E.g. derivatives always have a value because they are only added to the agg response if a value can be calculated. In contrast, a cumulative_sum will emit 0 even if no values are collected, so hasValue() depends on if any of the buckets it processes have a value themselves.

So big :(

This PR turned out quite large... we have a lot of aggs apparently. About half is just adding assertions to tests. If it's too unmanageable I suppose I could add a concrete placeholder in InternalAggregation and then implement aggs in chunks? Open to suggestions.

Didn't realize it would turn out so large when I started :)

Things not addressed in this PR

There are places where we can use the new hasValue() internally to deduplicate some code. This PR was already large (albeit simple changes), so I think we can tidy those up in followup PRs.

This also doesn't introduce any sort of hasValue() in script contexts, so scripts are unable to determine if the values they are receiving are real values or just empty placeholders. That can also be done in a followup.

Related to #34903

/cc @costin

This allows consumers of any InternalAggregation to easily determine
if the agg "has a value".  This is needed because InternalAggs represent
"empty" in different manners according to convention.  Some use NaN,
+/- Inf, 0.0, etc.

The XContent deals with all these nuances and converts them to `null`,
but internal consumers of the objects are stuck decoding the conventions
themselves.

By adding `hasValue()`, the conventions can be handled internally
and a simple interface is given for anything using InternalAggs.

The definition of "having a value" varies depending on agg,
but generally:
 - If it is a metric agg, the agg has collected at least one value
 - If it is a bucketing agg, at least one of the buckets had
   doc_count > 0
 - Pipeline aggs behavior is variable and depends on the particular agg.
   E.g. derivatives always have a value because they are only added to
   the agg response if a value can be calculated. In contrast, a
   cumulative_sum will emit 0 even if no values are collected, so
   hasValue() depends on if any of the buckets it processes have a
   value themselves.
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo

Copy link
Member

@costin costin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for picking this up @polyfractal ! Fwiw, LGTM!
Cheers!

@@ -150,6 +150,11 @@ MatrixStatsResults getResults() {
return results;
}

@Override
public boolean hasValue() {
return (getDocCount() == 0 && getStats() == null && getResults() == null) == false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe only results is needed since stats is just used internally for the results so the above might be simplified to:
getResults() != null

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++

@Override
public boolean hasValue() {
// TODO this could be incorrect... e.g. +1 + -1
// Think we'll have to serialize count if we want to fix this...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another idea which might impact backwards compatibility would be to use an out of range value (Infinity or NaN) and check that instead of the count.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just took a look and that turns out to be tricky (which is a shame, after you mentioned it I agreed that would be best). Sum agg and a few others use Kahan summation for improved numerical stability... but part of that explicitly handles when the sum goes non-finite as part of the compensation scheme. So I think we're stuck with recording counts like the Avg does.

@polyfractal
Copy link
Contributor Author

Jenkins, run the gradle build tests 1

@jpountz
Copy link
Contributor

jpountz commented Dec 4, 2018

Should we address this issue by returning null instead, like the JSON response does?

This used to be a bit tricky backward-compatibility-wise, but as users are progressively migrating to the REST clients, they are now getting nulls anyway and changes to the transport client are getting easier?

@polyfractal
Copy link
Contributor Author

If we're ok with the bwc break that'd probably be cleaner. It'd be a fairly widespread break for any remaining TC users though.

I don't have a strong opinion either way.

@jpountz
Copy link
Contributor

jpountz commented Dec 5, 2018

As much as I rather like cleaning up tech debt sooner than later, maybe the easier path is to change responses to null once we expect users to have completed their migration to the REST clients, eg. maybe for 8.0?

@polyfractal
Copy link
Contributor Author

We chatted about this some more over slack. The main concern is adding to the complexity and technical debt to the agg framework (which is already sizable in places) for a short-ish term fix.

@costin would this work for the SQL team:

  • Fix responses to properly return null or a value in 8.0. It'll still be a bwc break for users, but an easier story since we'll have the HLRC and users should be moving to it anyway.

  • For the duration of 6.x and 7.x, we add a static helper class that takes any Internal* agg object and tells you if it has a value or not.

    Basically all the hasValue() logic condensed into a single helper class. The analytics team can be responsible for keeping it updated and the SQL team can just import/use the helper where needed. This contains the "fix" to a single class which will be easy to cleanup when the agg framework is refactored in 8.0.

?

@costin
Copy link
Member

costin commented Dec 5, 2018

Sounds good. Thanks!

@polyfractal
Copy link
Contributor Author

Coolio. I'll fixup this PR (hopefully much reduced in size) and give you both a ping once the new version is ready for review :)

@polyfractal
Copy link
Contributor Author

Apologies for the delay. I removed the hasValue() method from InternalAggs, and instead created a set of "inspection helper" classes that have a bunch of hasValue() static functions. It's a little more unwieldy, but not too bad.

The main difference is that pipeline aggs are strictly worse now. Most of the pipeline aggs share InternalSimpleValue for their result, which means there's no way for a helper to know what generated the value and what the convention for "empty" is (in the previous iteration we overloaded hasValue() with an anonymous class).

I think we can improve this in the future, but in interest of keeping this PR as small as possible I just went with the coarsest definition of having value (not-NaN basically)

@costin
Copy link
Member

costin commented Dec 13, 2018

Thanks.
It's a step in the right direction and we can iterate in the following releases. For example, it would be nice to have one class that handles hasValue (I thought that was the previous proposal) as oppose to one class per package which forces the consumer to do a bunch of instanceof checks to figure out the internal agg type (and thus be tied to it) instead of just delegating to InternalInspectorHelper which already is aware of them. This would also make things more refactoring friendly since the internal implementation can change without impact to the user while currently one depends on MatrixAggregationInspectionHelper, etc.. and thus is highly impacted by any change to any of them.

@polyfractal
Copy link
Contributor Author

The issue is that Join and MatrixStats are in different plugins, so we can't reference them from a helper in core without adding them as a dependency. And I don't believe there's any central location that has all of them together in terms of dependencies...

:(

@costin
Copy link
Member

costin commented Dec 13, 2018

I see - then the smaller the number of Helper classes the better.

@polyfractal
Copy link
Contributor Author

We could also just move the helper into SQL itself, now that it's written. The three main helpers could be combined (Core, Join, MatrixStats), and the only one that needs to stay in core is the Metrics helper because that access package-private state.

But it's not meant to be consumed anyway, the main AggregationInspectionHelper delegates to MetricsInspectionHelper where needed.

@costin
Copy link
Member

costin commented Dec 14, 2018

That works too - we can do that as a separate PR to not postpone this one.

@jasontedor jasontedor added v6.7.0 and removed v6.6.0 labels Dec 19, 2018
@polyfractal
Copy link
Contributor Author

Jenkins, run the gradle build tests 1

@polyfractal polyfractal requested a review from talevy January 11, 2019 16:04
Copy link
Contributor

@talevy talevy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall LGTM, but left a question

Also awaits a matrix test because it fails if results are reduced.
Variances become different, unclear if this is a problem with the
test helper class ("MultiPassStats") or with incremental reduction
of matrix stats.
@polyfractal
Copy link
Contributor Author

Good call, thanks @talevy. Added uses of the helpers in the Matrix and Join tests.

Discovered that the MatrixStats test fails when it reduces due to mismatched variances. I investigated briefly but couldn't tell if this was due to the test-specific class (MultiPassStats) is doing something incorrectly, or if there's a bug with incremental reduction of the matrix stats. The agg uses an odd layout of temporary classes internally too, so I wasn't able to quickly determine what's going on.

Muted the test, will create an issue to investigate and unmute. Didn't want to block this PR on fixing that test though.

@polyfractal
Copy link
Contributor Author

@elasticmachine test this please

@polyfractal
Copy link
Contributor Author

@elasticmachine run the gradle build tests 1
@elasticmachine run the gradle build tests 2

Copy link
Contributor

@talevy talevy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM if bugUrl is updated and CI is happy

@polyfractal
Copy link
Contributor Author

@elasticmachine run the gradle build tests 1
@elasticmachine run the gradle build tests 2

2 similar comments
@polyfractal
Copy link
Contributor Author

@elasticmachine run the gradle build tests 1
@elasticmachine run the gradle build tests 2

@polyfractal
Copy link
Contributor Author

@elasticmachine run the gradle build tests 1
@elasticmachine run the gradle build tests 2

@polyfractal polyfractal merged commit 2ba9e36 into elastic:master Jan 22, 2019
@polyfractal polyfractal changed the title Add hasValue() property to InternalAggregation Add hasValue() aggregation inspection helpers Jun 21, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants