Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Test] Convert Pipeline agg integration tests into AggregatorTestCase #36015

Open
polyfractal opened this issue Nov 28, 2018 · 6 comments
Open
Labels
:Analytics/Aggregations Aggregations help wanted adoptme >non-issue Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) >test Issues or PRs that are addressing/adding tests

Comments

@polyfractal
Copy link
Contributor

Pipeline aggs were added before the AggregatorTestCase was created, so most of the pipelines are tested with a few unit tests and a lot of integration tests. We should be able to convert these over to AggregatorTestCase, which should be faster and generally easier to test against.

@polyfractal polyfractal added >non-issue >test Issues or PRs that are addressing/adding tests help wanted adoptme :Analytics/Aggregations Aggregations labels Nov 28, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo

@mirkojotic
Copy link
Contributor

Hey @polyfractal I'm interested in taking this one on. Can you give me some more details around this task - maybe a file or two ;)

@polyfractal
Copy link
Contributor Author

Sure! Thanks for offering to help out!

IIRC I filed this issue while working on #29641. In that PR, I removed CumulativeSumIT and replaced it with CumulativeSumAggregatorTests.

CumulativeSumAggregatorTests extends AggregatorTestCase which is somewhere between an integration test and a unit test. It basically allows tests to index a set of documents into a test Lucene index, then run searches/aggregations on that index and test the results (including multiple reductions to simulate multiple nodes/shards).

In contrast, all the tests that end in "IT" here extend ESIntegTestCase, which spins up one or more actual ES nodes, indexes the documents and runs the tests. Spinning up the whole node + waiting for nodes to discover eachother, etc takes a long time.

In most cases (I think!) we can convert directly from the IT to the AggregatorTestCase version without any loss of testing functionality.

I'd recommend doing individual pipeline aggs one at a time (one per PR), which will make your life easier since you aren't on the hook for all the aggs, and easier for us to review.

Thanks again! Feel free to ping me if you have questions, or you can open a WIP PR and we can iterate there. ❤️

@mirkojotic
Copy link
Contributor

mirkojotic commented Mar 19, 2019

Hi @polyfractal, I was looking to convert a couple of other IT to extend AggregatorTestCase but I have a few questions.
Remaining ITs include a lot of sibling aggs. Can you point me to an example on how I would assemble such aggregation.
A small example of what I'm talking about:
Currently in the IT's there are a lot of
client.addAggregation(...) which if I'm not wrong adds a sibling aggregation.
The new api exposes agg.subAggregation(...) and I haven't found a comparable solution. It's obviously due to my lack of understanding but I wasn't able to mimic the old way.
I've tried several approaches including a list of aggregations passed to the reducer with some success but not 100%. That approach was inspired with one of the tests you've started to convert but it was somewhat different than what I was trying to do. And after I've gone through that I was wondering if i made a wrong turn somewhere 😅

@polyfractal
Copy link
Contributor Author

Heya @mirkojotic, sorry for the delay. Had a quick skim and you're right, I'm not sure there's a way to do sibling aggs right now.

I didn't get a chance to look closely but I think we'll need to do something like you suggested, passing a list of aggregators to the searchAndReduce() method. Or an optional siblingAgg list that can be passed to the method (and default it to null otherwise, so the existing tests don't have to change).

The searchAndReduce() method is modeled on how we reduce things internally. Sibling aggs are extracted from the agg tree, and then reduced during the final reduction after everything else has been reduced. So we could probably add something similar and reduce the sibling pipelines after the child pipelines are reduced.

@mirkojotic
Copy link
Contributor

mirkojotic commented Mar 21, 2019

@polyfractal no problem 😉

I wouldn't mind taking a stab at it. I might open a WIP PR if I get stuck.

Just to confirm AggregatorTestCase is modeled after the actual internal implementation. So I need to follow that closely and add the a list of sibling aggs as an optional argument to searchAndReduce?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/Aggregations Aggregations help wanted adoptme >non-issue Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) >test Issues or PRs that are addressing/adding tests
Projects
None yet
Development

No branches or pull requests

4 participants