-
Notifications
You must be signed in to change notification settings - Fork 727
Support for aggregation framework #146
Comments
Nobody is working on it currently as far as I know; feel free to take a shot :) |
Ok. I'll do it. |
Hi whats the status of this issue? I checked #148 was not accepted, but i would like to contribute some aggregations support |
This issue is blocked by #160 - which is also in the todo list. |
Blockage cleared, issue is up for grabs! @jtjeferreira Feel free to work on this and please mind our Contribution Guidelines. |
Hi I solved my issue by parsing the json object to get my aggregations. I will try to improve it by making it more generic and contribute back |
Is this still an open issue? Looks like there was some work done previously, but I don't see aggregation support anywhere in the code or in open PRs. I'd be happy to take a crack at it if it's still an open issue. |
Aggregations introduce the notion of nesting. As this is difficult to handle from the Jest side, I'm going to add documentation to the aggregation class saying that parsing for nested aggregations must happen on the user's side. If not we would have to know the specific name of each aggregation while parsing out into the various Aggregation models used greatly complicating the code. Let me know if that sounds ok. That being said I've gone ahead and created aggregation models for all the other aggregations except the ones elasticsearch lists as "experimental", but that is my next step. Progress is in my fork: https://github.com/cfstout/Jest/tree/Aggregation_support |
I have added new models in the io.searchbox.core.search.aggregation package which represent all the base aggregations. I have not added any classes for 'nested' aggregations as these are simply ways of crafting subaggregations on the request side, but should be handled as one of the base models when dealing with search results. TODO: * implement getAggs() in the SearchResults class * Add tests around these aggregations models
I have added new models in the io.searchbox.core.search.aggregation package which represent all the base aggregations. I have not added any classes for 'nested' aggregations as these are simply ways of crafting subaggregations on the request side, but should be handled as one of the base models when dealing with search results. TODO: * implement getAggs() in the SearchResults class * Add tests around these aggregations models
Seems elasticsearch also removed the _type field in aggregations, so we can't follow the same pattern used by facets. I am going to try a different approach by having a single aggregation model, with all of the possible aggregation fields, and a list of sub aggregations. Basically, my plan is to create a java pojo representation of the json object adding some back-end logic to make the aggregations easy to deal with. |
Just pushed a 'hacky' version to my branch that returns aggregations with fields without knowing about type. My plan is to now override the getAggregations method in the SearchResult class to also take a name to type map, which would allow the user to provide the specific type of aggregation requested and the associated name. That way when we encounter the name in the json document, we simply perform a lookup in the table as opposed to looking at the _type metadata field present before. Then we can recursively pass down this map to the aggregations themselves to handle the nested use case also. |
I have create two ways to get aggregation data from a search response. 1) End user can provide a map of names to the aggregation types expected to be returned. This takes the place of the _type metadata returned in faceting. This map does not need to be in any particular order, but each aggregation will search through and see if it has any matching sub-aggregations matching the names present and populate them as needed. It is important to note that using this method we will ONLY return aggregations specified by the user in this map. 2) Type invariant aggregations can be requested. This method will simply return a list of TypeInvariantAggregations. This is recommended if the user is unsure of the Aggregations supplied, or simply wants a representation of all data returned in the aggregation. This module introduces some danger and requires responsibility on the user's part. The intended use is to call the getAggregationFields() method to get a list of all AggregationFields contained. The user is only assured that those particular methods will return non-null results. **TODO** * Polish off these methods-- this is meant to be a rough draft of the Aggregations module in order to get rapid feedback * Add testing over the Aggregation module
I just pushed a draft of this aggregation module. See commit ^. @kramer I'd love some feedback just whenever you have some free time before I continue with polishing and writing tests for this code. |
Changes to reflect suggestions by kramer
Changes to reflect suggestions by kramer
I have create three ways to get aggregation data from a search response. 1) End user can call `get<X>Aggregation(String aggName)` where X is the type of aggregation expected and agg name is the name of the aggregation. 2) User can call `getAggregation(String aggName, Class<T> aggType)` where aggName is the name of the aggregation and aggType is the type expected for that particular aggregation. 3) User can call `getAggregations(Map<String, Class<T>> nameToTypeMap)` which provides a map of names to expected types for this level of aggregation. A list of aggregations of the given types will be returned as a result. In order to handle nested aggregations, the root `Aggregation` class contains all three methods. The `getAggregations()` call in `SearchResult` returns the root level aggregation. Thus to get an average aggregation named 'avg1' the end user would make a call like: `searchResult.getAggregations().getAvgAggregation('avg1')` which would then have the `getAvg` method available. Nested aggregations could then be achieved by nesting these calls to the aggregations returned at each call: `searchResult.getAggregations.getTermsAggregation('term1').getAvgAggregation('avg1')` Tests were added for each of these Aggregation objects with the objective of testing all methods of a particular aggregation object in the case where a user provides a proper name, and in the case where they provide a bad name by which to get the aggregation. Finally, many of these methods have the potential to create null values, as a bad name will lead to no data being returned. I have commented in JavaDoc style these methods with the potential to return null so that end users will be aware of the possibility.
I have create three ways to get aggregation data from a search response. 1) End user can call `get<X>Aggregation(String aggName)` where X is the type of aggregation expected and agg name is the name of the aggregation. 2) User can call `getAggregation(String aggName, Class<T> aggType)` where aggName is the name of the aggregation and aggType is the type expected for that particular aggregation. 3) User can call `getAggregations(Map<String, Class<T>> nameToTypeMap)` which provides a map of names to expected types for this level of aggregation. A list of aggregations of the given types will be returned as a result. In order to handle nested aggregations, the root `Aggregation` class contains all three methods. The `getAggregations()` call in `SearchResult` returns the root level aggregation. Thus to get an average aggregation named 'avg1' the end user would make a call like: `searchResult.getAggregations().getAvgAggregation('avg1')` which would then have the `getAvg` method available. Nested aggregations could then be achieved by nesting these calls to the aggregations returned at each call: `searchResult.getAggregations.getTermsAggregation('term1').getAvgAggregation('avg1')` Tests were added for each of these Aggregation objects with the objective of testing all methods of a particular aggregation object in the case where a user provides a proper name, and in the case where they provide a bad name by which to get the aggregation. Finally, many of these methods have the potential to create null values, as a bad name will lead to no data being returned. I have commented in JavaDoc style these methods with the potential to return null so that end users will be aware of the possibility.
Hi,
I'm wondering if the support of the aggregation framework is under work ?
Thanks
The text was updated successfully, but these errors were encountered: