Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New GeoHexGrid aggregation #82924

Merged
merged 23 commits into from
Jan 27, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
50d9ce2
New GeoHexGrid aggregation
iverase Jan 24, 2022
16dc137
fix docs
iverase Jan 24, 2022
96bc189
Add include so page publishes
jrodewig Jan 24, 2022
23abea9
Merge branch 'master' into GeoHexGrid
elasticmachine Jan 24, 2022
2330f62
Update docs/reference/aggregations/bucket/geohexgrid-aggregation.asci…
iverase Jan 24, 2022
baa26df
Update docs/reference/aggregations/bucket/geohexgrid-aggregation.asci…
iverase Jan 24, 2022
cf529cf
Update docs/reference/aggregations/bucket/geohexgrid-aggregation.asci…
iverase Jan 24, 2022
2f83ca9
Update docs/reference/aggregations/bucket/geohexgrid-aggregation.asci…
iverase Jan 24, 2022
af890d2
Update docs/reference/aggregations/bucket/geohexgrid-aggregation.asci…
iverase Jan 24, 2022
4aad240
Update docs/reference/aggregations/bucket/geohexgrid-aggregation.asci…
iverase Jan 24, 2022
10356b2
Update docs/reference/aggregations/bucket/geohexgrid-aggregation.asci…
iverase Jan 24, 2022
0e734af
Update docs/reference/aggregations/bucket/geohexgrid-aggregation.asci…
iverase Jan 24, 2022
3a425db
Merge branch 'master' into GeoHexGrid
iverase Jan 26, 2022
98e48cc
iter in docs
iverase Jan 26, 2022
e04baf3
Update docs/changelog/82924.yaml
iverase Jan 26, 2022
0af195d
yaml editing
iverase Jan 26, 2022
5b23ee6
Update docs/changelog/82924.yaml
iverase Jan 26, 2022
157c2a9
fix link
iverase Jan 26, 2022
aef42db
Merge branch 'GeoHexGrid' of github.com:iverase/elasticsearch into Ge…
iverase Jan 26, 2022
1eebba3
Additional edits and fixes
jrodewig Jan 26, 2022
73f09bb
minor edit
jrodewig Jan 26, 2022
c67e4bb
fix test
iverase Jan 26, 2022
9c096d8
Merge branch 'GeoHexGrid' of github.com:iverase/elasticsearch into Ge…
iverase Jan 26, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions docs/changelog/82924.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
pr: 82924
summary: New `GeoHexGrid` aggregation
area: Geo
type: feature
issues: []
2 changes: 2 additions & 0 deletions docs/reference/aggregations/bucket.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,8 @@ include::bucket/geodistance-aggregation.asciidoc[]

include::bucket/geohashgrid-aggregation.asciidoc[]

include::bucket/geohexgrid-aggregation.asciidoc[]

include::bucket/geotilegrid-aggregation.asciidoc[]

include::bucket/global-aggregation.asciidoc[]
Expand Down
249 changes: 249 additions & 0 deletions docs/reference/aggregations/bucket/geohexgrid-aggregation.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,249 @@
[role="xpack"]
[[search-aggregations-bucket-geohexgrid-aggregation]]
=== Geohex grid aggregation
++++
<titleabbrev>Geohex grid</titleabbrev>
++++

A multi-bucket aggregation that groups <<geo-point,`geo_point`>>
values into buckets that represent a grid.
The resulting grid can be sparse and only
contains cells that have matching data. Each cell corresponds to a
https://h3geo.org/docs/core-library/h3Indexing#h3-cell-indexp[H3 cell index] and is
labeled using the https://h3geo.org/docs/core-library/h3Indexing#h3index-representation[H3Index representation].

See https://h3geo.org/docs/core-library/restable[the table of cell areas for H3
resolutions] on how precision (zoom) correlates to size on the ground.
Precision for this aggregation can be between 0 and 15, inclusive.

WARNING: High-precision requests can be very expensive in terms of RAM and
result sizes. For example, the highest-precision geohex with a precision of 15
produces cells that cover less than 10cm by 10cm. We recommend you use a
filter to limit high-precision requests to a smaller geographic area. For an example,
refer to <<geohexgrid-high-precision>>.

[[geohexgrid-low-precision]]
==== Simple low-precision request

[source,console,id=geohexgrid-aggregation-example]
--------------------------------------------------
PUT /museums
{
"mappings": {
"properties": {
"location": {
"type": "geo_point"
}
}
}
}

POST /museums/_bulk?refresh
{"index":{"_id":1}}
{"location": "52.374081,4.912350", "name": "NEMO Science Museum"}
{"index":{"_id":2}}
{"location": "52.369219,4.901618", "name": "Museum Het Rembrandthuis"}
{"index":{"_id":3}}
{"location": "52.371667,4.914722", "name": "Nederlands Scheepvaartmuseum"}
{"index":{"_id":4}}
{"location": "51.222900,4.405200", "name": "Letterenhuis"}
{"index":{"_id":5}}
{"location": "48.861111,2.336389", "name": "Musée du Louvre"}
{"index":{"_id":6}}
{"location": "48.860000,2.327000", "name": "Musée d'Orsay"}

POST /museums/_search?size=0
{
"aggregations": {
"large-grid": {
"geohex_grid": {
"field": "location",
"precision": 4
}
}
}
}
--------------------------------------------------

Response:

[source,console-result]
--------------------------------------------------
{
...
"aggregations": {
"large-grid": {
"buckets": [
{
"key": "841969dffffffff",
"doc_count": 3
},
{
"key": "841fb47ffffffff",
"doc_count": 2
},
{
"key": "841fa4dffffffff",
"doc_count": 1
}
]
}
}
}
--------------------------------------------------
// TESTRESPONSE[s/\.\.\./"took": $body.took,"_shards": $body._shards,"hits":$body.hits,"timed_out":false,/]

jrodewig marked this conversation as resolved.
Show resolved Hide resolved
[[geohexgrid-high-precision]]
==== High-precision requests

When requesting detailed buckets (typically for displaying a "zoomed in" map),
a filter like <<query-dsl-geo-bounding-box-query,geo_bounding_box>> should be
applied to narrow the subject area. Otherwise, potentially millions of buckets
will be created and returned.

[source,console,id=geohexgrid-high-precision-ex]
--------------------------------------------------
POST /museums/_search?size=0
{
"aggregations": {
"zoomed-in": {
"filter": {
"geo_bounding_box": {
"location": {
"top_left": "52.4, 4.9",
"bottom_right": "52.3, 5.0"
}
}
},
"aggregations": {
"zoom1": {
"geohex_grid": {
"field": "location",
"precision": 12
}
}
}
}
}
}
--------------------------------------------------
// TEST[continued]

Response:

[source,console-result]
--------------------------------------------------
{
...
"aggregations": {
"zoomed-in": {
"doc_count": 3,
"zoom1": {
"buckets": [
{
"key": "8c1969c9b2617ff",
"doc_count": 1
},
{
"key": "8c1969526d753ff",
"doc_count": 1
},
{
"key": "8c1969526d26dff",
"doc_count": 1
}
]
}
}
}
}
--------------------------------------------------
// TESTRESPONSE[s/\.\.\./"took": $body.took,"_shards": $body._shards,"hits":$body.hits,"timed_out":false,/]

[[geohexgrid-addtl-bounding-box-filtering]]
==== Requests with additional bounding box filtering

The `geohex_grid` aggregation supports an optional `bounds` parameter
that restricts the cells considered to those that intersect the
provided bounds. The `bounds` parameter accepts the same
<<query-dsl-geo-bounding-box-query-accepted-formats,bounding box formats>>
as the geo-bounding box query. This bounding box can be used with or
without an additional `geo_bounding_box` query for filtering the points prior to aggregating.
It is an independent bounding box that can intersect with, be equal to, or be disjoint
to any additional `geo_bounding_box` queries defined in the context of the aggregation.

[source,console,id=geohexgrid-aggregation-with-bounds]
--------------------------------------------------
POST /museums/_search?size=0
{
"aggregations": {
"tiles-in-bounds": {
"geohex_grid": {
"field": "location",
"precision": 12,
"bounds": {
"top_left": "52.4, 4.9",
"bottom_right": "52.3, 5.0"
}
}
}
}
}
--------------------------------------------------
// TEST[continued]

Response:

[source,console-result]
--------------------------------------------------
{
...
"aggregations": {
"tiles-in-bounds": {
"buckets": [
{
"key": "8c1969c9b2617ff",
"doc_count": 1
},
{
"key": "8c1969526d753ff",
"doc_count": 1
},
{
"key": "8c1969526d26dff",
"doc_count": 1
}
]
}
}
}
--------------------------------------------------
// TESTRESPONSE[s/\.\.\./"took": $body.took,"_shards": $body._shards,"hits":$body.hits,"timed_out":false,/]

[[geohexgrid-options]]
==== Options

[horizontal]
field::
(Required, string) Field containing indexed geo-point values. Must be explicitly
mapped as a <<geo-point,`geo_point`>> field. If the field contains an array,
`geohex_grid` aggregates all array values.

precision::
(Optional, integer) Integer zoom of the key used to define cells/buckets in
the results. Defaults to `6`. Values outside of [`0`,`15`] will be rejected.

bounds::
(Optional, object) Bounding box used to filter the geo-points in each bucket.
Accepts the same bounding box formats as the
<<query-dsl-geo-bounding-box-query-accepted-formats,geo-bounding box query>>.

size::
(Optional, integer) Maximum number of buckets to return. Defaults to 10,000.
When results are trimmed, buckets are prioritized based on the volume of
documents they contain.

shard_size::
(Optional, integer) Number of buckets returned from each shard. Defaults to
`max(10,(size x number-of-shards))` to allow for more a accurate count of the
top cells in the final result.
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ public void writeTo(StreamOutput out) throws IOException {
aggregations.writeTo(out);
}

protected long hashAsLong() {
public long hashAsLong() {
return hashAsLong;
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ public static ObjectParser<ParsedGeoGrid, Void> createParser(
return parser;
}

protected void setName(String name) {
public void setName(String name) {
super.setName(name);
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -55,16 +55,22 @@ protected int maxNumberOfBuckets() {
@Override
protected T createTestInstance(String name, Map<String, Object> metadata, InternalAggregations aggregations) {
final int precision = randomPrecision();
int size = randomNumberOfBuckets();
List<InternalGeoGridBucket> buckets = new ArrayList<>(size);
final int size = randomNumberOfBuckets();
final List<InternalGeoGridBucket> buckets = new ArrayList<>(size);
final List<Long> seen = new ArrayList<>(size);
int finalSize = 0;
for (int i = 0; i < size; i++) {
double latitude = randomDoubleBetween(-90.0, 90.0, false);
double longitude = randomDoubleBetween(-180.0, 180.0, false);

long hashAsLong = longEncode(longitude, latitude, precision);
buckets.add(createInternalGeoGridBucket(hashAsLong, randomInt(IndexWriter.MAX_DOCS), aggregations));
if (seen.contains(hashAsLong) == false) { // make sure we don't add twice the same bucket
buckets.add(createInternalGeoGridBucket(hashAsLong, randomInt(IndexWriter.MAX_DOCS), aggregations));
seen.add(hashAsLong);
finalSize++;
}
}
return createInternalGeoGrid(name, size, buckets, metadata);
return createInternalGeoGrid(name, finalSize, buckets, metadata);
}

@Override
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,8 @@ private SpatialStatsAction() {
* Items to track. Serialized by ordinals. Append only, don't remove or change order of items in this list.
*/
public enum Item {
GEOLINE
GEOLINE,
GEOHEX
}

public static class Request extends BaseNodesRequest<Request> implements ToXContentObject {
Expand Down
1 change: 1 addition & 0 deletions x-pack/plugin/spatial/build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ dependencies {
compileOnly project(path: ':modules:legacy-geo')
compileOnly project(':modules:lang-painless:spi')
compileOnly project(path: xpackModule('core'))
api project(":libs:elasticsearch-h3")
testImplementation(testArtifact(project(xpackModule('core'))))
testImplementation project(path: xpackModule('vector-tile'))
}
Expand Down
Loading