Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Geo_polygon queries sometimes matches points outside polygon in 2.4.x #22033

Closed
colings86 opened this issue Dec 7, 2016 · 9 comments
Closed
Labels
:Analytics/Geo Indexing, search aggregations of geo points and shapes >bug stalled

Comments

@colings86
Copy link
Contributor

In ES 2.4.x the following script shows a bug where for a certain polygon the geo polygon query matches points outside the polygon.

Gist showing polygon in blue, bounding box in red outline and indexed point from script: https://gist.github.com/anonymous/c330a7bc64f8b52add1dc43698c38dbe

DELETE test

#Set up an index with a geo_point field
PUT test
{
  "settings": {
    "number_of_replicas": 0,
    "number_of_shards": 1
  },
  "mappings": {
    "doc": {
      "properties": {
        "location": {
          "type": "geo_point"
        }
      }
    }
  }
}

# Index a document containing a point outside the polygon we are going to using in our search
POST test/doc/1
{
  "location": {
    "lat": 52.08063,
    "lon": 0.02002
  }
}

# This geo_polygon query returns 1 hit even though the document is outside the polygon (gist showing polygon in blue: https://gist.github.com/anonymous/9e404e940e4b39060ec88c2fd9895914):

GET test/doc/_search
{
  "size": 1,
  "query": {
    "geo_polygon": {
      "location": {
        "points": [
          [
            -0.016550000000000002,
            51.60118000000001
          ],
          [
            -0.01157,
            51.60049000000001
          ],
          [
            -0.00151,
            51.60052
          ],
          [
            -0.0009500000000000001,
            51.60096000000001
          ],
          [
            0.00002,
            51.60137
          ],
          [
            0.00048000000000000007,
            51.601760000000006
          ],
          [
            0.00108,
            51.60217
          ],
          [
            0.0016500000000000002,
            51.60253
          ],
          [
            0.00198,
            51.60295000000001
          ],
          [
            0.00218,
            51.603350000000006
          ],
          [
            0.0027,
            51.60367
          ],
          [
            0.00265,
            51.604110000000006
          ],
          [
            0.00249,
            51.60474000000001
          ],
          [
            0.00205,
            51.605540000000005
          ],
          [
            0.0018900000000000002,
            51.605940000000004
          ],
          [
            0.0036100000000000004,
            51.60624000000001
          ],
          [
            0.004240000000000001,
            51.6064
          ],
          [
            0.004880000000000001,
            51.606640000000006
          ],
          [
            0.005600000000000001,
            51.606840000000005
          ],
          [
            0.0074600000000000005,
            51.60802
          ],
          [
            0.01003,
            51.608700000000006
          ],
          [
            0.01071,
            51.61717
          ],
          [
            0.009680000000000001,
            51.618080000000006
          ],
          [
            0.00864,
            51.61883
          ],
          [
            0.0032500000000000003,
            51.61547
          ],
          [
            0,
            51.61536
          ],
          [
            -0.0023000000000000004,
            51.61813
          ],
          [
            -0.017320000000000002,
            51.61686
          ],
          [
            -0.01786,
            51.60976
          ],
          [
            -0.016550000000000002,
            51.60118000000001
          ]
        ]
      }
    }
  }
} 

# geo_polygon query on the bounding box of the polygon correctly returns 0 hits:

GET test/doc/_search
{
  "size": 1,
  "query": {
    "geo_polygon": {
      "location": {
        "points": [
          [
            -0.01786116510629654,
            51.60048899240792
          ],
          [
            0.010710898786783218,
            51.60048899240792
          ],
          [
            0.010710898786783218,
            51.618830943480134
          ],
          [
            -0.01786116510629654,
            51.618830943480134
          ],
          [
            -0.01786116510629654,
            51.60048899240792
          ]
        ]
      }
    }
  }
}

# geo_polygon query on polygon with two points on north side removed correctly returns 0 hits (gist showing polygon in blue: https://gist.github.com/anonymous/9d91a776b998b085e83d4c81b4375831):
GET test/doc/_search
{
  "size": 1,
  "query": {
    "geo_polygon": {
      "location": {
        "points": [
          [
            -0.016550000000000002,
            51.60118000000001
          ],
          [
            -0.01157,
            51.60049000000001
          ],
          [
            -0.00151,
            51.60052
          ],
          [
            -0.0009500000000000001,
            51.60096000000001
          ],
          [
            0.00002,
            51.60137
          ],
          [
            0.00048000000000000007,
            51.601760000000006
          ],
          [
            0.00108,
            51.60217
          ],
          [
            0.0016500000000000002,
            51.60253
          ],
          [
            0.00198,
            51.60295000000001
          ],
          [
            0.00218,
            51.603350000000006
          ],
          [
            0.0027,
            51.60367
          ],
          [
            0.00265,
            51.604110000000006
          ],
          [
            0.00249,
            51.60474000000001
          ],
          [
            0.00205,
            51.605540000000005
          ],
          [
            0.0018900000000000002,
            51.605940000000004
          ],
          [
            0.0036100000000000004,
            51.60624000000001
          ],
          [
            0.004240000000000001,
            51.6064
          ],
          [
            0.004880000000000001,
            51.606640000000006
          ],
          [
            0.005600000000000001,
            51.606840000000005
          ],
          [
            0.0074600000000000005,
            51.60802
          ],
          [
            0.01003,
            51.608700000000006
          ],
          [
            0.01071,
            51.61717
          ],
          [
            0.009680000000000001,
            51.618080000000006
          ],
          [
            -0.0023000000000000004,
            51.61813
          ],
          [
            -0.017320000000000002,
            51.61686
          ],
          [
            -0.01786,
            51.60976
          ],
          [
            -0.016550000000000002,
            51.60118000000001
          ]
        ]
      }
    }
  }
}
@colings86 colings86 added :Analytics/Geo Indexing, search aggregations of geo points and shapes >bug v2.4.0 v2.4.1 v2.4.2 v2.4.3 labels Dec 7, 2016
@clintongormley
Copy link
Contributor

@nknize please could you take a look

@nknize
Copy link
Contributor

nknize commented Dec 7, 2016

I'm on it...

@colings86
Copy link
Contributor Author

It seems to be the point [0, 51.61536] which causes the issue, if this is changed to [0.001, 51.61536] the search correctly returns 0 hits. So it looks like the problem might be with polygons with points exactly on the GMT line

@zhaozhijun1988
Copy link

I also have the same issue, How to solve it;
http://stackoverflow.com/questions/41309694/elasticsearch-geo-polygon-query-work-not-correctly

@ashleydw
Copy link

ashleydw commented Mar 1, 2017

Not wanting to jump on your thread, but I experience this in ES5 5.2.0 too.

I have run two tests, one using the default (50 meters?) and one using precision: 5m, distance_error_pct of 0.01.

Relevant mapping:

"boundary": {
    "type": "geo_shape",
    "precision": "5.0m",
   "distance_error_pct": 0.01
},

Relevant part of query (location is a geo_shape):

geo_shape": {
     "location": {
       "indexed_shape": {
           "index": "myindexes",
         "type": "myquerymodel",
            "id": 1,
             "path": "boundary"
           },
         "relation": "within"
      }
}

example location:

{
         "crs": {
          "type": "name",
          "properties": {
            "name": "EPSG:4326"
           }
       },
        "coordinates": [
           6.01667,
          51.78333
       ],
     "type": "Point"
}

Results for default precision: https://gist.github.com/anonymous/954bc675a973abe0bc67c94c696bcf9d
Results for 5m precision: https://gist.github.com/anonymous/421b3dd1091c29e3387a7fdd025e2065

Note that for the default precision I only entered 1 location, but the location appears in both results.

It seems I'm using the precision incorrectly? When "increasing" precision (5m) my data usage jumps from 21GB to 44GB so it's clearly doing something.

So, what does this precision actually mean - i'm guessing the error percentage 0.01 isn't actually 0.01 of 5 meteres as these locations are way outside that possible error.

@clintongormley
Copy link
Contributor

@ashleydw completely different issue - this is about geo_point fields and you're using geo_shape. The place to ask about what precision and distance_error_pct does is in the forums, or read what the docs have to say: https://www.elastic.co/guide/en/elasticsearch/reference/current/geo-shape.html

@ashleydw
Copy link

ashleydw commented Mar 1, 2017

Thanks for pointing the original question was regarding geo_point; I hadn't realized. Still, my problem of false positives is related, no?

I've read the docs, and understand so-far as what they say regarding 0 error meaning completely accurate, but they don't really describe what the error percentage is (is 0.01 % of {precision}?). This part of the question could well have been part of #23206.

@imotov
Copy link
Contributor

imotov commented Dec 11, 2018

We should revisit this after #32039

@imotov imotov added the stalled label Dec 11, 2018
@imotov
Copy link
Contributor

imotov commented Dec 20, 2018

The original issue (with geo_points) was fixed for quite a while. The second issue with the geo_shape was fixed by #35320. Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/Geo Indexing, search aggregations of geo points and shapes >bug stalled
Projects
None yet
Development

No branches or pull requests

6 participants