Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with Custom CSV data in Autocomplete and Reverse Geocoding API #1342

Closed
toton6868 opened this issue Aug 21, 2019 · 10 comments
Closed

Comments

@toton6868
Copy link

toton6868 commented Aug 21, 2019

I have added a CSV file with the help of csv-importer. In the names field, I have a few mobiles no. which starts with 01 I used the source as csv and layers as custom. I have configured the API parameters of my Config.json as follows

"api": {
    "accessLog": "common",
    "textAnalyzer": "libpostal",
    "host": "http://pelias.mapzen.com/",
    "indexName": "pelias",
    "version": "1.0",
    "targets": {
      "auto_discover": false,
      "canonical_sources": ["whosonfirst", "openstreetmap", "openaddresses", "geonames", "csv"],
      "layers_by_source": {
        "openstreetmap": [ "address", "venue", "street" ],
        "openaddresses": [ "address" ],
        "geonames": [
          "country", "macroregion", "region", "county", "localadmin", "locality", "borough",
          "neighbourhood", "venue"
        ],
        "whosonfirst": [
          "continent", "empire", "country", "dependency", "macroregion", "region", "locality",
          "localadmin", "macrocounty", "county", "macrohood", "borough", "neighbourhood",
          "microhood", "disputed", "venue", "postalcode", "continent", "ocean", "marinearea"
        ],
        "csv": ["custom"]
      },
      "source_aliases": {
        "osm": [ "openstreetmap" ],
        "oa":  [ "openaddresses" ],
        "gn":  [ "geonames" ],
        "wof": [ "whosonfirst" ],
        "csv": [ "csv" ]
      },
      "layer_aliases": {
        "coarse": [
          "continent", "empire", "country", "dependency", "macroregion", "region", "locality",
          "localadmin", "macrocounty", "county", "macrohood", "borough", "neighbourhood",
          "microhood", "disputed", "postalcode", "continent", "ocean", "marinearea", "custom"
        ]
      },
      "csv": ["custom"]
    },
    "services": {
      "placeholder": {
        "url": "http://localhost:3101"
      },
      "libpostal": {
        "url": "http://localhost:3103"
      },
      "pip": {
        "url": "http://localhost:3102"
      },
      "interpolation": {
        "url": "http://localhost:3104"
      }
    }
  },

When I am working with the search API it's working great and returning the correct data. The problem is in Autocomplete and Reverse APIs.

In reverse API the custom layer is not returning any data. if I use &layers=custom its returning null data.

In Autocomplete API If I put 0171 its not returning any data but when I am entering full mobile no such as 01712345678 it's returning only one data. Thus I am unable to use it in Autocomplete.

@orangejulius
Copy link
Member

Hi @toton6868,

As it stands right now, Pelias can't do autocomplete on phone numbers or other only-numeric inputs.

We intentionally prevent matching of partial number inputs and have done so for a very long time. It made a big difference when matching most addresses. We didn't consider phone numbers at the time.

However, since then we've added more logic that might allow us to relax this constraint.

@missinglink, considering that we now only match incomplete text on the ngrams index, do you think we could essentially revert pelias/schema#133? It would definitely reduce the complexity of our schema quite a bit.

@toton6868
Copy link
Author

@orangejulius Thank you very much. It is quite understandable that phone number will increase the complexity on address searching. Probably I can solve this in different way. But in the case of reverse geocoding custom layer is not returning any data. if I use &layers=custom its returning null data. whereas &layers=address returning smoothly.

@missinglink
Copy link
Member

missinglink commented Aug 23, 2019

I would be hesitant to produce ngrams for numerals, the hits for an input of 1 main st would be 1, 10, 100, 101, 102... etc

This would greatly reduce both performance and accuracy :(

[edit] a query like 1 m would be a cluster killer!

@missinglink
Copy link
Member

missinglink commented Aug 23, 2019

@toton6868 I'd suggest changing your config, the auto_discover feature is now preferred over manually specifying sources/layers.

#1316
#1319

In order to use it you can set auto_discover to true and remove everything else inside the targets block.

@orangejulius
Copy link
Member

@missinglink, now that we only search against the ngrams field for the last word in an autocomplete input, I don't think that the query /v1/autocomplete?text=1 main st would match any addreses with the housenumber 100.

Let me know if I'm right. The way I understand it generating ngrams for numeric tokens would only affect queries such as the following:

  • 1800123456 ( a phone number, purely numeric)
  • 100 (an input that only has a single numeric token so far. it could for example be the start of typing 1000 main street)
  • gleimstrasse 50 (or other address queries where the housenumber is entered after the street name. this would now match gleimstrasse 501, gleimstrasse 5000, etc. but also gleimstrasse 5 would now match gleimstrasse 50).

It would be a bit of work to test, but it might allow matching some addresses with fewer keystrokes, especially if there's a focus point to narrow down the search space.

@toton6868
Copy link
Author

Sorry for the late reply. After seeing your concern I have changed my dataset. in the name field, I put the owner name and in JSON field I put the phone no. Still, reverse geocoding and nearby is unable to show them in the returned result. even the &layers=custompoi is returning null. Does the reverse geocoding and nearby works with custom data?

@orangejulius
Copy link
Member

@toton6868 indeed you're right. looks like the reverse endpoint currently filters on a needlessly strict hardcoded list of layers:

api/query/reverse.js

Lines 44 to 45 in e04a6b2

// only include non-coarse layers
vs.var( 'layers', _.intersection(clean.layers, ['address', 'street', 'venue']));

Turns out we knew about this already from #1161, but hadn't fixed it.

Looks like what we'll probably have to do is add something to our TypeMapping system (https://github.com/pelias/api/blob/master/helper/TypeMapping.js) that allows for configuring the list of administrative areas, so that we can then generate a list of all non-admin areas, for use by reverse geocoding and elsewhere.

@toton6868
Copy link
Author

Thanks for the quick reply. It will be a great help and feature if you implement in the near future. But currently, if there any quick and dirty method so that I can change in code and that's return custom layers. That will solve my problem for now. I have tried the following hardcoded line with no luck

vs.var( 'layers', _.intersection(clean.layers, ['address', 'street', 'venue', 'custompoi']));

@toton6868
Copy link
Author

toton6868 commented Aug 30, 2019

Solved it by hardcoded input of custompoi layer in api/query/reverse_defaults.js file

changed the following line
'layers': ['venue', 'address', 'street'],

to
'layers': ['venue', 'address', 'street', 'custompoi'],

Thanks

@orangejulius
Copy link
Member

Since it looks like we've solved all the issues here or associated them with known issues tracked elsewhere, I think we can close this issue.

If there's anything else, don't hesitate to let us know.

By the way, we are considering making changes to Pelias so that phone numbers would be compatible with autocomplete: see pelias/schema#379

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants