Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

valhalla_export_edges v2 #442

Open
2 of 9 tasks
kevinkreiser opened this issue Feb 2, 2017 · 4 comments
Open
2 of 9 tasks

valhalla_export_edges v2 #442

kevinkreiser opened this issue Feb 2, 2017 · 4 comments
Labels

Comments

@kevinkreiser
Copy link
Member

(2016-10-17T15:12:25Z) @missinglink : originally created valhalla/tools#96

this ticket is to track issues that would improve how the valhalla_export_edges.cc script works in regards to the https://github.com/pelias/interpolation project.

  • break lines when access changes

as per the case below, we would like to avoid using the section of the road reserved for pedestrians for address interpolation. 'cutting' the road in to two will result in more accurate interpolation for vehicles.

example: pelias/interpolation#11

  • remove ferry routes and other non-road ways

we're only concerned with address data, so anything which doesn't have house numbers associated to it can be removed in order to keep the export file size down.

example: http://www.openstreetmap.org/way/6644126
example: http://www.openstreetmap.org/way/430610059#map=18/40.73282/-73.98087 - there are a bunch of 'other' paths in this area including highway:service and highway:footway and also some cycle tracks just to be super confusing!
example: http://www.openstreetmap.org/way/384288970 - this one appears to be a bus stop.

  • increase the 'greediness'

there are some linestrings which look like they could be longer, such as:

yt|o|Artde`FyLhCs`@vM??n^lqBlJroAsArd@eJjV_c@`r@qWzJuJhM_Ejj@rA~]fi@vm@tNhWhRfD~Np\p\dy@fDl_@pQb[zPtc@bL~|@^n]cMp{@c[fc@_]|i@qRvv@eErp@?`fArA|qAzP|rAj`@`{@nHp]gCp[yM~]yVj`@c[~]m_@jk@kp@xiAiRfOgh@vMs_@yAiw@cGcKaHc[sP}d@oTsPmIgDpQxG~I|TbPb[~]x\z`@ze@n]fg@`S`l@?~gAcGnqAaSre@?bUzAzk@b[tc@yAlJ`G\xMgD`G{UbG_NxBsUl@ucArQiM|Sm@lJxGjLfYl@jp@wDfg@dZ~b@b[xLl_@ra@`g@zUrQ|c@m@xk@}Jre@gDtJuE??l_@{U~b@eOtOrExGnTjBl^hLpRnXrQxHb[wDpRwMtYiCdYrPvXnIb\~N~]fg@dn@zGlU?dc@lIxMlZrP`MtYjQd[|TlTvNtn@tIvb@bLvNrPrPzF`SlJtYlJdZ|U|S??`MtP|DbQt@vv@|D~{AvDvMdOnTNL|JbGxk@hC??fNz@nDrGxAhVaBfc@gDhWyH`SqLpG??}DxBox@vc@sAdE?\GtEjBtDvb@gCbVhB??~XxBpRrF~RrFbk@|t@nHzUfh@bo@hXlt@v]ls@lYz~@zK~^rFl^t@lT|DhXn^n|@jKrP|ZdnATl@jBtD`BvD`M`G`RlUlJ~]bGzt@FjAlElU~Sbo@hLvc@zArFbBpGpLtYfDvDrKvM|EtFvl@jt@pzAtcA|O~RbLvXzKju@rKju@pLtn@rBhWGlTsApRyHli@yFrPqHnI}lB~fAmDxB{GvDuIjKkFdOu@hMd@bQ??zAjBxe@hk@|@jAd@l@`w@z~@h~AnrA|ZgEjy@|Jxa@sF~SxBlEpGvM~r@`IlTrF~IzJ`HtZ`HxLrFfNzKj[l_@~IbQnHz_@jLxa@hWvm@t|A~nChXdx@??vR|i@`]h`AhBjV{@jUsFlUe_@dd@aMlUoNdd@{Ar[?\dJxa@hWzj@dKvMxVrPfOjWf|@viClK|}@jAfb@eAre@gDh`B_Cha@kLz`@aMvb@wh@j_AuEfXsEhMgIbGiXnH}IdFkGzKuE~RcA~^aB`RuFbQ{PvXeJzKaRjKm_@rFkQm@}m@_TyM|@wRrFmd@pR}c@xa@eUbPck@~SwIxM_CzUWbFU~I`CpR|IdYbGr[hBjiAhHhMdO~IbGtDxLbQfDlUsAdYiWvjBcAzAcG|J_TlU}SfMiq@z`@ONi\jUqMfOoIdOk@xBwDxMqBnSiCfw@oCxLaHlJe_Afn@cUOsKeEcWo]uDyBeE_@{FxBeKdP_ItE{`@eE{e@xAeJtEcj@li@y\zK_J~HkKvYmOhW{VbQshAjj@yeAhb@ah@xKyu@|J}n@eEgI~Hyf@fx@al@rZgb@fOO?aRfN}OpQcQp\oSjVw|@xl@wl@lh@{nAfwAaq@bQaN|J}NtYsG|Is_@xWcWrF{JNs`@sFqHaHuOwX}OcRgHyB??aHyBiWk_@mE}J}@sQwCiL_ImJeFwNmEia@mOgYgNiCcB|@{FhBuI~IyMjUiCnI{@hCkA`RGpQnDd[?dc@cBzVcLfYs@dOlEvb@GrPmEfO_IdEi~BppAqg@|^{e@|Sc[`Hg]^{Q}@uJ]}DhBu@rFbBdF|Dz@fOwC~\hBbGtEvXjBtc@nIra@fCzJdFtn@tDxWeEpWkAzKzAbKnHnvA|JpHbF`HhN`CnR??u@dPGzAyBnH??{@xByHdFwH?{ViM??kPoIkG{@eTgDw]}@yiBb\md@rE{cAnTq_Bdx@iHfDgIpGav@xl@eZ~]u~@n{A??m~@byA??}N`\uEnToCvW??GnT
  • export all the names (including alt names)

this is not for interpolation, it's for importing in to Pelias. since we do not link the exported data to OSM we lose the alternate names for road, and possibly translations. It would be ideal to export all versions of the names we found for each segment.

the current format allows for a \0 separated list of names so it should 'just work' out of the box. how common are alternate names for streets/ different languages? If possible it would be nice to have the OSM tag name also, where the tag name has semantic value (such as language codes).

  • deduplicate coordinates in linestring

there appear to be duplicate adjacent lat/lon pairs when decoding the polyline format.

  • configurable minimum line length

[edit] marking this as completed, this can easily be done by the consumer, generally more data is better -missinglink

some linestrings are very short, it would be nice to be able to skip exporting them if they were under a certain threshold.

example:

3640618|mxhdaBkFV~E
3965397|oihr`BgaC??
10213015|w}owpAe_E]H
10213037|ucpwpAu|DTM
  • include private roads

the road segment linked below is not exported, possibly due to it being labelled as pedestrian but also vehicle:private
example: http://www.openstreetmap.org/way/4618808#map=18/52.49653/13.36355

  • rue de rivoli

this Parisian road is exported as several line strings, on investigation it seems like the geometry is correctly attributed in openstreetmap (names are correct).

It would be ideal if this road could be exported as a single line.

as an example, this intersection appears to be a point where the geometries are split:

[edit] after a second look, I think the graph traversal is correct, I think the duplicates are due to some pedestrian/cycle ways in the area with the same name.

uv_e|AmvonCe@nDe@zCe@lD}EnTyLtj@mFhT]nAsAxGmJxd@iLfh@_@nAyAtHcB`HkG~XgCnLgDbOuJrb@W|@iBrI{Kff@]tAm@vC_I~\qCjMUzAeAvE{Kpf@]pA{AhHiBhIoDbOM|@eAtEcAlE_Opp@UfAsAtGcFrT{L`k@U|@e@bCcB~F]fAgD|JeExOqM|m@uJrb@kAnFsj@jdC{FpWsFlVaMvi@]rAWnAsA`G]dByBpJcAbE}UbeA{@bEk[lwA}@nDk@dCeA`F{@bEcBhHUpAsBrIs@bDaCpKcA`FaSl}@sAlFmO`r@}c@drBaCdKgDlOaS`}@aQjw@u@vCsP~u@aCpTOnC?hI?vD?r@?s@?wD?iINoC`CqTrP_v@t@wC`Qkw@`Sa}@fDmO`CeK|c@erBlOar@rAmF`Sm}@bAaF`CqKr@cDrBsITqAbBiHz@cEdAaFj@eC|@oDj[mwAz@cE|UceAbAcExBqJ\eBrAaGVoA\sA`Mwi@rFmVzFqWrj@kdCjAoFtJsb@pM}m@dEyOfD}J\gAbB_Gd@cCT}@zLak@bFsTrAuGTgA~Nqp@bAmEdAuEL}@nDcOhBiIzAiH\qAzKqf@dAwET{ApCkM~H_]l@wC\uAzKgf@hBsIV}@tJsb@fDcOfCoLjG_YbBaHxAuH^oAhLgh@lJyd@rAyG\oAlFiTxLuj@|EoTd@mDd@{Cd@oDd@oD|Igo@l@uDnOodA~BmPdPafAvD{Vd@yClDuV`Iqi@r@_Fd@mDlEcZfJ}m@l@cEj@yDfJon@bAcH|@uFpG}c@tDyWvNqaAhCkQhBuMiBtMiCjQwNpaAuDxWqG|c@}@tFcAbHgJnn@k@xDm@bEgJ|m@mEbZe@lDs@~EaIpi@mDtVe@xCwDzVeP`fA_ClPoOndAm@tD}Ifo@e@nD
  • do not export main road refs

I noticed that in London (possibly common throughout the UK) there are alphanumeric identifiers used to denote major road ways, eg: 'A1202' or 'B501'.

The names for these road ways are exported and when used to group ways by name, they group together roads which actually have different names.

This makes it difficult to perform grouping and results in problems with duplication and incorrect association of points for address interpolation.

An example at location 18/51.51575/-0.15543 there is this grouping of 'Portman Square' and 'Gloucester Place' because they share the ref of A41

Same issue in Germany, eg: 18/52.51617/13.38611

A41

It may be as simple as excluding the ref tag from export, a more general-use strategy would be to give exported names a key->value syntax so the consumer could identify and discard names it felt were not applicable to it's domain.

If the ref values are also used to walk the graph it might be a little trickier?

eg: http://www.openstreetmap.org/way/4253840

@missinglink
Copy link
Contributor

another example for increase the 'greediness', this road is currently being exported as several segments with the same name:

Lancaster Drive Northeast
[ -122.983699, 44.953368 ]

@kevinkreiser
Copy link
Member Author

as a first pass to fixing some of this we are going to add the way ids to the output of the export. this will need to be another variable width column so we'll need to update the spec to have the number of things per column. we'll use 1 byte to do this so that a given column cant have more than 256 entries (names, way ids). this should be all of 5 minutes of work 😄

@missinglink
Copy link
Contributor

missinglink commented Apr 19, 2017

let's hold off until we've had a quick call to discuss, I don't want to waste your time.

the ideal would be to get all the names from you, but it sounds like that's not possible without changing your whole data model.

the idea of exporting the ids is a good alternative option but would require some code be written to grab the names from OSM and associate them to the polylines.

right now it's too far out-of-scope with our quarterly goals for me to build that, so making a spec change now would probably cause breaking changes to the existing code with no real short-term benefit.

@kevinkreiser
Copy link
Member Author

kevinkreiser commented Apr 19, 2017

yep i wasnt going to do anything until i go the word that you actually want it. i should take more care when wording these things to make sure its clear sorry about that. i was just in a haste to get it written down (memory like a sieve).

we do also want to have these extra names but we wont be adding those to our data in the near term.

IMHO using the ids muddies the waters a bit because we are still merging using the names. its completely possible that we would give ids for ways that dont have the same full list of names but do share the default names. anyway yeah... the real solution is probably get it into our data so we can both benefit from it and you dont have to go to the trouble of having data in a look aside table.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants