Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unclear instructions how to build address.db from openaddress data #207

Open
vrozental opened this issue Jul 21, 2019 · 2 comments
Open

Comments

@vrozental
Copy link

The instruction says:

./interpolate oa address.db street.db < /data/oa/nz/countrywide.csv

How to build worldwide adress.db?
Should one iterate over all CSV files in the /data/oa?
There are 57 countries directories, but only 28 countrywide.csv files

Please add detailed instructions to build worldwide address.db
Thank you!

@arne-cl
Copy link
Member

arne-cl commented Sep 24, 2019

Hello @vrozental,

here's an example of how to build address.db and street.db for a single country (here: Luxembourg) without installing anything except Docker. Please let me know if this works for you. If it does, I can also provide an example for worldwide databases.

If you have a Dockerfile with this content:

FROM golang:1.13-alpine AS builder

# install pbf converters
WORKDIR /pelias
RUN apk add git wget gcc musl-dev && \
    go get github.com/missinglink/pbf && \
    git clone https://github.com/pelias/pbf2json.git

# download datasets for Luxembourg
WORKDIR /pelias/osm
# Source: https://download.geofabrik.de/europe.html
RUN wget https://download.geofabrik.de/europe/luxembourg-latest.osm.pbf

WORKDIR /pelias/openaddresses
# Source: http://results.openaddresses.io/?runs=all#runs
RUN wget https://data.openaddresses.io/runs/677497/lu/countrywide.zip && \
    unzip countrywide.zip && rm countrywide.zip

# extract polylines from openstreetmap data
WORKDIR /pelias/polylines
RUN pbf streets /pelias/osm/luxembourg-latest.osm.pbf > /pelias/polylines/luxembourg-latest.osm.0sv

FROM pelias/interpolation

COPY --from=builder /pelias /pelias
WORKDIR /code/pelias/interpolation

ENV BUILDDIR /data/builddir
ENV WORKINGDIR /data/workingdir
ENV POLYLINE_FILE /pelias/polylines/luxembourg-latest.osm.0sv
ENV OAPATH /pelias/openaddresses
ENV PBF2JSON_FILE /pelias/osm/luxembourg-latest.osm.pbf
ENV PBF2JSON_BIN /pelias/pbf2json/build/pbf2json.linux-x64

# run script that converts input data into address.db and street.db
CMD [ "./interpolate", "build"]

You can build it with docker build -t interpolation-data-generation .
Afterwards, you need to create the output directory for the files that will
be created and run the ./interpolate build script like this:

mkdir -p /tmp/interpolation/builddir
docker run -v /tmp/interpolation:/data -ti interpolation-data-generation

On my machine, the run takes about five minutes and the result looks like this:

tree /tmp/interpolation/builddir
/tmp/interpolation/builddir
├── address.db
├── conflate_oa.err
├── conflate_oa.out
├── conflate_oa.skip
├── conflate_osm.err
├── conflate_osm.out
├── polyline.err
├── polyline.out
├── street.db
├── tmp
│   └── leveldb
│       ├── 000002.log
│       ├── 000003.ldb
│       ├── CURRENT
│       ├── LOCK
│       ├── LOG
│       └── MANIFEST-000000
├── vertices.err
├── vertices.out
└── vertices.skip

@missinglink
Copy link
Member

There is a script https://github.com/pelias/interpolation/blob/master/script/concat_oa.sh which can combine multiple OA files in to a single CSV stream for you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants