-
Notifications
You must be signed in to change notification settings - Fork 18
Data
Each OpenBlock deployment requires data sets that are largely unique depending on the geographic region being served.
To perform geocoding and provide a means of navigating at a block level, OpenBlock requires a database of the streets in your area. This must include each street's address ranges for individual street segments. If you live in the U.S.A. and your city hasn't had much new development since the year 2000, the U.S. Census' TIGER/Line file ( http://www.census.gov/geo/www/tiger/ ) may be sufficient. Importers for ESRI Shapefiles with appropriate attributes also exist (TeleAtlas street data?)
Some cities like Portland and Washington DC have opened up their address data on http://openaddresses.org
TODO: document how to load them
You need to add some named regions of interest to your database and provide their geographic boundaries. Usually, these regions are defined by providing a shapefile.
Typical examples:
- neighborhoods in your area
- local zip codes
Some places to find boundaries:
- http://koordinates.com/#/layers/category/boundaries/
- http://libremap.org/data/boundary/
- http://www.gisinventory.net/
- http://www.data.gov/catalog/geodata & http://geodata.gov
- Localized data catalogs: http://wiki.civiccommons.com/
TODO: document how to load them, where shapes might be found
You'll need to find some online data to feed to your system, and write (or modify) some "scraper" scripts to fetch that data and load it. What sort of data? Anything is potentially usable as long as each item has, at minimum:
- a date
- a location (eg. geographic point or street address)
- a description and/or title
If it doesn't have both a date and a location, it won't be very useful in OpenBlock, since that would violate one of the two key buzzwords ("hyperlocal" and "news").
- local news / blog feeds
- public meeting minutes
- public events calendar
- crime reports
- building permits
- health inspections
- Find out if there are license restrictions on using the data
- Find out if there are already feeds (RSS, Atom) you can use. Feeds are easy to parse and there is existing infrastructure to handle them.
- Find out there are dates in the data.
- Find out if the data has already has been geotagged with a latitude, longitude. If so, ideally it would be in the feed in a standard format like GeoRSS.
- If there are no existing feeds, and it's government data, it's time to start making phone calls and persuading your local officials to release their publicly-owned information. http://resource.org/8_principles.html has some good recommendations - emphasize the "machine readable" part.
- If you can't get a feed, but the data's on the web in some form, there is scraper infrastructure to handle other formats (HTML, CSV, PDF, ...) but you'll have to do some more scripting. Scraping HTML or PDF should be seen as a last resort, since any future visual redesign is likely to break your scraper, but sometimes that's all you can get.
For more information, see Ideal Feed Formats.
You'll need some scraper scripts.
Many communities already have local initiatives cataloging freely available data
- List of local data initiatives on the Civic Commons wiki
- http://www.data.gov/catalog/geodata & http://geodata.gov