The Sustainable Green Infrastructure Monitoring (SGIM) project uses sensors, provided by Opti, that measures water diverted into the ground. This repo contains the ETL code which uploads data from Opti's sensors to Socrata open data portals. It is a generalized program from the same code that uploads SGIM data to Chicago's open data portal.
This ETL depends on the open source "Open Data ETL Utility Kit" framework. The Utility Kit uses the open source Kettle data integration program to help automate uploads to a Socrata open data portal. This repository uses that Utility Kit and is further preconfigured to work with Opti's APIs.
This ETL requires you first install and configure the "Open Data ETL Utility Kit". Download that repository and follow the installation instructions.
Place this repository in the ./open-data-etl-utility-kit/ETL/
directory. You may name the folder whatever you prefer, though you can see our recommended naming conventions.
Opti uses a token to authenticate users of their API. Obtain your token and place it in the credentials_sample.csv
document and save it as credentials.csv
. The ETL is pre-configured to read from the credentials.csv
for authentication.
The bash script, ggws-77ih.sh
, will run the ETL process. Open the bash script and see the instructions to configure the file names to match your directory.
Use a task scheduler, such as crontab
to set a regular time to run the ETL. See the recommended configuration to setup a regular updating process while also logging the outcome of each update.
Automation is currently initiated by a Bash script, such as Cygwin to automate the process.
Below are the principal components of the repository and a brief description of their role.
SGIM_Results_ggws-77ih.ktr
- Main file which contains the ETL. This can be opened in Kettle (Spoon) and viewed.SGIM-Admin_DataStreams_Original.txt
- Shows all of the streams available through Opti's ETL.SGIM-Admin_DataStreams_Edited.csv
- The actual streams that will be used in the ETL. The list in the repository is the sensors currently displayed on Chicago's portal.ggws-77ih.sh
- A bash script which runs the ETL.credentials_sample.csv
- A sample of how credentials should be specified, however, it is not used in the ETL. Please see instructions to configure credentials.
Other files located in the repository not listed here either indirectly support the workflow or are adminstrative in nature. Note that .gitignore
includes credentials.csv
(excludes it from commits). You may wish to change this setting for internal-only forks of this repository within your organization but should be cautious if you later want to contribute back to this public repository.
See the license file.
Contributions such as pull requests or opening issues are welcomed. If you plan on submitting a pull request, please complete a Contributor License Agreement.