New York, like many other cities across the United States, publishes open data sets that are easily accessible via a web portal as well as an API. Open Data Week is an event held in early March to promote data literacy as well as the rich data sets generously made available by the city.
Our project introduces practical machine learning via Python with the scikit-learn library. Python is a minimalist, interpreted programming language that is well suited for data analysis. Scikit-learn implements many statistical learning algorithms with a consistent API. Our goal is to demystify ML via a simple introduction that won't get into the math but still discuss the algorithms beyond just fit() and predict().
We are using the Clean Heat data set from New York's Open Data portal.