Skip to content

Latest commit

 

History

History
150 lines (91 loc) · 6.54 KB

README.md

File metadata and controls

150 lines (91 loc) · 6.54 KB

Build Status

Air Quality Forecast in the Metropolitan Area of Mexico City

This repository contains a set of machine learning models to forecast the pollutants in in the Metropolitan Area of Mexico City. The models are optimized to obtain a low false positive rate according to the levels of the environmental contingency program.

Models were developed to forecast pollution levels in Mexico City, the pollutants predicted are the following:

Para cada contaminante se desarrollaron modelos para pronosticar sus niveles con hasta 24 horas de antelación, se obtuvo un error comparable a la bibliografía.

  • PM10
  • PM2.5 (in development)
  • Ozone

There is a dashboard of the project, developed in the Repositories, Research and Prospective Coordination (CRIP) of the National Council of Science and Technology (CONACyT).

The aim of the dashboard is to inform the population of the Valley of Mexico in a friendly and direct way about the state of air quality in it. It consists of a dashboard that shows the current status of the air quality index, and is updated hourly. The index is obtained from the data shared by the Ministry of Environment (SEDEMA) of the Government of Mexico City and can be found here. Also using machine learning algorithms, a model that estimates the air quality index 24 hours ahead was built. The table shows this estimate as well as a line graph of the hour-to-hour estimate of the index of suspended particles less than 10 micrometers (PM10) and ozone (O3).

Pollution and meteorological data are obtained from the CDMX air quality portal.

-- Project Status: [On-Hold]

Summary

For each pollutant models were developed to forecast their levels up to 24 hours in advance, an error comparable to the literature was obtained.

Air Quality Forecast in the Metropolitan Area of ​​Mexico City

This repository contains a set of machine learning models to forecast the pollutants in in the Metropolitan Area of ​​Mexico City. The models are optimized to obtain a low false positive rate according to the levels of the environmental contingency program.

Models were developed to forecast pollution levels in Mexico City, the pollutants predicted are the following:

  - PM10   - PM2.5 (in development)   - Ozone

Pollution and meteorological data are obtained from the CDMX air quality portal.

-- Project Status: [On-Hold]

Project Intro/Objective

For each pollutant models were developed to forecast their levels up to 24 hours in advance, an error comparable to the literature was obtained.

The following graph shows the actual and predicted values 12 hours in advance for PM10:

Contribuitors

  • Paulina Pradel in the visualization and web dashboard section. The following graph shows the actual and predicted values ​​12 hours in advance for the Ozone:

alt text

  • PM10 (24 hours Moving average):

alt text

The mean RMSE is about 11.59%, the next graph shows the RSME by hour:

alt text

For more info about the performance of the models, don't hesitate to contact me.

Contributors

Methods Used

  • Inferential Statistics
  • Machine Learning
  • Data Visualization
  • Predictive Modeling

Technologies

  • Python
  • Plotly
  • PostGres
  • Pandas, jupyter
  • HTML

Project Description

(Provide more detailed overview of the project. Talk a bit about your data sources and what questions and hypothesis you are exploring. What specific data analysis/visualization and modelling work are you using to solve the problem? What blockers and challenges are you facing? Feel free to number or bullet point things here)

Getting Started

  1. Clone this repo (for help see this tutorial).

  2. Raw Data is being kept [here](Repo folder containing raw data) within this repo.

    If using offline data mention that and how they may obtain the data from the froup)

  3. Data processing/transformation scripts are being kept [here](Repo folder containing data processing scripts/notebooks)

  4. Follow setup [instructions](Link to file)

Featured Notebooks/Analysis/Deliverables

Technologies

  • Python
  • Scikit
  • Plotly
  • PostgreSQL
  • Jupyter
  • HTML

Getting Started

If you want to access the forecast it is suggested to visit the dashboard directly (soon). If you need to compute the forecast, it is enough to follow the following steps:

  1. Clone this repo (for help see this tutorial).

  2. Raw Data is being kept here within this repo.

  3. The forecast and data processing/transformation scripts are implemented in a data pipeline, to run it, simply run in a terminal:

python pipeline_general/pipeline/4_predicción.ipynb

Featured Notebooks/Analysis/Deliverables

![tablero de calidad del aire](assets/tablero_scr.png

Contributing Members

Contact

Other Members:

Name Role
Norberto Morales Data Engineer

Contact¡

  • Feel free to contact team leads with any questions or if you are interested in contributing!