Proj3

Air Quality Forecasting

Nairobi PM2.5 Time Series Analysis

Overview

Proj3 analyzes air quality sensor data from Nairobi using time series techniques. The project progresses from data wrangling with MongoDB-style queries to autoregressive modeling.

Data Source

JSON sensor data with timestamps and PM2.5 readings.

Techniques

AR, ARMA, Linear Regression on lagged features

Evaluation

MAE, walk-forward validation, residual analysis

Lesson Workflows

Lesson Topic Technique Query
3.1 Data Wrangling with MongoDB Query, aggregate, reshape "run lesson 3.1"
3.2 Linear Regression on Time Series Lagged features + LinearRegression "run lesson 3.2"
3.3 Autoregressive Models AR(p) model fitting "run lesson 3.3"
3.4 ARMA and Hyperparameter Tuning ARMA(p,q) with grid search "run lesson 3.4"

Data Structure

Proj3/data/ ├── nairobi.json # Raw sensor data └── cache/ ├── site_29_P2_wrangled.parquet ├── site_29_P2_wrangled_lagged.parquet ├── linear_regression_model_predictions.parquet └── autoreg_p1_model_predictions.parquet

Available Tools

Tool Description
load_proj3_data Load Nairobi air quality dataset
analyze_air_quality Time series exploratory analysis
forecast_pm25 Generate PM2.5 forecasts
fit_ar_model Fit autoregressive model
fit_arma_model Fit ARMA model with tuning
create_acf_plot Autocorrelation visualization

Quick Start

# Load and analyze "load nairobi air quality data" "analyze air quality time series" # Run specific lesson "run lesson 3.3 on AR models" # Forecast "forecast PM2.5 for next 24 hours" # Cross-project "apply GARCH from project 8 to air quality data"