Will Air Quality Have an Exceedance Today? Machine Learning for 24-Hour Ozone Forecasting
Kamogelo Moukangwe , Rivoningo Mageza
Partner: ggm
Year: 2026
Abstract:
High concentrations of ground-level ozone pose a significant public health risk in the industrialised regions of South Africa, such as the Highveld. This project developed a machine learning-based system to predict hourly ozone concentrations up to 24 hours ahead using real-time precursor readings and meteorological data. Historical pollutant readings from 36 SAAQIS stations (2 years) and Open-Meteo weather data were used to train multiple models. Random Forest achieved the highest accuracy (R2=0.781, RMSE=5.94 ppb) and XGBoost performed nearly as well (R2=0.776) but trained 20 times faster. An interactive Streamlit app delivering 24-hour forecasts, visual alerts when ozone exceeds 61 ppb, and optional daily subscriptions was deployed. Feature importance analysis confirmed that lagged ozone and temperature dominate predictions. This proof-of-concept demonstrates that accessible, real-time air quality forecasting is feasible using open APIs and machine learning.