Deployment of a Three-Hour Lead Time for Ozone Levels Prediction in Secunda

Mpho Muloiwa , Chale Justice Moferefere

Partner: ggm

Year: 2026

Abstract: One of the gases that causes health risk is ozone (O3). O3 is a greenhouse gas that contributes to climate change, produced when sunlight reacts with pollutants such as NOX, thereby affecting urban air quality. Since Secunda, a town in Mpumalanga, South Africa, is known for producing coal using coal-to-oil plants, it is anticipated that high levels of emissions will emanate, including O3. Five classifier algorithms were applied to model the future ozone levels: Random Forest, XGBoost, Logistic Regression, Support Vector Machine, and Decision Tree. The results showed that Logistic Regression has the highest ROC-AUC (0.9280), followed closely by XGBoost (0.9053). XGBoost has the highest precision (0.1961) and F1-Score (0.2395), meaning when it predicts an exceedance it is more likely to be correct. Therefore, XGBoost algorithm can be applied to the air quality dataset to forecast a 3-hour lead in O3. The web application is user-friendly and requires inputs of current 3-hour O3, NO2, NO, NOX, hour of the day, and day of the year.

Presentation Video