Application of Deep Learning Techniques in Medium-Range Rainfall Forecasting
Thokozile Khosa , Caleb Siyasiya
Partner: ggm
Year: 2025
Abstract:
Medium-range forecasting (3 to 10 days ahead) is a critical technique in meteorology, as it aids in early warnings for severe weather, which can help save lives and protect infrastructure. In South Africa early warnings would be useful during heavy rainfall scenarios. Traditional forecasting techniques such as Numerical Weather Prediction (NWP) models perform well over short-time scales but have a declining accuracy with medium-term forecasting. The aim of this project was to explore machine learning techniques for predicting rainfall and thereafter develop a machine learning-based model to predict rainfall 3 to 10 days in advance across Southern Africa. This entailed working with ERA5 reanalysis data, which contains climate variables such as geopotential, specific humidity, temperature and more. Exploratory data analysis was conducted to investigate the relationship between these variables and total precipitation. Thereafter, data cleaning and processing was performed on the dataset. In the modelling phase, various deep learning model architectures were explored for their applicability within weather forecasting. The project focused on implementing a custom Convolutional Neural Networks with Long Short-Term Memory (CNN-LSTM) and fine-tuning an existing weather predictions Graph neural networks model (FT-GraphCast) for application in Southern Africa. The implemented models were assessed for accuracy in their predictions for medium-range forecasts and their alignment with PCS Predictability, Computability, Stability (PCS) requirements. These predictions and findings were then deployed to a web application. The findings of the project have shown there is promise in exploring deep learning models for medium-range rainfall forecasts as both models showed improvements with training. However, both models still struggle with different issues such as inaccuracies for predictions past 3 days and not capturing dynamic changes in specific locations. To address these issues refinements such as including more meteorological inputs, making model architecture improvements and obtaining more computation resources amongst other factors is required in future work.