Masters in IT - Data Science Capstone Exhibition, University of Pretoria

May 2026

MIT 808: Big Data Science Capstone Project

Welcome to the University of Pretoria MIT808: Big Data Science Capstone Project online exhibition



2026 Exhibition

The 2026 Capstone Exhibition showcases 21 impactful student projects that tackle urgent and diverse challenges using cutting-edge Data Science techniques. This year, 42 students worked in interdisciplinary teams with real-world partners to create solutions spanning climate policy, air quality forecasting, population estimation, parliamentary intelligence, cancer genomics, African NLP governance, and African language AI.

You can also explore the organisations and researchers we collaborated with via the Partners tab.


πŸš€ 2026 Project Overviews

🌍 African Climate NLP

Five teams applied NLP and LLMs to African climate governance β€” analysing policy documents across UNFCCC submissions, South African parliamentary proceedings, and SADC-region policy frameworks. Topics include corpus-based thematic classification, contextual bias in LLM-generated recommendations, multilingual discourse asymmetries, and identification of colonial framing.

πŸ›©οΈ UAV & Population Estimation

Three teams used drone and aerial imagery combined with deep learning (U-Net, DeepLabV3) and Bayesian/Gaussian Process regression to estimate population in the Melusi informal settlement in Atteridgeville. A scalable, cost-effective alternative to traditional census methods.

πŸ’¨ Air Quality Forecasting

Three teams developed machine learning pipelines to predict ground-level ozone exceedances on the South African Highveld β€” from same-day alerts (Random Forest, XGBoost) to 3-hour lead-time forecasting in Secunda and 24-hour Highveld-wide predictions with Streamlit-based alert systems.

πŸ›οΈ Parliamentary Intelligence

Three teams built AI-powered tools for monitoring South African parliamentary activity at scale β€” supporting investigative journalists with topic clustering, intent classification, abstractive summarisation, and RAG-assisted search across PMG data.

βš–οΈ African NLP Governance

Three teams audited copyright compliance and licensing risk across 249 African NLP datasets using rule-based scoring, machine learning classifiers, and unsupervised clustering to reveal systemic governance gaps. Interactive Streamlit dashboards provide real-time risk auditing.

πŸ”¬ Prostate Cancer Decision Support

Three teams built clinical dashboards for the South African Prostate Cancer Study (SAPCS) β€” covering risk stratification, molecular driver visualisation, and genomic integrity profiling using whole-genome sequencing data.

πŸ—£οΈ African Language AI

One team evaluated open-access LLMs for Sepedi translation to power a rabies awareness chatbot, addressing late-stage reporting in communities with limited access to health education in their home language.


Archive β€” Previous Cohorts

Year Projects Link
2025 9 projects, 18 students View 2025
2024 β€” View 2024
2023 β€” View 2023
2022 β€” View 2022
2021 β€” View 2021
2020 β€” View 2020

MIT 808 Information

MIT808 is taught at the University of Pretoria as part of the Masters in IT (Big Data Science) programme. Students carry out a Data Science Capstone project that integrates the theoretical work from their first year of study. The module is taught by Dr Olaperi Okuboyejo, Dr Abiodun Modupe, and Prof Vukosi Marivate.

More information:
- πŸ“˜ MIT808 Public Website
- πŸ§‘β€πŸŽ“ MIT Big Data Science Programme
- πŸ“¬ Mailing List for Updates


βœ‰οΈ Feedback and Contact

We'd love to hear from you!
πŸ“© dsfsi.info@up.ac.za

Organizers

Thapelo Sindane
Web Development Assistant
University of Pretoria
Neo Mokono
Web Development Assistant
University of Pretoria
Fiskani Banda
Web Development Assistant
University of Pretoria
Richard Lastrucci
Web Development Assistant
University of Pretoria
Keabetswe Madumo
Web Development Assistant
University of Pretoria

Thank you to our sponsors for supporting the physical exhibition for students!