Zondo Commission on State Capture: What We Missed

Erika Scholtz , Matimba Shingange

Partner: mg

Year: 2022

Abstract: The Zondo Commission on State Capture court transcripts contain a wealth of information that is yet to be tapped into. This project focuses on extracting untapped insights from the Zondo Commission on State Capture transcripts. To explore the raw transcripts, word frequency analysis, bi-gram and tri-gram analysis, Principal Components Analysis (PCA) and T-distributed Stochastic Neighbor Embedding (t-SNE) were performed. The analysis that was done include LDA topic modelling, Named Entity Recognition (NER) and network analysis. A focus is placed on veridical data science, and elements of the PCS framework are discussed throughout. Streamlit and Sigma.js were used as a visualisations tool to build interactive web applications. The project was able to provide journalists from Mail & Guardian with insights that they can use to generate new leads going forward.

Presentation Video