An Exploratory Data Analysis and Data Mining Approach to Dataset and Dataset Relationship Discovery in Production Systems

Penelope Matloga , Nandi Mnguni , Phindile Binda

Partner: innovation-africa

Year: 2023

Abstract: The study presents an exploratory data analysis and data mining approach in discovering relationships between datasets in the production system. The selected model is a pretrained sentence transformers model called multi-qa-MiniLM-L6-dot-v1, which establishes relationships between the datasets using score similarity. The outcome of the study is a single search widget that allows users to search for different projects in the Information Hub, where similar datasets are recommended based on the text that is searched by the end-user.

Presentation Video