An Automated Metadata Mining and Dataset Recommendation System

Fiskani Banda , Kris Hamersma

Partner: innovation-africa

Year: 2023

Abstract: The purpose of this project is to assist users with finding datasets available on Innovation Africa@UP's Information Hub. An automated metadata extraction model and database was designed to collect and store the metadata from the various multi-disciplinary projects available on the Information Hub. Word processing was performed on the metadata to create a collection of keywords that could be fed into the recommendation model. The recommendation model used the generated metadata to determine similarity between a user's search results and the available projects by performing word embedding and calculating the cosine similarity scores between available projects. The projects are then ranked accordingly and presented to the end user. Streamlit was used as the framework for the app.

Presentation Video