This repository contains the code for a degree project conducted at Linnaeus University in collaboration with Tietoevry and the Friends Foundation. The project explores the application of various supervised machine learning techniques for detecting bullying in textual data collected through school surveys provided by the Friends Foundation. The study evaluates several machine learning models, including Logistic Regression, Naive Bayes, Support Vector Machine (SVM), and Convolutional Neural Networks (CNN), alongside a Retrieval-Augmented Generation (RAG) model using Llama 3. The focus of this research is to identify the most effective model to detect instances of bullying, particularly aiming to achieve high recall to capture as many instances as possible.
For a detailed discussion of the project, including the methodology, results, and insights on the challenges faced during the research, you can read the full study here [link will be available later].
Contributers: Seif-Alamir Yousef & Ludvig Svensson.