Development of the prediction model for the Pundits Review website used to predict the sentiment in football news articles - https://www.punditsreview.com/
Pundits Review scrapes and processes news articles about the Premier League in order to give players and teams a review score each week. Each Monday, the project collects articles, divides them into phrases, identifies the player or club being referred to and then predicts the sentiment of the phrase. See more on how it works here!
This repository shows the progression of the methods used to train a model to predict the sentiment of phrases within a football news article. Attempts 1 & 2 use general sentiment datasets whereas 'Building the model' uses football specific sentiment data from BetSentiment.com. My final training data is provided alongside the methods used to pre-process it.
Attempt 1 trains a Linear Regression sentiment model on a twitter sentiment dataset provided by Stamford University
Attempt 2 trains a Linear Regression sentiment model on a airlines sentiment dataset - Kaggle
Notebook used to build different models using BetSentiment data
Notebook used to compare accuracy scores & confusion matrix between 48 approaches to predicting sentiment. Visualisations included.
Notebook used to clean and preprocess the training data
Manually annotated (sentiment & player target) set of 500 rows of sample data taken from The Mirror - Match Reports
Training data used to train different model approaches in 'Building_the_model' - After preprocessing