Prajwal Rao | prajwal.rao@outlook.com.au | Project 1 (Tech Stream) | MSA 2020
In these fast-changing and unprecendented times, there is something new happening almost everyday and it can sometimes be difficult to keep track of the news. To solve this problem, this module is designed to recommend news articles based on a user's reading history and also factor in various preferences including topics, source etc. - for a more personalised experience.
This concept has been implemented on a widely available news dataset on kaggle by author. This dataset contains around 200k news headlines from the year 2012 to 2018 obtained from HuffPost.
The entire project is written in Python Jupyter Notebook in Azure ML studio environment. All required packages and dependencies are provided by default. Any additional downloads or installations are performed in the notebook.
Alternatively, a requirements.txt file has been provided which can be used to setup a new environment.
Compute and Storage requirements:
- RAM: 4 GB
- Storage: 200 MB
- Highly recommend downloading this notebook and running it in an ML workspace, like Azure ML studio, or an equivalent workspace meeting the given requirements.
- Ensure the dataset is either cloned from this repository or downloaded from kaggle. Place it in the
Dataset
folder following the same directory structure. - Run the notebook cells sequentially in the environment setup above.