A big data group project that includes data manipulation, prediction parameters configuration with ALS and, lastly, communities detection with Newman-Girvan algorithm. The data used is the 20M MovieLens Dataset.
- Alexandros Rantos
- Duaa Alqattan
- Alexander Merschel
This project was developed in Databricks 6.3 & Apache Spark 2.4.4.
Furthermore, dataframes and pandas framework are also used.
pip install pyspark
pip install graphframes
pip install dataframe
pip install pandas