Movie Recommendations

Recommender System based on Movie Similarities with Spark Scala.

Recommender System: Movie Similarities

The goal of this project is to develop a recommendation system for the MovieLens dataset (accesible from https://grouplens.org/datasets/movielens/). The input is the target movie ID and the output is a list of movies ranked by similarity score and strength of recommendation.

The model uses Spark to perform distributed item-based collaborative filtering using the Cosine Similarity metric to recommend a maximum of 10 movies similar to a specified movie. The strength of the prediction (proxy for confidence) is based on how many movies were avaiable to compute the similarity score. The higher the number of movies, the stronger our prediction is going to be.

The Spark project is developed to be able to process big datasets in a parallel fashion. For testing purposes, I have developed the script to run on my local machine using all cores. To run it on a cluster, just change how the SparkContext is initialized to clarify your cluster and specify the path to the distribuetd file system when loading the data (hdfs, s3, ...).

To run the script in the command line, create JAR file form the package and use spark-submit specifying the target movieID:

$ spark-submit --class com.martataulet.spark.MovieSimilarities MovieSim.jar <movieID>

For example for movieID = 56 the script returns:

Top 10 similar movies for Pulp Fiction (1994)
Smoke (1995)	score: 0.9743848338030823	strength: 68
Reservoir Dogs (1992)	score: 0.9740674165782123	strength: 134
Donnie Brasco (1997)	score: 0.9738247291149608	strength: 75
Sling Blade (1996)	score: 0.9713796344244161	strength: 111
True Romance (1993)	score: 0.9707295689679896	strength: 99
Jackie Brown (1997)	score: 0.9706179145690377	strength: 55
Carlito's Way (1993)	score: 0.9706021261759088	strength: 52

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
MovieSimilarities.scala		MovieSimilarities.scala
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Movie Recommendations

Recommender System: Movie Similarities

About

Releases

Packages

Languages

mtaulet/MovieRecommendations

Folders and files

Latest commit

History

Repository files navigation

Movie Recommendations

Recommender System: Movie Similarities

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages