EECS731Project3

Data interpretation

We use the MovieLens 20M Dataset

It contains 20 million ratings and 465,000 tag applications applied to 27,000 movies by 138,000 users. Includes tag genome data with 12 million relevance scores across 1,100 tags. Released 4/2015; updated 10/2016 to update links.csv and add tag genome data.

Data sets included

Ratings Data File Structure (ratings.csv) Tags Data File Structure (tags.csv) Movies Data File Structure (movies.csv)

Part 1: Data Exploration

First of all, we clean the data sets, and analyze the three data sets. We characterized the data statisitics and generate several visulizations to demonstrate the data value of these data sets.

We analyze the tags and genres of each movie. Each genre contains how many movies has be visulized and the distribution of each genre has also been charterized.

Part 2: Clustering for Movie Recommendation

In this part, we adopt the K-nearest model for the clustering task. We first transform the genre and tags data into one-hot vector and add them into the dataframe. We also leverage the statistic information of ratings. As we can see from the clustering results, the recommendation system actually recommends the movies within similar years. This might happend due to the similar types of movies within similar years.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Movies.ipynb		Movies.ipynb
README.md		README.md
ml-20m-README.html		ml-20m-README.html
movies.csv		movies.csv
tags.csv		tags.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EECS731Project3

Data interpretation

Data sets included

Part 1: Data Exploration

Part 2: Clustering for Movie Recommendation

About

Releases

Packages

Languages

xionggj001/EECS731Project3

Folders and files

Latest commit

History

Repository files navigation

EECS731Project3

Data interpretation

Data sets included

Part 1: Data Exploration

Part 2: Clustering for Movie Recommendation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages