Collaborative Filtering

This repository is part of the solutions for the course Computational Intelligence Lab at ETH Zurich, Spring Semester 2021. Our solution achieved first place on the leaderboard for Collaborative Filtering.

Abstract: Collaborative filtering methods for recommender systems find their application in a wide variety of areas. In this work we consider several neural-based and standard matrix-factorization-based models, placing our focus on Bayesian Factorization Machines. We extended these by adding additional features such as implicit user/item information, multiple user similarity measures, item distance metrics, as well as unsupervised-learned clusters and report results in an extensive benchmark.

Overview

Contains the following algorithms:

Singular Value Decomposition
Non-Negative Matrix Factorization
Autoencoder
Autorec
Neural Collaborative Filtering
Kernel Net
Bayesian Factorization Machine

Results

Validation RMSE for different Matrix-Factorization-based Approaches:

Validation RMSE for different rank values. NMF performs best with a rank of 24 while others peak around 8 to 12.

Validation RMSE for different Neural-based Approaches:

Validation RMSE for different epochs. AutoRec and NCF diverge after a few epochs while KernelNet and AE show the opposite behaviour.

Heatmap of Bayesian Factorization Machine:

Validation RMSE for different values of sample size and rank. We conduct the hyperparameter search on our second best model, BMF SVD++ flipped, due to the high computational burden of our best. Low-rank approximations of 8 to 12 with high sampling sizes perform best.

Reproduce Results

Use Python version 3.7.4

Create environment

conda create --name collaborative-filtering python=3.7.4

Activate environment

conda activate collaborative-filtering

Install dependencies

pip install --user -r requirements.txt

Execute scripts

python main.py

This will reproduce the best submitted Kaggle result. However, note that this may take up to 2 days to finish.

python cv_experiments.py

This will reproduce the results used for generating the plots of the report.

python movie_clustering.py
python movie_distance.py

This will create the matrices used for movie features in the Bayesian Factorization Machine model. Note that Jaccard indices will only be calculated on the fly, if not downloaded beforehand. The Jaccard indices can be downloaded from Polybox and put them into data/features.

In order to reproduce plots one needs to download the cross-validation results from Polybox or run the cross validation explained above to get the files. The files that need to be downloaded are: bfm.hkl, bfm_svdpp.hkl, bfm_svdpp_flipped.hkl, svd.hkl. These files need to be put into the root directory. For reproducing the data for the validation plot one can either run the experiments from scratch or download from Polybox the following files: autoencoder.npy, autorec.npy, kernel_net.npy, ncf.npy and put them into the root directory. This will create plots

python plot_generator.py

Report

To build the report, simply call:

cd report

bash build-paper.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Collaborative Filtering

Overview

Results

Reproduce Results

Create environment

Activate environment

Install dependencies

Execute scripts

Report

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
data/features		data/features
lib		lib
plots		plots
report		report
README.md		README.md
cv_experiments.py		cv_experiments.py
main.py		main.py
movie_clustering.py		movie_clustering.py
movie_distances.py		movie_distances.py
plot_generator.py		plot_generator.py
requirements.txt		requirements.txt

PatrikOkanovic/CIL-Recommender-Systems

Folders and files

Latest commit

History

Repository files navigation

Collaborative Filtering

Overview

Results

Reproduce Results

Create environment

Activate environment

Install dependencies

Execute scripts

Report

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages