🍊 Hassaku - A Recommender Systems Framework for Research

What is Hassaku?

Beyond being a Japanese citrus hybrid, Hassaku is a(nother) research-oriented Recommender System Framework written mostly in Python and leveraging PyTorch. Hassaku hosts the main procedure for data processing, training, testing, hyperparameter optimization, metrics computation, and logging for several Collaborative Filtering-based Recommender Systems. The project mostly started as a way to support my own research, however, it is open to everyone who wants to experiment with Recommender Systems algorithms and run exciting experiments.

What does Hassaku have?

Basic Train/Val/Test strategies for a Recommender Systems based on implicit feedback
Negative Sampling strategies
Several Collaborative Filtering algorithms
Full evaluation on all items during validation and test
Automatic sync to Wandb
Hyperparameter Optimization offered by Ray Tune

What doesn't Hassaku have?

A wide selection of algorithms and options compared to RecBole and Elliot (I am one person after all 🧑)

Installation and Configuration

Environment

Install the environment with conda env create -f hassaku.yml
Activate the environment with conda activate hassaku

Data

Generally, Hassaku works with user-based ratio-based train/val/test splits of users' interactions (this paper offers nice brief description about it), sometimes named weak generalization. The folder ./data/ hosts the code for downloading, filtering, and generally processing some RecSys datasets. There are several default *_processor.py in each dataset folder that perform temporal-based splits of the users-items interactions.

E.g. to download and process the MovieLens 1M Dataset:

move into the folder with cd data/ml1m
run python movielens1m_processor.py (if the dataset is not detected, it will be automatically downloaded)

After the script ends you will have 5 files in the newly created processed_dataset folder:

3 files with the listening_history of the users for train, val, and test
2 files containing the user and item ids (as index in the rating matrix)

Wandb configuration

Place your wandb api key in a file, e.g., ./wandb_api_key.

In wandb_conf.py set:

PROJECT_NAME = hassaku (or something else if you want)
ENTITY_NAME = <your_name_on_wandb>
WANDB_API_KEY_PATH = ./wandb_api_key

Quick Start

Showcasing how to run one or multiple experiments with Hassaku, considering BPRMF applied on MovieLens 1M

One experiment

Run movielens1m_processor.py in ./data/ml1m/ (as in "Installation and Configuration/Data")
Set all your experiment configurations in a .yml file in the e.g., ./conf folder. For example, bprmf_conf.yml:

data_path: ./data # or absolute path

# Model Parameters
embedding_dim: 402
lr: 0.0003
wd: 0.00004
use_user_bias: False
use_item_bias: True
use_global_bias: False

# Training Parameters
optimizer: adamw
n_epochs: 50
max_patience: 5
train_batch_size: 128
neg_train: 50
rec_loss: bpr

# Experiment Parameters
eval_batch_size: 256
device: cuda
running_settings:
  train_n_workers: 4
  batch_verbose: True

Run python run_experiment -a mf -d ml1m -c ./conf/bprmf_conf.yml (This will perform both train/val and test)
Track your run on wandb.

Hyperparameter Optimization

Run movielens1m_processor.py in ./data/ml1m/ (as in "Installation and Configuration/Data")
Write your hyperparameter ranges and set parameters in ./hyper_search/hyper_params.py (see Ray Tune docs about this)

bprmf_ml1m_param = {
    # Hyper Parameters
    'lr': tune.loguniform(5e-5, 1e-3),
    'wd': tune.loguniform(1e-6, 1e-2),
    'embedding_dim': tune.lograndint(126, 512, base=2),
    'train_batch_size': tune.lograndint(32, 256, base=2),
    'neg_train': tune.randint(1, 100),
    # Set Parameters
    'n_epochs': 50,
    'max_patience': 5,
    'running_settings':
        {
            'train_n_workers': 4
        },
    'train_neg_strategy': 'uniform',
    'rec_loss': 'bpr',
    'use_user_bias': False,
    'use_item_bias': True,
    'use_global_bias': False,
    'optimizer': 'adamw'
}

and make sure that it will be selected (bottom of ./hyper_search/hyper_params.py)

alg_data_param = {
...
(AlgorithmsEnum.mf, DatasetsEnum.ml1m): bprmf_ml1m_param,
...
}

Run python run_hyper_experiment -a mf -d ml1m -dp <path_to_the_./data_directory \

Optionally, you can set the number of sampled configurations by -ns (e.g., -ns 50), cpus and gpus for each trial, <=1 values are allowed, using -ncpu -ngpu (e.g., -ncpu 1 -ngpu 0.2), and the number of concurrent trial with -nc (e.g., -nc 3).

Track all of your trials on wandb

Name		Name	Last commit message	Last commit date
Latest commit History 214 Commits
algorithms		algorithms
assets		assets
conf		conf
data		data
eval		eval
explanations		explanations
framework_tests/eval		framework_tests/eval
hyper_search		hyper_search
train		train
utilities		utilities
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
RecommenderSystemApproaches.ipynb		RecommenderSystemApproaches.ipynb
Structure.md		Structure.md
experiment_helper.py		experiment_helper.py
hassaku.yml		hassaku.yml
run_baselines.py		run_baselines.py
run_experiment.py		run_experiment.py
run_hyper_experiment.py		run_hyper_experiment.py
setup.py		setup.py
sweep_agent.py		sweep_agent.py
sweep_test.py		sweep_test.py
wandb_conf.py		wandb_conf.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🍊 Hassaku - A Recommender Systems Framework for Research

What is Hassaku?

What does Hassaku have?

What doesn't Hassaku have?

Installation and Configuration

Environment

Data

Wandb configuration

Quick Start

One experiment

Hyperparameter Optimization

About

Releases

Packages

Contributors 2

Languages

License

karapostK/hassaku

Folders and files

Latest commit

History

Repository files navigation

🍊 Hassaku - A Recommender Systems Framework for Research

What is Hassaku?

What does Hassaku have?

What doesn't Hassaku have?

Installation and Configuration

Environment

Data

Wandb configuration

Quick Start

One experiment

Hyperparameter Optimization

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages