A Python package for collaborative filtering on social datasets
- Pip (official releases):
pip install neighbors
- Github (bleeding edge):
pip install git+https://github.com/cosanlab/neighbors.git
The best way to learn how to use the package is by checking out the documentation site which contains usage tutorials as well as API documentation for all package functionality.
from neighbors.models import NNMF_sgd
from neighbors.utils create_user_item_matrix, estimate_performance
# Assuming data is 3 column pandas df with 'User', 'Item', 'Rating'
# convert it to a (possibly sparse) user x item matrix
mat = create_user_item_matrix(df)
# Initialize a model
model = NNMF_sgd(mat)
# Fit
model.fit()
# If data are time-series optionally fit model using dilation
# to leverage auto-correlation and improve performance
model.fit(dilate_by_nsamples=60)
# Visualize results
model.plot_predictions()
# Estimate algorithm performance using
# Repeated refitting with random masking (dense data)
# Or cross-validation (sparse data)
group_results, user_results = estimate_performance(NNMF_sgd, mat)
Currently supported algorithms include:
Mean
- a baseline modelKNN
- k-nearest neighborsNNMF_mult
- non-negative matrix factorization trained via multiplicative updatingNNMF_sgd
- non-negative matrix factorization trained via stochastic gradient descent