SASRec in Tensorflow 2.x #1530

aeroabir · 2021-09-20T10:13:07Z

Description

Added SASRec (Wang-Cheng Kang, Julian McAuley (2018). Self-Attentive Sequential Recommendation. In Proceedings of IEEE International Conference on Data Mining (ICDM'18)) coded in TF 2.x

This is to add newer algorithms especially the ones based on Transformers.

Related Issues

None

Checklist:

[ x] I have followed the contribution guidelines and code style for this project.
[ x] I have added tests covering my contributions.
I have updated the documentation accordingly.
[ x] This PR is being made to staging branch and not to main branch.

review-notebook-app · 2021-09-20T10:13:11Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

anargyri · 2021-09-20T10:49:13Z

Hey Abir, we use staging as our development branch. So we always merge PRs into staging and never to main (except staging -> main).

anargyri · 2021-09-20T10:51:27Z

Since this PR will depend on upgrading TF, let's keep reviewing but leave it open until we can put everything in the upgrade together.

anargyri · 2021-09-21T12:40:30Z

tests/ci/notebooks_unit_tests_tf2.yml

@@ -0,0 +1,59 @@
+# Copyright (c) Microsoft Corporation. All rights reserved.


I don't think we need a separate test pipeline for TF2. Once we upgrade the TF version in setup.py the current pipeline will switch to version 2. So you don't need this file.

removed this file.

anargyri · 2021-09-21T13:18:34Z

Could you please add docstrings to those functions that should appear in the documentation page on readthedocs? See https://github.com/Microsoft/Recommenders/wiki/Coding-Guidelines#python-and-docstrings-style
Don't forget to edit docs/source too.

anargyri

Each new notebook needs to have some description in the beginning. See other notebooks in the examples. The description is a summary of the method and what scenario it addresses. There may be text in other places in the notebook too, if needed for clarification.

Moreover, the method needs to be included in the table of methods we have in the README.

miguelgfierro · 2021-09-22T16:49:56Z

recommenders/models/sasrec/model.py

+    - each tuple (q, k, v) are fed to scaled_dot_product_attention
+    - all attention outputs are concatenated
+""" 
+class MultiHeadAttention(tf.keras.layers.Layer):


Could you please add docstrings?

Added docstrings to all the methods.

recommenders/models/sasrec/model.py

recommenders/models/sasrec/sampler.py

miguelgfierro · 2021-09-22T16:59:27Z

recommenders/models/sasrec/util.py

+def evaluate(model, dataset, maxlen, num_neg_test):
+    [train, valid, test, usernum, itemnum] = copy.deepcopy(dataset)
+
+    NDCG = 0.0
+    HT = 0.0
+    valid_user = 0.0
+
+    if usernum>10000:
+        users = random.sample(range(1, usernum + 1), 10000)
+    else:
+        users = range(1, usernum + 1)
+
+    for u in tqdm(users, ncols=70, leave=False, unit='b'):
+
+        if len(train[u]) < 1 or len(test[u]) < 1: continue
+
+        # print(train[u])


In other notebooks, we are standardizing the way we evaluate, so we are comparing apples to apples, would it be possible to use the evaluation functions that we have in the repo? https://github.com/microsoft/recommenders/blob/main/recommenders/evaluation/python_evaluation.py

The metrics that are used here, NDCG@10 and Hit@10 are same as what is available in deeprec_utils.py (https://github.com/microsoft/recommenders/blob/main/recommenders/models/deeprec/deeprec_utils.py). Currently I cannot include them since Tf2.x is not supported (from deeprec_utils import ndcg_score, hit_score does not work). Once we migrate to TF 2.x it is easy to invoke those functions.

miguelgfierro · 2021-09-22T17:00:58Z

tests/ci/notebooks_unit_tests_tf2.yml

@@ -0,0 +1,59 @@
+# Copyright (c) Microsoft Corporation. All rights reserved.


tests/integration/examples/test_notebooks_tf2.py