-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add save/load & input/output schema methods to T4Rec Model class #507
Conversation
Click to view CI ResultsGitHub pull request #507 of commit ad37cb1742c48226ee48b10b556e9d3af7ab4448, no merge conflicts. Running as SYSTEM Setting status of ad37cb1742c48226ee48b10b556e9d3af7ab4448 to PENDING with url http://10.20.17.181:8080/job/transformers4rec_tests/230/ and message: 'Build started for merge commit.' Using context: Jenkins Unit Test Run Building on master in workspace /var/jenkins_home/workspace/transformers4rec_tests using credential nvidia-merlin-bot Cloning the remote Git repository Cloning repository https://github.com/NVIDIA-Merlin/Transformers4Rec.git > git init /var/jenkins_home/workspace/transformers4rec_tests/transformers4rec # timeout=10 Fetching upstream changes from https://github.com/NVIDIA-Merlin/Transformers4Rec.git > git --version # timeout=10 using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Transformers4Rec.git +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA-Merlin/Transformers4Rec.git # timeout=10 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA-Merlin/Transformers4Rec.git # timeout=10 Fetching upstream changes from https://github.com/NVIDIA-Merlin/Transformers4Rec.git using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Transformers4Rec.git +refs/pull/507/*:refs/remotes/origin/pr/507/* # timeout=10 > git rev-parse ad37cb1742c48226ee48b10b556e9d3af7ab4448^{commit} # timeout=10 Checking out Revision ad37cb1742c48226ee48b10b556e9d3af7ab4448 (detached) > git config core.sparsecheckout # timeout=10 > git checkout -f ad37cb1742c48226ee48b10b556e9d3af7ab4448 # timeout=10 Commit message: "add suport of list outputs" > git rev-list --no-walk d532234b241f46d77366b98d3450b08f83133c20 # timeout=10 First time build. Skipping changelog. [transformers4rec_tests] $ /bin/bash /tmp/jenkins16443494207033087218.sh ============================= test session starts ============================== platform linux -- Python 3.8.10, pytest-7.1.3, pluggy-1.0.0 rootdir: /var/jenkins_home/workspace/transformers4rec_tests/transformers4rec plugins: anyio-3.6.1, xdist-2.5.0, forked-1.4.0, cov-4.0.0 collected 1 item |
Documentation previewhttps://nvidia-merlin.github.io/Transformers4Rec/review/pr-507 |
Click to view CI ResultsGitHub pull request #507 of commit 6dea3fe7ad046fa643b27e77439037ead84b51d3, no merge conflicts. Running as SYSTEM Setting status of 6dea3fe7ad046fa643b27e77439037ead84b51d3 to PENDING with url http://10.20.17.181:8080/job/transformers4rec_tests/231/ and message: 'Build started for merge commit.' Using context: Jenkins Unit Test Run Building on master in workspace /var/jenkins_home/workspace/transformers4rec_tests using credential nvidia-merlin-bot Cloning the remote Git repository Cloning repository https://github.com/NVIDIA-Merlin/Transformers4Rec.git > git init /var/jenkins_home/workspace/transformers4rec_tests/transformers4rec # timeout=10 Fetching upstream changes from https://github.com/NVIDIA-Merlin/Transformers4Rec.git > git --version # timeout=10 using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Transformers4Rec.git +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA-Merlin/Transformers4Rec.git # timeout=10 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA-Merlin/Transformers4Rec.git # timeout=10 Fetching upstream changes from https://github.com/NVIDIA-Merlin/Transformers4Rec.git using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Transformers4Rec.git +refs/pull/507/*:refs/remotes/origin/pr/507/* # timeout=10 > git rev-parse 6dea3fe7ad046fa643b27e77439037ead84b51d3^{commit} # timeout=10 Checking out Revision 6dea3fe7ad046fa643b27e77439037ead84b51d3 (detached) > git config core.sparsecheckout # timeout=10 > git checkout -f 6dea3fe7ad046fa643b27e77439037ead84b51d3 # timeout=10 Commit message: "add shape property and fix pr comment" > git rev-list --no-walk ad37cb1742c48226ee48b10b556e9d3af7ab4448 # timeout=10 [transformers4rec_tests] $ /bin/bash /tmp/jenkins12034054313781789087.sh ============================= test session starts ============================== platform linux -- Python 3.8.10, pytest-7.1.3, pluggy-1.0.0 rootdir: /var/jenkins_home/workspace/transformers4rec_tests/transformers4rec plugins: anyio-3.6.1, xdist-2.5.0, forked-1.4.0, cov-4.0.0 collected 1 item |
Click to view CI ResultsGitHub pull request #507 of commit e6f6e58ba93a1f67557c686ce6aafd5ed9891c7c, no merge conflicts. Running as SYSTEM Setting status of e6f6e58ba93a1f67557c686ce6aafd5ed9891c7c to PENDING with url http://10.20.17.181:8080/job/transformers4rec_tests/233/ and message: 'Build started for merge commit.' Using context: Jenkins Unit Test Run Building on master in workspace /var/jenkins_home/workspace/transformers4rec_tests using credential nvidia-merlin-bot Cloning the remote Git repository Cloning repository https://github.com/NVIDIA-Merlin/Transformers4Rec.git > git init /var/jenkins_home/workspace/transformers4rec_tests/transformers4rec # timeout=10 Fetching upstream changes from https://github.com/NVIDIA-Merlin/Transformers4Rec.git > git --version # timeout=10 using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Transformers4Rec.git +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA-Merlin/Transformers4Rec.git # timeout=10 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA-Merlin/Transformers4Rec.git # timeout=10 Fetching upstream changes from https://github.com/NVIDIA-Merlin/Transformers4Rec.git using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Transformers4Rec.git +refs/pull/507/*:refs/remotes/origin/pr/507/* # timeout=10 > git rev-parse e6f6e58ba93a1f67557c686ce6aafd5ed9891c7c^{commit} # timeout=10 Checking out Revision e6f6e58ba93a1f67557c686ce6aafd5ed9891c7c (detached) > git config core.sparsecheckout # timeout=10 > git checkout -f e6f6e58ba93a1f67557c686ce6aafd5ed9891c7c # timeout=10 Commit message: "update shape property with the convention used in systems" > git rev-list --no-walk 0732292df37d1cf427785608858f5590e0bcf6ab # timeout=10 First time build. Skipping changelog. [transformers4rec_tests] $ /bin/bash /tmp/jenkins15048533315320923114.sh ============================= test session starts ============================== platform linux -- Python 3.8.10, pytest-7.1.3, pluggy-1.0.0 rootdir: /var/jenkins_home/workspace/transformers4rec_tests/transformers4rec plugins: anyio-3.6.1, xdist-3.0.2, cov-4.0.0 collected 1 item |
Click to view CI ResultsGitHub pull request #507 of commit 3fb91d2b5b4d584a37f574e7c679896e1885b12a, no merge conflicts. Running as SYSTEM Setting status of 3fb91d2b5b4d584a37f574e7c679896e1885b12a to PENDING with url http://10.20.17.181:8080/job/transformers4rec_tests/234/ and message: 'Build started for merge commit.' Using context: Jenkins Unit Test Run Building on master in workspace /var/jenkins_home/workspace/transformers4rec_tests using credential nvidia-merlin-bot Cloning the remote Git repository Cloning repository https://github.com/NVIDIA-Merlin/Transformers4Rec.git > git init /var/jenkins_home/workspace/transformers4rec_tests/transformers4rec # timeout=10 Fetching upstream changes from https://github.com/NVIDIA-Merlin/Transformers4Rec.git > git --version # timeout=10 using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Transformers4Rec.git +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA-Merlin/Transformers4Rec.git # timeout=10 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA-Merlin/Transformers4Rec.git # timeout=10 Fetching upstream changes from https://github.com/NVIDIA-Merlin/Transformers4Rec.git using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Transformers4Rec.git +refs/pull/507/*:refs/remotes/origin/pr/507/* # timeout=10 > git rev-parse 3fb91d2b5b4d584a37f574e7c679896e1885b12a^{commit} # timeout=10 Checking out Revision 3fb91d2b5b4d584a37f574e7c679896e1885b12a (detached) > git config core.sparsecheckout # timeout=10 > git checkout -f 3fb91d2b5b4d584a37f574e7c679896e1885b12a # timeout=10 Commit message: "remove max_sequence_length from in/out schema methods" > git rev-list --no-walk e6f6e58ba93a1f67557c686ce6aafd5ed9891c7c # timeout=10 [transformers4rec_tests] $ /bin/bash /tmp/jenkins1055468907347027097.sh ============================= test session starts ============================== platform linux -- Python 3.8.10, pytest-7.1.3, pluggy-1.0.0 rootdir: /var/jenkins_home/workspace/transformers4rec_tests/transformers4rec plugins: anyio-3.6.1, xdist-3.0.2, cov-4.0.0 collected 1 item |
transformers4rec/torch/model/base.py
Outdated
# At inference, we just need the predictions tensors. | ||
# TODO: We are simplifying the logic around `hf_format` in the multi-gpu | ||
# support work. | ||
if not training and not self.hf_format: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove not training
from here so that the hf_format
controls the output. That would make it work in systems without a model adapter wrapper class
Click to view CI ResultsGitHub pull request #507 of commit c76b416a920916779dfcba953e80a3a02c5c3538, no merge conflicts. Running as SYSTEM Setting status of c76b416a920916779dfcba953e80a3a02c5c3538 to PENDING with url http://10.20.17.181:8080/job/transformers4rec_tests/235/ and message: 'Build started for merge commit.' Using context: Jenkins Unit Test Run Building on master in workspace /var/jenkins_home/workspace/transformers4rec_tests using credential nvidia-merlin-bot Cloning the remote Git repository Cloning repository https://github.com/NVIDIA-Merlin/Transformers4Rec.git > git init /var/jenkins_home/workspace/transformers4rec_tests/transformers4rec # timeout=10 Fetching upstream changes from https://github.com/NVIDIA-Merlin/Transformers4Rec.git > git --version # timeout=10 using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Transformers4Rec.git +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA-Merlin/Transformers4Rec.git # timeout=10 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA-Merlin/Transformers4Rec.git # timeout=10 Fetching upstream changes from https://github.com/NVIDIA-Merlin/Transformers4Rec.git using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Transformers4Rec.git +refs/pull/507/*:refs/remotes/origin/pr/507/* # timeout=10 > git rev-parse c76b416a920916779dfcba953e80a3a02c5c3538^{commit} # timeout=10 Checking out Revision c76b416a920916779dfcba953e80a3a02c5c3538 (detached) > git config core.sparsecheckout # timeout=10 > git checkout -f c76b416a920916779dfcba953e80a3a02c5c3538 # timeout=10 Commit message: "fix PR comments" > git rev-list --no-walk 3fb91d2b5b4d584a37f574e7c679896e1885b12a # timeout=10 [transformers4rec_tests] $ /bin/bash /tmp/jenkins14522582708295971654.sh ============================= test session starts ============================== platform linux -- Python 3.8.10, pytest-7.1.3, pluggy-1.0.0 rootdir: /var/jenkins_home/workspace/transformers4rec_tests/transformers4rec plugins: anyio-3.6.1, xdist-3.0.2, cov-4.0.0 collected 1 item |
Click to view CI ResultsGitHub pull request #507 of commit a399724271a5c77c0fc25f9873afc1456e003f6e, no merge conflicts. Running as SYSTEM Setting status of a399724271a5c77c0fc25f9873afc1456e003f6e to PENDING with url http://merlin-infra1.nvidia.com:8080/job/transformers4rec_tests/240/ and message: 'Build started for merge commit.' Using context: Jenkins Unit Test Run Building on master in workspace /var/jenkins_home/workspace/transformers4rec_tests using credential nvidia-merlin-bot Cloning the remote Git repository Cloning repository https://github.com/NVIDIA-Merlin/Transformers4Rec.git > git init /var/jenkins_home/workspace/transformers4rec_tests/transformers4rec # timeout=10 Fetching upstream changes from https://github.com/NVIDIA-Merlin/Transformers4Rec.git > git --version # timeout=10 using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Transformers4Rec.git +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA-Merlin/Transformers4Rec.git # timeout=10 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA-Merlin/Transformers4Rec.git # timeout=10 Fetching upstream changes from https://github.com/NVIDIA-Merlin/Transformers4Rec.git using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Transformers4Rec.git +refs/pull/507/*:refs/remotes/origin/pr/507/* # timeout=10 > git rev-parse a399724271a5c77c0fc25f9873afc1456e003f6e^{commit} # timeout=10 Checking out Revision a399724271a5c77c0fc25f9873afc1456e003f6e (detached) > git config core.sparsecheckout # timeout=10 > git checkout -f a399724271a5c77c0fc25f9873afc1456e003f6e # timeout=10 Commit message: "Merge branch 'main' into save-schema-for-t4rec-model" > git rev-list --no-walk ecae4337558075f1282ad3a5e40bbf6346b57243 # timeout=10 First time build. Skipping changelog. [transformers4rec_tests] $ /bin/bash /tmp/jenkins17120315256631438005.sh ============================= test session starts ============================== platform linux -- Python 3.8.10, pytest-7.1.3, pluggy-1.0.0 rootdir: /var/jenkins_home/workspace/transformers4rec_tests/transformers4rec plugins: anyio-3.6.1, xdist-3.0.2, cov-4.0.0 collected 1 item |
Click to view CI ResultsGitHub pull request #507 of commit 9c513119b2f522c662a288dd6dade872b906af14, no merge conflicts. Running as SYSTEM Setting status of 9c513119b2f522c662a288dd6dade872b906af14 to PENDING with url http://merlin-infra1.nvidia.com:8080/job/transformers4rec_tests/244/ and message: 'Build started for merge commit.' Using context: Jenkins Unit Test Run Building on master in workspace /var/jenkins_home/workspace/transformers4rec_tests using credential nvidia-merlin-bot Cloning the remote Git repository Cloning repository https://github.com/NVIDIA-Merlin/Transformers4Rec.git > git init /var/jenkins_home/workspace/transformers4rec_tests/transformers4rec # timeout=10 Fetching upstream changes from https://github.com/NVIDIA-Merlin/Transformers4Rec.git > git --version # timeout=10 using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Transformers4Rec.git +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA-Merlin/Transformers4Rec.git # timeout=10 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA-Merlin/Transformers4Rec.git # timeout=10 Fetching upstream changes from https://github.com/NVIDIA-Merlin/Transformers4Rec.git using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Transformers4Rec.git +refs/pull/507/*:refs/remotes/origin/pr/507/* # timeout=10 > git rev-parse 9c513119b2f522c662a288dd6dade872b906af14^{commit} # timeout=10 Checking out Revision 9c513119b2f522c662a288dd6dade872b906af14 (detached) > git config core.sparsecheckout # timeout=10 > git checkout -f 9c513119b2f522c662a288dd6dade872b906af14 # timeout=10 Commit message: "Merge branch 'main' into save-schema-for-t4rec-model" > git rev-list --no-walk 9e8632f3e5567381999a8da5a3edcfbe98529a9a # timeout=10 First time build. Skipping changelog. [transformers4rec_tests] $ /bin/bash /tmp/jenkins1417152996509697193.sh ============================= test session starts ============================== platform linux -- Python 3.8.10, pytest-7.1.3, pluggy-1.0.0 rootdir: /var/jenkins_home/workspace/transformers4rec_tests/transformers4rec plugins: anyio-3.6.1, xdist-3.0.2, cov-4.0.0 collected 1 item |
9c51311
to
0a2f6bd
Compare
Click to view CI ResultsGitHub pull request #507 of commit 0a2f6bd2c47536c52ed305f116eed260d5ce2d9b, no merge conflicts. Running as SYSTEM Setting status of 0a2f6bd2c47536c52ed305f116eed260d5ce2d9b to PENDING with url http://merlin-infra1.nvidia.com:8080/job/transformers4rec_tests/245/ and message: 'Build started for merge commit.' Using context: Jenkins Unit Test Run Building on master in workspace /var/jenkins_home/workspace/transformers4rec_tests using credential nvidia-merlin-bot Cloning the remote Git repository Cloning repository https://github.com/NVIDIA-Merlin/Transformers4Rec.git > git init /var/jenkins_home/workspace/transformers4rec_tests/transformers4rec # timeout=10 Fetching upstream changes from https://github.com/NVIDIA-Merlin/Transformers4Rec.git > git --version # timeout=10 using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Transformers4Rec.git +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA-Merlin/Transformers4Rec.git # timeout=10 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA-Merlin/Transformers4Rec.git # timeout=10 Fetching upstream changes from https://github.com/NVIDIA-Merlin/Transformers4Rec.git using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Transformers4Rec.git +refs/pull/507/*:refs/remotes/origin/pr/507/* # timeout=10 > git rev-parse 0a2f6bd2c47536c52ed305f116eed260d5ce2d9b^{commit} # timeout=10 Checking out Revision 0a2f6bd2c47536c52ed305f116eed260d5ce2d9b (detached) > git config core.sparsecheckout # timeout=10 > git checkout -f 0a2f6bd2c47536c52ed305f116eed260d5ce2d9b # timeout=10 Commit message: "fix PR comments" > git rev-list --no-walk 9c513119b2f522c662a288dd6dade872b906af14 # timeout=10 First time build. Skipping changelog. [transformers4rec_tests] $ /bin/bash /tmp/jenkins14261443200611807130.sh ============================= test session starts ============================== platform linux -- Python 3.8.10, pytest-7.1.3, pluggy-1.0.0 rootdir: /var/jenkins_home/workspace/transformers4rec_tests/transformers4rec plugins: anyio-3.6.1, xdist-3.0.2, cov-4.0.0 collected 1 item |
Fixes #499
Goals ⚽
Add save/load methods of T4Rec models using CloudPickle and following the protocol defined in Merlin-Model
Add an
input_schema
property to the T4Rec base classModel
that builds the model schema from the inputs modules of the heads and returns the merlin schema object.Add an
output_schema
property to the T4Rec base classModel
that builds the model schema from the predictions tasks specified in the heads and returns a merlin schema object with as many ColumnSchemas as the predictions tasks.Implementation Details 🚧
Add save/load methods to the T4Rec base class
Model
usingcloudpickle
and following the same protocol proposed in merlin models (here)Add a
shape
property to input/output schema to provide the length/shape information of list featuresI used the code of this unit test in merlin system as a starting point to convert the input T4Rec schema to a merlin schema object.
The output schema is built based on the prediction tasks provided to the model. The stored information is:
name, int_domain, value_counts, is_list, shape, and is_ragged
.Constraints
The format of the T4Rec model outputs is not standardized and varies a lot based on the PredictionTask and some specific boolean flags such as
hf_format
. There is a working going on to simplify the output API format ([Task] Standardize the model output format. #505) which will simplify the output of the model at inference (one prediction tensor is returned in the case of a single task learning or a dictionary of tensors where keys are the task name and values are the predictions tensors).The output dictionary needs to be converted to a
NamedTuple
for PyTorch serving.Testing Details 🔍
test_save_next_item_prediction_model
: saving/loading the model trained with the next item prediction task (the model used in the inference example)