-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix OOM issue in accelerate-training-of-large-embedding-tables-by-LazyAdam nb unit test #816
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Click to view CI ResultsGitHub pull request #816 of commit cbd2396177974d096eb97241165856d355b87998, no merge conflicts. Running as SYSTEM Setting status of cbd2396177974d096eb97241165856d355b87998 to PENDING with url https://10.20.13.93:8080/job/merlin_models/1556/console and message: 'Pending' Using context: Jenkins Building on master in workspace /var/jenkins_home/workspace/merlin_models using credential nvidia-merlin-bot > git rev-parse --is-inside-work-tree # timeout=10 Fetching changes from the remote Git repository > git config remote.origin.url https://github.com/NVIDIA-Merlin/models/ # timeout=10 Fetching upstream changes from https://github.com/NVIDIA-Merlin/models/ > git --version # timeout=10 using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/models/ +refs/pull/816/*:refs/remotes/origin/pr/816/* # timeout=10 > git rev-parse cbd2396177974d096eb97241165856d355b87998^{commit} # timeout=10 Checking out Revision cbd2396177974d096eb97241165856d355b87998 (detached) > git config core.sparsecheckout # timeout=10 > git checkout -f cbd2396177974d096eb97241165856d355b87998 # timeout=10 Commit message: "fix OOM issue in unit test" > git rev-list --no-walk 6ebbb78330f12a3e5bc47283faccee5d6fda506f # timeout=10 [merlin_models] $ /bin/bash /tmp/jenkins14986520480103475489.sh Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com Requirement already satisfied: testbook in /usr/local/lib/python3.8/dist-packages (0.4.2) Requirement already satisfied: nbformat>=5.0.4 in /usr/local/lib/python3.8/dist-packages (from testbook) (5.5.0) Requirement already satisfied: nbclient>=0.4.0 in /usr/local/lib/python3.8/dist-packages (from testbook) (0.6.8) Requirement already satisfied: fastjsonschema in /usr/local/lib/python3.8/dist-packages (from nbformat>=5.0.4->testbook) (2.16.1) Requirement already satisfied: jsonschema>=2.6 in /usr/local/lib/python3.8/dist-packages (from nbformat>=5.0.4->testbook) (4.16.0) Requirement already satisfied: jupyter_core in /usr/local/lib/python3.8/dist-packages (from nbformat>=5.0.4->testbook) (4.11.1) Requirement already satisfied: traitlets>=5.1 in /usr/local/lib/python3.8/dist-packages (from nbformat>=5.0.4->testbook) (5.4.0) Requirement already satisfied: jupyter-client>=6.1.5 in /usr/local/lib/python3.8/dist-packages (from nbclient>=0.4.0->testbook) (7.3.5) Requirement already satisfied: nest-asyncio in /usr/local/lib/python3.8/dist-packages (from nbclient>=0.4.0->testbook) (1.5.5) Requirement already satisfied: attrs>=17.4.0 in /usr/local/lib/python3.8/dist-packages (from jsonschema>=2.6->nbformat>=5.0.4->testbook) (22.1.0) Requirement already satisfied: importlib-resources>=1.4.0; python_version < "3.9" in /usr/local/lib/python3.8/dist-packages (from jsonschema>=2.6->nbformat>=5.0.4->testbook) (5.9.0) Requirement already satisfied: pkgutil-resolve-name>=1.3.10; python_version < "3.9" in /usr/local/lib/python3.8/dist-packages (from jsonschema>=2.6->nbformat>=5.0.4->testbook) (1.3.10) Requirement already satisfied: pyrsistent!=0.17.0,!=0.17.1,!=0.17.2,>=0.14.0 in /usr/local/lib/python3.8/dist-packages (from jsonschema>=2.6->nbformat>=5.0.4->testbook) (0.18.1) Requirement already satisfied: entrypoints in /usr/local/lib/python3.8/dist-packages (from jupyter-client>=6.1.5->nbclient>=0.4.0->testbook) (0.4) Requirement already satisfied: python-dateutil>=2.8.2 in /usr/local/lib/python3.8/dist-packages (from jupyter-client>=6.1.5->nbclient>=0.4.0->testbook) (2.8.2) Requirement already satisfied: pyzmq>=23.0 in /usr/local/lib/python3.8/dist-packages (from jupyter-client>=6.1.5->nbclient>=0.4.0->testbook) (24.0.0) Requirement already satisfied: tornado>=6.2 in /usr/local/lib/python3.8/dist-packages (from jupyter-client>=6.1.5->nbclient>=0.4.0->testbook) (6.2) Requirement already satisfied: zipp>=3.1.0; python_version < "3.10" in /usr/local/lib/python3.8/dist-packages (from importlib-resources>=1.4.0; python_version < "3.9"->jsonschema>=2.6->nbformat>=5.0.4->testbook) (3.8.1) Requirement already satisfied: six>=1.5 in /var/jenkins_home/.local/lib/python3.8/site-packages (from python-dateutil>=2.8.2->jupyter-client>=6.1.5->nbclient>=0.4.0->testbook) (1.15.0) ============================= test session starts ============================== platform linux -- Python 3.8.10, pytest-7.1.3, pluggy-1.0.0 rootdir: /var/jenkins_home/workspace/merlin_models/models, configfile: pyproject.toml plugins: anyio-3.6.1, xdist-2.5.0, forked-1.4.0, cov-4.0.0 collected 774 items |
Click to view CI ResultsGitHub pull request #816 of commit 240dfe5134b84c36cb37feb0fe12e3f5eea0f7c0, no merge conflicts. Running as SYSTEM Setting status of 240dfe5134b84c36cb37feb0fe12e3f5eea0f7c0 to PENDING with url https://10.20.13.93:8080/job/merlin_models/1559/console and message: 'Pending' Using context: Jenkins Building on master in workspace /var/jenkins_home/workspace/merlin_models using credential nvidia-merlin-bot > git rev-parse --is-inside-work-tree # timeout=10 Fetching changes from the remote Git repository > git config remote.origin.url https://github.com/NVIDIA-Merlin/models/ # timeout=10 Fetching upstream changes from https://github.com/NVIDIA-Merlin/models/ > git --version # timeout=10 using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/models/ +refs/pull/816/*:refs/remotes/origin/pr/816/* # timeout=10 > git rev-parse 240dfe5134b84c36cb37feb0fe12e3f5eea0f7c0^{commit} # timeout=10 Checking out Revision 240dfe5134b84c36cb37feb0fe12e3f5eea0f7c0 (detached) > git config core.sparsecheckout # timeout=10 > git checkout -f 240dfe5134b84c36cb37feb0fe12e3f5eea0f7c0 # timeout=10 Commit message: "fix test_soft_embedding" > git rev-list --no-walk 35a1cdae38d87d9d52f70f0a0e7da003defe4d17 # timeout=10 [merlin_models] $ /bin/bash /tmp/jenkins471903919269911390.sh Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com Requirement already satisfied: testbook in /usr/local/lib/python3.8/dist-packages (0.4.2) Requirement already satisfied: nbformat>=5.0.4 in /usr/local/lib/python3.8/dist-packages (from testbook) (5.5.0) Requirement already satisfied: nbclient>=0.4.0 in /usr/local/lib/python3.8/dist-packages (from testbook) (0.6.8) Requirement already satisfied: fastjsonschema in /usr/local/lib/python3.8/dist-packages (from nbformat>=5.0.4->testbook) (2.16.1) Requirement already satisfied: jsonschema>=2.6 in /usr/local/lib/python3.8/dist-packages (from nbformat>=5.0.4->testbook) (4.16.0) Requirement already satisfied: jupyter_core in /usr/local/lib/python3.8/dist-packages (from nbformat>=5.0.4->testbook) (4.11.1) Requirement already satisfied: traitlets>=5.1 in /usr/local/lib/python3.8/dist-packages (from nbformat>=5.0.4->testbook) (5.4.0) Requirement already satisfied: jupyter-client>=6.1.5 in /usr/local/lib/python3.8/dist-packages (from nbclient>=0.4.0->testbook) (7.3.5) Requirement already satisfied: nest-asyncio in /usr/local/lib/python3.8/dist-packages (from nbclient>=0.4.0->testbook) (1.5.5) Requirement already satisfied: attrs>=17.4.0 in /usr/local/lib/python3.8/dist-packages (from jsonschema>=2.6->nbformat>=5.0.4->testbook) (22.1.0) Requirement already satisfied: importlib-resources>=1.4.0; python_version < "3.9" in /usr/local/lib/python3.8/dist-packages (from jsonschema>=2.6->nbformat>=5.0.4->testbook) (5.9.0) Requirement already satisfied: pkgutil-resolve-name>=1.3.10; python_version < "3.9" in /usr/local/lib/python3.8/dist-packages (from jsonschema>=2.6->nbformat>=5.0.4->testbook) (1.3.10) Requirement already satisfied: pyrsistent!=0.17.0,!=0.17.1,!=0.17.2,>=0.14.0 in /usr/local/lib/python3.8/dist-packages (from jsonschema>=2.6->nbformat>=5.0.4->testbook) (0.18.1) Requirement already satisfied: entrypoints in /usr/local/lib/python3.8/dist-packages (from jupyter-client>=6.1.5->nbclient>=0.4.0->testbook) (0.4) Requirement already satisfied: python-dateutil>=2.8.2 in /usr/local/lib/python3.8/dist-packages (from jupyter-client>=6.1.5->nbclient>=0.4.0->testbook) (2.8.2) Requirement already satisfied: pyzmq>=23.0 in /usr/local/lib/python3.8/dist-packages (from jupyter-client>=6.1.5->nbclient>=0.4.0->testbook) (24.0.0) Requirement already satisfied: tornado>=6.2 in /usr/local/lib/python3.8/dist-packages (from jupyter-client>=6.1.5->nbclient>=0.4.0->testbook) (6.2) Requirement already satisfied: zipp>=3.1.0; python_version < "3.10" in /usr/local/lib/python3.8/dist-packages (from importlib-resources>=1.4.0; python_version < "3.9"->jsonschema>=2.6->nbformat>=5.0.4->testbook) (3.8.1) Requirement already satisfied: six>=1.5 in /var/jenkins_home/.local/lib/python3.8/site-packages (from python-dateutil>=2.8.2->jupyter-client>=6.1.5->nbclient>=0.4.0->testbook) (1.15.0) ============================= test session starts ============================== platform linux -- Python 3.8.10, pytest-7.1.3, pluggy-1.0.0 rootdir: /var/jenkins_home/workspace/merlin_models/models, configfile: pyproject.toml plugins: anyio-3.6.1, xdist-2.5.0, forked-1.4.0, cov-4.0.0 collected 774 items |
fixes OOM issue in unit test