Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows GPU test is skipping #1843

Closed
ydcjeff opened this issue Mar 22, 2021 · 4 comments · Fixed by #1985
Closed

Windows GPU test is skipping #1843

ydcjeff opened this issue Mar 22, 2021 · 4 comments · Fixed by #1985
Labels
ci CI

Comments

@ydcjeff
Copy link
Contributor

ydcjeff commented Mar 22, 2021

🐛 Bug description

Windows GPU tests are currently not running on master

@vfdev-5
Copy link
Collaborator

vfdev-5 commented Mar 22, 2021

Thanks for reporting @ydcjeff !

@ydcjeff
Copy link
Contributor Author

ydcjeff commented Mar 22, 2021

@vfdev-5 There seems like torch fails to find CUDA gpu or missing something in circleci config ?

..\..\tools\miniconda3\lib\site-packages\torch\cuda\__init__.py:52
  c:\tools\miniconda3\lib\site-packages\torch\cuda\__init__.py:52: UserWarning: CUDA initialization: CUDA driver initialization failed, you might not have a CUDA gpu. (Triggered internally at  ..\c10\cuda\CUDAFunctions.cpp:109.)
    return torch._C._cuda_getDeviceCount() > 0

@vfdev-5
Copy link
Collaborator

vfdev-5 commented Mar 22, 2021

I think this is related to machine we are using on Circle CI: https://circleci.com/docs/2.0/configuration-reference/#available-windows-gpu-image which has CUDA 10.1 but we install pytorch with CUDA 11.1 support.

@vfdev-5
Copy link
Collaborator

vfdev-5 commented Mar 22, 2021

I think we should do something like vision does here: https://github.com/seemethere/vision/blob/master/packaging/windows/internal/cuda_install.bat

  • install cuda 11.1
curl -k -L https://raw.githubusercontent.com/pytorch/vision/master/packaging/windows/internal/cuda_install.bat --output "cuda_install.bat"

set CU_VERSION="cu111"
.\cuda_install.bat

To check the driver :

"/c/Program Files/NVIDIA Corporation/NVSMI/nvidia-smi.exe"

and it says

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 426.00       Driver Version: 426.00       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+

To see current processes (if connected with bash)

ps 
      PID    PPID    PGID     WINPID   TTY         UID    STIME COMMAND
      177       1     177       5796  ?         197609 08:07:35 /tmp/6090fffd1ef947041b2e31cd-0-build/circleci-agent
      185     183     183       8924  ?         197609 08:08:02 /c/tools/miniconda3/Scripts/conda
      197     190     190       5896  ?         197609 08:10:44 /c/Program Files/Git/usr/bin/ps
      190       1     190       7064  ?         197609 08:09:28 /c/Program Files/Git/usr/bin/bash
      183       1     183       8808  ?         197609 08:08:01 /c/Program Files/Git/usr/bin/bash

@vfdev-5 vfdev-5 added the ci CI label May 4, 2021
vfdev-5 added a commit that referenced this issue May 4, 2021
@vfdev-5 vfdev-5 mentioned this issue May 4, 2021
3 tasks
vfdev-5 added a commit that referenced this issue May 4, 2021
* Use cuda 10.1 on Windows

Fixes #1843

* Update config.yml
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci CI
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants