Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cudaPackages: bump default cudaPackages_11_7 -> cudaPackages_11_8 #224927

Closed
wants to merge 10,000 commits into from

Conversation

nviets
Copy link
Contributor

@nviets nviets commented Apr 6, 2023

Description of changes

Updated the default cudaPackages from 11.7 to 11.8 to address #222778. Pytorch doesn't support 12, yet, so this is an interim change. A nixpkgs-review with cudaSupport enabled is forthcoming.

Things done
  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin

@samuela
Copy link
Member

samuela commented Apr 6, 2023

Looks good... as long as nixpkgs-review is happy, we're good to go!

@nviets
Copy link
Contributor Author

nviets commented Apr 6, 2023

I didn't think through the scope of nixpkgs-review for bumping CUDA. 11.8 built fine, but now it's downloading 80G worth of stuff to my machine. This may not happen tonight, but I guess we'll see.

@samuela
Copy link
Member

samuela commented Apr 6, 2023

Yeah, running nixpkgs-review is the real hurdle with these version bumps

@nviets
Copy link
Contributor Author

nviets commented Apr 6, 2023

Does it have a check-point mechanism? If I stop and restart tomorrow, will it pick up where it left off?

@samuela
Copy link
Member

samuela commented Apr 6, 2023

Not really, but builds and downloads that finish will stay cached in your nix store (as long as you don't garbage collect them!).

@nviets
Copy link
Contributor Author

nviets commented Apr 6, 2023

Oh good, thanks for confirming. I'll have to wrap it up tomorrow probably.

@SomeoneSerge
Copy link
Contributor

Result of nixpkgs-review pr 224927 --extra-nixpkgs-config '{ cudaCapabilities = [ "8.6" ]; }' run on x86_64-linux 1

55 packages failed to build:
  • colmapWithCuda
  • cudaPackages.cuda-samples
  • cudaPackages.cudatoolkit
  • cudaPackages.cudatoolkit.doc
  • cudaPackages.cudatoolkit.lib
  • cudaPackages.cutensor
  • cudaPackages.cutensor.dev
  • cudaPackages.nvidia_driver
  • forge
  • gpu-burn
  • gromacsCudaMpi
  • gwe
  • hip-nvidia
  • hip-nvidia.doc
  • katagoWithCuda
  • librealsenseWithCuda
  • librealsenseWithCuda.dev
  • mathematica-cuda
  • nvtop
  • nvtop-nvidia
  • python310Packages.cupy
  • python310Packages.cupy.dist
  • python310Packages.jaxlibWithCuda
  • python310Packages.jaxlibWithCuda.dist
  • python310Packages.numbaWithCuda
  • python310Packages.numbaWithCuda.dist
  • python310Packages.pycuda
  • python310Packages.pycuda.dist
  • python310Packages.pynvml
  • python310Packages.pynvml.dist
  • python310Packages.pyrealsense2WithCuda
  • python310Packages.pyrealsense2WithCuda.dev
  • python310Packages.theanoWithCuda
  • python310Packages.theanoWithCuda.dist
  • python310Packages.tiny-cuda-nn
  • python310Packages.torchWithCuda
  • python310Packages.torchWithCuda.dev
  • python310Packages.torchWithCuda.dist
  • python310Packages.torchWithCuda.lib
  • python311Packages.cupy
  • python311Packages.cupy.dist
  • python311Packages.jaxlibWithCuda
  • python311Packages.jaxlibWithCuda.dist
  • python311Packages.pycuda
  • python311Packages.pycuda.dist
  • python311Packages.pynvml
  • python311Packages.pynvml.dist
  • python311Packages.pyrealsense2WithCuda
  • python311Packages.pyrealsense2WithCuda.dev
  • python311Packages.theanoWithCuda
  • python311Packages.theanoWithCuda.dist
  • truecrack-cuda
  • xgboostWithCuda
  • xpraWithNvenc
  • xpraWithNvenc.dist
44 packages built:
  • cudaPackages.cuda_cccl
  • cudaPackages.cuda_cudart
  • cudaPackages.cuda_cuobjdump
  • cudaPackages.cuda_cupti
  • cudaPackages.cuda_cuxxfilt
  • cudaPackages.cuda_demo_suite
  • cudaPackages.cuda_documentation
  • cudaPackages.cuda_gdb
  • cudaPackages.cuda_memcheck
  • cudaPackages.cuda_nsight
  • cudaPackages.cuda_nvcc
  • cudaPackages.cuda_nvdisasm
  • cudaPackages.cuda_nvml_dev
  • cudaPackages.cuda_nvprof
  • cudaPackages.cuda_nvprune
  • cudaPackages.cuda_nvrtc
  • cudaPackages.cuda_nvtx
  • cudaPackages.cuda_nvvp
  • cudaPackages.cuda_profiler_api
  • cudaPackages.cuda_sanitizer_api
  • cudaPackages.cudnn
  • cudaPackages.cudnn_8_6_0
  • cudaPackages.cudnn_8_7_0
  • cudaPackages.fabricmanager
  • cudaPackages.libcublas
  • cudaPackages.libcufft
  • cudaPackages.libcufile
  • cudaPackages.libcurand
  • cudaPackages.libcusolver
  • cudaPackages.libcusparse
  • cudaPackages.libnpp
  • cudaPackages.libnvidia_nscq
  • cudaPackages.libnvjpeg
  • cudaPackages.nccl
  • cudaPackages.nccl.dev
  • cudaPackages.nsight_compute
  • cudaPackages.nsight_systems
  • cudaPackages.nvidia_fs
  • faissWithCuda
  • faissWithCuda.demos
  • magma (magma-cuda ,magma_2_7_1)
  • magma_2_6_2
  • nvidia-thrust-cuda
  • tiny-cuda-nn

@SomeoneSerge
Copy link
Contributor

Failed derivations

@nviets
Copy link
Contributor Author

nviets commented Apr 6, 2023

Thanks @SomeoneSerge. The failure list looks pretty bad. In the case of xgboost (which I see in the fail list), I had intended to use the cudaPackages argument to fix the package to something that worked, and I already know the next version mostly works with 11.8.

What do you suggest we do now?

@SomeoneSerge
Copy link
Contributor

@nviets hmmm, I don't see most of the build logs, got to retrieve them somehow now xD

@SomeoneSerge
Copy link
Contributor

Result of nixpkgs-review pr 224927 --extra-nixpkgs-config '{ cudaCapabilities = [ "8.6" ]; }' run on x86_64-linux 1

55 packages failed to build:
  • colmapWithCuda
  • cudaPackages.cuda-samples
  • cudaPackages.cudatoolkit
  • cudaPackages.cudatoolkit.doc
  • cudaPackages.cudatoolkit.lib
  • cudaPackages.cutensor
  • cudaPackages.cutensor.dev
  • cudaPackages.nvidia_driver
  • forge
  • gpu-burn
  • gromacsCudaMpi
  • gwe
  • hip-nvidia
  • hip-nvidia.doc
  • katagoWithCuda
  • librealsenseWithCuda
  • librealsenseWithCuda.dev
  • mathematica-cuda
  • nvtop
  • nvtop-nvidia
  • python310Packages.cupy
  • python310Packages.cupy.dist
  • python310Packages.jaxlibWithCuda
  • python310Packages.jaxlibWithCuda.dist
  • python310Packages.numbaWithCuda
  • python310Packages.numbaWithCuda.dist
  • python310Packages.pycuda
  • python310Packages.pycuda.dist
  • python310Packages.pynvml
  • python310Packages.pynvml.dist
  • python310Packages.pyrealsense2WithCuda
  • python310Packages.pyrealsense2WithCuda.dev
  • python310Packages.theanoWithCuda
  • python310Packages.theanoWithCuda.dist
  • python310Packages.tiny-cuda-nn
  • python310Packages.torchWithCuda
  • python310Packages.torchWithCuda.dev
  • python310Packages.torchWithCuda.dist
  • python310Packages.torchWithCuda.lib
  • python311Packages.cupy
  • python311Packages.cupy.dist
  • python311Packages.jaxlibWithCuda
  • python311Packages.jaxlibWithCuda.dist
  • python311Packages.pycuda
  • python311Packages.pycuda.dist
  • python311Packages.pynvml
  • python311Packages.pynvml.dist
  • python311Packages.pyrealsense2WithCuda
  • python311Packages.pyrealsense2WithCuda.dev
  • python311Packages.theanoWithCuda
  • python311Packages.theanoWithCuda.dist
  • truecrack-cuda
  • xgboostWithCuda
  • xpraWithNvenc
  • xpraWithNvenc.dist
44 packages built:
  • cudaPackages.cuda_cccl
  • cudaPackages.cuda_cudart
  • cudaPackages.cuda_cuobjdump
  • cudaPackages.cuda_cupti
  • cudaPackages.cuda_cuxxfilt
  • cudaPackages.cuda_demo_suite
  • cudaPackages.cuda_documentation
  • cudaPackages.cuda_gdb
  • cudaPackages.cuda_memcheck
  • cudaPackages.cuda_nsight
  • cudaPackages.cuda_nvcc
  • cudaPackages.cuda_nvdisasm
  • cudaPackages.cuda_nvml_dev
  • cudaPackages.cuda_nvprof
  • cudaPackages.cuda_nvprune
  • cudaPackages.cuda_nvrtc
  • cudaPackages.cuda_nvtx
  • cudaPackages.cuda_nvvp
  • cudaPackages.cuda_profiler_api
  • cudaPackages.cuda_sanitizer_api
  • cudaPackages.cudnn
  • cudaPackages.cudnn_8_6_0
  • cudaPackages.cudnn_8_7_0
  • cudaPackages.fabricmanager
  • cudaPackages.libcublas
  • cudaPackages.libcufft
  • cudaPackages.libcufile
  • cudaPackages.libcurand
  • cudaPackages.libcusolver
  • cudaPackages.libcusparse
  • cudaPackages.libnpp
  • cudaPackages.libnvidia_nscq
  • cudaPackages.libnvjpeg
  • cudaPackages.nccl
  • cudaPackages.nccl.dev
  • cudaPackages.nsight_compute
  • cudaPackages.nsight_systems
  • cudaPackages.nvidia_fs
  • faissWithCuda
  • faissWithCuda.demos
  • magma (magma-cuda ,magma_2_7_1)
  • magma_2_6_2
  • nvidia-thrust-cuda
  • tiny-cuda-nn

@SomeoneSerge
Copy link
Contributor

Oh!

cudatoolkit-11.8.0.drv (derivation hash: w6qvxzm9x48c1mnliivvrd59w6390p5f)

It's #224986 as well 🤦🏻

@nviets
Copy link
Contributor Author

nviets commented Apr 6, 2023

Will #224986 directly fix cudatoolkit? If it works, do you want to bump to 11_8 or go straight to 12?

@SomeoneSerge
Copy link
Contributor

@nviets that should fix the patchelf issues in cudaPackages_11_8 and cudaPackages_12, so you should be able to finish this PR: you re-base on master, and we re-run nixpkgs-review

I'd say we keep waiting for pytorch and such before we do cudaPackages_12

@nviets
Copy link
Contributor Author

nviets commented Apr 6, 2023

Given PyTorch's sensitivity to the CUDA version, would it make sense to make it configurable there?

@SomeoneSerge
Copy link
Contributor

Given PyTorch's sensitivity to the CUDA version, would it make sense to make it configurable there?

It is, one can always torch.override { cudaPackages = cudaPackages_11_7; }
We just want to have a working and up-to-date default, which is also what we build binary cache for

@SomeoneSerge
Copy link
Contributor

Result of nixpkgs-review pr 224927 --extra-nixpkgs-config '{ cudaCapabilities = [ "8.6" ]; }' run on x86_64-linux 1

13 packages failed to build:
  • cudaPackages.cuda-samples
  • cudaPackages.nvidia_driver
  • mathematica-cuda
  • python310Packages.jaxlibWithCuda
  • python310Packages.jaxlibWithCuda.dist
  • python310Packages.theanoWithCuda
  • python310Packages.theanoWithCuda.dist
  • python310Packages.tiny-cuda-nn
  • python311Packages.jaxlibWithCuda
  • python311Packages.jaxlibWithCuda.dist
  • python311Packages.theanoWithCuda
  • python311Packages.theanoWithCuda.dist
  • truecrack-cuda
86 packages built:
  • colmapWithCuda
  • cudaPackages.cuda_cccl
  • cudaPackages.cuda_cudart
  • cudaPackages.cuda_cuobjdump
  • cudaPackages.cuda_cupti
  • cudaPackages.cuda_cuxxfilt
  • cudaPackages.cuda_demo_suite
  • cudaPackages.cuda_documentation
  • cudaPackages.cuda_gdb
  • cudaPackages.cuda_memcheck
  • cudaPackages.cuda_nsight
  • cudaPackages.cuda_nvcc
  • cudaPackages.cuda_nvdisasm
  • cudaPackages.cuda_nvml_dev
  • cudaPackages.cuda_nvprof
  • cudaPackages.cuda_nvprune
  • cudaPackages.cuda_nvrtc
  • cudaPackages.cuda_nvtx
  • cudaPackages.cuda_nvvp
  • cudaPackages.cuda_profiler_api
  • cudaPackages.cuda_sanitizer_api
  • cudaPackages.cudatoolkit
  • cudaPackages.cudatoolkit.doc
  • cudaPackages.cudatoolkit.lib
  • cudaPackages.cudnn
  • cudaPackages.cudnn_8_6_0
  • cudaPackages.cudnn_8_7_0
  • cudaPackages.cutensor
  • cudaPackages.cutensor.dev
  • cudaPackages.fabricmanager
  • cudaPackages.libcublas
  • cudaPackages.libcufft
  • cudaPackages.libcufile
  • cudaPackages.libcurand
  • cudaPackages.libcusolver
  • cudaPackages.libcusparse
  • cudaPackages.libnpp
  • cudaPackages.libnvidia_nscq
  • cudaPackages.libnvjpeg
  • cudaPackages.nccl
  • cudaPackages.nccl.dev
  • cudaPackages.nsight_compute
  • cudaPackages.nsight_systems
  • cudaPackages.nvidia_fs
  • faissWithCuda
  • faissWithCuda.demos
  • forge
  • gpu-burn
  • gromacsCudaMpi
  • gwe
  • hip-nvidia
  • hip-nvidia.doc
  • katagoWithCuda
  • librealsenseWithCuda
  • librealsenseWithCuda.dev
  • magma (magma-cuda ,magma_2_7_1)
  • magma_2_6_2
  • nvidia-thrust-cuda
  • nvtop
  • nvtop-nvidia
  • python310Packages.cupy
  • python310Packages.cupy.dist
  • python310Packages.numbaWithCuda
  • python310Packages.numbaWithCuda.dist
  • python310Packages.pycuda
  • python310Packages.pycuda.dist
  • python310Packages.pynvml
  • python310Packages.pynvml.dist
  • python310Packages.pyrealsense2WithCuda
  • python310Packages.pyrealsense2WithCuda.dev
  • python310Packages.torchWithCuda
  • python310Packages.torchWithCuda.dev
  • python310Packages.torchWithCuda.dist
  • python310Packages.torchWithCuda.lib
  • python311Packages.cupy
  • python311Packages.cupy.dist
  • python311Packages.pycuda
  • python311Packages.pycuda.dist
  • python311Packages.pynvml
  • python311Packages.pynvml.dist
  • python311Packages.pyrealsense2WithCuda
  • python311Packages.pyrealsense2WithCuda.dev
  • tiny-cuda-nn
  • xgboostWithCuda
  • xpraWithNvenc
  • xpraWithNvenc.dist

@nviets
Copy link
Contributor Author

nviets commented Apr 8, 2023

Looking much better! How can i see the logs from your build? Sorry I couldn't help more after the initial commit, but my computer doesn't have the horsepower to build all of the downstream stuff.

@SomeoneSerge
Copy link
Contributor

They were supposed to be automatically published, but smth went wrong and I'm omw to get lost in a Finnish "forest"/national park so it's hard for me to check up on it xD

Maybe somebody coild rebuild individual packages, cc @samuela @nixos/cuda-maintainers

@SomeoneSerge
Copy link
Contributor

Fwiw jaxlib had been failing last time I checked, idk if anyone fixed it yet

@SomeoneSerge
Copy link
Contributor

The other failures aren't new either

@nviets
Copy link
Contributor Author

nviets commented Apr 9, 2023

Sounds like an awesome trip!

Thanks for running another nixpkgs-review. If this PR isn't breaking anything new, I think it's ready for review. Happy to recieve more feedback.

@nviets nviets marked this pull request as ready for review April 9, 2023 01:04
@samuela
Copy link
Member

samuela commented Apr 10, 2023

tried running nixpkgs-review, ran into Mic92/nixpkgs-review#328

@piegamesde
Copy link
Member

It looks like you accidentally mass-pinged a bunch of people, which are now subscribed
and getting notifications for everything in this pull request. Unfortunately, they
cannot be automatically unsubscribed from the issue (removing review request does not
unsubscribe), therefore development cannot continue in this pull request anymore.

Please open a new pull request with your changes, link back to this one and ping the
people actually involved in here over there.

In order to avoid this in the future, there are instructions for how to properly
rebase between branches in our contribution guidelines.
Setting your pull request to draft prior to rebasing is strongly recommended.
In draft status, you can preview the list of people that are about to be requested
for review, which allows you to sidestep this issue.
This is not a bulletproof method though, as OfBorg still does review requests even on draft PRs.

@NixOS NixOS locked and limited conversation to collaborators Jun 19, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.