You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
NVIDIA Docker (virtualisation.docker.enableNvidia) cannot be used on default NixOS option due to cgroup v2 not supported by libnvidia-container (the error, root cause). The container refuse to spawn because this runtime error.
$ nvidia-docker run -it -p 3000:3000 mycroft/mimic2:gpu
docker: Error response from daemon: failed to create shim: OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: Running hook #1:: error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: container error: cgroup subsystem devices not found: unknown.
ERRO[0003] error waiting for container: context canceled
I encountered this issue exactly because I'm running rootless docker with nvidia runtime using your usernetes. Everything works, except have to set no-cgroups = true in /etc/nvidia-container-runtime/config.toml
To Reproduce
Steps to reproduce the behavior:
Clone mycroft/mimic2 repository and enter the directory. This might be any repository or docker image with gpu requirements.
# a list of nixpkgs attributes affected by the problemattribute:
- systemd.enableUnifiedCgroupHierarchy
- virtualisation.docker.enableNvidia# a list of nixos modules affected by the problemmodule:
- systemd
- nvidia-docker
The text was updated successfully, but these errors were encountered:
Took an entire day to find the linked comment [1] by @biggs, which says:
> Fix on NixOS (where cgroup v2 is also now default): add
> `systemd.enableUnifiedCgroupHierarchy = false;`
> and restart.
Indeed, after applying this commit and then running
`sudo systemctl restart docker`, any of the following commands works:
```bash
sudo docker run --gpus=all nvidia/cuda:10.0-runtime nvidia-smi
sudo docker run --runtime=nvidia nvidia/cuda:10.0-runtime nvidia-smi
sudo nvidia-docker run nvidia/cuda:10.0-runtime nvidia-smi
```
ARGH!!!1
Links:
[1] NVIDIA/nvidia-docker#1447 (comment)
[2] NixOS/nixpkgs#127146
[3] NixOS/nixpkgs#73800
[4] https://blog.zentria.company/posts/nixos-cgroupsv2/
P.S.
I use Colemak, but typing arstarstarst doesn't have the same ring to it.
Describe the bug
NVIDIA Docker (
virtualisation.docker.enableNvidia
) cannot be used on default NixOS option due to cgroup v2 not supported by libnvidia-container (the error, root cause). The container refuse to spawn because this runtime error.There are two potential solutions as NVIDIA/libnvidia-container#111 (comment),
systemd.enableUnifiedCgroupHierarchy = false;
)nvidia-container-runtime
per Non-default nvidia-container-runtime-hook config file NVIDIA/nvidia-container-runtime#47 (comment).To Reproduce
Steps to reproduce the behavior:
mycroft/mimic2
repository and enter the directory. This might be any repository or docker image with gpu requirements.docker build -t mycroft/mimic2:gpu -f gpu.Dockerfile .
nvidia-docker run -it -p 3000:3000 mycroft/mimic2:gpu
.Expected behavior
Run happily ever after.
Metadata
Please run
nix-shell -p nix-info --run "nix-info -m"
and paste the result.Maintainer information:
The text was updated successfully, but these errors were encountered: