| 22 Apr 2025 |
luke-skywalker | * ereslibre: where exactly should I make an issue? on nixpkgs gituhb? If so how do I indicate that its about the nvidia-container-toolkit?
So far, the attached config got me to:
- docker (+compose) runtime working with CUDA workloads
- containerd runtime directly running cuda containers
- get rke2 running with a
config.toml that points to all needed runtimes in the nix store.
hardware.nvidia-container-toolkit = {
enable = true;
# package = pkgs.nvidia-container-toolkit;
# # Use UUID for device naming - better for multi-GPU setups
device-name-strategy = "uuid"; # one of "index", "uuid", "type-index"
# Mount additional directories for compatibility
mount-nvidia-docker-1-directories = true;
# Mount NVIDIA executables into container
mount-nvidia-executables = true;
};
hardware.nvidia = {
modesetting.enable = true;
nvidiaPersistenced = true;
};
services.rke2 = {
enable = true;
role = "server";
nodeName = "workstation-0";
cni = "canal"; # | canal
# Set the node IP directly
nodeIP = "${systemProfile.network.staticIP}";
debug = true;
# Set cluster CIDR ranges properly
extraFlags = [
"--kubelet-arg=cgroup-driver=systemd"
"--cluster-cidr=10.42.0.0/16"
"--service-cidr=10.43.0.0/16"
"--disable-cloud-controller" # Disable cloud controller for bare metal
# "--kubelet-arg=feature-gates=DevicePlugins=true" # Add this for device plugins
];
disable = ["traefik"]; # "servicelb"
# environmentVars = {
# NVIDIA_VISIBLE_DEVICES = "all";
# NVIDIA_DRIVER_CAPABILITIES = "all";
# # Set NVIDIA driver root to the standard location
# # NVIDIA_DRIVER_ROOT = "/usr/lib/nvidia";
# # Home directory for RKE2
# HOME = "/root";
# };
};
/var/lib/rancher/rke2/agent/etc/containerd/config.toml
# File generated by rke2. DO NOT EDIT. Use config.toml.tmpl instead.
version = 3
root = "/var/lib/rancher/rke2/agent/containerd"
state = "/run/k3s/containerd"
[grpc]
address = "/run/k3s/containerd/containerd.sock"
[plugins.'io.containerd.internal.v1.opt']
path = "/var/lib/rancher/rke2/agent/containerd"
[plugins.'io.containerd.grpc.v1.cri']
stream_server_address = "127.0.0.1"
stream_server_port = "10010"
[plugins.'io.containerd.cri.v1.runtime']
enable_selinux = false
enable_unprivileged_ports = true
enable_unprivileged_icmp = true
device_ownership_from_security_context = false
[plugins.'io.containerd.cri.v1.images']
snapshotter = "overlayfs"
disable_snapshot_annotations = true
[plugins.'io.containerd.cri.v1.images'.pinned_images]
sandbox = "index.docker.io/rancher/mirrored-pause:3.6"
[plugins.'io.containerd.cri.v1.runtime'.containerd.runtimes.runc]
runtime_type = "io.containerd.runc.v2"
[plugins.'io.containerd.cri.v1.runtime'.containerd.runtimes.runc.options]
SystemdCgroup = true
[plugins.'io.containerd.cri.v1.runtime'.containerd.runtimes.runhcs-wcow-process]
runtime_type = "io.containerd.runhcs.v1"
[plugins.'io.containerd.cri.v1.runtime'.containerd.runtimes.'nvidia']
runtime_type = "io.containerd.runc.v2"
[plugins.'io.containerd.cri.v1.runtime'.containerd.runtimes.'nvidia'.options]
BinaryName = "/var/lib/rancher/rke2/data/v1.31.7-rke2r1-7f85e977b85d/bin/nvidia-container-runtime"
SystemdCgroup = true
[plugins.'io.containerd.cri.v1.runtime'.containerd.runtimes.'nvidia-cdi']
runtime_type = "io.containerd.runc.v2"
[plugins.'io.containerd.cri.v1.runtime'.containerd.runtimes.'nvidia-cdi'.options]
BinaryName = "/var/lib/rancher/rke2/data/v1.31.7-rke2r1-7f85e977b85d/bin/nvidia-container-runtime.cdi"
SystemdCgroup = true
[plugins.'io.containerd.cri.v1.images'.registry]
config_path = "/var/lib/rancher/rke2/agent/etc/containerd/certs.d"
lsmod | grep nvidia
nvidia_drm 139264 81
nvidia_modeset 1830912 26 nvidia_drm
nvidia_uvm 3817472 2
nvidia 97120256 533 nvidia_uvm,nvidia_modeset
video 81920 2 asus_wmi,nvidia_modeset
drm_ttm_helper 20480 2 nvidia_drm
However when trying to deploy the nvidia device plugin either with rke2 operator or as simple daemonset or as helm chart from the nvidia-device-plugin repo, it fails on detecing the cuda environment. for example by complaining about auto strategy.
| 11:04:17 |
luke-skywalker | * ereslibre: where exactly should I make an issue? on nixpkgs gituhb? If so how do I indicate that its about the nvidia-container-toolkit?
So far, the attached config got me to:
- docker (+compose) runtime working with CUDA workloads
2. containerd runtime directly running cuda containers
3. get rke2 running with a `config.toml` that points to all needed runtimes in the nix store.
```nix
hardware.nvidia-container-toolkit = {
enable = true;
# package = pkgs.nvidia-container-toolkit;
# # Use UUID for device naming - better for multi-GPU setups
device-name-strategy = "uuid"; # one of "index", "uuid", "type-index"
# Mount additional directories for compatibility
mount-nvidia-docker-1-directories = true;
# Mount NVIDIA executables into container
mount-nvidia-executables = true;
};
hardware.nvidia = {
modesetting.enable = true;
nvidiaPersistenced = true;
};
services.rke2 = {
enable = true;
role = "server";
nodeName = "workstation-0";
cni = "canal"; # | canal
# Set the node IP directly
nodeIP = "${systemProfile.network.staticIP}";
debug = true;
# Set cluster CIDR ranges properly
extraFlags = [
"--kubelet-arg=cgroup-driver=systemd"
"--cluster-cidr=10.42.0.0/16"
"--service-cidr=10.43.0.0/16"
"--disable-cloud-controller" # Disable cloud controller for bare metal
# "--kubelet-arg=feature-gates=DevicePlugins=true" # Add this for device plugins
];
disable = ["traefik"]; # "servicelb"
# environmentVars = {
# NVIDIA_VISIBLE_DEVICES = "all";
# NVIDIA_DRIVER_CAPABILITIES = "all";
# # Set NVIDIA driver root to the standard location
# # NVIDIA_DRIVER_ROOT = "/usr/lib/nvidia";
# # Home directory for RKE2
# HOME = "/root";
# };
};
/var/lib/rancher/rke2/agent/etc/containerd/config.toml
# File generated by rke2. DO NOT EDIT. Use config.toml.tmpl instead.
version = 3
root = "/var/lib/rancher/rke2/agent/containerd"
state = "/run/k3s/containerd"
[grpc]
address = "/run/k3s/containerd/containerd.sock"
[plugins.'io.containerd.internal.v1.opt']
path = "/var/lib/rancher/rke2/agent/containerd"
[plugins.'io.containerd.grpc.v1.cri']
stream_server_address = "127.0.0.1"
stream_server_port = "10010"
[plugins.'io.containerd.cri.v1.runtime']
enable_selinux = false
enable_unprivileged_ports = true
enable_unprivileged_icmp = true
device_ownership_from_security_context = false
[plugins.'io.containerd.cri.v1.images']
snapshotter = "overlayfs"
disable_snapshot_annotations = true
[plugins.'io.containerd.cri.v1.images'.pinned_images]
sandbox = "index.docker.io/rancher/mirrored-pause:3.6"
[plugins.'io.containerd.cri.v1.runtime'.containerd.runtimes.runc]
runtime_type = "io.containerd.runc.v2"
[plugins.'io.containerd.cri.v1.runtime'.containerd.runtimes.runc.options]
SystemdCgroup = true
[plugins.'io.containerd.cri.v1.runtime'.containerd.runtimes.runhcs-wcow-process]
runtime_type = "io.containerd.runhcs.v1"
[plugins.'io.containerd.cri.v1.runtime'.containerd.runtimes.'nvidia']
runtime_type = "io.containerd.runc.v2"
[plugins.'io.containerd.cri.v1.runtime'.containerd.runtimes.'nvidia'.options]
BinaryName = "/var/lib/rancher/rke2/data/v1.31.7-rke2r1-7f85e977b85d/bin/nvidia-container-runtime"
SystemdCgroup = true
[plugins.'io.containerd.cri.v1.runtime'.containerd.runtimes.'nvidia-cdi']
runtime_type = "io.containerd.runc.v2"
[plugins.'io.containerd.cri.v1.runtime'.containerd.runtimes.'nvidia-cdi'.options]
BinaryName = "/var/lib/rancher/rke2/data/v1.31.7-rke2r1-7f85e977b85d/bin/nvidia-container-runtime.cdi"
SystemdCgroup = true
[plugins.'io.containerd.cri.v1.images'.registry]
config_path = "/var/lib/rancher/rke2/agent/etc/containerd/certs.d"
lsmod | grep nvidia
nvidia_drm 139264 81
nvidia_modeset 1830912 26 nvidia_drm
nvidia_uvm 3817472 2
nvidia 97120256 533 nvidia_uvm,nvidia_modeset
video 81920 2 asus_wmi,nvidia_modeset
drm_ttm_helper 20480 2 nvidia_drm
However when trying to deploy the nvidia device plugin either with rke2 operator or as simple daemonset or as helm chart from the nvidia-device-plugin repo, it fails on detecing the cuda environment. for example by complaining about auto strategy.
| 11:08:12 |
luke-skywalker | * ereslibre: where exactly should I make an issue? on nixpkgs gituhb? If so how do I indicate that its about the nvidia-container-toolkit?
So far, the attached config got me to:
- docker (+compose) runtime working with CUDA workloads
โฏ docker run --rm --device=nvidia.com/gpu=all nvidia/cuda:11.8.0-base-ubuntu22.04 nvidia-smi
Tue Apr 22 11:08:39 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.133.07 Driver Version: 570.133.07 CUDA Version: 12.8 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3080 Ti On | 00000000:01:00.0 On | N/A |
| 0% 50C P8 42W / 350W | 773MiB / 12288MiB | 20% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+
- containerd runtime directly running cuda containers
- get rke2 running with a
config.toml that points to all needed runtimes in the nix store.
hardware.nvidia-container-toolkit = {
enable = true;
# package = pkgs.nvidia-container-toolkit;
# # Use UUID for device naming - better for multi-GPU setups
device-name-strategy = "uuid"; # one of "index", "uuid", "type-index"
# Mount additional directories for compatibility
mount-nvidia-docker-1-directories = true;
# Mount NVIDIA executables into container
mount-nvidia-executables = true;
};
hardware.nvidia = {
modesetting.enable = true;
nvidiaPersistenced = true;
};
services.rke2 = {
enable = true;
role = "server";
nodeName = "workstation-0";
cni = "canal"; # | canal
# Set the node IP directly
nodeIP = "${systemProfile.network.staticIP}";
debug = true;
# Set cluster CIDR ranges properly
extraFlags = [
"--kubelet-arg=cgroup-driver=systemd"
"--cluster-cidr=10.42.0.0/16"
"--service-cidr=10.43.0.0/16"
"--disable-cloud-controller" # Disable cloud controller for bare metal
# "--kubelet-arg=feature-gates=DevicePlugins=true" # Add this for device plugins
];
disable = ["traefik"]; # "servicelb"
# environmentVars = {
# NVIDIA_VISIBLE_DEVICES = "all";
# NVIDIA_DRIVER_CAPABILITIES = "all";
# # Set NVIDIA driver root to the standard location
# # NVIDIA_DRIVER_ROOT = "/usr/lib/nvidia";
# # Home directory for RKE2
# HOME = "/root";
# };
};
/var/lib/rancher/rke2/agent/etc/containerd/config.toml
# File generated by rke2. DO NOT EDIT. Use config.toml.tmpl instead.
version = 3
root = "/var/lib/rancher/rke2/agent/containerd"
state = "/run/k3s/containerd"
[grpc]
address = "/run/k3s/containerd/containerd.sock"
[plugins.'io.containerd.internal.v1.opt']
path = "/var/lib/rancher/rke2/agent/containerd"
[plugins.'io.containerd.grpc.v1.cri']
stream_server_address = "127.0.0.1"
stream_server_port = "10010"
[plugins.'io.containerd.cri.v1.runtime']
enable_selinux = false
enable_unprivileged_ports = true
enable_unprivileged_icmp = true
device_ownership_from_security_context = false
[plugins.'io.containerd.cri.v1.images']
snapshotter = "overlayfs"
disable_snapshot_annotations = true
[plugins.'io.containerd.cri.v1.images'.pinned_images]
sandbox = "index.docker.io/rancher/mirrored-pause:3.6"
[plugins.'io.containerd.cri.v1.runtime'.containerd.runtimes.runc]
runtime_type = "io.containerd.runc.v2"
[plugins.'io.containerd.cri.v1.runtime'.containerd.runtimes.runc.options]
SystemdCgroup = true
[plugins.'io.containerd.cri.v1.runtime'.containerd.runtimes.runhcs-wcow-process]
runtime_type = "io.containerd.runhcs.v1"
[plugins.'io.containerd.cri.v1.runtime'.containerd.runtimes.'nvidia']
runtime_type = "io.containerd.runc.v2"
[plugins.'io.containerd.cri.v1.runtime'.containerd.runtimes.'nvidia'.options]
BinaryName = "/var/lib/rancher/rke2/data/v1.31.7-rke2r1-7f85e977b85d/bin/nvidia-container-runtime"
SystemdCgroup = true
[plugins.'io.containerd.cri.v1.runtime'.containerd.runtimes.'nvidia-cdi']
runtime_type = "io.containerd.runc.v2"
[plugins.'io.containerd.cri.v1.runtime'.containerd.runtimes.'nvidia-cdi'.options]
BinaryName = "/var/lib/rancher/rke2/data/v1.31.7-rke2r1-7f85e977b85d/bin/nvidia-container-runtime.cdi"
SystemdCgroup = true
[plugins.'io.containerd.cri.v1.images'.registry]
config_path = "/var/lib/rancher/rke2/agent/etc/containerd/certs.d"
lsmod | grep nvidia
nvidia_drm 139264 81
nvidia_modeset 1830912 26 nvidia_drm
nvidia_uvm 3817472 2
nvidia 97120256 533 nvidia_uvm,nvidia_modeset
video 81920 2 asus_wmi,nvidia_modeset
drm_ttm_helper 20480 2 nvidia_drm
However when trying to deploy the nvidia device plugin either with rke2 operator or as simple daemonset or as helm chart from the nvidia-device-plugin repo, it fails on detecing the cuda environment. for example by complaining about auto strategy.
| 11:08:46 |
luke-skywalker | * ereslibre: where exactly should I make an issue? on nixpkgs gituhb? If so how do I indicate that its about the nvidia-container-toolkit?
So far, the attached config got me to:
- docker (+compose) runtime working with CUDA workloads
- containerd runtime directly running cuda containers
- get rke2 running with a
config.toml that points to all needed runtimes in the nix store.
hardware.nvidia-container-toolkit = {
enable = true;
# package = pkgs.nvidia-container-toolkit;
# # Use UUID for device naming - better for multi-GPU setups
device-name-strategy = "uuid"; # one of "index", "uuid", "type-index"
# Mount additional directories for compatibility
mount-nvidia-docker-1-directories = true;
# Mount NVIDIA executables into container
mount-nvidia-executables = true;
};
hardware.nvidia = {
modesetting.enable = true;
nvidiaPersistenced = true;
};
services.rke2 = {
enable = true;
role = "server";
nodeName = "workstation-0";
cni = "canal"; # | canal
# Set the node IP directly
nodeIP = "${systemProfile.network.staticIP}";
debug = true;
# Set cluster CIDR ranges properly
extraFlags = [
"--kubelet-arg=cgroup-driver=systemd"
"--cluster-cidr=10.42.0.0/16"
"--service-cidr=10.43.0.0/16"
"--disable-cloud-controller" # Disable cloud controller for bare metal
# "--kubelet-arg=feature-gates=DevicePlugins=true" # Add this for device plugins
];
disable = ["traefik"]; # "servicelb"
# environmentVars = {
# NVIDIA_VISIBLE_DEVICES = "all";
# NVIDIA_DRIVER_CAPABILITIES = "all";
# # Set NVIDIA driver root to the standard location
# # NVIDIA_DRIVER_ROOT = "/usr/lib/nvidia";
# # Home directory for RKE2
# HOME = "/root";
# };
};
/var/lib/rancher/rke2/agent/etc/containerd/config.toml
# File generated by rke2. DO NOT EDIT. Use config.toml.tmpl instead.
version = 3
root = "/var/lib/rancher/rke2/agent/containerd"
state = "/run/k3s/containerd"
[grpc]
address = "/run/k3s/containerd/containerd.sock"
[plugins.'io.containerd.internal.v1.opt']
path = "/var/lib/rancher/rke2/agent/containerd"
[plugins.'io.containerd.grpc.v1.cri']
stream_server_address = "127.0.0.1"
stream_server_port = "10010"
[plugins.'io.containerd.cri.v1.runtime']
enable_selinux = false
enable_unprivileged_ports = true
enable_unprivileged_icmp = true
device_ownership_from_security_context = false
[plugins.'io.containerd.cri.v1.images']
snapshotter = "overlayfs"
disable_snapshot_annotations = true
[plugins.'io.containerd.cri.v1.images'.pinned_images]
sandbox = "index.docker.io/rancher/mirrored-pause:3.6"
[plugins.'io.containerd.cri.v1.runtime'.containerd.runtimes.runc]
runtime_type = "io.containerd.runc.v2"
[plugins.'io.containerd.cri.v1.runtime'.containerd.runtimes.runc.options]
SystemdCgroup = true
[plugins.'io.containerd.cri.v1.runtime'.containerd.runtimes.runhcs-wcow-process]
runtime_type = "io.containerd.runhcs.v1"
[plugins.'io.containerd.cri.v1.runtime'.containerd.runtimes.'nvidia']
runtime_type = "io.containerd.runc.v2"
[plugins.'io.containerd.cri.v1.runtime'.containerd.runtimes.'nvidia'.options]
BinaryName = "/var/lib/rancher/rke2/data/v1.31.7-rke2r1-7f85e977b85d/bin/nvidia-container-runtime"
SystemdCgroup = true
[plugins.'io.containerd.cri.v1.runtime'.containerd.runtimes.'nvidia-cdi']
runtime_type = "io.containerd.runc.v2"
[plugins.'io.containerd.cri.v1.runtime'.containerd.runtimes.'nvidia-cdi'.options]
BinaryName = "/var/lib/rancher/rke2/data/v1.31.7-rke2r1-7f85e977b85d/bin/nvidia-container-runtime.cdi"
SystemdCgroup = true
[plugins.'io.containerd.cri.v1.images'.registry]
config_path = "/var/lib/rancher/rke2/agent/etc/containerd/certs.d"
lsmod | grep nvidia
nvidia_drm 139264 81
nvidia_modeset 1830912 26 nvidia_drm
nvidia_uvm 3817472 2
nvidia 97120256 533 nvidia_uvm,nvidia_modeset
video 81920 2 asus_wmi,nvidia_modeset
drm_ttm_helper 20480 2 nvidia_drm
However when trying to deploy the nvidia device plugin either with rke2 operator or as simple daemonset or as helm chart from the nvidia-device-plugin repo, it fails on detecing the cuda environment. for example by complaining about auto strategy.
kube-system/nvidia-device-plugin-daemonset-j8rmc:
I0422 11:10:39.192906 1 main.go:235] "Starting NVIDIA Device Plugin" version=<
3c378193
commit: 3c378193fcebf6e955f0d65bd6f2aeed099ad8ea
>
I0422 11:10:39.193038 1 main.go:238] Starting FS watcher for /var/lib/kubelet/device-plugins
I0422 11:10:39.193372 1 main.go:245] Starting OS watcher.
I0422 11:10:39.193730 1 main.go:260] Starting Plugins.
I0422 11:10:39.193772 1 main.go:317] Loading configuration.
I0422 11:10:39.194842 1 main.go:342] Updating config with default resource matching patterns.
I0422 11:10:39.195036 1 main.go:353]
Running with config:
{
"version": "v1",
"flags": {
"migStrategy": "none",
"failOnInitError": false,
"mpsRoot": "",
"nvidiaDriverRoot": "/",
"nvidiaDevRoot": "/",
"gdsEnabled": false,
"mofedEnabled": false,
"useNodeFeatureAPI": null,
"deviceDiscoveryStrategy": "auto",
"plugin": {
"passDeviceSpecs": false,
"deviceListStrategy": [
"envvar"
],
"deviceIDStrategy": "uuid",
"cdiAnnotationPrefix": "cdi.k8s.io/",
"nvidiaCTKPath": "/usr/bin/nvidia-ctk",
"containerDriverRoot": "/driver-root"
}
},
"resources": {
"gpus": [
{
"pattern": "*",
"name": "nvidia.com/gpu"
}
]
},
"sharing": {
"timeSlicing": {}
},
"imex": {}
}
I0422 11:10:39.195045 1 main.go:356] Retrieving plugins.
E0422 11:10:39.195368 1 factory.go:112] Incompatible strategy detected auto
E0422 11:10:39.195374 1 factory.go:113] If this is a GPU node, did you configure the NVIDIA Container Toolkit?
E0422 11:10:39.195378 1 factory.go:114] You can check the prerequisites at: https://github.com/NVIDIA/k8s-device-plugin#prerequisites
E0422 11:10:39.195382 1 factory.go:115] You can learn how to set the runtime at: https://github.com/NVIDIA/k8s-device-plugin#quick-start
E0422 11:10:39.195385 1 factory.go:116] If this is not a GPU node, you should set up a toleration or nodeSelector to only deploy this plugin on GPU nodes
I0422 11:10:39.195390 1 main.go:381] No devices found. Waiting indefinitely.
| 11:12:55 |
luke-skywalker | * ereslibre: where exactly should I make an issue? on nixpkgs gituhb? If so how do I indicate that its about the nvidia-container-toolkit?
So far, the attached config got me to:
- docker (+compose) runtime working with CUDA workloads
- containerd runtime directly running cuda containers
- get rke2 running with a
config.toml that points to all needed runtimes in the nix store.
hardware.nvidia-container-toolkit = {
enable = true;
# package = pkgs.nvidia-container-toolkit;
# # Use UUID for device naming - better for multi-GPU setups
device-name-strategy = "uuid"; # one of "index", "uuid", "type-index"
# Mount additional directories for compatibility
mount-nvidia-docker-1-directories = true;
# Mount NVIDIA executables into container
mount-nvidia-executables = true;
};
hardware.nvidia = {
modesetting.enable = true;
nvidiaPersistenced = true;
};
services.rke2 = {
enable = true;
role = "server";
nodeName = "workstation-0";
cni = "canal"; # | canal
# Set the node IP directly
nodeIP = "${systemProfile.network.staticIP}";
debug = true;
# Set cluster CIDR ranges properly
extraFlags = [
"--kubelet-arg=cgroup-driver=systemd"
"--cluster-cidr=10.42.0.0/16"
"--service-cidr=10.43.0.0/16"
"--disable-cloud-controller" # Disable cloud controller for bare metal
# "--kubelet-arg=feature-gates=DevicePlugins=true" # Add this for device plugins
];
disable = ["traefik"]; # "servicelb"
# environmentVars = {
# NVIDIA_VISIBLE_DEVICES = "all";
# NVIDIA_DRIVER_CAPABILITIES = "all";
# # Set NVIDIA driver root to the standard location
# # NVIDIA_DRIVER_ROOT = "/usr/lib/nvidia";
# # Home directory for RKE2
# HOME = "/root";
# };
};
/var/lib/rancher/rke2/agent/etc/containerd/config.toml
# File generated by rke2. DO NOT EDIT. Use config.toml.tmpl instead.
version = 3
root = "/var/lib/rancher/rke2/agent/containerd"
state = "/run/k3s/containerd"
[grpc]
address = "/run/k3s/containerd/containerd.sock"
[plugins.'io.containerd.internal.v1.opt']
path = "/var/lib/rancher/rke2/agent/containerd"
[plugins.'io.containerd.grpc.v1.cri']
stream_server_address = "127.0.0.1"
stream_server_port = "10010"
[plugins.'io.containerd.cri.v1.runtime']
enable_selinux = false
enable_unprivileged_ports = true
enable_unprivileged_icmp = true
device_ownership_from_security_context = false
[plugins.'io.containerd.cri.v1.images']
snapshotter = "overlayfs"
disable_snapshot_annotations = true
[plugins.'io.containerd.cri.v1.images'.pinned_images]
sandbox = "index.docker.io/rancher/mirrored-pause:3.6"
[plugins.'io.containerd.cri.v1.runtime'.containerd.runtimes.runc]
runtime_type = "io.containerd.runc.v2"
[plugins.'io.containerd.cri.v1.runtime'.containerd.runtimes.runc.options]
SystemdCgroup = true
[plugins.'io.containerd.cri.v1.runtime'.containerd.runtimes.runhcs-wcow-process]
runtime_type = "io.containerd.runhcs.v1"
[plugins.'io.containerd.cri.v1.runtime'.containerd.runtimes.'nvidia']
runtime_type = "io.containerd.runc.v2"
[plugins.'io.containerd.cri.v1.runtime'.containerd.runtimes.'nvidia'.options]
BinaryName = "/var/lib/rancher/rke2/data/v1.31.7-rke2r1-7f85e977b85d/bin/nvidia-container-runtime"
SystemdCgroup = true
[plugins.'io.containerd.cri.v1.runtime'.containerd.runtimes.'nvidia-cdi']
runtime_type = "io.containerd.runc.v2"
[plugins.'io.containerd.cri.v1.runtime'.containerd.runtimes.'nvidia-cdi'.options]
BinaryName = "/var/lib/rancher/rke2/data/v1.31.7-rke2r1-7f85e977b85d/bin/nvidia-container-runtime.cdi"
SystemdCgroup = true
[plugins.'io.containerd.cri.v1.images'.registry]
config_path = "/var/lib/rancher/rke2/agent/etc/containerd/certs.d"
lsmod | grep nvidia
nvidia_drm 139264 81
nvidia_modeset 1830912 26 nvidia_drm
nvidia_uvm 3817472 2
nvidia 97120256 533 nvidia_uvm,nvidia_modeset
video 81920 2 asus_wmi,nvidia_modeset
drm_ttm_helper 20480 2 nvidia_drm
However when trying to deploy the nvidia device plugin either with rke2 operator or as simple daemonset or as helm chart from the nvidia-device-plugin repo, it fails on detecing the cuda environment. for example by complaining about auto strategy.
kube-system/nvidia-device-plugin-daemonset-j8rmc
I0422 11:10:39.193038 1 main.go:238] Starting FS watcher for /var/lib/kubelet/device-plugins
I0422 11:10:39.193372 1 main.go:245] Starting OS watcher.
I0422 11:10:39.193730 1 main.go:260] Starting Plugins.
I0422 11:10:39.193772 1 main.go:317] Loading configuration.
I0422 11:10:39.194842 1 main.go:342] Updating config with default resource matching patterns.
I0422 11:10:39.195036 1 main.go:353]
Running with config:
{
"version": "v1",
"flags": {
"migStrategy": "none",
"failOnInitError": false,
"mpsRoot": "",
"nvidiaDriverRoot": "/",
"nvidiaDevRoot": "/",
"gdsEnabled": false,
"mofedEnabled": false,
"useNodeFeatureAPI": null,
"deviceDiscoveryStrategy": "auto",
"plugin": {
"passDeviceSpecs": false,
"deviceListStrategy": [
"envvar"
],
"deviceIDStrategy": "uuid",
"cdiAnnotationPrefix": "cdi.k8s.io/",
"nvidiaCTKPath": "/usr/bin/nvidia-ctk",
"containerDriverRoot": "/driver-root"
}
},
"resources": {
"gpus": [
{
"pattern": "*",
"name": "nvidia.com/gpu"
}
]
},
"sharing": {
"timeSlicing": {}
},
"imex": {}
}
I0422 11:10:39.195045 1 main.go:356] Retrieving plugins.
E0422 11:10:39.195368 1 factory.go:112] Incompatible strategy detected auto
E0422 11:10:39.195374 1 factory.go:113] If this is a GPU node, did you configure the NVIDIA Container Toolkit?
E0422 11:10:39.195378 1 factory.go:114] You can check the prerequisites at: https://github.com/NVIDIA/k8s-device-plugin#prerequisites
E0422 11:10:39.195382 1 factory.go:115] You can learn how to set the runtime at: https://github.com/NVIDIA/k8s-device-plugin#quick-start
E0422 11:10:39.195385 1 factory.go:116] If this is not a GPU node, you should set up a toleration or nodeSelector to only deploy this plugin on GPU nodes
I0422 11:10:39.195390 1 main.go:381] No devices found. Waiting indefinitely.
| 11:14:24 |
luke-skywalker | ok I think setting
"--default-runtime=nvidia"
"--node-label=nvidia.com/gpu.present=true"
let rke-server find two nvidia runtimes.
Now I get a completely different error about an undefined symbol in the used glibc.
Feels like getting closer though ๐
๐ค
| 13:39:48 |
ereslibre | luke-skywalker: you can open an issue in nixpkgs with the following template: https://github.com/NixOS/nixpkgs/issues/new?template=03_bug_report_nixos.yml
You can use something like โnixos/nvidia-container-toolkit: containerd does not honor CDI specsโ instead of โnixos/MODULENAME: BUG TITLEโ | 19:22:57 |
ereslibre | Looks like you will need something along the lines of https://github.com/cncf-tags/container-device-interface?tab=readme-ov-file#containerd-configuration | 19:38:49 |
ereslibre | I have to check myself and reproduce the issue and check the fix. When done, I am positive that we can also automate this on the nixos module side | 19:39:32 |
luke-skywalker | ok finally after 4 days of falling from one rabbit hole into the next deeper one, was finally able to deploy nvidia device plugin onto my initial cluster node and run cuda workloads ๐ฅณ
Will now proceed replicated the setup on a second machine with another NVIDIA GPU, join the cluster and see if I can do pipeline parellelised vLLM inference ๐ค | 20:44:24 |
luke-skywalker | yes that was indeed the last missing piece of the puzzle! | 20:45:13 |
luke-skywalker | * ๐ฅณ I was finally able to deploy nvidia device plugin onto my initial cluster node and run cuda workloads
after 4 days of falling from one rabbit hole into the next ๐
Will now proceed replicated the setup on a second machine with another NVIDIA GPU, join the cluster and see if I can do pipeline parellelised vLLM inference ๐ค | 21:27:51 |
luke-skywalker | ereslibre: interestingly enough I was so far only able to successfully deploy the daemonset v14 and 15. Using the latest v17 results in a glib error. | 22:12:36 |
luke-skywalker | th for pointing me to this. I have been scratchign my head on what the right channel and format is to give feedback to nixos project ๐๐๐ | 22:13:27 |
luke-skywalker | * ๐๐๐ thx for pointing me to this. I have been scratchign my head on what the right channel and format is to give feedback to nixOS project. | 22:13:47 |
SomeoneSerge (back on matrix) | connor (he/him) (UTC-7): look familiar? https://mastodon.social/@effinbirds/114383881424822335 | 23:06:37 |
| 23 Apr 2025 |
ereslibre | Glad it worked! :) | 05:48:55 |
SomeoneSerge (back on matrix) | luke-skywalker: looking forward to read the blog post xD | 12:03:57 |
luke-skywalker | blog post?
Shouldnt everyone have the joy of fighting through those dungeons of rabbit holes and come out the other end with some awesome loot? ๐
Will do when I find the time to write it down as a guide / article or make a PR to either rke2 or nvidia-container-toolkit. Might even wrap it into its own system module. But main thing is available time since this is just one of the stepping stones to a system to federate distributed "AI" capabilites.
Dont actually want to be too public before I have a working "kernel" of the envisioned system. | 15:22:21 |
ereslibre | I might be able to open a PR to enable CDI on containerd this weekend | 21:14:48 |
luke-skywalker | FYI with the virtualisation.containerd module (not the one used by rke2) it works already out of the box. | 21:16:15 |
| 24 Apr 2025 |
ereslibre | luke-skywalker: unless Iโm missing something, nothing is setting https://github.com/cncf-tags/container-device-interface?tab=readme-ov-file#containerd-configuration, right? You had to do this manually, right? | 06:06:26 |
luke-skywalker | funnily from all the detours I took to make ti work, I though rke2 was doing that, but I think I must have done that by hand and forgot about it.
So yes you need to profide a config.toml.tmpl with nvidia-cdi defined pointing to the runtime binary and set [plugins."io.containerd.grpc.v1.cri".cdi]
Could you give me the TLDR why using image: nvcr.io/nvidia/k8s-device-plugin:v0.17.x
fails with glibc issue.
my understanding it was build build with a newer version of glibc that on my system (2.40)? ANy way to solve this or is it ok to simply stick to 16.x?
| 12:19:15 |
luke-skywalker | * funnily from all the detours I took to make it work, I though rke2 was doing that, but I think I must have done that by hand and forgot about it.
So yes you need to provide a config.toml.tmpl with nvidia-cdi defined pointing to the runtime binary and set [plugins."io.containerd.grpc.v1.cri".cdi]
Could you give me the TLDR why using image: nvcr.io/nvidia/k8s-device-plugin:v0.17.x
fails with glibc issue.
my understanding it was build build with a newer version of glibc that on my system (2.40)? ANy way to solve this or is it ok to simply stick to 16.x?
| 12:19:41 |
luke-skywalker | * funnily from all the detours I took to make it work, I though rke2 was doing that, but I think I must have done that by hand and forgot about it.
So yes you need to provide a config.toml.tmpl with nvidia-cdi defined pointing to the runtime binary and set [plugins."io.containerd.grpc.v1.cri".cdi]
Could you give me the TLDR why using image: nvcr.io/nvidia/k8s-device-plugin:v0.17.x
fails with glibc issue.
my understanding it was build build with a newer version of glibc that on my system (2.40)? Any way to solve this or shoudl I to simply stick to 16.x until the glibc version on unstable channel nixos is compatible again?
| 12:20:36 |
luke-skywalker | ui interesting. How does it compare to vllm?
I see it supports device maps. Is that for pipeline parallelism so GPU devices on different nodes / machines as well?
Is it somehow affiliated to mistral-ai or whats the reason for the name of the library? ;) | 14:12:59 |
luke-skywalker | also does that work on k8s clusters? ๐ค | 14:13:43 |
Gaรฉtan Lepage | I haven't used it much myself as I don't own a big enough GPU.
According to me, it is not affiliated to Mistral (the company). I guess that it's the same as "ollama" and Llama (Meta). | 15:43:10 |
luke-skywalker | its getting better still though. Now switched from Daemonset deplyoment of the device plugin to helm deployment with custom values. This made it possible to also enable time slicing available GPU ๐ฅณ | 16:10:02 |
luke-skywalker | thx for the info, yeah the same as ollama was my assumption.
Guess ill stick to vllm depoyment with helm on k8s. | 16:37:54 |