| 21 Jul 2024 |
hexa | yes, why aren't we? 😄 | 23:57:25 |
hexa | you see that a lot of work goes into discovery of these | 23:57:33 |
@adam:robins.wtf | Jul 21 19:43:20 sink1 ollama[3567]: time=2024-07-21T19:43:20.566-04:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu cpu_avx cpu_avx2 rocm]"
Jul 21 19:43:20 sink1 ollama[3567]: time=2024-07-21T19:43:20.566-04:00 level=INFO source=gpu.go:205 msg="looking for compatible GPUs"
Jul 21 19:43:20 sink1 ollama[3567]: time=2024-07-21T19:43:20.567-04:00 level=WARN source=amd_linux.go:58 msg="ollama recommends running the https://www.amd.com/en/support/linux-drivers" error="amdgpu version file missing: /sys/module/amdgpu/version stat /sys/module/amdgpu/version: no such file or directory"
Jul 21 19:43:20 sink1 ollama[3567]: time=2024-07-21T19:43:20.568-04:00 level=INFO source=amd_linux.go:333 msg="skipping rocm gfx compatibility check" HSA_OVERRIDE_GFX_VERSION=10.3.0
Jul 21 19:43:20 sink1 ollama[3567]: time=2024-07-21T19:43:20.568-04:00 level=INFO source=types.go:103 msg="inference compute" id=0 library=rocm compute=gfx1031 driver=0.0 name=1002:73df total="12.0 GiB" available="9.8 GiB"
| 23:59:37 |
| 22 Jul 2024 |
hexa | sus | 00:01:56 |
hexa | can you post deviceallow/policy? | 00:02:04 |
@adam:robins.wtf | ❯ sudo systemctl cat ollama | rg Device
DeviceAllow=/dev/nvidia?
DeviceAllow=/dev/nvidia-caps/nvidia-cap?
DeviceAllow=/dev/nvidiactl
DeviceAllow=/dev/nvidia-modeset
DeviceAllow=/dev/nvidia-uvm
DeviceAllow=/dev/nvidia-uvm-tools
DeviceAllow=/dev/dri/card*
DeviceAllow=/dev/dri/renderD*
DeviceAllow=/dev/kfd
DevicePolicy=closed
| 00:02:54 |
hexa | ok, that is wild | 00:03:56 |
hexa | did you set up anything specific for rocm? | 00:05:03 |
hexa | level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu cpu_avx cpu_avx2 rocm]"
level=WARN source=amd_linux.go:58 msg="ollama recommends running the https://www.amd.com/en/support/linux-drivers" error="amdgpu version file missing: /sys/module/amdgpu/version stat /sys/module/amdgpu/version: no such file or directory"
level=INFO source=amd_linux.go:333 msg="skipping rocm gfx compatibility check" HSA_OVERRIDE_GFX_VERSION=10.3.0
level=INFO source=types.go:98 msg="inference compute" id=0 library=rocm compute=gfx1010 driver=0.0 name=1002:731f total="8.0 GiB" available="6.5 GiB"
| 00:07:46 |
hexa | hah! | 00:07:47 |
hexa | diff --git a/nixos/modules/services/misc/ollama.nix b/nixos/modules/services/misc/ollama.nix
index 63ee6798a6dd..d7cabb9af497 100644
--- a/nixos/modules/services/misc/ollama.nix
+++ b/nixos/modules/services/misc/ollama.nix
@@ -183,16 +183,12 @@ in
DeviceAllow = [
# CUDA
# https://docs.nvidia.com/dgx/pdf/dgx-os-5-user-guide.pdf
- "/dev/nvidia?"
- "/dev/nvidia-caps/nvidia-cap?"
- "/dev/nvidiactl"
- "/dev/nvidia-modeset"
- "/dev/nvidia-uvm"
- "/dev/nvidia-uvm-tools"
+ "char-nvidiactl"
+ "char-nvidia-caps"
+ "char-nvidia-uvm"
# ROCm
- "/dev/dri/card*"
- "/dev/dri/renderD*"
- "/dev/kfd"
+ "char-drm"
+ "char-kfd"
];
DevicePolicy = "closed";
LockPersonality = true;
| 00:08:01 |
hexa | device node type matching works better for me | 00:08:27 |
hexa | also more concise, less pattern matching | 00:08:34 |
hexa | updated the PR, please retest, if it works for both of us now | 00:09:33 |