!eWOErHSaiddIbsUNsJ:nixos.org

NixOS CUDA

288 Members
CUDA packages maintenance and support in nixpkgs | https://github.com/orgs/NixOS/projects/27/ | https://nixos.org/manual/nixpkgs/unstable/#cuda57 Servers

Load older messages


SenderMessageTime
9 Jun 2024
@glepage:matrix.orgGaétan LepageOh, I've just seen your message20:53:58
10 Jun 2024
@mjolnir:nixos.orgNixOS Moderation Bot unbanned @jonringer:matrix.org@jonringer:matrix.org.00:17:14
@glepage:matrix.orgGaétan Lepageclipboard.png
Download clipboard.png
06:44:40
@glepage:matrix.orgGaétan LepageHaha botorch has probably taken ~11h but it succeeded X)06:44:56
@shekhinah:she.khinah.xyzshekhinah set their display name to yaldebaoth.11:02:59
@shekhinah:she.khinah.xyzshekhinah changed their display name from yaldebaoth to yaldabaoth.11:03:43
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8) Gaétan Lepage: did you mention there was a PR or something merged to disable the checkPhase or test suite for botorch, or did I misunderstand? 14:01:56
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8) On another note, has anyone built elpa (https://github.com/NixOS/nixpkgs/blob/master/pkgs/development/libraries/elpa/default.nix) successfully with CUDA support? I let it run for like 20h and it was still building. Seems to compile four object files at a time? 14:04:03
@glepage:matrix.orgGaétan Lepage
In reply to @connorbaker:matrix.org
Gaétan Lepage: did you mention there was a PR or something merged to disable the checkPhase or test suite for botorch, or did I misunderstand?
No, I have not done anything. I was actually able to build it just fine from master earlier today.
14:29:01
@hexa:lossy.networkhexa Gaétan Lepage: have you considered pulling this patch for tensorflow-bin? https://github.com/tensorflow/tensorflow/issues/58073#issuecomment-2097055553 20:58:34
11 Jun 2024
@keiichi:matrix.orgteto when using localai 2.15 from unstable and even after a reboot I get ggml_cuda_init: failed to initialize CUDA: CUDA driver is a stub library. It's a bit random but if anyone has a tip, I take it. nvidia-smi output looks fine 00:25:38
@glepage:matrix.orgGaétan Lepage
In reply to @hexa:lossy.network
Gaétan Lepage: have you considered pulling this patch for tensorflow-bin? https://github.com/tensorflow/tensorflow/issues/58073#issuecomment-2097055553
This looks like it could work !
However, how do you apply a patch to a wheel-type python derivation ?
06:38:47
@glepage:matrix.orgGaétan Lepage What phase of the buildPythonPackage script should I hook it into ? 06:39:02
@glepage:matrix.orgGaétan Lepage I tried patches = [ but it does not work 06:39:15
@glepage:matrix.orgGaétan Lepage

I am packaging this: https://github.com/EricLBuehler/mistral.rs?tab=readme-ov-file#installation-and-build
You can see that it support several variations for building (CUDA, metal, mkl...)

-> What should be the approach ? Adding cudaSupport ? metalSupport ? mklSupport ?

07:01:41
@kaya:catnip.eekaya 𖤐 changed their profile picture.08:03:48
@hexa:lossy.networkhexa
In reply to @glepage:matrix.org
This looks like it could work !
However, how do you apply a patch to a wheel-type python derivation ?
likely in postInstall 😕
11:58:18
@hexa:lossy.networkhexacurses11:59:30
@glepage:matrix.orgGaétan Lepage Ok, but I can I use fetchpatch though ? 12:02:43
@ss:someonex.netSomeoneSerge (back on matrix) connor (he/him) (UTC-5) IIRC you brought up setting legacy (FindCUDA&c) variables from the setup hooks. I think we should set them, and we should put that logic behind a guard (e.g. findCudaCmakeSupport=true), just as we should guard the current logic (e.g. findCudatoolkitCmakeSupport=true). We should disable the legacy by default. We should only set cmake flags when the cmake hook is actually used or when cmake flags are explicitly requested. 13:19:22
@ss:someonex.netSomeoneSerge (back on matrix)
In reply to @keiichi:matrix.org
when using localai 2.15 from unstable and even after a reboot I get ggml_cuda_init: failed to initialize CUDA: CUDA driver is a stub library. It's a bit random but if anyone has a tip, I take it. nvidia-smi output looks fine
LD_DEBUG=libs
13:19:44
@ss:someonex.netSomeoneSerge (back on matrix)
In reply to @glepage:matrix.org

I am packaging this: https://github.com/EricLBuehler/mistral.rs?tab=readme-ov-file#installation-and-build
You can see that it support several variations for building (CUDA, metal, mkl...)

-> What should be the approach ? Adding cudaSupport ? metalSupport ? mklSupport ?

Does it allow enabling multiple features at once?
13:20:13
@glepage:matrix.orgGaétan Lepage
In reply to @ss:someonex.net
Does it allow enabling multiple features at once?
No, but I think that I will copy the implementation from ollama
13:20:38
@glepage:matrix.orgGaétan LepageIt looks very clean to me13:20:44
@glepage:matrix.orgGaétan Lepage https://github.com/NixOS/nixpkgs/blob/master/pkgs/by-name/ol/ollama/package.nix#L65-L82 13:21:07
@ss:someonex.netSomeoneSerge (back on matrix) The shouldEnable logic looks maybe a bit complex but the arguments seem good? 13:23:21
@ss:someonex.netSomeoneSerge (back on matrix)
In reply to @gjvnq:matrix.org
Hey, can I ask for help with compiling alice-vision on NixOS?
Looking more closely, I'd guess the issue is somewhere around __has_include(<Imath/half.h>) in ${openimageio.dev}/include/OpenImageIO/half.h
14:49:05
@keiichi:matrix.orgteto SomeoneSerge (UTC+3): TIL, LD_DEBUG looks quite useful. I suppose the "stub" referred to in the message concerns /nix/store/q3m473lh6gcg4xbhbknrhmcj7w7njjs6-cuda_cudart-12.2.140-lib/lib/stubs/glibc-hwcaps/x86-64-v3 . Do you know what a "stub" is and why that would be a problem ? I understand "stub" as a "generic" library ? (I have a 3060RTX) 16:34:54
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8) teto: as I understand it, we use stub libraries when the libraries we would link against aren't available -- for example, because they exist outside the sandbox (like libcuda.so does, as part of the NVIDIA driver, in /run/opengl-driver/lib/). They allow the build to succeed where they would otherwise fail due to missing symbols.
They shouldn't cause issues at runtime, because the executable should find and load the proper library from wherever it is it comes from (in this case, /run/opengl-driver/lib/).
18:05:24
@keiichi:matrix.orgtetoI dont seem to have any cuda library in /run/opengl-driver/. Should I add anything into hardware.opengl.extraPackages ?18:17:25

Show newer messages


Back to Room ListRoom Version: 9