!eWOErHSaiddIbsUNsJ:nixos.org

NixOS CUDA

290 Members
CUDA packages maintenance and support in nixpkgs | https://github.com/orgs/NixOS/projects/27/ | https://nixos.org/manual/nixpkgs/unstable/#cuda57 Servers

Load older messages


SenderMessageTime
7 Feb 2025
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8) Ugh FINALLY have a test to catch different versions of the package set leaking into each other: https://github.com/ConnorBaker/cuda-packages/commit/6c9cb3a17962427e9772849a3b7ca08899897aae
Got tried of seeing multiple versions of CUDA dependencies in the closure of members of the package set
02:04:37
@ss:someonex.netSomeoneSerge (back on matrix) Let's do Thursday February 13th 2-3PM UTC? 14:41:38
@stick:matrix.orgstickno idea - seems like an intermittent issue?15:07:19
@stick:matrix.orgstickother than that, are you ok with merging the PR? I would love vllm to appear in the cache15:07:44
@stick:matrix.orgstick* other than that, are you ok with merging the PR? I would love vllm to appear in the nix-community cache15:07:50
@stick:matrix.orgstickand i just merged an update from 0.7.1 -> 0.7.2 to master15:08:02
@stick:matrix.orgsticki rebased the PR to check whether the CI fails again on the same test15:09:55
@stick:matrix.orgstick* i rebased the PR https://github.com/NixOS/nixpkgs/pull/379575 to check whether the CI fails again on the same test15:10:13
@stick:matrix.orgstickupdate: no it did not - i guess there was an error in master, not in my branch15:11:22
@ss:someonex.netSomeoneSerge (back on matrix) Yes ofc. I was about to press the button but then this weird action failed even after I restarted it manually 15:23:31
@stick:matrix.orgstickis that the only thing needed to get vllm into nix-community cache?15:23:58
@ss:someonex.netSomeoneSerge (back on matrix)Looks like it's happy after the rebase?15:24:01
@stick:matrix.orgstickyes, it is15:24:09
@stick:matrix.orgstickthanks for the merge!15:33:11
8 Feb 2025
@terrorjack:matrix.orgterrorjack joined the room.01:25:30
@terrorjack:matrix.orgterrorjack set a profile picture.02:24:20
@terrorjack:matrix.orgterrorjack removed their profile picture.02:24:57
@vanishingideal:matrix.orgvanishingideal joined the room.04:06:53
@mabau:matrix.org@mabau:matrix.org left the room.07:11:39
@zopieux:matrix.zopi.euzopieux

alright, we got a recent build, so I tried again. Updated to 550e11f and ran:

$ colmena build -v --show-trace --nix-option builders "" --nix-option cores 0
x | these 11 derivations will be built:
x |   /nix/store/f1s6y83hb8gdl0s49vmj0w54i5a75gd7-ollama-0.5.7.drv
x |   /nix/store/fpfv7cn50ns667qrkwx2frn26di1hnc7-ollama.service.drv
[snip]
x | building '/nix/store/f1s6y83hb8gdl0s49vmj0w54i5a75gd7-ollama-0.5.7.drv'...

even though f1s6… is right there. Is there a way to debug that nix-community is even being requested at all, perhaps?

15:36:58
@zopieux:matrix.zopi.euzopieux ok that was dumb, I should have checked nix flake show first. Something was overriding the substituters this whole time, despite my flake setting them. Sorry for the noise, it's all fine now! 16:02:13
@zopieux:matrix.zopi.euzopieux * ok that was dumb, I should have checked nix config show first. Something was overriding the substituters this whole time, despite my flake setting them. Sorry for the noise, it's all fine now! 16:02:25
@stick:matrix.orgstick

Do we want to enable aarch64-linux for cuda package set on hydra?

I think it makes sense with already available GH200 and Digits available in May

20:25:34
@stick:matrix.orgstickin the meantime I am compiling the aarch64-linux package set on my macbook, trying to remove potential build failures21:21:27
@adam:robins.wtf@adam:robins.wtf left the room.21:22:28
9 Feb 2025
@ss:someonex.netSomeoneSerge (back on matrix) Yes. The only "blocker" is refactoring release-cuda.nix to support specifying jetson platforms instead of sbsa 04:21:13
@ss:someonex.netSomeoneSerge (back on matrix) * Yes. The only "blocker" is refactoring release-cuda.nix to support specifying jetson platforms instead of (or in addition to? does anyone use) sbsa 04:21:29
@stick:matrix.orgstickYes, GH200 is SBSA iirc19:44:53
@ruroruro:matrix.orgruro

I am currently working on fixing cuda-samples. I wanted to bikeshed a bit more about its dependency on the insecure freeimage package. I still think that in general having eval errors for insecure package dependencies is useful. Afaik, the official hydra builders don't build insecure packages, so I am kind of against setting allowInsecure/allowInsecurePredicate/permittedInsecurePackages in release-cuda.nix.

On the other hand, it's kind of silly to treat cuda-samples as insecure just because it depends on freeimage. Most of the code in cuda-samples is for demonstration/validation purposes only, a lot of it doesn't even have proper error checking and shouldn't be used in production anyway. So it is extremely weird to refuse building cuda-samples just because there are some potential buffer overflow exploits or whatever in the library it uses for image reading.

I still think that we should just change freeimage to

freeimage.overrideAttrs (prev: {
  meta = prev.meta // {
    knownVulnerabilities = [ ];
  };
})

specifically only in the cuda-samples derivation.

However, if not shipping "vulnerable" binaries is so important, then we could keep cuda-samples broken, but introduce a separate cuda-samples.passthru.buildCheckOnly derivation that ignores the freeimage "vulnerabilities", but then replaces its installPhase with something along the lines of touch $out. That way, we keep cuda-samples.buildCheckOnly as a useful smoke test, but don't distribute any "potentially vulnerable" binaries. Though, TBH, this sounds like over-engineering and we should just override freeimage.knownVulnerabilities in cuda-samples specifically.

Any thoughts?

20:24:30
@stick:matrix.orgstickwhat exactly is the purpose to have samples packaged in nixpkgs? my first intuition tells me this does not belong to nixpkgs at all22:13:09

Show newer messages


Back to Room ListRoom Version: 9