!eWOErHSaiddIbsUNsJ:nixos.org

NixOS CUDA

292 Members
CUDA packages maintenance and support in nixpkgs | https://github.com/orgs/NixOS/projects/27/ | https://nixos.org/manual/nixpkgs/unstable/#cuda58 Servers

You have reached the beginning of time (for this room).


SenderMessageTime
13 Feb 2025
@srhb:matrix.orgsrhbYup :D 14:02:10
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8)

Gaétan Lepage: I've got the manifest for cusparseLT here: https://github.com/ConnorBaker/cuda-packages/blob/main/modules/redists/cusparselt/manifests/0.6.3.json

I think with that you should be able to construct a Nix expression which manually calls redist-builder (or whatever I called it upstream) with the proper arguments

15:24:04
@ss:someonex.netSomeoneSerge (back on matrix)Meeting notes: https://pad.lassul.us/YGyymxE9Qqy9iFVt7A2VnA?both#Conclusion. Some intermediate conversations missing right now, but are recorded by Connor; hopefully he can fill in the blanks when he's free15:30:22
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8)Just pasting the last of them now15:30:46
@ss:someonex.netSomeoneSerge (back on matrix)

Regarding scheduling the future meetings,

  • we should probably aim to meet in 2-4 weeks to follow up on the patchelf exception and for a report on the ephemeral builders situation;
  • we can probably first bring up the alignment questions with nix-community just in their chat, without video because async is faster;
  • additionally, I think I should have hours this and next week to sort the backlog as mentioned in the notes; I think it'd still be useful, for onboarding new people, to do that with the audio and the screenshare, but it's not worth synchronizing people's schedules for this; maybe it'll be just a pop-in format?
15:38:04
@ss:someonex.netSomeoneSerge (back on matrix)(jaja, maybe we do this in Gaetan's twitch?)15:38:26
@glepage:matrix.orgGaétan LepageSure haha15:46:34
14 Feb 2025
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8)As of a few days ago Onnxruntime requires CUDA separable compilation… so I guess I gotta fix that now 🙃01:50:24
@ss:someonex.netSomeoneSerge (back on matrix)

RE: CI infra/yesterday's meeting
CC connor (he/him) (UTC-8):

By the way, while on my side I'm advertising both options for provisioning hardware, the spot instances and the owned hardware, I think we might want to incentivize companies to commit to support the latter path. While it's obviously more work, organisational and engineering, it is a much better long-term promise for the community. With the rented hardware, if two or three companies simultaneously decide to withdraw, we basically have to immediately scale down the CI. If we buy hardware for a non-profit and a few years later some companies decide they're not interested anymore, we maybe lose a retainer covering the maintenance work. With own hardware we can also be more flexible and maybe dedicate some machines to be used as community builders/devboxes for ad hoc experimentation.

11:15:16
@zopieux:matrix.zopi.euzopieux

It's me again :-) This time I have a genuinely surprising behavior from the community cache (the substituters are correctly configured): nccl was successfully built (derivation mv02…), the narinfo is available, but upon nix-shell -p cudaPackages_12.nccl I get

this derivation will be built:
  /nix/store/mv02rgvrhw9n1682dw7vs8w3pssc24lr-nccl-2.21.5-1.drv
(lots of compiling)

Others, like cudaPackages.cudnn, are successfully retrieved from the cache.

17:58:45

Show newer messages


Back to Room ListRoom Version: 9