!eWOErHSaiddIbsUNsJ:nixos.org

NixOS CUDA

290 Members
CUDA packages maintenance and support in nixpkgs | https://github.com/orgs/NixOS/projects/27/ | https://nixos.org/manual/nixpkgs/unstable/#cuda58 Servers

Load older messages


SenderMessageTime
28 Dec 2024
@matthewcroughan:defenestrate.itmatthewcroughan changed their display name from matthewcroughan to matthewcroughan (DECT: 56490).18:41:55
29 Dec 2024
@lromor:matrix.orglromor set a profile picture.16:13:20
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8)Just tried to build PyTorch and I completely forgot it vendors its dependencies, was stunned to see it building ONNX21:49:20
@ss:someonex.netSomeoneSerge (back on matrix) I wish... matthewcroughan (DECT: 56490) maybe? 21:50:20
@ss:someonex.netSomeoneSerge (back on matrix) Yeah 21:50:35
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8) I remember I had tried to work on using system-provided dependencies (I guess more than a year ago now) and gave up because it would have required a bunch CMake rewriting.
And every time upstream changed something, BOOM! Another merge conflict or more rewriting required.
But I suppose it’s that way with lots of projects.
21:55:11
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8)Serge, how do you stay upbeat about packaging stuff?21:55:58
@ss:someonex.netSomeoneSerge (back on matrix) Yes, which is why this is really is about working with the upstream and getting the changes through on their side, not on nixpkgs side 21:56:38
@ss:someonex.netSomeoneSerge (back on matrix) I clearly don't... 21:57:31
@glepage:matrix.orgGaétan Lepage
In reply to @connorbaker:matrix.org
I remember I had tried to work on using system-provided dependencies (I guess more than a year ago now) and gave up because it would have required a bunch CMake rewriting.
And every time upstream changed something, BOOM! Another merge conflict or more rewriting required.
But I suppose it’s that way with lots of projects.
At least we still build pytorch from source... looking at you protobuf-python, tensorflow and, since today, jax
21:57:59
@glepage:matrix.orgGaétan Lepage🥲21:58:05
@glepage:matrix.orgGaétan LepageAt least they take less resources to build 🤡21:58:59
@glepage:matrix.orgGaétan Lepage* At least they take less resources to "build" 🤡21:59:04
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8)
In reply to @ss:someonex.net
Yes, which is why this is really is about working with the upstream and getting the changes through on their side, not on nixpkgs side
Thoughts on what to do when upstream makes it clear they don’t care or don’t want to implement changes that make it easier (or feasible) to build with Nix?
21:59:14
@ss:someonex.netSomeoneSerge (back on matrix)In case of pytorch, I think they are willing to accept stuff21:59:41
@ss:someonex.netSomeoneSerge (back on matrix)They just won't do it themselves21:59:53
@glepage:matrix.orgGaétan LepageAlso, I'm afraid we are severly under-staffed :/22:00:12
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8)I’ve had good experiences with them too; I meant more like NVIDIA and the ONNX ecosystem22:00:14
@glepage:matrix.orgGaétan LepageSince the latest staging-next merge, everything is kind of broken...22:00:38
@glepage:matrix.orgGaétan LepageHopefully, we merged the triton-llvm fix and the jax/jaxlib switch to bin.22:00:54
@ss:someonex.netSomeoneSerge (back on matrix) Yeah right... like, pray that they lose the market? 22:00:55
@glepage:matrix.orgGaétan LepageBut still22:00:57
@glepage:matrix.orgGaétan LepageAlso, we have python3.13 now which is very brittle22:01:12
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8)At least the gods gave Sisyphus the rock he has to push; I had to buy mine from ZOTAC22:03:40
30 Dec 2024
@matthewcroughan:defenestrate.itmatthewcroughan changed their display name from matthewcroughan (DECT: 56490) to matthewcroughan.17:27:46
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8)

Well, messing around with Triton compiler failure on pytorch, you know the good ol' error

torch._inductor.exc.InductorError: FileNotFoundError: [Errno 2] No such file or directory: '/sbin/ldconfig'

seems similar to what Serge pointed out here https://github.com/NixOS/nixpkgs/pull/278969/files#diff-289748b7fbff3ff07ecd17030035a7e7aa78b21e882a549900885e6bc5030973

18:41:06
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8) Oh I got torch.compile working! 21:02:45
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8)I'll submit a PR for Nixpkgs21:12:21
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8) Oh wait, what's the proper way to expose a runtime dependency on libcuda.so? Is it enough to point it to the stub? Because (as I understand) that's only for linking, not for runtime use (because it's a stub).
Since libcuda.so is provided by the driver, and library location depends on the host OS...
21:15:03
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8) I guess if we knew ahead of time where libcuda.so and the like were, we wouldn't need nixGL or nix-gl-host because we could package everything in a platform-agnostic way, huh... 21:23:41

Show newer messages


Back to Room ListRoom Version: 9