!eWOErHSaiddIbsUNsJ:nixos.org

NixOS CUDA

288 Members
CUDA packages maintenance and support in nixpkgs | https://github.com/orgs/NixOS/projects/27/ | https://nixos.org/manual/nixpkgs/unstable/#cuda56 Servers

Load older messages


SenderMessageTime
14 Oct 2024
@ss:someonex.netSomeoneSerge (back on matrix)Yes but also the hydra history is all green 🤷19:08:54
@glepage:matrix.orgGaƩtan LepageYes, weird...19:13:19
@ss:someonex.netSomeoneSerge (back on matrix)Noticed https://github.com/SomeoneSerge/nixpkgs-cuda-ci/issues/31#issuecomment-2412043822 only now, published a response19:22:08
@glepage:matrix.orgGaƩtan Lepage I can't get onnx to build...
Here are the logs in case someone know what is happening: https://paste.glepage.com/upload/eel-falcon-sloth
20:08:13
@ss:someonex.netSomeoneSerge (back on matrix)
      error: downloading 'https://github.com/abseil/abseil-cpp/archive/refs/tags/20230125.3.tar.gz' failed

lol

20:19:08
@ss:someonex.netSomeoneSerge (back on matrix)
In reply to @ss:someonex.net
Yes but also the hydra history is all green 🤷
Maybe that just came in from staging
20:19:30
15 Oct 2024
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8)
In reply to @glepage:matrix.org
I can't get onnx to build...
Here are the logs in case someone know what is happening: https://paste.glepage.com/upload/eel-falcon-sloth
Onnx's CMake isn't detecting at least one dependency, so it tries to download them all in order, starting with abseil. Since there's no networking in the sandbox, it fails.
00:06:48
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8)I'm currently working on Onnx packaging for a thing, and you can see what I've got going on here: https://github.com/ConnorBaker/cuda-packages/blob/main/cudaPackages-common/onnx.nix (It's a combination C++/Python install so it's gnarly. But better than having two separate derivations with libraries built with different flags, I guess.)00:09:04
@glepage:matrix.orgGaƩtan LepageOk interesting, thanks for sharing05:46:57
@glepage:matrix.orgGaƩtan LepageIs your plan to upstream this to nixpkgs ?05:47:13
@glepage:matrix.orgGaƩtan Lepage [triton update]
triton-llvm fails during the test phase.
Logs: https://paste.glepage.com/upload/fish-jaguar-pig
08:48:05
@atagen:imagisphe.reatagen joined the room.11:38:21
@ss:someonex.netSomeoneSerge (back on matrix)
In reply to @glepage:matrix.org
[triton update]
triton-llvm fails during the test phase.
Logs: https://paste.glepage.com/upload/fish-jaguar-pig
Can't reproduce, builds for me
12:35:31
@ss:someonex.netSomeoneSerge (back on matrix)
In reply to @glepage:matrix.org
[triton update]
triton-llvm fails during the test phase.
Logs: https://paste.glepage.com/upload/fish-jaguar-pig
* Can't reproduce, builds for me. Maybe we tried different HEADs?
12:36:26
@atagen:imagisphe.reatagen hi, what am I missing to get a cache hit? going by this hydra output torch should be in the cache (for nixpkgs 5633bcf). I have nix-community cachix set up, allowUnfree, cudaSupport,, and the package in question is providing its overlay properly with final.callPackage so it ought to be using my system packages 12:46:24
@atagen:imagisphe.reatagenhttps://gist.github.com/atagen/615e187e323f3ca3f5f9d40e55ce2b7c12:55:50
@atagen:imagisphe.reatagen oof, could it be because I'm specifying python311Packages instead of python3Packages? 12:57:30
@atagen:imagisphe.reatagen... yup, that was it12:58:23
@ss:someonex.netSomeoneSerge (back on matrix)https://github.com/NixOS/nixpkgs/blob/70f9c111b27db0d459a227e477acce62016cbf10/pkgs/top-level/release-cuda.nix#L11813:04:59
@ss:someonex.netSomeoneSerge (back on matrix)
In reply to @glepage:matrix.org
[triton update]
triton-llvm fails during the test phase.
Logs: https://paste.glepage.com/upload/fish-jaguar-pig
With the current HEAD and ccache off I just reached the pytest branch
14:17:17
@glepage:matrix.orgGaƩtan Lepage
In reply to @ss:someonex.net
With the current HEAD and ccache off I just reached the pytest branch
You mean that you were able to build it fine ?
14:47:18
@ss:someonex.netSomeoneSerge (back on matrix)Yes14:47:27
@ss:someonex.netSomeoneSerge (back on matrix)Well the pytest bit fails with these 20 tests ofc but that'll come later14:47:41
@glepage:matrix.orgGaƩtan LepageOk, weird then...14:49:24
@glepage:matrix.orgGaƩtan LepageBtw, I'm running a cross-system review for this triton PR.14:49:35
@glepage:matrix.orgGaƩtan Lepagequite a few rebuilds14:49:40
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8)
In reply to @glepage:matrix.org
Ok interesting, thanks for sharing
Yep, that's the goal. My hope is to replace the current CUDA packaging stuff with what I've got there.
I personally will be maintaining CUDA 11.8 for a while but mark it as end of life. Since it requires toolchains which will be removed upstream, I'll keep it out of tree.
My plan is to only maintain the latest version of CUDA, but block upgrades to newer versions if some prominent packages don't build, even on master.
I plan to ship the same version of most libraries that NVIDIA does with its ML containers, which means roughly a monthly release cadence.
16:19:57
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8)Of course, all this is pending agreement with the other maintainers, but it would certainly help cut down the scope of CUDA packages and allow us to better populate the cache since there'd be really just one version supported upstream16:20:36
@glepage:matrix.orgGaƩtan LepageThis looks smart indeed !16:55:13
16 Oct 2024
@glepage:matrix.orgGaƩtan Lepage As the onnx failure was blocking me elsewhere, I went and fixed it myself.
Any review is welcome :)
https://github.com/NixOS/nixpkgs/pull/348985
09:07:07

Show newer messages


Back to Room ListRoom Version: 9