!eWOErHSaiddIbsUNsJ:nixos.org

NixOS CUDA

290 Members
CUDA packages maintenance and support in nixpkgs | https://github.com/orgs/NixOS/projects/27/ | https://nixos.org/manual/nixpkgs/unstable/#cuda57 Servers

Load older messages


SenderMessageTime
17 Oct 2024
@ss:someonex.netSomeoneSerge (back on matrix)
In reply to @connorbaker:matrix.org

Pretty proud of how performant this is, after much stats scrutinizing and reading primop implementations: https://github.com/ConnorBaker/cuda-packages/blob/23f199365343e3355469332acb5cf501c8c5fc68/upstreamable-lib/attrsets.nix#L38

Credit for the flattenDrvTree function goes to Adam Joseph though, basically took that stuff from https://github.com/NixOS/nixpkgs/blob/3a5940b539fdd56ace90d5e79a926e5e2694ba45/pkgs/top-level/release-attrpaths-superset.nix#L38

That for a release file or?
08:56:25
@connorbaker:matrix.orgconnor (he/him)Yeah, the idea being that if you’re using CI that isn’t hydra or wanted to build a whole package set, you could use that function14:22:16
18 Oct 2024
@mcwitt:matrix.orgmcwitt

Hi guys, I'm running into an error with jaxlibWithCuda in master: LLVM ERROR: Cannot select: intrinsic %llvm.nvvm.shfl.sync.down.i32

I'm not certain whether the issue is with the nixpkgs infrastructure or upstream, but thought I'd raise it here first in case anyone has seen something similar.

Full reproduction with flake.lock here: https://gist.github.com/mcwitt/4cf6c5cae44152dab1df1ef96d49d22e

18:45:05
@mcwitt:matrix.orgmcwitt *

Hi guys, I'm running into an error with jaxlibWithCuda in master: LLVM ERROR: Cannot select: intrinsic %llvm.nvvm.shfl.sync.down.i32

I'm not certain whether the issue is with the nixpkgs infrastructure or upstream, but thought I'd raise it here first in case anyone has seen something similar.

Full reproduction with flake.lock here: https://gist.github.com/mcwitt/4cf6c5cae44152dab1df1ef96d49d22e

EDIT: one reason I suspect it might be a nixpkgs-specific issue is I'm also seeing many instances of the warning '+ptx84' is not a recognized feature for this target (ignoring feature), which was reported to be fixed in 0.4.28: https://github.com/jax-ml/jax/issues/21121#issuecomment-2103606397

18:47:10
@glepage:matrix.orgGaétan LepageIdk if an update would help. Unfortunately, they have heavily changed their packaging internals and we are quite stalling on this project...20:53:57
@mcwitt:matrix.orgmcwitt

Unfortunately, they have heavily changed their packaging internals and we are quite stalling on this project...

Ah man, sorry to hear. I've been occasionally checking in on the progress of JAX updates (most recently https://github.com/NixOS/nixpkgs/pull/318995), and have to say that it seems like a herculean effort to debug all the build issues that come up with each update, not to mention the frequent breakage of downstream dependencies since things are moving so quickly with JAX. Thanks so much for you work on this! I've benefited a ton from having JAX in nixpkgs.

22:21:30
@glepage:matrix.orgGaétan Lepage Glad that it's useful for you ! Sometimes I wonder if there are people using the nix python set ^^.
Indeed, each update involves some work. While most of the time it is pretty straightforward, the recent changes (related to bazel) are much more annoying.
22:31:53
@glepage:matrix.orgGaétan Lepage hexa (UTC+1) I am indeed unable to build tensordict from my torch 2.5.0 update PR.
I systematically gets stuck at 22%, no matter the core count apparently.
I will try to investigate this tomorrow.
22:41:37
@hexa:lossy.networkhexaThank you.22:42:09
19 Oct 2024
@glepage:matrix.orgGaétan Lepage Very weird, I was now able to build it.
It does this in the middle, but the tests are apparently all succesfull.
11:35:50
@glepage:matrix.orgGaétan Lepageclipboard.png
Download clipboard.png
11:35:54
@glepage:matrix.orgGaétan LepageSometimes this message doesn't show up and the package builds just fine...11:47:44
@hexa:lossy.networkhexaas I said, 6th gen intel breaks, 8th gen intel works12:53:32
@hexa:lossy.networkhexaboth are essentially skylake12:53:40
@hexa:lossy.networkhexamy build farm is 6th gen fwiw12:53:45
@hacker1024:matrix.orghacker1024
In reply to @hexa:lossy.network
reliably crashes and gets stuck on 6th gen intel
Could it be out of date microcode?
12:55:56
@hexa:lossy.networkhexa🤔12:56:10
@hexa:lossy.networkhexa
hardware.cpu.intel.updateMicrocode = lib.mkDefault config.hardware.enableRedistributableFirmware;
12:56:44
@hexa:lossy.networkhexa which defaults on config.hardware.enableAllFirmware 12:57:09
@hexa:lossy.networkhexa which defaults to false 12:57:19
@hexa:lossy.networkhexa🤡12:57:20
@hacker1024:matrix.orghacker1024Rip12:57:24
@hexa:lossy.networkhexa thank you nixos-generate-config 12:57:34
@hacker1024:matrix.orghacker1024No guarantee that it'll fix it but it can't hurt12:57:42
@hacker1024:matrix.orghacker1024They have long lists of errata12:57:52
@hexa:lossy.networkhexayeah, worth a try12:58:00
@hexa:lossy.networkhexacan always roll back12:58:03
@hexa:lossy.networkhexathx12:58:58
@connorbaker:matrix.orgconnor (he/him)I keep forgetting I added myself as a maintainer to glibc until I get emails for reviews lmao19:09:59
@connorbaker:matrix.orgconnor (he/him) SomeoneSerge (utc+3) Gaétan Lepage thoughts on having backendStdenv automatically propagate autoAddDriverRunpath and autoPatchelfHook? I feel like forgetting to add the former is a footgun people keep firing, and the latter is a great check to make sure all your dependencies are either present or explicitly ignored. 19:22:47

Show newer messages


Back to Room ListRoom Version: 9