!eWOErHSaiddIbsUNsJ:nixos.org

NixOS CUDA

290 Members
CUDA packages maintenance and support in nixpkgs | https://github.com/orgs/NixOS/projects/27/ | https://nixos.org/manual/nixpkgs/unstable/#cuda57 Servers

Load older messages


SenderMessageTime
3 Feb 2025
@matthewcroughan:defenestrate.itmatthewcroughan changed their display name from matthewcroughan (FOSDEM) to matthewcroughan.09:11:41
@ss:someonex.netSomeoneSerge (back on matrix) changed their display name from SomeoneSerge (Bruxelles) to SomeoneSerge (Gand St. Pieters).13:40:41
@ruroruro:matrix.orgruro connor (he/him) (UTC-7): SomeoneSerge (Gand St. Pieters) sorry to keep annoying you guys, but could you respond to the above question? Alternatively, "we are too busy right now, you'll have to figure it out on your own" is also an acceptable answer))) 14:37:45
@ss:someonex.netSomeoneSerge (back on matrix) Sorry, I forgot to reply. I'll write before tomorrow 14:41:33
@ruroruro:matrix.orgruro❤️14:42:10
@pederbs:pvv.ntnu.nopbsds changed their display name from pbsds (FOSDEM) to pbsds.16:25:49
@ss:someonex.netSomeoneSerge (back on matrix)

Starting with the last question: great to hear! As one tool to help with discovery, we have a task board at https://github.com/orgs/NixOS/projects/27/views/1. We haven't been properly maintaining it for the last year, I see many invalidated/outdated items there, but some of the roadmap is still relevant, and the "New" column is automatically populated with all issues and PRs tagged "cuda".

If you're willing to do chores, fixing issues like "nvidia's bash wrapper for nsys-ui assumes things are installed into weird locations and is completely broken" and "a package has changed the way they hardcode /usr/lib or dlopen stuff and now fails to find libcuda.so again", those would be very useful, relatively straightforward, but involve an amount of debugging and suffering and usually get ignored for a long time because it's just demotivating.

If you're interested in architectural issues, then note the message about the upcoming meeting and the proposed subjects, check out the "Roadmap" column, and Connor's out-of-tree cuda-packages

22:27:33
@ss:someonex.netSomeoneSerge (back on matrix) OK one more item for the agenda: I think it would be good for us together to walk through the backlog, discuss issues' context, status, and present relevance, and sort/close outdated issues, maybe merge well-reviewed but forgotten PRs. I'd guess this is easily half an hour or more, should we schedule this separately? 22:30:38
@ss:someonex.netSomeoneSerge (back on matrix) * OK one more item for the agenda: I think it would be good for us together to walk through the backlog, discuss issues' contexts, statuses, and present relevance, and sort/close outdated issues, maybe merge well-reviewed but forgotten PRs. I'd guess this is easily half an hour or more, should we schedule this separately? 22:30:50
@ss:someonex.netSomeoneSerge (back on matrix)

I was thinking that we might be able to improve the situation by making general nixpkgs contributors more aware of this situation. For example, it would be pretty cool if we could track the nix-community hydra builds on status.nixos.org and on zh.fail (and try to include CUDA packages in future ZHF events).

You're certainly right, and the idea of promoting cuda fixes during ZHF has in fact been around. By the same token, an ofborg-like integration, an external service that would test a PR on-push and post a report on failures on non-default instantiations or involving out-of-tree tests is maybe even necessary to ensure stability of hw-accelerated packages. Even when a contributor doesn't care about cuda, it's important they are informed about unintended consequences of their changes, and maybe can ping the interested parties as needed

22:41:27
@ss:someonex.netSomeoneSerge (back on matrix)

For example, https://hydra.nix-community.org/jobset/nixpkgs/cuda/evals has a bunch of Eval Errors and build errors and I don't remember the last time that it was green (although some of those eval errors might not be indicative of actually broken packages).

My javascript might be broken, but I only see build failures. Some errors under cudaPackages. seem actually familiar, e.g. the cutensor error was fixed at least once already and is recurring... that's to be fixed somewhere around manifest.nix in the current implementation

22:44:52
@ss:someonex.netSomeoneSerge (back on matrix) Ah I see, thanks for the link. I guess "this is unfree" errors are kind of expected, you'll see them in the official hydra too? This does sound ridiculous though, I agree 22:49:09
@ss:someonex.netSomeoneSerge (back on matrix)

Also, I understand why hydra.nixos.org doesn't build CUDA packages, but do you think that we could enable evaluation-only checks for CUDA packages on nixpkgs github PRs and then build those PRs using the nix-community builders and report the results on the PR?

Ah great, you already said as much. Yes, we definitely can. You may have seen issues about unfree stuff open and closed in the Ofborg repo, so the notion isn't entirely new. I know for sure there are several interested parties, and this would be incredibly useful, maybe we can discuss in more detail on the call. This issue needs to be approached with some from the community perspective though, because it's desirable for nixpkgs and nix-community to still stay independent/disentangled: legally, socially, architecturally...

22:54:24
@ss:someonex.netSomeoneSerge (back on matrix) Is it still broken? I might have interest in fixing it, I'll check tmr 22:56:15
@ss:someonex.netSomeoneSerge (back on matrix) * Is it still broken? The attribute page shows latest eval grey. I might have interest in fixing it, I'll check tmr 22:57:50
@hexa:lossy.networkhexa
       > ERROR: noBrokenSymlinks: the symlink /nix/store/fqx2dv9vp1k0f00imgqshy6d92ykcw5d-python3.12-kaleido-0.2.1/lib/python3.12/site-packages/kaleido/executable/etc/fonts/fonts.conf points to a missing target /nix/store/2ynwbywyaxk4wgl8d3xrb9dzkdzv241x-fontconfig-2.15.0-bin/etc/fonts/fonts.conf
       > ERROR: noBrokenSymlinks: found 1 dangling symlinks and 0 reflexive symlinks
       For full logs, run 'nix log /nix/store/f7whd4p85k8b7bd8sx2bnp5jpmzycbkx-python3.12-kaleido-0.2.1.drv'.
error: 1 dependencies of derivation '/nix/store/gdg8kgy8ry0gjhpv2dws072wajkjk69l-python3.12-plotly-5.24.1.drv' failed to build
error: 1 dependencies of derivation '/nix/store/pfxw0g0npwr091cr7ks7012jl8qsg
23:00:36
@hexa:lossy.networkhexa yes, but now also due to that new hook connor (he/him) (UTC-7) introduced 23:00:45
@hexa:lossy.networkhexa * SomeoneSerge (Gand St. Pieters): yes, but now also due to that new hook connor (he/him) (UTC-7) introduced 23:00:50
@glepage:matrix.orgGaétan LepageRedacted or Malformed Event23:02:31
@ss:someonex.netSomeoneSerge (back on matrix) Huh? 23:10:48
@ss:someonex.netSomeoneSerge (back on matrix) * Huh? I thought it was out of tree 23:11:17
@ss:someonex.netSomeoneSerge (back on matrix) * Huh? I thought it was still out of tree 23:11:21
@hexa:lossy.networkhexait is in the current staging cycle23:58:18
@hexa:lossy.networkhexaand it is causing all sort of havoc23:58:27
@hexa:lossy.networkhexabecause no fixes were prepared in advance23:58:38
4 Feb 2025
@connorbaker:matrix.orgconnor (he/him) I apologize, I should have made sure at least stdenvs for {x86_64,aarch64}-{darwin,linux} were working
In the case of rebuild-the-world PRs, I'm not sure what is considered sufficient in terms of testing -- is there particular language (or Nix expressions) you'd want to see in the contributing guidelines?
05:07:55
@ss:someonex.netSomeoneSerge (back on matrix) Oh nice! 09:46:24
@ss:someonex.netSomeoneSerge (back on matrix)Found it09:46:27
@afdee1c:matrix.orgafdee1c joined the room.20:04:25
@zopieux:matrix.zopi.euzopieux I'm a bit confused about the nix-community cache and I wonder if my system/config is to blame, or the cache.
This cuda build succeeded and depends on nixpkgs d0bb46, which I pinned in my flake. Upon building though, nix decides to build fmpz8s6hy3yr8z6kb84h6498437d0xj1-ollama-0.5.7.drv even though per the above and per https://nix-community.cachix.org/8njyvpf8sxh8k61zvnv13cymn7szv63c.narinfo, the output should be available in the cache. nix.conf confirms the substituter/pubkey is present. Am I missing something?
21:03:07

Show newer messages


Back to Room ListRoom Version: 9