| 8 Mar 2023 |
SomeoneSerge (matrix works sometimes) | We do have the automation we don't need xD https://github.com/NixOS/nixpkgs/issues/217913#event-fa1e4cc9-b97c-53b4-bf98-595b6708f0fa | 01:18:10 |
connor (he/him) |  Download Screenshot 2023-03-07 at 8.20.12 PM.png | 01:20:20 |
connor (he/him) | We only have a few of them; the auto-add to project one would be nice to make sure everything which is tagged with CUDA makes its way into the project. These are the ones I see / have enabled so far: | 01:20:22 |
connor (he/him) | Okay I turned on a few more (reopened items are sent to the backlog to be reprioritized; code changes do the same; code review approvals move it to Ready so it can be merged) | 01:23:00 |
SomeoneSerge (matrix works sometimes) | I think we should prevent issues being closed when their tickets are just moved around on the board | 01:30:27 |
SomeoneSerge (matrix works sometimes) | In reply to @ss:someonex.net We do have the automation we don't need xD https://github.com/NixOS/nixpkgs/issues/217913#event-fa1e4cc9-b97c-53b4-bf98-595b6708f0fa We haven't merged https://github.com/NixOS/nixpkgs/pull/218035 yet | 01:31:01 |
connor (he/him) | Ah, would you rather I have deleted that issue from the project? It was closed as a duplicate, so to me it's "done" in the sense that we no longer need to worry about it | 01:34:39 |
SomeoneSerge (matrix works sometimes) |
was closed as a duplicate, oh, ok
| 01:35:29 |
SomeoneSerge (matrix works sometimes) | *
was closed as a duplicate,
oh, ok
| 01:35:33 |
connor (he/him) | Also just learned that if you accidentally reset your commit and force push it turning your PR into an empty diff it's closed automatically | 01:35:55 |
| 9 Mar 2023 |
hexa | https://opensource.googleblog.com/2023/03/openxla-is-ready-to-accelerate-and-simplify-ml-development.html?m=1 | 08:36:41 |
SomeoneSerge (matrix works sometimes) | https://github.com/openxla/xla/issues/1 | 13:21:07 |
connor (he/him) | Is there a centralized location for docs for CUDA-maintainer related stuff? Containing answers to (for example):
- Do we have any infrastructure (like CI) besides cachix?
- What populates our cachix?
- What's the storage limit for our cachix (meaning, is the number of derivations we host a result of limited compute, storage, or both)?
- If it's not CI populating the cache, what's the process for getting permissions to push to it?
| 15:47:43 |
SomeoneSerge (matrix works sometimes) | I'd CUDA page on NixOS Wiki should be one | 15:59:20 |
SomeoneSerge (matrix works sometimes) | * I'd say CUDA page on NixOS Wiki should be one | 15:59:23 |
SomeoneSerge (matrix works sometimes) | https://nixos.wiki/wiki/CUDA | 15:59:31 |
SomeoneSerge (matrix works sometimes) | Hasn't been maintained for a while | 15:59:44 |
connor (he/him) | Is there any sort of VCS or approval/review process for that? | 19:50:45 |
| 10 Mar 2023 |
connor (he/him) | These results are preliminary but it looks like using -Xfatbin=-compress-always nearly cut the size of the magma NAR in half (from 429.4M to 233.6M) when building for just 8.6: https://github.com/NixOS/nixpkgs/pull/220402
That's super impressive, so I'm excited to see what it looks like when targeting multiple capabilities.
| 00:47:45 |
SomeoneSerge (matrix works sometimes) | Just a wiki | 02:39:29 |
connor (he/him) | Looks like PyTorch nightlies are using a much newer version of Triton relative to what they were a month or two ago. They’ve got their own branch which closely tracks master: https://github.com/openai/triton/tree/torch-inductor-stable.
Packaging that could be difficult given they pull in their own build of MLIR (based on LLVM 17: https://github.com/openai/triton/blob/2c32f4399986045ff25cae201ed3b16d922a9d3b/python/setup.py#L72) and unconditionally grab NVCC from conda (https://github.com/openai/triton/blob/2c32f4399986045ff25cae201ed3b16d922a9d3b/python/setup.py#L107).
We don’t have MLIR packaged yet: https://github.com/NixOS/nixpkgs/pull/163878 (although I think we do build it for ROCm?)
Thoughts? | 14:29:53 |
| 11 Mar 2023 |
connor (he/him) | I'm getting put on a client at work and moved off of the bench (which allowed me to do all the stuff I've worked on so far), so unfortunately my time might be more limited the next 2-4 weeks :( | 06:43:43 |
| 13 Mar 2023 |
SomeoneSerge (matrix works sometimes) | Roger! That reminds me that I was actually kind of wondering how much interest does Tweag have in nixpkgs cuda in general? And whether the quantity of that interest might depend on there being a clear set of goals and deliverables:) | 05:33:04 |
connor (he/him) | I've been pushing for it internally; a fair amount of the work people are doing already surrounds Nix the language more than Nixpkgs.
I personally think it'd be hugely valuable to point to Nixpkgs and some pre-made machine learning environments and say "look, we don't need docker, we can cache binaries, and it's tiny!"
Of course that requires the builds to work, the binaries to be cached, and the closure to, in fact, be tiny.
I'm just so tired of trying to reproduce stuff and having authors tell me they just used whatever was on the campus cluster or a patched version of an ancient package a professor had worked on lmao | 20:46:02 |
connor (he/him) | I think there's a nice synergy with jupyenv (https://github.com/tweag/jupyenv) and want to use that for a demo / proof of concept. | 20:47:05 |
| 14 Mar 2023 |
SomeoneSerge (matrix works sometimes) | https://github.com/NVIDIA/build-system-archive-import-examples/issues/3 | 08:33:25 |
SomeoneSerge (matrix works sometimes) | In reply to @ss:someonex.net https://github.com/NVIDIA/build-system-archive-import-examples/issues/3 Of course it only occurred to me after opening the issue that I should also check the redist packages' subdirectories, and sure thing: https://developer.download.nvidia.com/compute/cuda/redist/nvidia_driver/LICENSE.txt | 08:53:38 |
Kevin Mittman (UTC-7) | Redacted or Malformed Event | 17:41:48 |
| 15 Mar 2023 |
connor (he/him) | Aw damn. Okay, I tried to rebase my OpenCV PR on staging and have it target staging instead of master, but when I pushed the rebase it automatically closed it: https://github.com/NixOS/nixpkgs/pull/218044.
What's the right way to keep the PR open and change the target branch as well as rebase? | 19:52:05 |
connor (he/him) | Best I could figure was to just make a new PR against staging: https://github.com/NixOS/nixpkgs/pull/221370 | 20:01:56 |