NixOS CUDA - Public Room Timeline

	NixOS CUDA	340 Members
	CUDA packages maintenance and support in nixpkgs \| https://github.com/orgs/NixOS/projects/27/ \| https://nixos.org/manual/nixpkgs/unstable/#cuda	64 Servers

Load older messages

Sender	Message	Time
21 Mar 2023
mjlbach	Also Tyler (tbenst) doesn't really use nix anymore, I saw some people pinging him for approval on magma and such. I just texted him to confirm (we are friends IRL)	02:30:40
connor (he/him)	That's correct, it's only forward not backward compatible Come to think of it, should we even have a default set of cuda architectures we build for? If we're looking at making the default just the single test architecture, why bother having a default? Why not require the user to specify what they want?	03:21:08
connor (he/him)	In reply to @atrius:matrix.org Also Tyler (tbenst) doesn't really use nix anymore, I saw some people pinging him for approval on magma and such. I just texted him to confirm (we are friends IRL) Would he be okay if I removed him from stuff like Magma? What's the recommended way to handle that?	03:24:57
mjlbach	I just texted him, will let you know :)	03:25:33
mjlbach	In reply to @connorbaker:matrix.org That's correct, it's only forward not backward compatible Come to think of it, should we even have a default set of cuda architectures we build for? If we're looking at making the default just the single test architecture, why bother having a default? Why not require the user to specify what they want? Is there a reason for targeting 90 instead of 86 + PTX	03:26:27
mjlbach	The only issue with having users specify the default is its not exactly clear	03:27:10
mjlbach	I guess you can just put a link to https://developer.nvidia.com/cuda-gpus for people to to figure out the needed versions	03:28:20
connor (he/him)	In reply to @atrius:matrix.org Is there a reason for targeting 90 instead of 86 + PTX Not really; if this is in reference https://github.com/NixOS/nixpkgs/issues/221564#issuecomment-1477191551 that was just because it's the latest. 86+PTX would give the best performance with the largest compatibility with Nixpkgs as it is currently, I think (latest version Torch 1.13 supports for example). 50+PTX would give the broadest HW support with (possibly) the worst performance.	03:30:54
mjlbach	Ah, sorry for the phrasing, was suggesting 86 :)	03:31:37
mjlbach	* Ah, sorry for the phrasing, was suggesting 86, 89, 90 + PTX :)	03:31:56
mjlbach	I also finally got pytorch building with poetry2nix...	03:32:25
connor (he/him)	In reply to @atrius:matrix.org The only issue with having users specify the default is its not exactly clear that's definitely an issue! Personally I'd be a fan of an assert like `cudaSupport -> cudaCapabilities != [ ]` in `cudaPackages` which points the user to docs about how to build for a specific target/how to find out the cuda compute capability of their device	03:33:32
mjlbach	Maybe there should be a default cuda flake with instructions that shows how to override it	03:34:31
mjlbach	* Maybe there should be a default cuda flake with instructions that shows how to override it/ with the doc link embedded	03:34:41
mjlbach	https://github.com/cpcloud/torch-p2nix/blob/81f318026f19e1a2a41cf6126f8e1cd5a7fab8be/flake.nix#L24-L61	03:35:03
mjlbach	The issue with torch is that I can't see how one derivation is going to provide support for building all versions of torch that are currently in circulation	03:35:41
SomeoneSerge (matrix works sometimes)	In reply to @atrius:matrix.org The issue with torch is that I can't see how one derivation is going to provide support for building all versions of torch that are currently in circulation all versions of torch that are currently in circulation You mean semvers?	11:10:56
SomeoneSerge (matrix works sometimes)	LoL, I think supplying a global `cudaSupport` overrides `USE_CUDA` in the ROCM version of torch and spoils nixpkgs-review 🙃	11:26:53
SomeoneSerge (matrix works sometimes)	Also, we ought to do sometihng about these `pytestCheckPhase`s, they really don't go well in parallel	11:27:55
connor (he/him)	Does anyone else set `max-jobs = auto` in their nix configuration? I've found that I need to limit it to 2 or similar to prevent `nixpkgs-review` from building several copies of torch and Jax in parallel lol	14:27:15
connor (he/him)	In reply to @ss:someonex.net `crt/host_defines.h` and such are shipped in `cuda_nvcc` -> currently we can't simply drop `cuda_nvcc` in `nativeBuildInputs`, but have to sometimes add it to `buildInputs` as well We should split the outputs in your experience with the redist packages, is splitting outputs going to be simple or require doing so on a case-by-case basis? I'm going to make some more issues tonight and wanted to know if I should make a single issue for splitting the outputs or if I should find the worst offenders by closure size and make tickets for them specifically.	14:30:09
SomeoneSerge (matrix works sometimes)	In reply to @connorbaker:matrix.org in your experience with the redist packages, is splitting outputs going to be simple or require doing so on a case-by-case basis? I'm going to make some more issues tonight and wanted to know if I should make a single issue for splitting the outputs or if I should find the worst offenders by closure size and make tickets for them specifically. I have no idea. We just need to try editing `build-cuda-redist-package.nix` and see if that breaks any cmake/pkg-config discovery downstream	14:32:20
SomeoneSerge (matrix works sometimes)	Meanwhile, I just noticed we don't actually patch cuda's `.pc` files: `│ File: cudaPackages/pkg-config/nvrtc-11.7.pc 1 │ cudaroot=/usr/local/cuda-11.7 2 │ libdir=${cudaroot}/targets/x86_64-linux/lib 3 │ includedir=${cudaroot}/targets/x86_64-linux/include 4 │ 5 │ Name: nvrtc 6 │ Description: A runtime compilation library for CUDA C++ 7 │ Version: 11.7 8 │ Libs: -L${libdir} -lnvrtc 9 │ Cflags: -I${includedir}` I guess all of the automatic discovery we had worked through `FindCUDAToolkit.cmake` and not pkg-config	14:34:03
SomeoneSerge (matrix works sometimes)	Either way, we definitely should replace this `/usr/local` stuff	14:34:22
SomeoneSerge (matrix works sometimes)	Which, CC Kevin Mittman 😆, isn't technically permitted by the license	14:35:20
Kevin Mittman (UTC-7)	New release of nvJPEG2000 https://developer.download.nvidia.com/compute/nvjpeg2k/redist/	14:36:37
SomeoneSerge (matrix works sometimes)	!!!!	14:36:57
Kevin Mittman (UTC-7)	Formulating a response to inquiry but wording is hard	14:49:39
SomeoneSerge (matrix works sometimes)	Tfw upstream has `- name: Patch setup.py` in their github workflows (looking at openai/triton which we now need for pytochWithRocm...)	14:57:50
SomeoneSerge (matrix works sometimes)	* Tfw upstream has `- name: Patch setup.py` in their github workflows (looking at openai/triton which we now need for pytochWithRocm...): "hwy would you hide this from me?.."	15:17:03

Show newer messages

Back to Room ListRoom Version: 9