NixOS CUDA - Public Room Timeline

	NixOS CUDA	283 Members
	CUDA packages maintenance and support in nixpkgs \| https://github.com/orgs/NixOS/projects/27/ \| https://nixos.org/manual/nixpkgs/unstable/#cuda	58 Servers

Load older messages

Sender	Message	Time
10 Sep 2025
	SomeoneSerge (back on matrix) changed their display name from SomeoneSerge (@nixcon & back on matrix) to SomeoneSerge (back on matrix).	00:35:23
SomeoneSerge (back on matrix)	softdep issues again?	00:43:26
zowoq	It only ever takes a couple of hours for our hydra to catch up. After the latest staging-next merge and subsequent nixos-unstable-small channel bump it took less than five hours for the full rebuild of the cuda jobset.	04:06:52
Albert Larsan	Here is my first version, before I went ahead and created a PR to add it to upstream nixpkgs: https://git.sr.ht/~albertlarsan68/nur/tree/8fcbc4612bcd097065c5691ca18cbc8f0e0825a0/item/pkgs/xmrig-cuda-mo/default.nix And here is the pr: https://github.com/NixOS/nixpkgs/pull/441494	05:59:05
Hugo	Thanks connor (he/him) (UTC+2) . I can now launch the triton test. However, when attempting to launch tests on the `unstloth` library, nix builds Torch for Python313 instead of Python 312. Torch is not supported on Python 3.13 yet - it still attempts to build however which confuses me. diff --git a/pkgs/development/python-modules/unsloth/default.nix b/pkgs/development/python-modules/unsloth/default.nix index 73f94721b5e0..e6473c3bfa1d 100644 --- a/pkgs/development/python-modules/unsloth/default.nix +++ b/pkgs/development/python-modules/unsloth/default.nix @@ -27,6 +27,9 @@ hf-transfer, diffusers, torchvision, + + # tests + cudaPackages, }: buildPythonPackage rec { @@ -85,6 +88,19 @@ buildPythonPackage rec { # NotImplementedError: Unsloth: No NVIDIA GPU found? Unsloth currently only supports GPUs! dontUsePythonImportsCheck = true; + passthru.tests = { + import-cuda = cudaPackages.writeGpuTestPython + { + libraries = ps: [ + ps.torch + ]; + } + '' + import unsloth + unsloth.test() + ''; + }; + meta = { description = "Finetune Llama 3.3, DeepSeek-R1 & Reasoning LLMs 2x faster with 70% less memory"; homepage = "https://github.com/unslothai/unsloth"; `nix-build -I nixpkgs=. --arg config '{ allowUnfree = true; cudaSupport = true;}' -A python312Packages.unsloth.tests.import-cuda`	07:12:50
Gaétan Lepage	Why shouldn't torch support 3.13?	07:15:31
Hugo	Sorry, tensorflow does not support Python 3.13	07:16:41
Hugo	How can I get the passthru tests use the Python interpreter specified ?	07:20:59
SomeoneSerge (back on matrix)	connor (he/him) (UTC+2): are 7AM (2PM UTC) meetings still OK for you? Would you like to reschedule to a different time? We could have Kevin Mittman join us the next time or other	07:48:33
SomeoneSerge (back on matrix)	Oh cool, I wonder if this is the 1st project consuming cuda or torch via meson i nixpkgs	07:52:33
SomeoneSerge (back on matrix)	AFAIK NVIDIA has never had any objections to ZLUDA, only AMD did	07:55:35
Hugo	I managed to launch a test on my package with CUDA enabled, but I get an issue from `triton` not finding a C compiler. Does that ring a bell to someone ? `RuntimeError: Failed to find C compiler. Please specify via CC environment variable or set triton.knobs.build.impl.` I share my work in progress in a draft PR here: https://github.com/NixOS/nixpkgs/pull/441728	09:59:29
SomeoneSerge (back on matrix)	In reply to @hugo:okeso.eu I managed to launch a test on my package with CUDA enabled, but I get an issue from `triton` not finding a C compiler. Does that ring a bell to someone ? `RuntimeError: Failed to find C compiler. Please specify via CC environment variable or set triton.knobs.build.impl.` I share my work in progress in a draft PR here: https://github.com/NixOS/nixpkgs/pull/441728 At the very least we recently stopped early-binding rocm libraries, maybe hard-coded compiler paths went with them. Try giving it a compiler at test time as suggested in the error?	10:05:31
Hugo	I (vibe) tried to give it a compiler here in this commit https://github.com/NixOS/nixpkgs/pull/441728/commits/81f7997ca1ca37193f2f26fdbc85c586a92ba6dd but was unsuccessful. Any suggestion how to do that?	10:06:58
	matthewcroughan changed their display name from matthewcroughan @ nixcon to matthewcroughan.	15:02:50
Lun	That should only impact ROCm unless I messed it up! diff was https://github.com/NixOS/nixpkgs/commit/c74e5ffb6526ac1b4870504921b9ba9362189a17	15:52:00
	layus joined the room.	18:26:24
layus	Is this team involved in flox/nvidia partnership ? (See https://flox.dev/cuda/) I guess so since the nixos foundation also is, but there is no mention of this team or its amazing work.	18:30:16
matthewcroughan	adrian-gierakowski: !!! Exception during processing !!! HIP error: invalid device function HIP kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing AMD_SERIALIZE_KERNEL=3 Compile with `TORCH_USE_HIP_DSA` to enable device-side assertions. Traceback (most recent call last): File "/nix/store/dg5g3ypdsjvy0274156l74klx4wr0nbx-comfyui-unstable-2025-09-06/lib/python3.13/site-packages/execution.py", line 496, in execute output_data, output_ui, has_subgraph, has_pending_tasks = await get_output_data(prompt_id, unique_id, obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, hidden_inputs=hidden_inputs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/nix/store/dg5g3ypdsjvy0274156l74klx4wr0nbx-comfyui-unstable-2025-09-06/lib/python3.13/site-packages/execution.py", line 315, in get_output_data return_values = await _async_map_node_over_list(prompt_id, unique_id, obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, hidden_inputs=hidden_inputs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/nix/store/dg5g3ypdsjvy0274156l74klx4wr0nbx-comfyui-unstable-2025-09-06/lib/python3.13/site-packages/execution.py", line 289, in _async_map_node_over_list await process_inputs(input_dict, i) File "/nix/store/dg5g3ypdsjvy0274156l74klx4wr0nbx-comfyui-unstable-2025-09-06/lib/python3.13/site-packages/execution.py", line 277, in process_inputs result = f(*inputs) File "/nix/store/dg5g3ypdsjvy0274156l74klx4wr0nbx-comfyui-unstable-2025-09-06/lib/python3.13/site-packages/nodes.py", line 74, in encode return (clip.encode_from_tokens_scheduled(tokens), ) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^ File "/nix/store/dg5g3ypdsjvy0274156l74klx4wr0nbx-comfyui-unstable-2025-09-06/lib/python3.13/site-packages/comfy/sd.py", line 170, in encode_from_tokens_scheduled pooled_dict = self.encode_from_tokens(tokens, return_pooled=return_pooled, return_dict=True) File "/nix/store/dg5g3ypdsjvy0274156l74klx4wr0nbx-comfyui-unstable-2025-09-06/lib/python3.13/site-packages/comfy/sd.py", line 232, in encode_from_tokens o = self.cond_stage_model.encode_token_weights(tokens) File "/nix/store/dg5g3ypdsjvy0274156l74klx4wr0nbx-comfyui-unstable-2025-09-06/lib/python3.13/site-packages/comfy/sd1_clip.py", line 689, in encode_token_weights out = getattr(self, self.clip).encode_token_weights(token_weight_pairs) File "/nix/store/dg5g3ypdsjvy0274156l74klx4wr0nbx-comfyui-unstable-2025-09-06/lib/python3.13/site-packages/comfy/sd1_clip.py", line 45, in encode_token_weights o = self.encode(to_encode) File "/nix/store/dg5g3ypdsjvy0274156l74klx4wr0nbx-comfyui-unstable-2025-09-06/lib/python3.13/site-packages/comfy/sd1_clip.py", line 291, in encode return self(tokens) File "/nix/store/jzm64j9dp50xs770h3w7n8h9pj6mpkjp-python3.13-torch-2.8.0/lib/python3.13/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl return self._call_impl(args, *kwargs) ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^ File "/nix/store/jzm64j9dp50xs770h3w7n8h9pj6mpkjp-python3.13-torch-2.8.0/lib/python3.13/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl return forward_call(args, *kwargs) File "/nix/store/dg5g3ypdsjvy0274156l74klx4wr0nbx-comfyui-unstable-2025-09-06/lib/python3.13/site-packages/comfy/sd1_clip.py", line 253, in forward embeds, attention_mask, num_tokens, embeds_info = self.process_tokens(tokens, device) ~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^ File "/nix/store/dg5g3ypdsjvy0274156l74klx4wr0nbx-comfyui-unstable-2025-09-06/lib/python3.13/site-packages/comfy/sd1_clip.py", line 204, in process_tokens tokens_embed = self.transformer.get_input_embeddings()(tokens_embed, out_dtype=torch.float32) File "/nix/store/jzm64j9dp50xs770h3w7n8h9pj6mpkjp-python3.13-torch-2.8.0/lib/python3.13/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl return self._call_impl(args, *kwargs) ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^ File "/nix/store/jzm64j9dp50xs770h3w7n8h9pj6mpkjp-python3.13-torch-2.8.0/lib/python3.13/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl return forward_call(args, *kwargs) File "/nix/store/dg5g3ypdsjvy0274156l74klx4wr0nbx-comfyui-unstable-2025-09-06/lib/python3.13/site-packages/comfy/ops.py", line 270, in forward return self.forward_comfy_cast_weights(args, **kwargs) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^ File "/nix/store/dg5g3ypdsjvy0274156l74klx4wr0nbx-comfyui-unstable-2025-09-06/lib/python3.13/site-packages/comfy/ops.py", line 266, in forward_comfy_cast_weights return torch.nn.functional.embedding(input, weight, self.padding_idx, self.max_norm, self.norm_type, self.scale_grad_by_freq, self.sparse).to(dtype=output_dtype) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/nix/store/jzm64j9dp50xs770h3w7n8h9pj6mpkjp-python3.13-torch-2.8.0/lib/python3.13/site-packages/torch/nn/functional.py", line 2546, in embedding return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse) ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ torch.AcceleratorError: HIP error: invalid device function HIP kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing AMD_SERIALIZE_KERNEL=3 Compile with `TORCH_USE_HIP_DSA` to enable device-side assertions.	19:53:26
matthewcroughan	After a whole day recompiling torch :)	19:53:32
matthewcroughan	Actually, with the HSA override to 11.0.0 it worked, but I get a different kind of error `loaded completely 30779.053608703613 1639.406135559082 True 0%\| \| 0/1 [00:00<?, ?it/s:0:rocdevice.cpp :3020: 78074348282d us: Callback: Queue 0x7f831c600000 aborting with error : HSA_STATUS_ERROR_INVALID_ISA: The instruction set architecture is invalid. code: 0x100f`	19:58:53
matthewcroughan	* Actually, with the HSA override to 11.0.0 it worked, but I get a different kind of error 0%\| \| 0/1 [00:00<?, ?it/s:0:rocdevice.cpp :3020: 78074348282d us: Callback: Queue 0x7f831c600000 aborting with error : HSA_STATUS_ERROR_INVALID_ISA: The instruction set architecture is invalid. code: 0x100f Aborted (core dumped) command nix "$@"```	19:59:14
Robbie Buxton	In reply to @layus:matrix.org Is this team involved in flox/nvidia partnership ? (See https://flox.dev/cuda/) I guess so since the nixos foundation also is, but there is no mention of this team or its amazing work. Ron mentioned them here https://discourse.nixos.org/t/nix-flox-nvidia-opening-up-cuda-redistribution-on-nix/69189/7	20:01:06
matthewcroughan	Is there a rocm room?	20:54:56
Lun	https://matrix.to/#/#ROCm:nixos.org	21:40:38
	connor (he/him) changed their display name from connor (he/him) (UTC+2) to connor (he/him) (UTC-7).	22:20:37
Gaétan Lepage	Well done guys for allowing this to happen (connor (he/him) (UTC-7) SomeoneSerge (back on matrix) stick...) 👏	23:07:17
Gaétan Lepage	* Well done guys for allowing this to happen (connor (he/him) (UTC-7) SomeoneSerge (back on matrix) stick Samuel Ainsworth...) 👏	23:22:06
SomeoneSerge (back on matrix)	The negotiations with NVIDIA have been run by Flox (although in parallel with many other companies' simultaneous inquiries). Ron kept us, the Foundation, and the SC in the loop, and offered both legal help and workforce. The current idea roughly is that the CUDA team gets access to the relevant repo and infra, and work closely together with Flox to secure the position and a commx channel to NVIDIA.	23:26:05
hexa	What were the blockers for setting this up within the NixOS Foundation?	23:54:32

Show newer messages

Back to Room ListRoom Version: 9