NixOS CUDA - Public Room Timeline

	NixOS CUDA	290 Members
	CUDA packages maintenance and support in nixpkgs \| https://github.com/orgs/NixOS/projects/27/ \| https://nixos.org/manual/nixpkgs/unstable/#cuda	57 Servers

Load older messages

Sender	Message	Time
13 Dec 2024
Moritz Sanft	I just noticed another weird thing while trying to hunt down that Perl dependency: As I'm building the driver for a server scenario, I removed the graphics and X11 stuff from the `libPath`. I still had the Perl dependency in my image though. When analyzing its chain, I saw the following: /nix/store/1gx9dgmj33jd1753fww5cmq0q087q48n-nixos-system-nixos-25.05pre-git └───/nix/store/czlpjck5z3vsgw1w9szinwnv15l4a2n3-system-path └───/nix/store/7m1var7g0swf2ikn3d3swsxk5w6lbcpv-nvidia-persistenced-550.90.07 └───/nix/store/wh45iphj9kr43mxq0wks9qam2swabf6f-nvidia-x11-550.90.07-6.11 └───/nix/store/lcq3ibmsb6c2jgqp3yfi1yp773x5wz19-mesa-24.2.6 └───/nix/store/0i5icd6l3pkjckipa5f94jv7dsj5md70-lm-sensors-3.6.0 └───/nix/store/3vq9qasxlqpyq1k95nq3s13g2m6w59ay-perl-5.40.0 Now, when I remove the persistenced, the dependency is gone. This means that the persistenced depends on another NVIDIA driver than what the system actually uses, somehow. The driver that's used in the system is at `/nix/store/zsdr4vrybbik9hb8nss6fbmi71wsqhv3-nvidia-x11-550.90.07-6.11`. When I now run `nix derivation show /path/to/persistenced-package`, I see the following: `"postFixup": "# Save a copy of persistenced for mounting in containers\nmkdir $out/origBin\ncp $out/{bin,origBin}/nvidia-persistenced\npatchelf --set-interpreter /lib64/ld-linux-x86-64.so.2 $out/origBin/nvidia-persistenced\n\npatchelf --set-rpath \"$(patchelf --print-rpath $out/bin/nvidia-persistenced):/nix/store/wh45iphj9kr43mxq0wks9qam2swabf6f-nvidia-x11-550.90.07-6.11/lib\" \\\n $out/bin/nvidia-persistenced\n",` This means that another driver is used for building the persistenced somehow? Looking at the packaging infrastructure, it seems that `nvidia_x11` is passed as an argument, which would mean that it should use the same one. However, I fear that there's some kind of evaluation differential here, as the persistenced package might be built before `hardware.nvidia.package` is even evaluated? Has anyone of you ever run into something similar before?	08:41:49
Moritz Sanft	fwiw; Solved it by doing a very dirty hack that overrides the `nvidia_x11` used in `nvidia-persistenced` explicitly: https://github.com/edgelesssys/contrast/commit/5bf5cb81ce05f6f25b2cdf960ca3ab57a7f3459f	15:05:40
14 Dec 2024
matthewcroughan	Is there a way to wrap programs in Nix so that they believe they have a specific directory structure, like an FHS Env, whilst not screwing around too much with things that impact Cuda/GPU access?	16:47:31
matthewcroughan	I'm trying to package an application that wants access to source code dir paths, and I think this would be a good use of a layer/wrapper that performs symlinking at runtime to change the view of the world from the perspective of the application: https://github.com/BatteredBunny/nix-ai-stuff/blob/main/pkgs/comfyui/default.nix#L54-L70 https://github.com/lboklin/nixified-ai/blob/master/projects/comfyui/package.nix#L116-L147	16:52:25
matthewcroughan	instead of doing it in the installPhase for example	16:52:32
matthewcroughan	if you enable cudaSupport and rocmSupport, what happens? Do you actually get an output that is usable for both?	20:24:53
sielicki	matthewcroughan: IMO it might be faster and better for you to write the missing pyproject.toml it needs	20:34:25
matthewcroughan	Would I not have to rewrite that each and every single time the owner updates the package?	20:34:49
matthewcroughan	I would rather write down the missing source/destination, instead of filling in for what the developer isn't doing	20:34:59
matthewcroughan	Does the pyproject.toml actually account for mutable vs immutable ?	20:35:44
matthewcroughan	If it's a minimal enough difference then maybe I can submit it upstream and convince them to maintain it	20:36:25
sielicki	although python is interpretted you still end up with an automatic installation phase that generates bytecode -- it really shouldn't be the case that it needs any access to .py files at runtime	20:36:59
matthewcroughan	But it does, and this is a pattern present in plenty of projects, and this also happens in PHP all the time	20:37:26
matthewcroughan	So we have to get over it somehow	20:37:37
matthewcroughan	I think the least evil method are wrappers that fool the application into seeing a FS you want them to see	20:38:18
matthewcroughan	Though maintaining that with a series of cp/ln in the installPhase of something is pretty annoying	20:38:38
sielicki	I guess what I'm saying is that I think just copying the source into $out is insufficient and abnormal -- each of these should be a separate python module and the buildPhase should take care of both copying it into expected python path dir structures and byte-compiling it. Yes, it works to just have a py file instead of a pyc file and vice-versa, but you really do want the pyc files	20:44:58
matthewcroughan	I guess what I'm saying is that I think just copying the source into $out is insufficient and abnormal Yes, but abnormal applications exist, and make up the majority of nixpkgs	20:45:26
sielicki	as far as it relates to cuda -- as long as you have cudaified torch in the python environment and the shebangs are modified to refer to that python environment, I don't think there's any real concern around what you're doing in the installPhase	20:47:26
sielicki	just keep in mind the general restrictions around running cuda apps on non-nixos	20:47:42
matthewcroughan	yeah because the complexities of advertising more than nvidia/cuda outside of nixpkgs are high	20:50:21
matthewcroughan	even if it reduces closure size, the overhead of writing nix code that is able to provide nvidia/rocm is really a lot	20:50:49
matthewcroughan	* even if it reduces closure size, the overhead of writing nix code that is able to provide nvidia/rocm separately is really a lot	20:50:54
matthewcroughan	I don't want to have attributes like `nvidia-myapp` `rocm-myapp` and `myapp`	20:51:36
matthewcroughan	would be much better to have a single derivation that is capable of all at runtime	20:52:00
sielicki	you shouldn't have to -- the torch input to your derivation will check whether nixpkgs was imported with config.cudaSupport and transparently do the right thing	20:52:11
matthewcroughan	yes, which means in my flake that isn't nixpkgs, I will have to import nixpkgs 3 times to handle those 3 cases	20:52:30
matthewcroughan	and also write nix code three times to handle the three cases	20:52:51
matthewcroughan	I could delay it by using an overlay and just not create 3 attributes upfront in the `packages` attribute of a flake	20:53:22
matthewcroughan	then it will depend on the importer.. but then I can't just have people `nix run` my attributes, they will have to write their own flake to consume, and specify the support in their own flake (which imo is not ideal)	20:53:50

Show newer messages

Back to Room ListRoom Version: 9