NixOS CUDA | 310 Members | |
| CUDA packages maintenance and support in nixpkgs | https://github.com/orgs/NixOS/projects/27/ | https://nixos.org/manual/nixpkgs/unstable/#cuda | 61 Servers |
| Sender | Message | Time |
|---|---|---|
| 11 Jun 2024 | ||
| https://github.com/NixOS/nixpkgs/blob/master/pkgs/by-name/ol/ollama/package.nix#L65-L82 | 13:21:07 | |
The shouldEnable logic looks maybe a bit complex but the arguments seem good? | 13:23:21 | |
In reply to @gjvnq:matrix.orgLooking more closely, I'd guess the issue is somewhere around __has_include(<Imath/half.h>) in ${openimageio.dev}/include/OpenImageIO/half.h | 14:49:05 | |
| SomeoneSerge (UTC+3): TIL, LD_DEBUG looks quite useful. I suppose the "stub" referred to in the message concerns /nix/store/q3m473lh6gcg4xbhbknrhmcj7w7njjs6-cuda_cudart-12.2.140-lib/lib/stubs/glibc-hwcaps/x86-64-v3 . Do you know what a "stub" is and why that would be a problem ? I understand "stub" as a "generic" library ? (I have a 3060RTX) | 16:34:54 | |
teto: as I understand it, we use stub libraries when the libraries we would link against aren't available -- for example, because they exist outside the sandbox (like libcuda.so does, as part of the NVIDIA driver, in /run/opengl-driver/lib/). They allow the build to succeed where they would otherwise fail due to missing symbols.They shouldn't cause issues at runtime, because the executable should find and load the proper library from wherever it is it comes from (in this case, /run/opengl-driver/lib/). | 18:05:24 | |
| I dont seem to have any cuda library in /run/opengl-driver/. Should I add anything into hardware.opengl.extraPackages ? | 18:17:25 | |
You mean /run/opengl-driver/lib/ and not /run/opengl-driver/ right? | 18:41:09 | |
| I've searched both in depth so yes | 19:53:53 | |
| What's the command you're using to try to run this piece of software? If it's a flake I can try to reproduce it on my machine | 20:37:10 | |
In reply to @ss:someonex.net Yeah, I had already figured out it but the bug issue is that I don't know what is the "right" way to include the definition of the half type. To make matters worse, I've tried to compile AliceVision on a docker container and using the official compilation scripts and yet the thing keeps failing. This means I can't even look at how the thing is supposed to compile. I'm at a bit of a loss as for how to proceed, but I suspect that I'll have to either ask the original authors for help or carefully read the cmake compilation scripts in order to look for potential sources of the error. Theoretically AliceVision has a nice CI pipeline but I can't see their build history so I don't even know how useful their CI scripts are. | 20:44:35 | |
connor (he/him) (UTC-5): it's packaged in nixpkgs, nix run nixpkgs#local-ai (you need the override with config.cudaSupport true though). At one point I had GPU working but I use it on and off and now something changed in nixpkgs probably. | 21:11:33 | |
In reply to @keiichi:matrix.org hardware.opengl.enable and the nvidia driver?.. | 21:29:16 | |
* the state of hardware.opengl.enable and the nvidia driver?.. | 21:29:30 | |
What revision of nixpkgs are you on? master fails to build (go-stable-diffusion errors during CMake configure) | 21:50:56 | |
| ha nevermind I do have libcuda.so and so on in /run/opengl-driver/lib (my first search must have ignored symlinks). I use the local-ai from nixos-unstable so that would be c7b821ba2e1e635ba5a76d299af62821cbcb09f3 | 21:58:35 | |
| Huh, not sure how that's working for you since I can't get it to build:
| 22:08:21 | |
Oh! SomeoneSerge (UTC+3) I was rebuilding OpenCV4 with the changes to the setup hooks you mentioned earlier about the CMake flags being opt-in, and I noticed that switching --compiler-bindir to -ccbin was enough to get rid of the "incompatible redefinition" warnings we've been seeing with CMake: https://github.com/NixOS/nixpkgs/pull/306172/commits/7dc8d6d83a853f98a695e2b23aa8d33a50aff6df#diff-3692a7105fd90d95727cd2f794cdb4af2656be94af52d97485c9d4ded9107883L93-R72 | 22:23:06 | |
| Do you guys have any recommendations on setting up a cache for https://github.com/hasktorch/hasktorch-skeleton/pull/9 ? The haskell part of the build already takes quite a long time but throw CUDA in the mix and I think build times would quickly get out of hand. Either way I've also never setup a cache via cachix or hercules-ci before so I don't know what the limits on build times are either. | 23:31:32 | |
| 12 Jun 2024 | ||
| SomeoneSerge (UTC+3): Any thoughts? https://github.com/NixOS/nixpkgs/issues/319167 | 00:36:45 | |
| This strikes me as a rather odd thing to even be checking for as a user. | 00:39:26 | |
| Redacted or Malformed Event | 00:45:31 | |
In reply to @trexd:matrix.orgDepends on how much space you need. Cachix gives 5Gb for free from what I remember. For a single project, that may be enough. For anything larger than that, you’re in for a world of unpleasantness depending on storage requirements. Even just hosting a binary cache via S3, you’re going to be paying for each API call (of which Nix will generate many, like an HTTP HEAD request for each NARINFO file, not to mention the actual retrieval of the data) as well as egress, which adds up very quickly. | 03:37:41 | |
| The cheapest I’ve managed so far is a Hetzner instance with a 7950x3D and a 10gbe NIC for maybe 150$ a month. Download and upload speeds certainly aren’t saturating the NIC, but until I re-write enough of Attic to be able to run it fully server less via Cloudflare workers/R2/D1/KV, that’s the best I’m going to get I think. | 03:42:09 | |
In reply to @connorbaker:matrix.orgI figured time would be the more important factor rather than storage? 🤔 I don't imagine the package taking more than 5GB. Is there a nix command that I can use to get the total size of a build? | 12:34:38 | |
nix path-info -rsSh <flake attribute or store path> iirc | 12:36:40 | |
| 7.9G RIP 🫠 | 12:44:12 | |
| to be fair IIRC that's the uncompressed size, and if there are paths cached in the main nixos cache, it wouldn't count against you not sure Domen allows specifying other upstream caches though (avoiding caching some of the CUDA dependencies would probably be best) | 13:41:29 | |
| I guess I'll play around with it and see what's up. Is pinning to a successful build of nixpkgs-cuda-ci still the best way to get CUDA hits? | 13:43:59 | |
| I believe so, yes | 15:00:37 | |
In reply to @aidalgol:matrix.org I suppose we need some kind of a fixpoint 🤷
Honestly, not sure if this is worth the effort 🙃 | 16:24:04 | |