!eWOErHSaiddIbsUNsJ:nixos.org

NixOS CUDA

274 Members
CUDA packages maintenance and support in nixpkgs | https://github.com/orgs/NixOS/projects/27/ | https://nixos.org/manual/nixpkgs/unstable/#cuda55 Servers

Load older messages


SenderMessageTime
4 Nov 2025
@arilotter:matrix.orgAri Lotterlet me compare my derivation with that one22:31:40
@arilotter:matrix.orgAri Lotterok yeah, decently similar. difference is i'm building against cutlass 4.0 instead of 4.1, and.. somehow my deps list is wayy simpler, yet the build works (on previous versions of my derivation, pre updating CUDA)? very strange..22:35:13
@arilotter:matrix.orgAri Lotter

but yeah i just smash into

> build/lib.linux-x86_64-cpython-312/flash_attn_2_cuda.cpython-312-x86_64-linux-gnu.so: PC-relative offset overflow in PLT entry for `_ZNK3c1010TensorImpl4sizeEl'
``` 🤷
22:35:28
@arilotter:matrix.orgAri Lotteri'm so tired of CUDA nightmares 😭 im so close to giving up and building dockerized devenvs, i just really don't want to give in..... :(22:37:57
@glepage:matrix.orgGaétan Lepage (It's a secret, but you might want to add https://cache.nixos-cuda.org as a substituter, it is slowly getting more and more artifacts)
Public key: cache.nixos-cuda.org:74DUi4Ye579gUqzH4ziL9IyiJBlDpMRn9MBN8oNan9M=
22:44:02
@glepage:matrix.orgGaétan Lepage connor (burnt/out) (UTC-8), Serge and I got #457803 ready.
We are waiting for nixpkgs's CI to get fixed (https://github.com/NixOS/nixpkgs/pull/458647).
Let's merge ASAP
23:38:07
@sporeray:matrix.orgRobbie Buxton For flash attention you should use the version of cutlass in the repo 23:54:57
@sporeray:matrix.orgRobbie Buxton They have a hash 23:55:06
@sporeray:matrix.orgRobbie Buxton In csrc/cutlass 23:56:01
@sporeray:matrix.orgRobbie Buxton* They have a rev23:56:25
5 Nov 2025
@apyh:matrix.orgapyhah fair enough 00:10:30
@ss:someonex.netSomeoneSerge (back on matrix) step 1: torchWithCuda = pkgsCuda.....torch (we were supposed to be here now, but it got out of hand)
step 2: torchWithCuda = warn "..." pkgsCuda...
step 3: torchWithCuda = throw
00:12:18
@ss:someonex.netSomeoneSerge (back on matrix)and what we really want is late binding and incremental builds00:13:41
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8)Why are you building for so many CUDA capabilities? I can’t really think of a reason you’d need that range in particular.01:59:14
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8)Added to merge queue02:07:23
@apyh:matrix.orgapyh
In reply to @connorbaker:matrix.org
Why are you building for so many CUDA capabilities? I can’t really think of a reason you’d need that range in particular.
's a distributed ml training application that needs to run on everything from gtx 10xx gpus to modern data center GH/GB200s :/
03:27:37
@apyh:matrix.orgapyhmost common hardware is gonna be 30xx 40xx 50xx, h100, a100, b20003:27:56
@apyh:matrix.orgapyhthough.. i could just see what pytorch precompiled wherls runs on and limit to that 03:28:54
@apyh:matrix.orgapyhshould be fine03:28:56
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8)Flash attention doesn’t support anything older than Ampere I thought03:29:07
@sporeray:matrix.orgRobbie Buxton V2 does 03:29:19
@sporeray:matrix.orgRobbie Buxton V3 is hopper only 03:29:24
@apyh:matrix.orgapyhya its only v3 iirc03:29:26
@sporeray:matrix.orgRobbie Buxton V4 (cute) is Blackwell 03:29:33
@sporeray:matrix.orgRobbie BuxtonBut that’s a wip03:29:38
@apyh:matrix.orgapyhand yeah fair enough I could drop fa for older gpus, ig i can provide cuda capabilities different per package03:30:06
@sporeray:matrix.orgRobbie Buxton Requires cutlass-dsl 03:30:07
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8)V2 doesn’t support older than Ampere per their readme, unless they forgot to update it03:30:25
@apyh:matrix.orgapyh
In reply to @connorbaker:matrix.org
V2 doesn’t support older than Ampere per their readme, unless they forgot to update it
yeah makes sense then maybe I'll just drop and see if anyone complains lol
03:30:43
@sporeray:matrix.orgRobbie Buxton
In reply to @connorbaker:matrix.org
V2 doesn’t support older than Ampere per their readme, unless they forgot to update it
It’s not optimized but it runs
03:30:46

Show newer messages


Back to Room ListRoom Version: 9