!eWOErHSaiddIbsUNsJ:nixos.org

NixOS CUDA

290 Members
CUDA packages maintenance and support in nixpkgs | https://github.com/orgs/NixOS/projects/27/ | https://nixos.org/manual/nixpkgs/unstable/#cuda57 Servers

Load older messages


SenderMessageTime
2 Dec 2024
@glepage:matrix.orgGaétan Lepage Well, it's still not perfect hexa (UTC+1).
On the flaky Threadripper 3990X system, torchrl segfaults after the test suite has passed. It only occurs for python312Packages.torchrl but not for python311Packages.torchrl...
14:40:15
@glepage:matrix.orgGaétan Lepageclipboard.png
Download clipboard.png
14:40:51
@glepage:matrix.orgGaétan LepageOf course, it builds perfectly fine on two other AMD systems (Ryzen 5 5600X and Ryzen 9 3900)14:41:38
@hexa:lossy.networkhexamicrocode updates applied?14:42:05
@glepage:matrix.orgGaétan LepageI forgot how to check that14:42:35
@glepage:matrix.orgGaétan LepageI think you told me a while ago14:42:47
@hexa:lossy.networkhexaintel be like14:43:46
@hexa:lossy.networkhexa

[ 0.000000] microcode: updated early: 0xc6 -> 0xf8, date = 2024-02-01

14:43:51
@hexa:lossy.networkhexadmesg | grep microcode14:44:36
@glepage:matrix.orgGaétan Lepage
[    2.153548] microcode: microcode updated early to new patch_level=0x0830107b
14:45:07
@glepage:matrix.orgGaétan LepageNothing matches on my Ryzen system.14:46:03
@glepage:matrix.orgGaétan Lepage *

TR system:

[    2.153548] microcode: microcode updated early to new patch_level=0x0830107b
14:46:09
@glepage:matrix.orgGaétan Lepage *

TR system:

[    2.153548] microcode: microcode updated early to new patch_level=0x0830107b
...
[    2.154392] microcode: Microcode Update Driver: v2.2.
14:46:30
@hexa:lossy.networkhexa
❯ ./amd_ucode_info.py kernel/x86/microcode/AuthenticAMD.bin 
Microcode patches in kernel/x86/microcode/AuthenticAMD.bin:
  Family=0x10 Model=0x02 Stepping=0x03: Patch=0x01000083 Length=960 bytes
  Family=0x10 Model=0x02 Stepping=0x02: Patch=0x01000083 Length=960 bytes
  Family=0x10 Model=0x02 Stepping=0x0a: Patch=0x01000084 Length=960 bytes
  Family=0x10 Model=0x06 Stepping=0x02: Patch=0x010000c7 Length=960 bytes
  Family=0x10 Model=0x04 Stepping=0x03: Patch=0x010000c8 Length=960 bytes
  Family=0x10 Model=0x06 Stepping=0x03: Patch=0x010000c8 Length=960 bytes
  Family=0x10 Model=0x05 Stepping=0x03: Patch=0x010000c8 Length=960 bytes
  Family=0x10 Model=0x08 Stepping=0x01: Patch=0x010000d9 Length=960 bytes
  Family=0x10 Model=0x09 Stepping=0x01: Patch=0x010000d9 Length=960 bytes
  Family=0x10 Model=0x08 Stepping=0x00: Patch=0x010000da Length=960 bytes
  Family=0x10 Model=0x04 Stepping=0x02: Patch=0x010000db Length=960 bytes
  Family=0x10 Model=0x05 Stepping=0x02: Patch=0x010000db Length=960 bytes
  Family=0x10 Model=0x0a Stepping=0x00: Patch=0x010000dc Length=960 bytes
  Family=0x11 Model=0x03 Stepping=0x01: Patch=0x02000032 Length=512 bytes
  Family=0x12 Model=0x01 Stepping=0x00: Patch=0x03000027 Length=960 bytes
  Family=0x14 Model=0x01 Stepping=0x00: Patch=0x05000029 Length=1568 bytes
  Family=0x14 Model=0x02 Stepping=0x00: Patch=0x05000119 Length=1568 bytes
Microcode patches in kernel/x86/microcode/AuthenticAMD.bin+0x318c:
  Family=0x15 Model=0x01 Stepping=0x02: Patch=0x0600063e Length=2592 bytes
  Family=0x15 Model=0x02 Stepping=0x00: Patch=0x06000852 Length=2592 bytes
  Family=0x15 Model=0x10 Stepping=0x01: Patch=0x06001119 Length=2592 bytes
Microcode patches in kernel/x86/microcode/AuthenticAMD.bin+0x5050:
  Family=0x16 Model=0x00 Stepping=0x01: Patch=0x0700010f Length=3458 bytes
Microcode patches in kernel/x86/microcode/AuthenticAMD.bin+0x5e06:
  Family=0x17 Model=0x01 Stepping=0x02: Patch=0x0800126f Length=3200 bytes
  Family=0x17 Model=0x31 Stepping=0x00: Patch=0x0830107c Length=3200 bytes
  Family=0x17 Model=0x08 Stepping=0x02: Patch=0x0800820d Length=3200 bytes
  Family=0x17 Model=0xa0 Stepping=0x00: Patch=0x08a00008 Length=3200 bytes
Microcode patches in kernel/x86/microcode/AuthenticAMD.bin+0x9082:
  Family=0x19 Model=0x01 Stepping=0x00: Patch=0x0a00107a Length=5568 bytes
  Family=0x19 Model=0x11 Stepping=0x02: Patch=0x0a101248 Length=5568 bytes
  Family=0x19 Model=0xa0 Stepping=0x02: Patch=0x0aa00215 Length=5568 bytes
  Family=0x19 Model=0x01 Stepping=0x02: Patch=0x0a001238 Length=5568 bytes
  Family=0x19 Model=0x11 Stepping=0x01: Patch=0x0a101148 Length=5568 bytes
  Family=0x19 Model=0x01 Stepping=0x01: Patch=0x0a0011d5 Length=5568 bytes
  Family=0x19 Model=0xa0 Stepping=0x01: Patch=0x0aa00116 Length=5568 bytes
14:52:33
@hexa:lossy.networkhexathis is the latest ucode for amd on nixpkgs master14:52:39
@hexa:lossy.networkhexaso 0x19 is family 2514:53:59
@hexa:lossy.networkhexaand for the model you probably have to binary or your model with 0xa0 if it is > 1714:54:27
@glepage:matrix.orgGaétan LepageOk, so anyway the issues are not a problem of the python package then15:11:51
4 Dec 2024
@justbrowsing:matrix.orgKevin Mittman (UTC-8)Anyone planning to attend PlanetNix https://www.socallinuxexpo.org/scale/22x/events/planet-nix ? Looks like CFP is still open02:21:01
@connorbaker:matrix.orgconnor (he/him)I probably will; I'm also planning to submit two talks04:38:33
6 Dec 2024
@vannagamma:matrix.orgvannagamma joined the room.00:01:17
@connorbaker:matrix.orgconnor (he/him)Does anyone have a NixOS system they recommend using to test eval performance?05:49:07
@connorbaker:matrix.orgconnor (he/him)Ideally something which takes on the order of 30s or so to eval05:51:25
@kaya:catnip.eekaya 𖤐 changed their profile picture.21:17:32
7 Dec 2024
@ss:someonex.netSomeoneSerge (back on matrix)
In reply to @connorbaker:matrix.org
Does anyone have a NixOS system they recommend using to test eval performance?
Working on cppnix?
01:40:03
@connorbaker:matrix.orgconnor (he/him)Kind of? More like preliminary work for a talk I plan to give at Nix Planet06:52:28
@ss:someonex.netSomeoneSerge (back on matrix)Looking forward to watch the record:)12:28:11
8 Dec 2024
@kaya:catnip.eekaya 𖤐Hi, im attempting to upstream my nix derivation for exllamav2 from https://github.com/BatteredBunny/nix-ai-stuff/blob/main/pkgs/exllamav2.nix For some reason its complaining about CUDA_HOME being missing even though im specifying it which im kind of confused about, i thought maybe i would replace torch with torchWithCuda but then i get some mysterious error which i dont get in the flake Anyone had issues with anything similar? Current attempt for anyone curious https://gist.github.com/BatteredBunny/2212ac469f07244d954bf556f128cb0716:42:14
@kaya:catnip.eekaya 𖤐Pretty sure the biggest difference between the flake and my upstreaming attempt is that in the flake i have allowUnfree and allowUnfree as true, but those options should carry over (i assume) as i also have those enabled on my nixos and im building with --impure16:44:00
@kaya:catnip.eekaya 𖤐 * Pretty sure the biggest difference between the flake and my upstreaming attempt is that in the flake i have allowUnfree and cudaSupport as true, but those options should carry over (i assume) as i also have those enabled on my nixos and im building with --impure16:44:11

Show newer messages


Back to Room ListRoom Version: 9