!eWOErHSaiddIbsUNsJ:nixos.org

NixOS CUDA

290 Members
CUDA packages maintenance and support in nixpkgs | https://github.com/orgs/NixOS/projects/27/ | https://nixos.org/manual/nixpkgs/unstable/#cuda57 Servers

Load older messages


SenderMessageTime
30 Nov 2024
@ss:someonex.netSomeoneSerge (back on matrix) Ah, I was just commenting on that on github. I think open being a passthru of nvidia_x11 is a constant source of confusion and we should just move it out to nvidiaPackages. AFAIU the only way open depends on nvidia_x11 is for version numbers... 13:00:29
2 Dec 2024
@hexa:lossy.networkhexa Gaétan Lepage: so when are we killing the tests on tensordict? 00:53:22
@glepage:matrix.orgGaétan Lepage:')07:24:29
@glepage:matrix.orgGaétan LepageAt least the blocking one I guess07:24:35
@glepage:matrix.orgGaétan Lepage Should be fixed by https://github.com/NixOS/nixpkgs/pull/361008. 07:57:50
@glepage:matrix.orgGaétan Lepage
In reply to @glepage:matrix.org
Should be fixed by https://github.com/NixOS/nixpkgs/pull/361008.
I went through both tensordict and torchrl. They should be more robust now.
14:06:35
@glepage:matrix.orgGaétan Lepage Well, it's still not perfect hexa (UTC+1).
On the flaky Threadripper 3990X system, torchrl segfaults after the test suite has passed. It only occurs for python312Packages.torchrl but not for python311Packages.torchrl...
14:40:15
@glepage:matrix.orgGaétan Lepageclipboard.png
Download clipboard.png
14:40:51
@glepage:matrix.orgGaétan LepageOf course, it builds perfectly fine on two other AMD systems (Ryzen 5 5600X and Ryzen 9 3900)14:41:38
@hexa:lossy.networkhexamicrocode updates applied?14:42:05
@glepage:matrix.orgGaétan LepageI forgot how to check that14:42:35
@glepage:matrix.orgGaétan LepageI think you told me a while ago14:42:47
@hexa:lossy.networkhexaintel be like14:43:46
@hexa:lossy.networkhexa

[ 0.000000] microcode: updated early: 0xc6 -> 0xf8, date = 2024-02-01

14:43:51
@hexa:lossy.networkhexadmesg | grep microcode14:44:36
@glepage:matrix.orgGaétan Lepage
[    2.153548] microcode: microcode updated early to new patch_level=0x0830107b
14:45:07
@glepage:matrix.orgGaétan LepageNothing matches on my Ryzen system.14:46:03
@glepage:matrix.orgGaétan Lepage *

TR system:

[    2.153548] microcode: microcode updated early to new patch_level=0x0830107b
14:46:09
@glepage:matrix.orgGaétan Lepage *

TR system:

[    2.153548] microcode: microcode updated early to new patch_level=0x0830107b
...
[    2.154392] microcode: Microcode Update Driver: v2.2.
14:46:30
@hexa:lossy.networkhexa
❯ ./amd_ucode_info.py kernel/x86/microcode/AuthenticAMD.bin 
Microcode patches in kernel/x86/microcode/AuthenticAMD.bin:
  Family=0x10 Model=0x02 Stepping=0x03: Patch=0x01000083 Length=960 bytes
  Family=0x10 Model=0x02 Stepping=0x02: Patch=0x01000083 Length=960 bytes
  Family=0x10 Model=0x02 Stepping=0x0a: Patch=0x01000084 Length=960 bytes
  Family=0x10 Model=0x06 Stepping=0x02: Patch=0x010000c7 Length=960 bytes
  Family=0x10 Model=0x04 Stepping=0x03: Patch=0x010000c8 Length=960 bytes
  Family=0x10 Model=0x06 Stepping=0x03: Patch=0x010000c8 Length=960 bytes
  Family=0x10 Model=0x05 Stepping=0x03: Patch=0x010000c8 Length=960 bytes
  Family=0x10 Model=0x08 Stepping=0x01: Patch=0x010000d9 Length=960 bytes
  Family=0x10 Model=0x09 Stepping=0x01: Patch=0x010000d9 Length=960 bytes
  Family=0x10 Model=0x08 Stepping=0x00: Patch=0x010000da Length=960 bytes
  Family=0x10 Model=0x04 Stepping=0x02: Patch=0x010000db Length=960 bytes
  Family=0x10 Model=0x05 Stepping=0x02: Patch=0x010000db Length=960 bytes
  Family=0x10 Model=0x0a Stepping=0x00: Patch=0x010000dc Length=960 bytes
  Family=0x11 Model=0x03 Stepping=0x01: Patch=0x02000032 Length=512 bytes
  Family=0x12 Model=0x01 Stepping=0x00: Patch=0x03000027 Length=960 bytes
  Family=0x14 Model=0x01 Stepping=0x00: Patch=0x05000029 Length=1568 bytes
  Family=0x14 Model=0x02 Stepping=0x00: Patch=0x05000119 Length=1568 bytes
Microcode patches in kernel/x86/microcode/AuthenticAMD.bin+0x318c:
  Family=0x15 Model=0x01 Stepping=0x02: Patch=0x0600063e Length=2592 bytes
  Family=0x15 Model=0x02 Stepping=0x00: Patch=0x06000852 Length=2592 bytes
  Family=0x15 Model=0x10 Stepping=0x01: Patch=0x06001119 Length=2592 bytes
Microcode patches in kernel/x86/microcode/AuthenticAMD.bin+0x5050:
  Family=0x16 Model=0x00 Stepping=0x01: Patch=0x0700010f Length=3458 bytes
Microcode patches in kernel/x86/microcode/AuthenticAMD.bin+0x5e06:
  Family=0x17 Model=0x01 Stepping=0x02: Patch=0x0800126f Length=3200 bytes
  Family=0x17 Model=0x31 Stepping=0x00: Patch=0x0830107c Length=3200 bytes
  Family=0x17 Model=0x08 Stepping=0x02: Patch=0x0800820d Length=3200 bytes
  Family=0x17 Model=0xa0 Stepping=0x00: Patch=0x08a00008 Length=3200 bytes
Microcode patches in kernel/x86/microcode/AuthenticAMD.bin+0x9082:
  Family=0x19 Model=0x01 Stepping=0x00: Patch=0x0a00107a Length=5568 bytes
  Family=0x19 Model=0x11 Stepping=0x02: Patch=0x0a101248 Length=5568 bytes
  Family=0x19 Model=0xa0 Stepping=0x02: Patch=0x0aa00215 Length=5568 bytes
  Family=0x19 Model=0x01 Stepping=0x02: Patch=0x0a001238 Length=5568 bytes
  Family=0x19 Model=0x11 Stepping=0x01: Patch=0x0a101148 Length=5568 bytes
  Family=0x19 Model=0x01 Stepping=0x01: Patch=0x0a0011d5 Length=5568 bytes
  Family=0x19 Model=0xa0 Stepping=0x01: Patch=0x0aa00116 Length=5568 bytes
14:52:33
@hexa:lossy.networkhexathis is the latest ucode for amd on nixpkgs master14:52:39
@hexa:lossy.networkhexaso 0x19 is family 2514:53:59
@hexa:lossy.networkhexaand for the model you probably have to binary or your model with 0xa0 if it is > 1714:54:27
@glepage:matrix.orgGaétan LepageOk, so anyway the issues are not a problem of the python package then15:11:51
4 Dec 2024
@justbrowsing:matrix.orgKevin Mittman (UTC-8)Anyone planning to attend PlanetNix https://www.socallinuxexpo.org/scale/22x/events/planet-nix ? Looks like CFP is still open02:21:01
@connorbaker:matrix.orgconnor (he/him)I probably will; I'm also planning to submit two talks04:38:33
6 Dec 2024
@vannagamma:matrix.orgvannagamma joined the room.00:01:17
@connorbaker:matrix.orgconnor (he/him)Does anyone have a NixOS system they recommend using to test eval performance?05:49:07
@connorbaker:matrix.orgconnor (he/him)Ideally something which takes on the order of 30s or so to eval05:51:25
@kaya:catnip.eekaya 𖤐 changed their profile picture.21:17:32

Show newer messages


Back to Room ListRoom Version: 9