!eWOErHSaiddIbsUNsJ:nixos.org

NixOS CUDA

288 Members
CUDA packages maintenance and support in nixpkgs | https://github.com/orgs/NixOS/projects/27/ | https://nixos.org/manual/nixpkgs/unstable/#cuda56 Servers

Load older messages


SenderMessageTime
20 Dec 2024
@matthewcroughan:defenestrate.itmatthewcroughanultimately it complains about this 13:15:07
@ss:someonex.netSomeoneSerge (back on matrix) No VM tests, no 13:19:16
@matthewcroughan:defenestrate.itmatthewcroughanThis is the first one I'm trying to execute entirely on the CPU13:20:32
@matthewcroughan:defenestrate.itmatthewcroughanfor comfyui in particular13:20:39
@matthewcroughan:defenestrate.itmatthewcroughanimage.png
Download image.png
13:21:14
@matthewcroughan:defenestrate.itmatthewcroughanI have this cat that I can reproduce on the host cpu in 13 seconds only13:21:17
@matthewcroughan:defenestrate.itmatthewcroughan comfyui is launched with --cpu but maybe that is incomplete 13:21:31
@matthewcroughan:defenestrate.itmatthewcroughanMaybe it secretly still accesses the GPU and this vm test proves it13:21:41
@ss:someonex.netSomeoneSerge (back on matrix)

Plausible, I suppose pytorch could ignore our flags and build something with vector extensions on (unless cc-wrapper filters those, I'm not sure), but what part of the logs suggested this conclusion?

Searching for "qemu avx" I see https://superuser.com/a/454814 suggesting -cpu sandyBridge,+avx,enforce

13:23:54
@ss:someonex.netSomeoneSerge (back on matrix)oh i'm acting like an llm13:24:16
@matthewcroughan:defenestrate.itmatthewcroughan

Yeah I've done all of that, and lspcu inside the vm shows

Architecture:             x86_64
  CPU op-mode(s):         32-bit, 64-bit
  Address sizes:          48 bits physical, 48 bits virtual
  Byte Order:             Little Endian
CPU(s):                   8
  On-line CPU(s) list:    0-7
Vendor ID:                AuthenticAMD
  BIOS Vendor ID:         QEMU
  Model name:             AMD Ryzen 9 3900X 12-Core Processor
    BIOS Model name:      pc-i440fx-9.1  CPU @ 2.0GHz
    BIOS CPU family:      1
    CPU family:           23
    Model:                113
    Thread(s) per core:   1
    Core(s) per socket:   8
    Socket(s):            1
    Stepping:             0
    BogoMIPS:             7599.99
    Flags:                fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge m
                          ca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall
                           nx mmxext fxsr_opt pdpe1gb rdtscp lm rep_good nopl cp
                          uid extd_apicid tsc_known_freq pni pclmulqdq ssse3 fma
                           cx16 sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_t
                          imer aes xsave avx f16c rdrand hypervisor lahf_lm cmp_
                          legacy svm cr8_legacy abm sse4a misalignsse 3dnowprefe
                          tch osvw perfctr_core ssbd ibpb stibp vmmcall fsgsbase
                           tsc_adjust bmi1 avx2 smep bmi2 rdseed adx smap clflus
                          hopt clwb sha_ni xsaveopt xsavec xgetbv1 clzero xsavee
                          rptr wbnoinvd arat npt lbrv nrip_save tsc_scale vmcb_c
                          lean flushbyasid pausefilter pfthreshold v_vmsave_vmlo
                          ad vgif umip rdpid overflow_recov succor arch_capabili
                          ties
Virtualization features:
13:24:36
@matthewcroughan:defenestrate.itmatthewcroughanso I supposedly have it13:24:44
@matthewcroughan:defenestrate.itmatthewcroughan I've tried a lot of -cpu options too 13:26:40
@matthewcroughan:defenestrate.itmatthewcroughanmaybe there's a PYTORCH_VAR I can set?13:26:59
@ss:someonex.netSomeoneSerge (back on matrix)

avx ... but what part of the logs suggested this conclusion?

13:38:07
@matthewcroughan:defenestrate.itmatthewcroughanNothing, just other people's reports online 13:38:17
@matthewcroughan:defenestrate.itmatthewcroughanThat are now lost to my browser history 13:40:38
@matthewcroughan:defenestrate.itmatthewcroughanThis same stuff works in the host though 13:44:01
@matthewcroughan:defenestrate.itmatthewcroughan Host has
fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_goo d nopl xtopology nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand lahf_l m cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cd p_l3 hw_pstate ssbd mba ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr rdpru wbnoinvd arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthr eshold avic v_vmsave_vmload vgif v_spec_ctrl umip rdpid overflow_recov succor smca sev sev_es
13:51:08
@matthewcroughan:defenestrate.itmatthewcroughan guest has:
fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge m ca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm rep_good nopl cp uid extd_apicid tsc_known_freq pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_t imer aes xsave avx f16c rdrand hypervisor lahf_lm cmp_ legacy svm cr8_legacy abm sse4a misalignsse 3dnowprefe tch osvw perfctr_core ssbd ibpb stibp vmmcall fsgsbase tsc_adjust bmi1 avx2 smep bmi2 rdseed adx smap clflus hopt clwb sha_ni xsaveopt xsavec xgetbv1 clzero xsavee rptr wbnoinvd arat npt lbrv nrip_save tsc_scale vmcb_c lean flushbyasid pausefilter pfthreshold v_vmsave_vmlo ad vgif umip rdpid overflow_recov succor arch_capabili ties
13:51:14
@matthewcroughan:defenestrate.itmatthewcroughancqm doesn't seem to exist in qemu, what is it?13:55:19
@matthewcroughan:defenestrate.itmatthewcroughan SomeoneSerge (utc+3): Ah here it was https://discuss.pytorch.org/t/runtimeerror-could-not-create-a-primitive/117519 14:03:07
@matthewcroughan:defenestrate.itmatthewcroughan Oh wow.. I made it work SomeoneSerge (utc+3) 14:22:02
@matthewcroughan:defenestrate.itmatthewcroughanit was some systemd hardening feature causing it14:22:10
@matthewcroughan:defenestrate.itmatthewcroughan

It is one of these, but we do not know which one it is

          DevicePolicy = "closed";
          LockPersonality = true;
          MemoryDenyWriteExecute = true;
          NoNewPrivileges = true;
          PrivateDevices = false; # hides acceleration devices
          PrivateTmp = true;
          PrivateUsers = true;
          ProcSubset = "all"; # /proc/meminfo
          ProtectClock = true;
          ProtectControlGroups = true;
          ProtectHome = true;
          ProtectHostname = true;
          ProtectKernelLogs = true;
          ProtectKernelModules = true;
          ProtectKernelTunables = true;
          ProtectProc = "invisible";
          ProtectSystem = "strict";
          RemoveIPC = true;
          RestrictNamespaces = true;
          RestrictRealtime = true;
          RestrictSUIDSGID = true;
          RestrictAddressFamilies = [
            "AF_INET"
            "AF_INET6"
            "AF_UNIX"
          ];
          SupplementaryGroups = [ "render" ]; # for rocm to access /dev/dri/renderD* devices
          SystemCallArchitectures = "native";
          SystemCallFilter = [
            "@system-service @resources"
            "~@privileged"
          ];
          UMask = "0077";

14:24:04
@matthewcroughan:defenestrate.itmatthewcroughannow comes the bisecting14:28:29
@matthewcroughan:defenestrate.itmatthewcroughan It appears it was WriteMemoryDenyExecute causing it 14:35:34
@matthewcroughan:defenestrate.itmatthewcroughanhttps://github.com/pytorch/pytorch/issues/14365114:54:06
@matthewcroughan:defenestrate.itmatthewcroughanMade an issue in pytorch anyway 14:54:10
@ss:someonex.netSomeoneSerge (back on matrix)Nice. Maybe it's some kind of JIT stuff?14:54:41

Show newer messages


Back to Room ListRoom Version: 9