!eWOErHSaiddIbsUNsJ:nixos.org

NixOS CUDA

211 Members
CUDA packages maintenance and support in nixpkgs | https://github.com/orgs/NixOS/projects/27/ | https://nixos.org/manual/nixpkgs/unstable/#cuda42 Servers

Load older messages


SenderMessageTime
1 Aug 2024
@ss:someonex.netSomeoneSerge (utc+3)
gy skimage.transform skimage.util skimage.segmentation
python3-3.11.9-env> building '/nix/store/4rqjcjk4h2mnfwsbvcgf3igjnmpxhxwf-python3-3.11.9-env.drv'
python3-3.11.9-env> created 521 symlinks in user environment
opencv-4.9.0-libstdcxx-test> building '/nix/store/2gh11xabzlxbfgvydhcln0qbfiharw32-opencv-4.9.0-libstdcxx-test.drv'
┏━ Dependency Graph:
┃             ┌─ ✔ opencv-4.9.0 ⏱ 17m40s
┃          ┌─ ✔ python3.11-pillow-heif-0.16.0 ⏱ 2m0s
┃       ┌─ ✔ python3.11-imageio-2.34.2 ⏱ 11s
┃    ┌─ ✔ python3.11-scikit-image-0.22.0 ⏱ 1m37s
┃ ┌─ ✔ python3-3.11.9-env ⏱ 1s
┃ ✔ opencv-4.9.0-libstdcxx-test 
┣━━━ Builds         
┗━ ∑ ⏵ 0 │ ✔ 6 │ ⏸ 0 │ Finished at 17:11:37 after 21m35s
17:12:13
@ss:someonex.netSomeoneSerge (utc+3)So ugh at least opencv4's python extension must be linking the right libstdc++17:13:11
@ss:someonex.netSomeoneSerge (utc+3) Hmm the last torch update was almost two months ago https://github.com/NixOS/nixpkgs/pull/317576 17:14:41
@ss:someonex.netSomeoneSerge (utc+3) * Hmm the last merged torch update was almost two months ago https://github.com/NixOS/nixpkgs/pull/317576 17:14:45
@ss:someonex.netSomeoneSerge (utc+3) yorickvp would you volunteer to run the bisection? 🫠 17:15:40
@yorickvp:matrix.orgyorickvpsure, do you have a known working commit?17:15:47
@ss:someonex.netSomeoneSerge (utc+3)

Well, I got a workstation sat

Revision:      b2852eb9365c6de48ffb0dc2c9562591f652242a
Last modified: 2024-06-27 16:44:53

Let me check if torch actually works there

17:16:31
@ss:someonex.netSomeoneSerge (utc+3)
❯ nix-shell -p 'python3.withPackages (ps: [ ps.torch ])'
trace: warning: cudaPackages.autoAddDriverRunpath is deprecated, use pkgs.autoAddDriverRunpath instead
trace: warning: cudaPackages.autoFixElfFiles is deprecated, use pkgs.autoFixElfFiles instead
trace: warning: cudaPackages.autoAddOpenGLRunpathHook is deprecated, use pkgs.autoAddDriverRunpathHook instead
this derivation will be built:
  /nix/store/qmmz2hxinp65zsprb3g92my7wqvbwncm-python3-3.11.9-env.drv
building '/nix/store/qmmz2hxinp65zsprb3g92my7wqvbwncm-python3-3.11.9-env.drv'...
created 516 symlinks in user environment
[WARN] - (starship::utils): Executing command "/home/ss/.nix-profile/bin/git" timed out.
[WARN] - (starship::utils): You can set command_timeout in your config to a higher value to allow longer-running commands to keep executing.
ss in 🌐 cs-338 in triton on  openai-triton [$] via ❄️  impure (shell) 
❯ python
Python 3.11.9 (main, Apr  2 2024, 08:25:04) [GCC 13.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.cuda.is_available()
True
>>>
17:17:20
@yorickvp:matrix.orgyorickvpit probably works but still secretly links gcc-12.4.0, which isn't always fatal17:17:19
@ss:someonex.netSomeoneSerge (utc+3)No, it shouldn't17:17:33
@ss:someonex.netSomeoneSerge (utc+3)It definitely was the case that gcc-12 reference was not retained in the output path17:17:56
@yorickvp:matrix.orgyorickvp patchelf --print-rpath $(nix-build python3.pkgs.torch.lib)/lib/libtorch_cuda.so ? 17:18:22
@ss:someonex.netSomeoneSerge (utc+3)Yup, it's 12.3 in this commit and it happens to work, but yet again, this is a regression17:19:46
@ss:someonex.netSomeoneSerge (utc+3)The reference used not to be retained in the outputs17:20:02
@ss:someonex.netSomeoneSerge (utc+3)Trying 24.0517:21:01
@ss:someonex.netSomeoneSerge (utc+3)I think it's time to add an exportReferencesGraph test to e.g. torch, or better yet to a few core packages17:22:47
@ss:someonex.netSomeoneSerge (utc+3)As a very unambiguous way to ensure that this stuff isn't referenced17:23:10
@ss:someonex.netSomeoneSerge (utc+3)Oh wait. Actually, now it is going to be in the closure if we include triton17:23:31
@ss:someonex.netSomeoneSerge (utc+3)I think we keep a reference to the toolchain in triton17:24:03
@ss:someonex.netSomeoneSerge (utc+3)A a rough estimate, I think 23.11 is a good commit xD17:29:22
@ss:someonex.netSomeoneSerge (utc+3)Sorry got to leave now for a while17:29:35
@yorickvp:matrix.orgyorickvp

bisecting the following:

let
  pkgs = import ./. {
    config = {
      allowUnfree = true;
      cudaCapabilities = [ "8.6" ];
      cudaSupport = true;
    };
  };
in
{
  torchtest = (pkgs.python3.pkgs.torch.override { openai-triton = null; }).overridePythonAttrs (o: {
    disallowedReferences = [ pkgs.python3.pkgs.torch.cudaPackages.cuda_nvcc.stdenv.cc.cc.lib ];
    USE_CUDNN = 0;
    USE_KINETO = 0;
    USE_QNNPACK = 0;
    USE_PYTORCH_QNNPACK = 0;
    USE_XNNPACK = 0;
    INTERN_DISABLE_ONNX = 1;
    ONNX_ML = 0;
    USE_ITT = 0;
    USE_FLASH_ATTENTION = 0;
    USE_MEM_EFF_ATTENTION = 0;
    USE_FBGEMM = 0;
    USE_MKLDNN = 0;
  });
}
17:37:11
@yorickvp:matrix.orgyorickvp
In reply to @yorickvp:matrix.org

bisecting the following:

let
  pkgs = import ./. {
    config = {
      allowUnfree = true;
      cudaCapabilities = [ "8.6" ];
      cudaSupport = true;
    };
  };
in
{
  torchtest = (pkgs.python3.pkgs.torch.override { openai-triton = null; }).overridePythonAttrs (o: {
    disallowedReferences = [ pkgs.python3.pkgs.torch.cudaPackages.cuda_nvcc.stdenv.cc.cc.lib ];
    USE_CUDNN = 0;
    USE_KINETO = 0;
    USE_QNNPACK = 0;
    USE_PYTORCH_QNNPACK = 0;
    USE_XNNPACK = 0;
    INTERN_DISABLE_ONNX = 1;
    ONNX_ML = 0;
    USE_ITT = 0;
    USE_FLASH_ATTENTION = 0;
    USE_MEM_EFF_ATTENTION = 0;
    USE_FBGEMM = 0;
    USE_MKLDNN = 0;
  });
}
disallowedReferences seems not to work, though
17:52:16
@yorickvp:matrix.orgyorickvpit was already broken on dc7b3febf8d862328d8704de5c8437d2df442c7618:02:02
@yorickvp:matrix.orgyorickvp(23.11 branchoff)18:02:06
@ss:someonex.netSomeoneSerge (utc+3) Ehh cuda_nvcc.stdenv.cc looks wrong 18:35:31
@ss:someonex.netSomeoneSerge (utc+3)Or maybe you're tight it's the unwrapped one18:36:12
@ss:someonex.netSomeoneSerge (utc+3)* Or maybe you're right it's the unwrapped one18:36:19
@ss:someonex.netSomeoneSerge (utc+3)Butthe reference could be through triton18:36:48
@ss:someonex.netSomeoneSerge (utc+3)I think you have to print-rpath18:37:09

Show newer messages


Back to Room ListRoom Version: 9