1 Aug 2024 |
SomeoneSerge (utc+3) | saxpy and opencv are built using cmake too | 17:08:34 |
SomeoneSerge (utc+3) | At least one of them has been shown to still work (whatever the cost) | 17:08:56 |
SomeoneSerge (utc+3) | gy skimage.transform skimage.util skimage.segmentation
python3-3.11.9-env> building '/nix/store/4rqjcjk4h2mnfwsbvcgf3igjnmpxhxwf-python3-3.11.9-env.drv'
python3-3.11.9-env> created 521 symlinks in user environment
opencv-4.9.0-libstdcxx-test> building '/nix/store/2gh11xabzlxbfgvydhcln0qbfiharw32-opencv-4.9.0-libstdcxx-test.drv'
┏━ Dependency Graph:
┃ ┌─ ✔ opencv-4.9.0 ⏱ 17m40s
┃ ┌─ ✔ python3.11-pillow-heif-0.16.0 ⏱ 2m0s
┃ ┌─ ✔ python3.11-imageio-2.34.2 ⏱ 11s
┃ ┌─ ✔ python3.11-scikit-image-0.22.0 ⏱ 1m37s
┃ ┌─ ✔ python3-3.11.9-env ⏱ 1s
┃ ✔ opencv-4.9.0-libstdcxx-test
┣━━━ Builds
┗━ ∑ ⏵ 0 │ ✔ 6 │ ⏸ 0 │ Finished at 17:11:37 after 21m35s
| 17:12:13 |
SomeoneSerge (utc+3) | So ugh at least opencv4's python extension must be linking the right libstdc++ | 17:13:11 |
SomeoneSerge (utc+3) | Hmm the last torch update was almost two months ago https://github.com/NixOS/nixpkgs/pull/317576 | 17:14:41 |
SomeoneSerge (utc+3) | * Hmm the last merged torch update was almost two months ago https://github.com/NixOS/nixpkgs/pull/317576 | 17:14:45 |
SomeoneSerge (utc+3) | yorickvp would you volunteer to run the bisection? 🫠 | 17:15:40 |
yorickvp | sure, do you have a known working commit? | 17:15:47 |
SomeoneSerge (utc+3) | Well, I got a workstation sat
Revision: b2852eb9365c6de48ffb0dc2c9562591f652242a
Last modified: 2024-06-27 16:44:53
Let me check if torch actually works there | 17:16:31 |
SomeoneSerge (utc+3) | ❯ nix-shell -p 'python3.withPackages (ps: [ ps.torch ])'
trace: warning: cudaPackages.autoAddDriverRunpath is deprecated, use pkgs.autoAddDriverRunpath instead
trace: warning: cudaPackages.autoFixElfFiles is deprecated, use pkgs.autoFixElfFiles instead
trace: warning: cudaPackages.autoAddOpenGLRunpathHook is deprecated, use pkgs.autoAddDriverRunpathHook instead
this derivation will be built:
/nix/store/qmmz2hxinp65zsprb3g92my7wqvbwncm-python3-3.11.9-env.drv
building '/nix/store/qmmz2hxinp65zsprb3g92my7wqvbwncm-python3-3.11.9-env.drv'...
created 516 symlinks in user environment
[WARN] - (starship::utils): Executing command "/home/ss/.nix-profile/bin/git" timed out.
[WARN] - (starship::utils): You can set command_timeout in your config to a higher value to allow longer-running commands to keep executing.
ss in 🌐 cs-338 in triton on openai-triton [$] via ❄️ impure (shell)
❯ python
Python 3.11.9 (main, Apr 2 2024, 08:25:04) [GCC 13.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.cuda.is_available()
True
>>>
| 17:17:20 |
yorickvp | it probably works but still secretly links gcc-12.4.0, which isn't always fatal | 17:17:19 |
SomeoneSerge (utc+3) | No, it shouldn't | 17:17:33 |
SomeoneSerge (utc+3) | It definitely was the case that gcc-12 reference was not retained in the output path | 17:17:56 |
yorickvp | patchelf --print-rpath $(nix-build python3.pkgs.torch.lib)/lib/libtorch_cuda.so ? | 17:18:22 |
SomeoneSerge (utc+3) | Yup, it's 12.3 in this commit and it happens to work, but yet again, this is a regression | 17:19:46 |
SomeoneSerge (utc+3) | The reference used not to be retained in the outputs | 17:20:02 |
SomeoneSerge (utc+3) | Trying 24.05 | 17:21:01 |
SomeoneSerge (utc+3) | I think it's time to add an exportReferencesGraph test to e.g. torch, or better yet to a few core packages | 17:22:47 |
SomeoneSerge (utc+3) | As a very unambiguous way to ensure that this stuff isn't referenced | 17:23:10 |
SomeoneSerge (utc+3) | Oh wait. Actually, now it is going to be in the closure if we include triton | 17:23:31 |
SomeoneSerge (utc+3) | I think we keep a reference to the toolchain in triton | 17:24:03 |
SomeoneSerge (utc+3) | A a rough estimate, I think 23.11 is a good commit xD | 17:29:22 |
SomeoneSerge (utc+3) | Sorry got to leave now for a while | 17:29:35 |
yorickvp | bisecting the following:
let
pkgs = import ./. {
config = {
allowUnfree = true;
cudaCapabilities = [ "8.6" ];
cudaSupport = true;
};
};
in
{
torchtest = (pkgs.python3.pkgs.torch.override { openai-triton = null; }).overridePythonAttrs (o: {
disallowedReferences = [ pkgs.python3.pkgs.torch.cudaPackages.cuda_nvcc.stdenv.cc.cc.lib ];
USE_CUDNN = 0;
USE_KINETO = 0;
USE_QNNPACK = 0;
USE_PYTORCH_QNNPACK = 0;
USE_XNNPACK = 0;
INTERN_DISABLE_ONNX = 1;
ONNX_ML = 0;
USE_ITT = 0;
USE_FLASH_ATTENTION = 0;
USE_MEM_EFF_ATTENTION = 0;
USE_FBGEMM = 0;
USE_MKLDNN = 0;
});
}
| 17:37:11 |
yorickvp | In reply to @yorickvp:matrix.org
bisecting the following:
let
pkgs = import ./. {
config = {
allowUnfree = true;
cudaCapabilities = [ "8.6" ];
cudaSupport = true;
};
};
in
{
torchtest = (pkgs.python3.pkgs.torch.override { openai-triton = null; }).overridePythonAttrs (o: {
disallowedReferences = [ pkgs.python3.pkgs.torch.cudaPackages.cuda_nvcc.stdenv.cc.cc.lib ];
USE_CUDNN = 0;
USE_KINETO = 0;
USE_QNNPACK = 0;
USE_PYTORCH_QNNPACK = 0;
USE_XNNPACK = 0;
INTERN_DISABLE_ONNX = 1;
ONNX_ML = 0;
USE_ITT = 0;
USE_FLASH_ATTENTION = 0;
USE_MEM_EFF_ATTENTION = 0;
USE_FBGEMM = 0;
USE_MKLDNN = 0;
});
}
disallowedReferences seems not to work, though | 17:52:16 |
yorickvp | it was already broken on dc7b3febf8d862328d8704de5c8437d2df442c76 | 18:02:02 |
yorickvp | (23.11 branchoff) | 18:02:06 |
SomeoneSerge (utc+3) | Ehh cuda_nvcc.stdenv.cc looks wrong | 18:35:31 |
SomeoneSerge (utc+3) | Or maybe you're tight it's the unwrapped one | 18:36:12 |
SomeoneSerge (utc+3) | * Or maybe you're right it's the unwrapped one | 18:36:19 |