!eWOErHSaiddIbsUNsJ:nixos.org

NixOS CUDA

288 Members
CUDA packages maintenance and support in nixpkgs | https://github.com/orgs/NixOS/projects/27/ | https://nixos.org/manual/nixpkgs/unstable/#cuda58 Servers

Load older messages


SenderMessageTime
20 Feb 2023
@FRidh:matrix.orgFRidh
In reply to @connorbaker:matrix.org

How do you all handle PRs which have dependencies on other PRs?

For context, this is related to a PR I made here (https://github.com/NixOS/nixpkgs/pull/215229) which has spiraled into a grab-bag of changes.

As an example, if you have a PR A and a PR B which relies on that changes A has, how do you make that clear?

  • Should B use the branch A creates instead of Nixpkgs' master?
  • Should there be notes or warnings in the description of the PRs about the merge order? If so, is there a particular style you'd recommend?
  • How do you test multiple PRs which depend on each other with nixpkgs-review? I had tried specifying multiple PRs as arguments, but it didn't seem like it stacked them.

The closest thing I think I've seen before is PRs against torch (example: https://github.com/pytorch/pytorch/pull/94243). They use this tool: https://github.com/ezyang/ghstack.

That's always tough. I think your approach here, all in one PR but with clear commits is fine. The thing is, there are just not that many contributors in Nixpkgs to CUDA and it's quite a big PR so it takes some effort to review.
18:47:27
@connorbaker:matrix.orgconnor (he/him)https://github.com/NixOS/nixpkgs/pull/217322 is closer to what I envision doing once I've further split apart that large PR. Does that seem okay? Is there an automated tool I should be using to do something similar to this?19:03:27
@connorbaker:matrix.orgconnor (he/him) Thank you all for the work you do maintaining the CUDA-accelerated packages. Building jaxlib and tensorflowWithCuda repeatedly is awful. 21:45:11
21 Feb 2023
@hexa:lossy.networkhexahttps://github.com/NixOS/nixpkgs/pull/21749716:55:31
@connorbaker:matrix.orgconnor (he/him)Has anyone used or set up CCACHE for any of the CUDA derivations? I know they take a while to build and I'm curious what's been done to try to reduce build times21:07:03
@ss:someonex.netSomeoneSerge (back on matrix)Hi! Thank you for investing your time and work in this right now!21:20:45
@ss:someonex.netSomeoneSerge (back on matrix) Interesting. I just looked up the ccache nixos wiki page, it suggests one can just drop in something called ccacheStdenv 21:21:46
@ss:someonex.netSomeoneSerge (back on matrix) Do you know if packages built that way would work as substitutes for normal stdenv ones? 21:22:59
@connorbaker:matrix.orgconnor (he/him)Unfortunately I think it would be a different derivation :l21:35:02
@ss:someonex.netSomeoneSerge (back on matrix)so, no magic(21:57:53
@connorbaker:matrix.orgconnor (he/him)From a conversation I had, it seems like it's intended more for use as a dev-shell than for the end derivation23:00:04
22 Feb 2023
@hexa:lossy.networkhexaso, I have an application that wants tflite_runtime00:17:18
@hexa:lossy.networkhexacan I substitute that with calls to actual tensorflow?00:17:31
@hexa:lossy.networkhexanot keen on packaging tflite00:18:48
@ss:someonex.netSomeoneSerge (back on matrix)I'm not familiar with tflite, but good news is it seems that they build it with cmake, not bazel08:56:05
@ss:someonex.netSomeoneSerge (back on matrix)https://www.tensorflow.org/lite/guide/build_cmake08:57:43
@hexa:lossy.networkhexatensorflow.life is a thing09:38:47
@hexa:lossy.networkhexaSo no building required i think09:39:09
@hexa:lossy.networkhexa * tensorflow.lite is a thing13:24:36
@hexa:lossy.networkhexa
-import tflite_runtime.interpreter as tflite
+try:
+    from tflite_runtime.interpreter import Interpreter
+except ModuleNotFoundError:
+    from tensorflow.lite.python.interpreter import Interpreter
13:24:53
@hexa:lossy.networkhexathis is mostly what I did13:24:57
@connorbaker:matrix.orgconnor (he/him) Is there a recommended way to get in touch with NVIDIA about their docs?
For example, https://docs.nvidia.com/cuda/archive/11.0.3/ gives me an access denied, and some of their tables in their older docs are missing supported compute capabilities (https://docs.nvidia.com/cuda/archive/11.2.1/cuda-compiler-driver-nvcc/index.html#gpu-feature-list vs https://docs.nvidia.com/cuda/archive/11.3.1/cuda-compiler-driver-nvcc/index.html#gpu-feature-list, sm_37 reappears, but sm_52 is missing in both)
15:05:30
@connorbaker:matrix.orgconnor (he/him)Ah, the link for their 11.0.x docs on https://developer.nvidia.com/cuda-toolkit-archive is wrong -- it follows the 10.2 format so it should be something like https://docs.nvidia.com/cuda/archive/11.0/cuda-compiler-driver-nvcc/index.html#gpu-feature-list15:09:01
23 Feb 2023
@connorbaker:matrix.orgconnor (he/him)If anyone has any knowledge to contribute, I'd appreciate it: https://github.com/NixOS/nixpkgs/issues/21778001:14:30
@justbrowsing:matrix.orgKevin Mittman (EOY sleep) RE: Getting in touch, I'd recommend starting a new thread in https://forums.developer.nvidia.com/c/8  03:09:29
@connorbaker:matrix.orgconnor (he/him)NVCC has a certain range of compilers it supports. I know that currently we export CC/CXX/CUDAHOSTCXX as appropriate to handle that... but that only changes things in the current derivation. Since the default language standard (like c++11 -> c++14) can change between compiler releases, it's possible that we build a derivation with an NVCC-supported version of GCC or clang, but the libraries that derivation links against were built with a different compiler version with a different language standard. That can manifest as missing or broken symbols during linking, right?21:51:31
24 Feb 2023
@connorbaker:matrix.orgconnor (he/him)

Example of me trying to run something I just packaged (https://github.com/connorbaker/bsrt) and maybe getting bitten by (what I think is) exactly this:

[connorbaker@fedora bsrt_temp]$ python3 -m bsrt
Traceback (most recent call last):
  File "/nix/store/0pyymzxf7n0fzpaqnvwv92ab72v3jq8d-python3-3.10.9/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/nix/store/0pyymzxf7n0fzpaqnvwv92ab72v3jq8d-python3-3.10.9/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/connorbaker/Documents/bsrt_temp/bsrt/__main__.py", line 6, in <module>
    from mfsr_utils.pipelines.synthetic_burst_generator import (
  File "/nix/store/ay2msah0yd16xjwyldqd0n6incf9gd7l-python3.10-mfsr_utils-1.7/lib/python3.10/site-packages/mfsr_utils/pipelines/synthetic_burst_generator.py", line 6, in <module>
    import cv2  # type: ignore[import]
ImportError: /nix/store/ps7an26cirhh0xy1wrlc2icvfhrd39cj-gcc-11.3.0-lib/lib/libstdc++.so.6: version `GLIBCXX_3.4.30' not found (required by /nix/store/s9jsa3p9csvnpvfhix19b3rfyg08m275-opencv-4.7.0/lib/libopencv_gapi.so.407)

OpenCV specifies the CUDA host compiler, but does not set the C or C++ compilers. I'm trying a build with a patched derivation for opencv and hoping that resolves the problem. (Also, OpenCV apparently doesn't build for specific GPU architectures or take advantage of CUDNN!)

01:02:04
@connorbaker:matrix.orgconnor (he/him) It did! Now I'm seeing a different error of RuntimeError: CUDA driver error: PTX JIT compiler library not found, but that's progress :) 01:32:45
@connorbaker:matrix.orgconnor (he/him) * It did! Now I'm seeing a different error of RuntimeError: CUDA driver error: PTX JIT compiler library not found, but that's because I'm not using nixGL yet on a non-NixOS machine 01:41:42
@mcwitt:matrix.orgmcwitt

Is there an issue with cudaPackages since the the gcc version bump to 12? I'd expect the following to work

let pkgs = import ./. { };
in pkgs.runCommandCC "test" { buildInputs = with pkgs.cudaPackages; [ cuda_nvcc cuda_cudart ]; } ''
  nvcc ${pkgs.writeText "test.cu" "int main() { return 0; }"} -o $out
''

but on master I see

error -- unsupported GNU version! gcc versions later than 11 are not supported!

(might be missing something because I'm not immediately finding workarounds that were necessary for other CUDA packages since gcc was bumped)

19:06:53

There are no newer messages yet.


Back to Room ListRoom Version: 9