!eWOErHSaiddIbsUNsJ:nixos.org

NixOS CUDA

290 Members
CUDA packages maintenance and support in nixpkgs | https://github.com/orgs/NixOS/projects/27/ | https://nixos.org/manual/nixpkgs/unstable/#cuda57 Servers

Load older messages


SenderMessageTime
15 Dec 2024
@matthewcroughan:defenestrate.itmatthewcroughanIs it fair to say that rocm is not supported very well right now in nixpkgs?21:01:43
@ss:someonex.netSomeoneSerge (back on matrix) yes, rather 21:02:38
@matthewcroughan:defenestrate.itmatthewcroughan We seem unable to compile torch, so okay I override python to use torch-bin, but then I still have to allowBroken 21:02:48
@matthewcroughan:defenestrate.itmatthewcroughan
       error: 'miopengemm' has been deprecated.
       It is still available for some time as part of rocmPackages_5.

21:02:52
@matthewcroughan:defenestrate.itmatthewcroughanand then if I do, this happens21:02:56
@sielicki:matrix.orgsielicki

SomeoneSerge (utc+3): connor (he/him) (UTC-7): I spent some time revamping my personal flake today/yesterday and now have a better understanding of the new hostPlatform/buildPlatform stuff, alongside lib.platforms

I do think it's the right interface long-term for both cudaSupport, gencodes, and also configuring fabrics.

The entire extent of my contributions to nixpkgs has been just doing small contributions to specific packages. I want to write up a small doc proposing this and discussing the migration path, but also wanted to collaborate w/ you guys. I don't even know where to post this, do I just open an issue on gh or does it need to go on the discourse?

21:18:06
16 Dec 2024
@connorbaker:matrix.orgconnor (he/him)I'd recommend testing the waters and getting a sense of prior art done in terms of extending those; perhaps the folks in the Nixpkgs Stdenv room would be good to reach out to? https://matrix.to/#/#stdenv:nixos.org After that (and you've gotten a list of people interested in or knowledgeable about such work), I think drafting an RFC would be the next step, to fully lay out the plan for design and implementation. If it's a relatively small change, maybe an RFC is unnecessary and a PR would be fine!06:55:30
@ss:someonex.netSomeoneSerge (back on matrix)

That's the reason I've lately been ignoring flakes in personal projects: just stick to impure eval and autocalling. For me it's usually going in the direction of

{ nixpkgs ? <nixpkgs>
, pkgs ? import nixpkgs { inherit config; }
, lib ? pkgs.lib
, cudaSupport ? false
, config ? { inherit cudaSupport; }
}:

lib.makeScope pkgs.newScope (self: {
 # ...
})
14:45:04
@ss:someonex.netSomeoneSerge (back on matrix) *

That's the reason I've lately been ignoring flakes in personal projects: just stick to impure eval and autocalling. For me it's usually going in the direction of

{ nixpkgs ? <nixpkgs>
, pkgs ? import nixpkgs { inherit config; }
, lib ? pkgs.lib
, cudaSupport ? false
, config ? { inherit cudaSupport; }
}:

lib.makeScope pkgs.newScope (self: {
 # ...
})
14:45:18
@matthewcroughan:defenestrate.itmatthewcroughanI actually ended up with a good pattern using flake.parts14:45:28
@matthewcroughan:defenestrate.itmatthewcroughan
      perSystem = { system, ... }: {
        _module.args.rocmPkgs = import inputs.nixpkgs {
          overlays = [
            inputs.self.overlays.default
            (self: super: {
              python3 = super.python3.override {
                packageOverrides = self: super: { torch = super.torch-bin; };
              };
            })
          ];
          config.allowUnfree = true;
          config.rocmSupport = true;
          inherit system;
        };
        _module.args.nvidiaPkgs = import inputs.nixpkgs {
          overlays = [
            inputs.self.overlays.default
          ];
          config.allowUnfree = true;
          config.cudaSupport = true;
          inherit system;
        };
        _module.args.pkgs = import inputs.nixpkgs {
          overlays = [
            inputs.self.overlays.default
          ];
          config.allowUnfree = true;
          inherit system;
        };
      };

14:46:22
@matthewcroughan:defenestrate.itmatthewcroughan *
      perSystem = { system, ... }: {
        _module.args.rocmPkgs = import inputs.nixpkgs {
          overlays = [
            inputs.self.overlays.default
            (self: super: {
              python3 = super.python3.override {
                packageOverrides = self: super: { torch = super.torch-bin; };
              };
            })
          ];
          config.allowUnfree = true;
          config.rocmSupport = true;
          inherit system;
        };
        _module.args.nvidiaPkgs = import inputs.nixpkgs {
          overlays = [
            inputs.self.overlays.default
          ];
          config.allowUnfree = true;
          config.cudaSupport = true;
          inherit system;
        };
        _module.args.pkgs = import inputs.nixpkgs {
          overlays = [
            inputs.self.overlays.default
          ];
          config.allowUnfree = true;
          inherit system;
        };
      };

14:46:27
@matthewcroughan:defenestrate.itmatthewcroughanCould be deduplicated using a function, but this is what it looks like unfolded14:46:38
@matthewcroughan:defenestrate.itmatthewcroughan Then other flake-modules have these rocmPkgs and nvidiaPkgs arguments passed to them 14:47:01
@matthewcroughan:defenestrate.itmatthewcroughan
  perSystem = { config, pkgs, nvidiaPkgs, rocmPkgs, system, ... }: {
    packages = {
      comfyui-nvidia = nvidiaPkgs.comfyuiPackages.comfyui;
      comfyui-amd = rocmPkgs.comfyuiPackages.comfyui;
    };
  };

14:47:04
@matthewcroughan:defenestrate.itmatthewcroughan *
  perSystem = { config, pkgs, nvidiaPkgs, rocmPkgs, system, ... }: {
    packages = {
      comfyui-nvidia = nvidiaPkgs.comfyuiPackages.comfyui;
      comfyui-amd = rocmPkgs.comfyuiPackages.comfyui;
    };
  };
14:47:05
@ss:someonex.netSomeoneSerge (back on matrix) I think this should be doable without major backwards incompatible changes? 14:47:07
@matthewcroughan:defenestrate.itmatthewcroughanthe issue is that rocmSupport is fully broken in Nixpkgs, so this doesn't work, but it should 14:47:25
@matthewcroughan:defenestrate.itmatthewcroughan * the issue is that rocmSupport is fully broken in Nixpkgs, so this doesn't work, but it should in future14:47:29
@ss:someonex.netSomeoneSerge (back on matrix)

That's more or less what the llama-cpp flake did, but didn't you say

I don't want to have attributes like nvidia-myapp rocm-myapp and myapp

14:48:09
@matthewcroughan:defenestrate.itmatthewcroughanI'm happy as long as I don't have to do weird things to achieve it14:48:58
@matthewcroughan:defenestrate.itmatthewcroughanand for me, this is not weird14:49:02
@matthewcroughan:defenestrate.itmatthewcroughanpreviously what my flake was doing was far weirder14:49:07
@matthewcroughan:defenestrate.itmatthewcroughanhttps://github.com/nixified-ai/flake/blob/master/projects/invokeai/default.nix#L66-L9614:49:32
@matthewcroughan:defenestrate.itmatthewcroughan previously it was defining functions that were able to create variants of packages without setting rocmSupport or cudaSupport 14:49:51
@matthewcroughan:defenestrate.itmatthewcroughanJust terrible14:50:00
@matthewcroughan:defenestrate.itmatthewcroughan Besides, the modules the flake will export, won't interact with the comfyui-nvidia or comfyui-amd attrs, this is just for people who want to try it with nix run 14:50:42
@matthewcroughan:defenestrate.itmatthewcroughan In a system using the nixosModules, the overlay will be applied, which strictly ignores the packages attr of the flake 14:51:05
@matthewcroughan:defenestrate.itmatthewcroughanthe packages attr of the flake is just there for people wanting to use things in a non-nixos context really14:53:53
@ss:someonex.netSomeoneSerge (back on matrix) They're just completely separate.
I guess there are mappings between subsets of the frameworks, as evidenced by ZLUDA, hipify, and https://docs.scale-lang.com.
I suppose one could say that ZLUDA is a sort of a runtime proxy, although the multi-versioning bit is still missing.
14:55:59

Show newer messages


Back to Room ListRoom Version: 9