| 25 Mar 2024 |
SomeoneSerge (matrix works sometimes) | I don't think there's a ticket actually | 20:41:38 |
SomeoneSerge (matrix works sometimes) | * I don't think there's a ticket actually Samuel Ainsworth | 20:41:56 |
SomeoneSerge (matrix works sometimes) | I opened one https://github.com/NixOS/ofborg/issues/678 | 20:54:45 |
| 26 Mar 2024 |
jayrahdevore | Hello! I'm new here, so I'd like to start by thanking you all for the excellent work that is the NixOS CUDA project. I wanted to reach out since I'm a bit stuck on tensorrt_8_6 in particular--it keeps coming back with marked as broken. If I understand correctly, I need the correct package set and options to it to evaluate (I'm just not sure what these are).
{
inputs = {
nixpkgs = {
url = "github:nixos/nixpkgs/nixos-unstable";
};
};
outputs = { nixpkgs, ... }: let
system = "x86_64-linux";
pkgs = import nixpkgs {
config = {
allowUnfree = true;
cudaSupport = true;
cudnnSupport = true;
cudaEnableForwardCompat = true;
cudaCapabilities = [ "8.6" ];
};
inherit system;
overlays = [
(self: super: {
cudaPackages = super.cudaPackages.overrideScope (final: prev: {
cudnn = prev.cudnn_8_9;
tensorrt = prev.tensorrt_8_6;
cudatoolkit = prev.cudatoolkit.override {cudaVersion = "12.1";};
});
})
];
};
in {
devShells."${system}".default = pkgs.mkShell {
packages = with pkgs; [
cudaPackages.tensorrt
];
};
};
}
| 01:44:51 |
aidalgol | In reply to @jayrahdevore:matrix.org
Hello! I'm new here, so I'd like to start by thanking you all for the excellent work that is the NixOS CUDA project. I wanted to reach out since I'm a bit stuck on tensorrt_8_6 in particular--it keeps coming back with marked as broken. If I understand correctly, I need the correct package set and options to it to evaluate (I'm just not sure what these are).
{
inputs = {
nixpkgs = {
url = "github:nixos/nixpkgs/nixos-unstable";
};
};
outputs = { nixpkgs, ... }: let
system = "x86_64-linux";
pkgs = import nixpkgs {
config = {
allowUnfree = true;
cudaSupport = true;
cudnnSupport = true;
cudaEnableForwardCompat = true;
cudaCapabilities = [ "8.6" ];
};
inherit system;
overlays = [
(self: super: {
cudaPackages = super.cudaPackages.overrideScope (final: prev: {
cudnn = prev.cudnn_8_9;
tensorrt = prev.tensorrt_8_6;
cudatoolkit = prev.cudatoolkit.override {cudaVersion = "12.1";};
});
})
];
};
in {
devShells."${system}".default = pkgs.mkShell {
packages = with pkgs; [
cudaPackages.tensorrt
];
};
};
}
Uh oh... I put TensorRT in nixpkgs initially, but there's been some refactoring around the CUDA support in nixpkgs since then, SomeoneSerge (migrating synapse), we may need your assistance here as well. But first, can you please paste the error you get when you try to enter the devshell in your flake? | 02:04:43 |
aidalgol | In reply to @jayrahdevore:matrix.org
Hello! I'm new here, so I'd like to start by thanking you all for the excellent work that is the NixOS CUDA project. I wanted to reach out since I'm a bit stuck on tensorrt_8_6 in particular--it keeps coming back with marked as broken. If I understand correctly, I need the correct package set and options to it to evaluate (I'm just not sure what these are).
{
inputs = {
nixpkgs = {
url = "github:nixos/nixpkgs/nixos-unstable";
};
};
outputs = { nixpkgs, ... }: let
system = "x86_64-linux";
pkgs = import nixpkgs {
config = {
allowUnfree = true;
cudaSupport = true;
cudnnSupport = true;
cudaEnableForwardCompat = true;
cudaCapabilities = [ "8.6" ];
};
inherit system;
overlays = [
(self: super: {
cudaPackages = super.cudaPackages.overrideScope (final: prev: {
cudnn = prev.cudnn_8_9;
tensorrt = prev.tensorrt_8_6;
cudatoolkit = prev.cudatoolkit.override {cudaVersion = "12.1";};
});
})
];
};
in {
devShells."${system}".default = pkgs.mkShell {
packages = with pkgs; [
cudaPackages.tensorrt
];
};
};
}
* Uh oh... I put TensorRT in nixpkgs initially, but there's been some refactoring around the CUDA support in nixpkgs since then, so SomeoneSerge (migrating synapse), we may need your assistance here as well. But first, can you please paste the error you get when you try to enter the devshell in your flake? | 02:04:54 |
jayrahdevore | In reply to @aidalgol:matrix.org Uh oh... I put TensorRT in nixpkgs initially, but there's been some refactoring around the CUDA support in nixpkgs since then, so SomeoneSerge (migrating synapse), we may need your assistance here as well. But first, can you please paste the error you get when you try to enter the devshell in your flake? Sure thing! Thanks. The exact error I get is
error:
… while calling the 'derivationStrict' builtin
at /builtin/derivation.nix:9:12: (source not available)
… while evaluating derivation 'nix-shell'
whose name attribute is located at /nix/store/amxd2p02wx78nyaa4bkb0hjvgwhz1dq7-source/pkgs/stdenv/generic/make-derivation.nix:331:7
… while evaluating attribute 'nativeBuildInputs' of derivation 'nix-shell'
at /nix/store/amxd2p02wx78nyaa4bkb0hjvgwhz1dq7-source/pkgs/stdenv/generic/make-derivation.nix:375:7:
374| depsBuildBuild = elemAt (elemAt dependencies 0) 0;
375| nativeBuildInputs = elemAt (elemAt dependencies 0) 1;
| ^
376| depsBuildTarget = elemAt (elemAt dependencies 0) 2;
error: Package ‘tensorrt-8.6.1.6’ in /nix/store/amxd2p02wx78nyaa4bkb0hjvgwhz1dq7-source/pkgs/development/cuda-modules/generic-builders/manifest.nix:325 is marked as broken, refusing to evaluate.
a) To temporarily allow broken packages, you can use an environment variab
le
for a single invocation of the nix tools.
$ export NIXPKGS_ALLOW_BROKEN=1
Note: When using `nix shell`, `nix build`, `nix develop`, etc with a flake,
then pass `--impure` in order to allow use of environment variables.
b) For `nixos-rebuild` you can set
{ nixpkgs.config.allowBroken = true; }
in configuration.nix to override this.
c) For `nix-env`, `nix-build`, `nix-shell` or any other Nix command you can add
{ allowBroken = true; }
to ~/.config/nixpkgs/config.nix.
| 02:13:44 |
aidalgol | Looking at pkgs/development/cuda-modules/tensorrt/releases.nix, it does look like your versions should be compatible, so I'm not sure what's marking it as broken. I also tried it locally without the overlay, and get the same error. Hopefully SS can help me understand the logic around marking CUDA packages as broken. | 02:27:59 |
connor (he/him) | I'll take a look at it | 12:37:48 |
connor (he/him) | Two things about the flake which may be surprising
cudnnSupport = true; shouldn't do anything, as IIRC the only places this occurs in Nixpkgs it defaults to the value of config.cudaSupport and isn't accessed through config but rather expected to be passed as an attribute via an override on the result of callPackage (for example, if you wanted to turn CuDNN support off explicitly). Although, if this is for something out of tree, I guess keep it? config.cudaSupport has an actual option in Nixpkgs so its presence (and a valid boolean value) is guaranteed, but config.cudnnSupport does not
cudatoolkit = prev.cudatoolkit.override {cudaVersion = "12.1";}; shouldn't be used; if you want to use a specific version of CUDA, use the explicitly versioned package sets (like cudaPackages_12_1)
| 12:43:59 |
connor (he/him) | * Two things about the flake which may be surprising
cudnnSupport = true; shouldn't do anything, as IIRC the only places this occurs in Nixpkgs it defaults to the value of config.cudaSupport and isn't accessed through config but rather expected to be passed as an attribute via an override on the result of callPackage (for example, if you wanted to turn CuDNN support off explicitly). Although, if this is for something out of tree, I guess keep it? config.cudaSupport has an actual option in Nixpkgs so its presence (and a valid boolean value) is guaranteed, but config.cudnnSupport does not
cudatoolkit = prev.cudatoolkit.override {cudaVersion = "12.1";}; shouldn't be used; if you want to use a specific version of CUDA, use the explicitly versioned package sets (like cudaPackages_12_1)
Additionally, since system is a variable referring to a string, you don't need to quote it when you're doing variable access -- you can do devShells.${system}.default instead
| 12:46:39 |
connor (he/him) | With those nits resolved, here's the flake I'm looking at -- I added legacyPackages.${system} = pkgs; because it's helpful to look at the copy of Nixpkgs we've instantiated and we'll use it for troubleshooting | 12:47:12 |
connor (he/him) | {
inputs.nixpkgs.url = "github:nixos/nixpkgs/nixos-unstable";
outputs =
{ nixpkgs, ... }:
let
system = "x86_64-linux";
pkgs = import nixpkgs {
config = {
allowUnfree = true;
cudaSupport = true;
cudaEnableForwardCompat = true;
cudaCapabilities = [ "8.6" ];
};
inherit system;
overlays = [
(self: super: {
cudaPackages = super.cudaPackages.overrideScope (
final: prev: {
cudnn = prev.cudnn_8_9;
tensorrt = prev.tensorrt_8_6;
}
);
})
];
};
in
{
legacyPackages.${system} = pkgs;
devShells.${system}.default = pkgs.mkShell { packages = with pkgs; [ cudaPackages.tensorrt ]; };
};
}
| 12:47:38 |
connor (he/him) | One of the things we've been working hard on is adding a way to see why something is marked as broken or a bad platform. To do that, we've added two attributes to most of the CUDA packages: brokenConditions and badPlatformsConditions (https://github.com/NixOS/nixpkgs/blob/84b4b872f06c0c5a9e1bc82ff747267b45925df6/pkgs/development/cuda-modules/generic-builders/manifest.nix#L130-L143). Ideally, we'd use these attribute sets to decide whether to mark something as broken or a bad platform. Since different packages have different conditions, they can extend these. Here's the one for TensorRT: https://github.com/NixOS/nixpkgs/blob/84b4b872f06c0c5a9e1bc82ff747267b45925df6/pkgs/development/cuda-modules/tensorrt/fixup.nix#L27-L47 | 12:49:57 |
connor (he/him) | We can use that to see what's going wrong:
$ nix eval .#cudaPackages.tensorrt.brokenConditions
warning: Git tree '/home/connorbaker/Packages/temp_cuda_help' is dirty
{ "CUDA version is too new" = true; "CUDA version is too old" = false; "CUDNN version is too new" = false; "CUDNN version is too old" = false; }
| 12:51:04 |
connor (he/him) | Looking at the source (gross, yes, and perhaps these values should be in passthrough), we can see that the version of TensorRT we currently have packaged can't be used past 12.1: https://github.com/NixOS/nixpkgs/blob/84b4b872f06c0c5a9e1bc82ff747267b45925df6/pkgs/development/cuda-modules/tensorrt/releases.nix#L120-L127 | 12:51:54 |
connor (he/him) | Since the current version of cudaPackages is 12.2, we can change the default version Nixpkgs uses by changing
cudaPackages = super.cudaPackages.overrideScope (
to
cudaPackages = super.cudaPackages_12_1.overrideScope (
so every package will be built against 12.1 by default (unless the package specifically requests a different version -- you'd need to find and override those manually)
| 12:54:56 |
jayrahdevore | connor (he/him) (UTC-5): This is fantastic. Thank you for taking some time to not just provide a solution but give a few helpful pointers and a tip for debugging in the future. I appreciate it! | 15:19:44 |
jayrahdevore | * connor (he/him) (UTC-5): This is fantastic. Thank you for taking some time to not just provide a solution but also give a few helpful pointers and a tip for debugging in the future. I appreciate it! | 15:19:58 |
aidalgol | connor (he/him) (UTC-5): Awesome breakdown, thank you! | 19:28:33 |
| 27 Mar 2024 |
connor (he/him) | jfc why don't getDev getLib and friends actually get the output | 02:54:32 |
connor (he/him) | pain https://github.com/NixOS/nixpkgs/pull/279952/commits/3fc9e9baf4786acc679ff11e4cd0f2377b6c2e96 | 04:08:03 |
| felix joined the room. | 12:36:13 |
SomeoneSerge (matrix works sometimes) | if ! pkg ? outputSpecified || ! pkg.outputSpecified
suspicious | 14:29:47 |
connor (he/him) | When looking at the build inputs of saxpy I saw three copies of the main libcublas derivation and decided to just access the components directly | 14:43:42 |
connor (he/him) | I mean, those utility functions also aren’t aware of splicing and so can’t really be effectively used inside nativeBuildInputs, for example, because it’ll get the output of the package from pkgs instead of buildPackages. | 14:45:26 |
SomeoneSerge (matrix works sometimes) | In reply to @connorbaker:matrix.org I mean, those utility functions also aren’t aware of splicing and so can’t really be effectively used inside nativeBuildInputs, for example, because it’ll get the output of the package from pkgs instead of buildPackages. Hm. You're saying that, in nativeBuildInputs = [ (getLib libcublas) ], getLib forcefully extracts the HostTarget slice and returns its output? | 14:50:16 |
SomeoneSerge (matrix works sometimes) | If that was correct I'd expect pretty much all of nixpkgs to be broken cross-compilation wise | 14:50:56 |
SomeoneSerge (matrix works sometimes) | 🤔 this isn't all that far from observations though | 14:51:40 |