!eWOErHSaiddIbsUNsJ:nixos.org

NixOS CUDA

311 Members
CUDA packages maintenance and support in nixpkgs | https://github.com/orgs/NixOS/projects/27/ | https://nixos.org/manual/nixpkgs/unstable/#cuda62 Servers

Load older messages


SenderMessageTime
19 May 2024
@ss:someonex.netSomeoneSerge (matrix works sometimes)(Did I just edit a message instead of sending a new one again?)23:51:45
@aidalgol:matrix.orgaidalgol I just cloned https://github.com/ConnorBaker/nix-cuda-test.git, applied the fix described earlier, then ran nix run .#nix-cuda-test 23:52:36
@ss:someonex.netSomeoneSerge (matrix works sometimes)
In reply to @aidalgol:matrix.org
An RTX 3080 is too old?!
So yeah there was a question about cudaCapabilities = [ ... "8.6" ]
23:52:41
@aidalgol:matrix.orgaidalgolI saw it. 👍️23:52:53
@ss:someonex.netSomeoneSerge (matrix works sometimes)
In reply to @aidalgol:matrix.org
I just cloned https://github.com/ConnorBaker/nix-cuda-test.git, applied the fix described earlier, then ran nix run .#nix-cuda-test
Our culprit: https://github.com/ConnorBaker/nix-cuda-test/blob/182c2148e6df0932fe19f9cb7180173ee2f9cb2d/flake.nix#L66
23:53:08
20 May 2024
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8)Ah yeah sorry I usually just have it set to my GPU to speed up compile since I use it to test PRs00:22:23
@aidalgol:matrix.orgaidalgol Sooo... while that was building, KDE crashed to the display manager, and now the GPU usage is showing a non-zero value as I expected earlier. I have no idea what could have changed. 01:12:24
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8)Maybe it forced the driver to reload lol03:46:33
@ss:someonex.netSomeoneSerge (matrix works sometimes)

connor (he/him) (UTC-5): Samuel Ainsworth: Madoura:

1 I'd like us to add a more generic alias to this room that would encompass a wider range of topics, rocm at the least. The reasoning is the same: i don't think we need a special room for cuda... Madoura: what do rocm-maintainers think about their presence on matrix?
2 I notice we never brought up the subject of onboarding new people anywhere in public, not even discourse. Wdyt should be done about that?

14:20:05
@trexd:matrix.orgtrexdCould call the room nixos-gpu since that covers rocm too.14:21:23
@ss:someonex.netSomeoneSerge (matrix works sometimes)
In reply to @trexd:matrix.org
Could call the room nixos-gpu since that covers rocm too.
Yes, gpu/coprocessors/accelerators/scicomp/ai even/anything in that direction. Well there already is nixos hpc and nixos data science but I don't see much conversation there, what has to be changed to spark conversations? There's activity in matthewcroughan's flake room, and sometimes hete, but not in nixos ds.
14:34:08
@ss:someonex.netSomeoneSerge (matrix works sometimes)* Yes, gpu/coprocessors/accelerators/scicomp/ai even/anything in that direction. Well there already is nixos hpc and nixos data science but I don't see much conversation there, what has to be changed to spark conversations? There's activity in matthewcroughan's flake room, and sometimes here, but not in nixos ds.14:34:25
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8)

To tackle diamond dependencies (among other things), I started making https://github.com/ConnorBaker/dep-tracker

Specify a flake attribute for a package and it’ll grab a copy of all the attributes on the package containing a list of dependencies (the attributes it looks for are here https://github.com/ConnorBaker/dep-tracker/blob/cd8e927c561f3f1ed5c904609654c946d85cf954/packages/dep-tracker/dep_tracker/types.py#L15). It’ll look through those arrays and populate a SQLite database with libraries it finds in those dependencies.

Now, a question: besides recursing and doing the same for every dependency I find (that is, harvesting attributes and updating the database), is there an easier way to get the closure of dependencies without building the package? IIRC nix path-info requires the package be built.

A different question: with that hardcoded list of attributes I inspect, is it possible I’d miss dependencies (and therefore libraries) which are present in the closure?

@someoneserge you have good ideas about finding dependencies — any suggestions? Currently finding what a dependency provides is limited to listing the names of libraries present under lib (https://github.com/ConnorBaker/dep-tracker/blob/cd8e927c561f3f1ed5c904609654c946d85cf954/packages/dep-tracker/dep_tracker/deps.py#L42) and finding out what a library needs is accomplished through patchelf (https://github.com/ConnorBaker/dep-tracker/blob/cd8e927c561f3f1ed5c904609654c946d85cf954/packages/dep-tracker/dep_tracker/deps.py#L56)

17:57:11
@tpw_rules:matrix.orgtpw_rulesis there some reason we've got an out of date tensorflow build from source?18:07:41
@tpw_rules:matrix.orgtpw_rulesoh maybe the bin has cuda support now?18:08:17
@ss:someonex.netSomeoneSerge (matrix works sometimes)

IIRC nix path-info requires the package be built.

For deciding which dependencies to retain a runtime reference to

18:11:59
@ss:someonex.netSomeoneSerge (matrix works sometimes) Have you seen https://fzakaria.com/2023/09/11/quick-insights-using-sqlelf.html? 18:17:01
@glepage:matrix.orgGaétan Lepage
In reply to @tpw_rules:matrix.org
is there some reason we've got an out of date tensorflow build from source?
Because a >200 IQ is necessary to grasp this derivation 😅
18:19:34
@trexd:matrix.orgtrexd
In reply to @glepage:matrix.org
Because a >200 IQ is necessary to grasp this derivation 😅
What's the TLDR on why tensorflow is so difficult to package if you don't mind me asking? Maybe this is another example of "packaging is a hard problem" that I can add to my Nix pitch slides.
18:27:20
@glepage:matrix.orgGaétan Lepage Well, the [tensorflow derivation](https://github.com/NixOS/nixpkgs/blob/master/pkgs/development/python-modules/tensorflow/default.nix= is ~600 lines of hacking around the bazel build system, + doing a bunch of hacks to inject our own dependencies + CUDA stuff...
All of this requires a lot of expertise (that I personnaly lack).
It is surely one of the hardest packages that I am aware of in the python package set.
18:36:24
@glepage:matrix.orgGaétan Lepage * Well, the tensorflow derivation is ~600 lines of hacking around the bazel build system, + doing a bunch of hacks to inject our own dependencies + CUDA stuff...
All of this requires a lot of expertise (that I personnaly lack).
It is surely one of the hardest packages that I am aware of in the python package set.
18:36:30
@glepage:matrix.orgGaétan Lepage Now, if you want to take this challenge and update tensorflow, please go ahead.
For context, there is a stale PR for updating tensorflow to 2.14: https://github.com/NixOS/nixpkgs/pull/272838
18:38:02
@trexd:matrix.orgtrexd
In reply to @glepage:matrix.org
Now, if you want to take this challenge and update tensorflow, please go ahead.
For context, there is a stale PR for updating tensorflow to 2.14: https://github.com/NixOS/nixpkgs/pull/272838
I'm already full up with packaging hasktorch haha! Yeah I've had a look at the derivation before and it seems nuts.
18:49:58
@aidalgol:matrix.orgaidalgol
In reply to @connorbaker:matrix.org
Maybe it forced the driver to reload lol
I think that's what it is, yeah. This seems to break upon resuming the machine from suspend.
20:06:08
@connorbaker:matrix.orgconnor (burnt/out) (UTC-8)
In reply to @ss:someonex.net
Have you seen https://fzakaria.com/2023/09/11/quick-insights-using-sqlelf.html?
I have! I saw Farid give a presentation on it at NixCon NA and that was neat; but it’s not packaged in Nixpkgs and I don’t want to do it :/
20:58:45
@tpw_rules:matrix.orgtpw_rulesdoes the tensorflow-bin have cuda support?21:41:29
@glepage:matrix.orgGaétan LepageHmm I guess so22:01:54
@glepage:matrix.orgGaétan LepageIt comes with its prebuilt CUDA binary I think22:02:11
@kamillaova:matrix.orgKamilla 'ova joined the room.23:23:10
21 May 2024
@glepage:matrix.orgGaétan Lepage connor (he/him) (UTC-5) where do you "develop" from on nixpkgs ?
Do you use the same machine for writting code and building things (i.e. doing eval) ?
08:04:04

Show newer messages


Back to Room ListRoom Version: 9