NixOS CUDA - Public Room Timeline

	NixOS CUDA	289 Members
	CUDA packages maintenance and support in nixpkgs \| https://github.com/orgs/NixOS/projects/27/ \| https://nixos.org/manual/nixpkgs/unstable/#cuda	57 Servers

Load older messages

Sender	Message	Time
9 Nov 2022
SomeoneSerge (back on matrix)	I meant, you can start with a personal repo. Maybe just something regularly running tests of your choice against current master and reporting the results in some form. Later there'll be a question of where this testsuite could be integrated (e.g. ideally we'd make a separate "stable cuda channel", which commits would only reach after passing cuda-related tests, but that too is work to be done). I think samuela's nixpkgs-upkeep is worth a look: it automatically opens issues about newly broken packages.	21:14:00
SomeoneSerge (back on matrix)	* I rather meant you can start with a personal repo. Maybe just something regularly running tests of your choice against current master and reporting the results in some form. Later there'll be a question of where this testsuite could be integrated (e.g. ideally we'd make a separate "stable cuda channel", which commits would only reach after passing cuda-related tests, but that too is work to be done). I think samuela's nixpkgs-upkeep is worth a look: it automatically opens issues about newly broken packages.	21:14:29
SomeoneSerge (back on matrix)	Now thinking of it though, I'm not quite sure if gpu-enabled tests are the most needed investment (compute-wise, or work-wise)	21:16:25
breakds	Sounds good. I am currently maintaining a separate repo of a few machine learning packages: https://github.com/nixvital/ml-pkgs Will take a look at nixpkgs-upkeep and see how to add a CI similarly	21:16:30
SomeoneSerge (back on matrix)	GPU checks are something we don't have, yes. But even the `checkPhase`s we have now are sufficient most of the time to tell when a change breaks stuff. The issue is that we can't prevent that change from reaching the channels we consume as users. That could be solved by a separate branch (even if outside nixpkgs), which would be merged into automatically as long as our checks pass	21:24:27
breakds	I see. But isn't it the CI's job to prevent such offending commits from being merged, by running the tests? Is it because running all the tests takes a lot of time due to the size of `nixpkgs`?	21:34:30
SomeoneSerge (back on matrix)	hexa: you seem to be online 🙃 could you merge this?	21:35:12
SomeoneSerge (back on matrix)	It's because nixpkgs' CI doesn't take CUDA packages into account (they're unfree and the trust model for nixpkgs' CI workers and public cache is such that they're... not expected to be running potentially malicious blackbox binaries). So we run a parallel CI. This ensures our packages are available prebuilt in cachix, and we can spot failures in the dashboard and handle them after the fact	21:38:23
SomeoneSerge (back on matrix)	Again, I think a separate branch with an auto-merge would be entirely reasonable	21:39:26
SomeoneSerge (back on matrix)	as a compromise	21:39:31
breakds	I understand now. Thanks for taking time to explain the situation Someone S !	21:40:07
SomeoneSerge (back on matrix)	No problem, welcome into the club:)	21:40:59
hexa	done	21:42:54
	breakds set their display name to breakds.	21:51:01
10 Nov 2022
Domen Kožar	In reply to @ss:someonex.net It's because nixpkgs' CI doesn't take CUDA packages into account (they're unfree and the trust model for nixpkgs' CI workers and public cache is such that they're... not expected to be running potentially malicious blackbox binaries). So we run a parallel CI. This ensures our packages are available prebuilt in cachix, and we can spot failures in the dashboard and handle them after the fact ah that remind me, I need to finish https://github.com/cachix/nixpkgs-unfree-redistributable	04:50:55
eahlberg	is it possible to see what versions are in the cuda-maintainers cachix cache? I'm trying to get CUDA up and running on an AWS ec2 instance with Tesla K80 but some things are compiled which as far as I understand will take a really long time	09:24:51
eahlberg	* is it possible to see what versions are in the cuda-maintainers cachix cache? I'm trying to get CUDA up and running on an AWS ec2 instance with Tesla K80 but some things are compiling which as far as I understand will take a really long time	09:27:54
SomeoneSerge (back on matrix)	The only interface I know for checking the actual contents is `nix path-info -r /nix/store/... --store ...` But a heuristic to avoid rebuilds is to pick a recently finished build that doesn't have too many failures: its will have been cached, and likely not have been garbage-collected by cachix	09:31:12
SomeoneSerge (back on matrix)	What's it going to be? I guess I should look up `cachix-deploy-lib`	09:38:30
Domen Kožar	It will build all unfree packages for maco/linux	10:40:27
Domen Kožar	* It will build all unfree packages for macos/linux	10:40:35
SomeoneSerge (back on matrix)	...dedicating hardware for that long-term? A cache, separate from cuda?	10:43:56
eahlberg	In reply to @ss:someonex.net The only interface I know for checking the actual contents is `nix path-info -r /nix/store/... --store ...` But a heuristic to avoid rebuilds is to pick a recently finished build that doesn't have too many failures: its will have been cached, and likely not have been garbage-collected by cachix Cool, thanks! Managed to get it up and running using the cache	14:47:22
Domen Kožar	In reply to @ss:someonex.net ...dedicating hardware for that long-term? A cache, separate from cuda? I think so. Any concerns?	14:51:40
SomeoneSerge (back on matrix)	Nooo no no, this sounds positively great! I'm only trying to understand what effective total resources the community is going to have access to after you get this running, and how that could be used to change the nixpkgs ML/scicomp user experience 😆	15:09:13
11 Nov 2022
tpw_rules	Someone S: can you explain a ltitle more why you want to add a platform attribute to numba? it seems most other python packages don't have it and there's nothing in there that really limits it to a particular platform that isn't already limited by some dependency	01:20:58
tpw_rules	just wondering if this is some new standard i'm not aware of really	01:23:54
16 Nov 2022
	omlet joined the room.	20:34:15
23 Nov 2022
Samuel Ainsworth	In reply to @breakds:matrix.org A separate question, as I read from https://discourse.nixos.org/t/announcing-the-nixos-cuda-maintainers-team-and-a-call-for-maintainers/18074 , is x86_64-linux computing cycle still needed for github actions? I have a spare RTX 3080 not attached to any machine at this moment, not sure what is the best way to make it useful to the project. Shall I build a machine to run github actions? Hi breakds! Thanks so much for your generosity! Yes, we'd love to find a way to make it useful somehow. Oddly enough our primary bottleneck atm is CPU cycles. It turns out that running CIs (nixpkgs-upkeep, Someone S 's build CI) takes a bit of compute. I'm finding that build jobs frequently hit the free-tier 6 hour limit on GH Actions, so finding an x86_64-linux machine that could be home to a GH Actions runner would be great!	09:11:05
Samuel Ainsworth	I created https://github.com/samuela/nixpkgs-upkeep with the goal of ultimately running GPU-enabled tests in CI, but honestly we've had so many issues just keeping the packages building at all (only CPU required to build), that GPU-enabled tests haven't been the issue so far	09:12:34

Show newer messages

Back to Room ListRoom Version: 9