| 9 Nov 2022 |
SomeoneSerge (back on matrix) | I meant, you can start with a personal repo. Maybe just something regularly running tests of your choice against current master and reporting the results in some form.
Later there'll be a question of where this testsuite could be integrated (e.g. ideally we'd make a separate "stable cuda channel", which commits would only reach after passing cuda-related tests, but that too is work to be done).
I think samuela's nixpkgs-upkeep is worth a look: it automatically opens issues about newly broken packages. | 21:14:00 |
SomeoneSerge (back on matrix) | * I rather meant you can start with a personal repo. Maybe just something regularly running tests of your choice against current master and reporting the results in some form.
Later there'll be a question of where this testsuite could be integrated (e.g. ideally we'd make a separate "stable cuda channel", which commits would only reach after passing cuda-related tests, but that too is work to be done).
I think samuela's nixpkgs-upkeep is worth a look: it automatically opens issues about newly broken packages. | 21:14:29 |
SomeoneSerge (back on matrix) | Now thinking of it though, I'm not quite sure if gpu-enabled tests are the most needed investment (compute-wise, or work-wise) | 21:16:25 |
breakds | Sounds good. I am currently maintaining a separate repo of a few machine learning packages: https://github.com/nixvital/ml-pkgs Will take a look at nixpkgs-upkeep and see how to add a CI similarly | 21:16:30 |
SomeoneSerge (back on matrix) | GPU checks are something we don't have, yes. But even the checkPhases we have now are sufficient most of the time to tell when a change breaks stuff. The issue is that we can't prevent that change from reaching the channels we consume as users. That could be solved by a separate branch (even if outside nixpkgs), which would be merged into automatically as long as our checks pass | 21:24:27 |
breakds | I see. But isn't it the CI's job to prevent such offending commits from being merged, by running the tests? Is it because running all the tests takes a lot of time due to the size of nixpkgs? | 21:34:30 |
SomeoneSerge (back on matrix) | hexa: you seem to be online 🙃 could you merge this? | 21:35:12 |
SomeoneSerge (back on matrix) | It's because nixpkgs' CI doesn't take CUDA packages into account (they're unfree and the trust model for nixpkgs' CI workers and public cache is such that they're... not expected to be running potentially malicious blackbox binaries). So we run a parallel CI. This ensures our packages are available prebuilt in cachix, and we can spot failures in the dashboard and handle them after the fact | 21:38:23 |
SomeoneSerge (back on matrix) | Again, I think a separate branch with an auto-merge would be entirely reasonable | 21:39:26 |
SomeoneSerge (back on matrix) | as a compromise | 21:39:31 |
breakds | I understand now. Thanks for taking time to explain the situation Someone S ! | 21:40:07 |
SomeoneSerge (back on matrix) | No problem, welcome into the club:) | 21:40:59 |
hexa | done | 21:42:54 |
| breakds set their display name to breakds. | 21:51:01 |
| 10 Nov 2022 |
Domen Kožar | In reply to @ss:someonex.net It's because nixpkgs' CI doesn't take CUDA packages into account (they're unfree and the trust model for nixpkgs' CI workers and public cache is such that they're... not expected to be running potentially malicious blackbox binaries). So we run a parallel CI. This ensures our packages are available prebuilt in cachix, and we can spot failures in the dashboard and handle them after the fact ah that remind me, I need to finish https://github.com/cachix/nixpkgs-unfree-redistributable | 04:50:55 |
eahlberg | is it possible to see what versions are in the cuda-maintainers cachix cache? I'm trying to get CUDA up and running on an AWS ec2 instance with Tesla K80 but some things are compiled which as far as I understand will take a really long time | 09:24:51 |
eahlberg | * is it possible to see what versions are in the cuda-maintainers cachix cache? I'm trying to get CUDA up and running on an AWS ec2 instance with Tesla K80 but some things are compiling which as far as I understand will take a really long time | 09:27:54 |
SomeoneSerge (back on matrix) | The only interface I know for checking the actual contents is nix path-info -r /nix/store/... --store ... But a heuristic to avoid rebuilds is to pick a recently finished build that doesn't have too many failures: its will have been cached, and likely not have been garbage-collected by cachix | 09:31:12 |
SomeoneSerge (back on matrix) | What's it going to be? I guess I should look up cachix-deploy-lib | 09:38:30 |
Domen Kožar | It will build all unfree packages for maco/linux | 10:40:27 |
Domen Kožar | * It will build all unfree packages for macos/linux | 10:40:35 |
SomeoneSerge (back on matrix) | ...dedicating hardware for that long-term?
A cache, separate from cuda? | 10:43:56 |
eahlberg | In reply to @ss:someonex.net The only interface I know for checking the actual contents is nix path-info -r /nix/store/... --store ... But a heuristic to avoid rebuilds is to pick a recently finished build that doesn't have too many failures: its will have been cached, and likely not have been garbage-collected by cachix Cool, thanks! Managed to get it up and running using the cache | 14:47:22 |
Domen Kožar | In reply to @ss:someonex.net ...dedicating hardware for that long-term? A cache, separate from cuda? I think so. Any concerns? | 14:51:40 |
SomeoneSerge (back on matrix) | Nooo no no, this sounds positively great! I'm only trying to understand what effective total resources the community is going to have access to after you get this running, and how that could be used to change the nixpkgs ML/scicomp user experience 😆 | 15:09:13 |
| 11 Nov 2022 |
tpw_rules | Someone S: can you explain a ltitle more why you want to add a platform attribute to numba? it seems most other python packages don't have it and there's nothing in there that really limits it to a particular platform that isn't already limited by some dependency | 01:20:58 |
tpw_rules | just wondering if this is some new standard i'm not aware of really | 01:23:54 |
| 16 Nov 2022 |
| omlet joined the room. | 20:34:15 |
| 23 Nov 2022 |
Samuel Ainsworth | In reply to @breakds:matrix.org A separate question, as I read from https://discourse.nixos.org/t/announcing-the-nixos-cuda-maintainers-team-and-a-call-for-maintainers/18074 , is x86_64-linux computing cycle still needed for github actions? I have a spare RTX 3080 not attached to any machine at this moment, not sure what is the best way to make it useful to the project. Shall I build a machine to run github actions? Hi breakds! Thanks so much for your generosity! Yes, we'd love to find a way to make it useful somehow. Oddly enough our primary bottleneck atm is CPU cycles. It turns out that running CIs (nixpkgs-upkeep, Someone S 's build CI) takes a bit of compute. I'm finding that build jobs frequently hit the free-tier 6 hour limit on GH Actions, so finding an x86_64-linux machine that could be home to a GH Actions runner would be great! | 09:11:05 |
Samuel Ainsworth | I created https://github.com/samuela/nixpkgs-upkeep with the goal of ultimately running GPU-enabled tests in CI, but honestly we've had so many issues just keeping the packages building at all (only CPU required to build), that GPU-enabled tests haven't been the issue so far | 09:12:34 |