!eWOErHSaiddIbsUNsJ:nixos.org

NixOS CUDA

317 Members
CUDA packages maintenance and support in nixpkgs | https://github.com/orgs/NixOS/projects/27/ | https://nixos.org/manual/nixpkgs/unstable/#cuda63 Servers

Load older messages


SenderMessageTime
19 May 2024
@aidalgol:matrix.orgaidalgolIt sounds exactly like this: https://forums.developer.nvidia.com/t/nvidia-smi-reporting-0-gpu-utilization/26187809:48:51
@connorbaker:matrix.orgconnor (he/him)I can try to do a thing on my GPU in a bit and see what happens13:33:30
@connorbaker:matrix.orgconnor (he/him)

ahahaha okay well...

python3.11-nix-cuda-test> Running phase: pythonRuntimeDepsCheckHook
python3.11-nix-cuda-test> Executing pythonRuntimeDepsCheck
python3.11-nix-cuda-test> Checking runtime dependencies for nix_cuda_test-0.1.0-py3-none-any.whl
python3.11-nix-cuda-test>   - torchvision>=0.15.0 not satisfied by version 0.18.0a0
13:46:37
@connorbaker:matrix.orgconnor (he/him)so now I guess that needs to be fixed13:46:46
@connorbaker:matrix.orgconnor (he/him) I don't have experience with Python's packaging so I'm not sure how this is implemented: https://github.com/NixOS/nixpkgs/blob/4e6ae832dcc55a3d8c0b05504548524f297f7ed5/pkgs/development/interpreters/python/hooks/python-runtime-deps-check-hook.py#L81-L85 13:51:44
@glepage:matrix.orgGaétan LepageOk ! Thanks for the details !13:54:29
@glepage:matrix.orgGaétan LepageYou have other usage for storage than nix builds right ?13:55:08
@connorbaker:matrix.orgconnor (he/him)Ah yeah definitely! I'm really into multi-frame super resolution so I've been trying to start aggregating photography I've done to turn it into a dataset13:56:32
@connorbaker:matrix.orgconnor (he/him)I've also got a Light L16 I want to use to create a dataset, and a Lytro Illum because I thought it could be neat to see what I can do with a plenoptic camera13:57:07
@connorbaker:matrix.orgconnor (he/him)UGH https://github.com/pytorch/vision/blob/v0.18.0/version.txt13:58:36
@glepage:matrix.orgGaétan Lepage

Oh I see !
At first, I looked at old MB/CPU combos on ebay (Epyc) but they are

  1. DDR4
  2. not "that" cheap
  3. slower than more modern chips

Lately I was more looking at the Threadripper 7960x

13:58:38
@glepage:matrix.orgGaétan LepageBut it's quite expensive, and the MB too13:58:56
@connorbaker:matrix.orgconnor (he/him) They left it as 0.18.0a0 in version.txt 13:58:57
@glepage:matrix.orgGaétan Lepage* But it's quite expensive, and the MBs too13:59:01
@connorbaker:matrix.orgconnor (he/him)Oof yeah any of the workstation-grade chips are very expensive13:59:25
@connorbaker:matrix.orgconnor (he/him)I didn't realize how dump Nix's remote build protocol is in terms of scheduling (doesn't take advantage of data locality, keep records of build times of pervious versions of packages with that name to decide how to allocate, etc.) so I thought scaling out would be better than scaling up14:00:14
@connorbaker:matrix.orgconnor (he/him) nixbuild.net is doing amazing stuff with respect to scaling out though -- they've re-implemented the nix remote build protocol and so while their endpoint presents itself as a single monolithic machine, one the backend they're able to scale up and down instances as needed 14:01:50
@connorbaker:matrix.orgconnor (he/him) hexa (UTC+1): sorry for the @ -- any ideas if the above failure (last four messages) is by design? I'm not familiar with packaging but I saw you contributed the hook doing the version check. I'd just like to know whether I should tell upstream or patch in-tree. 14:04:53
@hexa:lossy.networkhexathe upstream package pins that version14:05:44
@hexa:lossy.networkhexaand we provide something that doesn't match that constraint14:06:00
@glepage:matrix.orgGaétan LepageOk ! So is what tier would you think is the most interesting for a builder: consumer, HEDT or pro ?14:17:57
@connorbaker:matrix.orgconnor (he/him)Ah it's because pre-releases aren't allowed by default right14:18:05
@glepage:matrix.orgGaétan Lepage7960x would be HEDT I guess14:18:08
@connorbaker:matrix.orgconnor (he/him) Changing "torchvision>=0.15.0", to "torchvision>=0.15.0a0", in nix-cuda-test's pyproject.toml enables pre-releases for that requirement (https://github.com/pypa/packaging/blob/32deafe8668a2130a3366b98154914d188f3718e/src/packaging/specifiers.py#L249-L270).
So I guess I should submit a PR to torchvision to fix their version (it doesn't match their tag either).
14:23:35
@hexa:lossy.networkhexaoohhh, pre-releases14:24:13
@hexa:lossy.networkhexamy bad14:24:14
@hexa:lossy.networkhexanot sure if we should allow pre-releases14:25:06
@hexa:lossy.networkhexait would probably remove confusion about the error14:25:14
@connorbaker:matrix.orgconnor (he/him) I think it's safe to say it's upstream's fault -- their previous releases didn't have mismatched version.txt files. I made a PR: https://github.com/pytorch/vision/pull/8431 14:36:28
@connorbaker:matrix.orgconnor (he/him)So really, your hook helped me catch something upstream did <314:36:51

Show newer messages


Back to Room ListRoom Version: 9