NixOS Infrastructure - Public Room Timeline

	NixOS Infrastructure	422 Members
	Next Infra call: 2024-07-11, 18:00 CEST (UTC+2) \| Infra operational issues backlog: https://github.com/orgs/NixOS/projects/52 \| See #infra-alerts:nixos.org for real time alerts from Prometheus.	132 Servers

Load older messages

Sender	Message	Time
9 May 2026
hexa	but I also don't want an extended backlog of uploads ideally	20:19:55
emily	right, but they'll happen anyway right?	20:20:11
hexa	we can increase the retry amounts	20:20:12
emily	they're ultimately part of the jobset	20:20:20
hexa	except when the ydon't	20:20:22
emily	I guess the difference is it can give up on leafs	20:20:26
hexa	huh	20:20:28
hexa	they?	20:20:36
emily	the things being uploaded	20:20:48
hexa	right	20:20:52
emily	I think a nicer solution is ^ where you just never push out a `.narinfo` for any output until all the outputs are up	20:21:13
emily	but looking at the C++ code it doesn't seem like that would be trivial to arrange if S3 can even do it	20:21:29
emily	and obviously I don't know how the new queue runner will handle uploads (maybe John Ericson does)	20:21:44
hexa	Simon Hauser would know	20:22:08
emily	it seems like just increasing the number of retries in the Nix config specifically used by the queue runner would likely mitigate this problem in practice for now	20:22:13
hexa	fair enough	20:22:29
hexa	let's say 32 instead of 1024 though	20:22:37
John Ericson	I do agree with that sort of thing	20:23:49
John Ericson	(well in the CA case, it would be the build trace entry, but I digress :))	20:24:04
emily	in the CA case the Darwin builds will be broken 100% of the time so there'll be much less debugging required to find the root cause :P	20:24:37
John Ericson	no more automatic rewriting in CA soon! :)	20:26:12
emily	how is that going to work?	20:26:42
John Ericson	https://github.com/NixOS/nix/pull/15793 see what Artemis has been working on	20:27:51
John Ericson	(note that the use of the exing protocol vs a new simpler protocol is provisional)	20:28:07
hexa	how annoying would it be to move the S3 bucket to Europe btw 😆	20:28:54
emily	I don't see how this solves every dylib on Darwin encoding its own path?	20:29:51
emily	"The builder can learn the calculated content-addressed path of one of its outputs before creating others" sounds like it can only prevent breaking cross-output references (and since you usually need to know all paths upfront at `./configure` time probably not even then)	20:30:12
emily	but offtopic, anyway :)	20:30:14
	@squalus0:matrix.org left the room.	21:21:39
10 May 2026
Vladimír Čunát	In reply to @emilazy:matrix.org right, okay. I don't know exactly why it happens but I know vcunat has mentioned it happening multiple times before. To me the behavior looked like insufficient deduplication of the situations where different hydra jobs built the same derivation, e.g. on different jobsets. It seemed easy to happen. We're helped by the ratio of job count to build times usually, as the duplication only happens concurrently. And for linux we get helped by the builder pool being small and split along the big-parallel axis, so often the duplicate got scheduled to the same machine and deduplicated locally by nix in there.	06:27:16

Show newer messages

Back to Room ListRoom Version: 6