NixOS Infrastructure - Public Room Timeline

	NixOS Infrastructure	427 Members
	Next Infra call: 2024-07-11, 18:00 CEST (UTC+2) \| Infra operational issues backlog: https://github.com/orgs/NixOS/projects/52 \| See #infra-alerts:nixos.org for real time alerts from Prometheus.	131 Servers

Load older messages

Sender	Message	Time
20 May 2026
Cobalt	I know, I plan to do some research on this for my baechelor thesis. iirc you or hexa (signing key rotation when) broght it up when I asked for relevant issues before in the offtopic channel. The main thing for me here was to not just count store paths though but instead the total size of the store paths (as , e.g., firefox-bin is heavier than harfbuzz). My main plan here was to make a prototype with an extra agent/nix-scheduler-hook and use the results from testing there to propose changes to hydra queue runner later.	11:38:49
Cobalt	* I know, I plan to do some research on this for my baechelor thesis. iirc you or hexa (signing key rotation when) broght it up when I asked for relevant issues before in the offtopic channel. The main thing for me here was to not just count store paths though but instead the total size of the store paths (as , e.g., firefox-bin is heavier than harfbuzz). My main plan here was to make a prototype with an extra agent/nix-scheduler-hook and use the results from testing there to propose changes to hydra queue runner later. The number/total size of paths here was to make some estimates here about actual costs of, e.g., asking the daemon for the subset of paths in the store.	11:39:57
Cobalt	* I know, I plan to do some research on this for my baechelor thesis. iirc you or hexa (signing key rotation when) broght it up when I asked for relevant issues before in the offtopic channel. The main thing for me here was to not just count store paths though but instead the total size of the store paths (as , e.g., firefox-bin is heavier than harfbuzz). My main plan here was to make a prototype with an extra agent/nix-scheduler-hook and use the results from testing there to propose changes to hydra queue runner later. The number/total size of paths here was to make some estimates here about actual computational cost of, e.g., asking the daemon for the subset of paths in the store.	11:40:07
Cobalt	* I'm aware of the issues with just tagging it on the existing daemon, I plan to do some research on this for my baechelor thesis. iirc you or hexa (signing key rotation when) broght it up when I asked for relevant issues before in the offtopic channel. The main thing for me here was to not just count store paths though but instead the total size of the store paths (as , e.g., firefox-bin is heavier than harfbuzz). My main plan here was to make a prototype with an extra agent/nix-scheduler-hook and use the results from testing there to propose changes to hydra queue runner later. The number/total size of paths here was to make some estimates here about actual computational cost of, e.g., asking the daemon for the subset of paths in the store.	11:41:37
hexa	[root@elated-minsky:~]# nix shell nixpkgs#sqlite -c sqlite3 /nix/var/nix/db/db.sqlite 'select count(), sum(narSize) from ValidPaths' 391244\|5632213988352 [root@sleepy-brown:~]# nix shell nixpkgs#sqlite -c sqlite3 /nix/var/nix/db/db.sqlite 'select count(), sum(narSize) from ValidPaths' 335188\|5232269242216 [root@goofy-hopcroft:~]# nix shell nixpkgs#sqlite -c sqlite3 /nix/var/nix/db/db.sqlite 'select count(), sum(narSize) from ValidPaths' 828294\|12762811007880 [root@hopeful-rivest:~]# nix shell nixpkgs#sqlite -c sqlite3 /nix/var/nix/db/db.sqlite 'select count(), sum(narSize) from ValidPaths' 157509\|2004835377096	11:56:04
hexa	those are our linux builders, the first two x86_64-linux, the last two aarch64-linux	11:57:39
Cobalt	Thank you a lot!	11:58:08
hexa	and our gc logic	11:58:42
Cobalt	Thank you for pointing me to the configurations, I was aware of that to some degree (searching for the metrics initially led me to the system configurations). The 500GB limit for a GC run is presumably to ensure that the GC is not locking the store for too long?	12:01:36
hexa	yeah, gc is not free and if we keep some outputs around for reuse that's good too	12:03:27
hexa	we might eventually migrate to nix-fast-gc in a bit	12:03:59
hexa	~~that'll fix us~~	12:04:12
hexa	Redacted or Malformed Event	12:04:23
hexa	https://github.com/Mic92/fast-nix-gc	12:04:35
Cobalt	That also looks very interesting, especially the notes on SQLite queries in fast-nix-gc.	12:06:42
K900	I'm not sure what kind of optimization you're looking at here, but generally GC or any kind of store queries aren't the bottleneck	12:07:33
Cobalt	The main optimizations were how to reduce the number/cost of queries for evaluating the subset size required when looking scheduling a set of builds. They might not be a bottlneck by themselves but caching and/or applicability of a stochastic data structure seems an interesting extension. My supervisor was interested in this specific sub-problem as it relates a bit to his own research iirc.	12:13:23
Cobalt	fast-nix-gc does not really have anything related to this, it just mentions that they load the paths into a graph for the GC search first instead of querying the store for all lookups.	12:14:15
K900	Well right now the scheduling is very stupid	12:14:20
K900	There's no locality awareness	12:14:30
K900	Or hell job size awareness	12:14:32
K900	Improving it will definitely help a little, but the big bottleneck is still the coordinator itself	12:15:32
hexa	the scheduler	12:15:50
K900	The coordinator as in the machine	12:16:01
K900	But yeah	12:16:02
hexa	the coordinator is the process that runs the remote build	12:16:14
Cobalt	So just to understand this a bit more, a significant problem is the performance of the software running the scheduler/coordinator (so the queue runner)?	12:17:07
K900	It's not even the software necessarily	12:18:34
K900	It's the design of the whole thing that requires a lot of copying data around	12:18:44
Cobalt	A montivation of the optimizations was to ensure that scheduling was supposed to stay cheap-ish so I would try to not compromise this too much.	12:18:53

Show newer messages

Back to Room ListRoom Version: 6