NixOS Infrastructure - Public Room Timeline

	NixOS Infrastructure	400 Members
	Next Infra call: 2024-07-11, 18:00 CEST (UTC+2) \| Infra operational issues backlog: https://github.com/orgs/NixOS/projects/52 \| See #infra-alerts:nixos.org for real time alerts from Prometheus.	120 Servers

Load older messages

Sender	Message	Time
18 Mar 2026
hexa	`command="NIX_SSL_CERT_FILE=/nix/store/s6x1s57vgmiccc1grwwcmphzbffs76cd-nss-cacert-3.121/etc/ssl/certs/ca-bundle.crt /nix/store/92nq0147myzffvhq6jlq06ad68yh0ja3-nix-2.31.2+1/bin/nix-store --serve --store daemon --write"`	12:59:48
hexa	looks like it runs the wrong nix-store binary?	13:00:11
hexa	rebooting	13:05:14
hexa	no change	13:09:03
hexa	ok, what helped was deconfiguring the key and reconfiguring it fresh	13:25:32
Winter	could I get `restart-jobs` on hydra please? email is `[name]@[name].cafe`	14:04:03
hexa	Download image.png	16:05:47
hexa	applied	16:05:48
Winter	thank you 🫡	16:06:16
	debtquity joined the room.	21:56:55
19 Mar 2026
Vladimír Čunát	Hydra now shows lots of abortions from S3 uploads. Aborted: [31;1merror:[0m … while uploading to S3 binary cache at '[35;1ms3://nix-cache[0m' [31;1merror:[0m unable to upload '[35;1mhttps://nix-cache.s3.us-east-1.amazonaws.com/nar/0fi6dj94id8fla40n3h98b9spmsbszl7i7k2bkpnjbqk9fcgyad8.nar.xz[0m': HTTP error [35;1m400[0m[35;1m[0m response body: [35;1m<?xml version="1.0" encoding="UTF-8"?> <Error><Code>RequestTimeout</Code><Message>Your socket connection to the server was not read from or written to within the timeout period. Idle connections will be closed.</Message><RequestId>2SDA6N47EBP0Y9TY</RequestId><HostId>DzalNia2QC6pPMWG4tSso2PO0E+w+ZmqUrCtA+oI2GP1hMZV3IVW5WibfNELfak4rQVKuh6OrCw=</HostId></Error>[0m	08:08:00
hexa	RequestTimeout	08:08:37
hexa	hydra surely retries uplodas, right?	08:08:50
Vladimír Čunát	The Hydra jobs certainly remained failed/aborted until I restarted them manually.	08:11:34
Sergei Zimmerman (xokdvium)	The queue runner is at 2.34 now?	08:59:09
hexa	Hydra 0.1.20260316.a40d428 (using nix-2.34.1 and nix-eval-jobs-2.34.1)	09:03:48
hexa	Mar 17 16:25:15 mimas hydra-queue-runner[4333]: warning: error: unable to upload 'https://s3.us-east-1.amazonaws.com/nix-cache/yak3aj118k761cgsnc79k0wpk4y5598c.ls': Timeout was reached (28) Operation timed out after 15942 milliseconds with 0 bytes received; retrying in 287 ms they started here	09:04:57
hexa	and there have been ~17.8k of these since	09:05:14
hexa	that coincides with https://github.com/NixOS/infra/pull/978	09:06:09
Vladimír Čunát	I've seen these abortions occasionally in the past few months.	09:08:09
Vladimír Čunát	* I've seen these S3 abortions occasionally in the past few months.	09:08:19
Sergei Zimmerman (xokdvium)	So 2.33 wasn’t experiencing this? The only meaningful change since then I think was adding tcp keepalive	09:08:21
Sergei Zimmerman (xokdvium)	And also switching to virtual-hosted–style endpoints. Might have made a difference there	09:17:57
Sergei Zimmerman (xokdvium)	Connection is most likely the issue here. As a quickfix I think this should help: https://github.com/NixOS/infra/blob/22a38e7c5f61d7b04231c3dd84f5ca1a6fec52ef/build/hydra.nix#L88 Adding `addressing-style=path` should prevent connection reuse since it's http1-only.	09:27:19
Sergei Zimmerman (xokdvium)	And s3 sends a connection close on those always. We'll see about making connection reuse much less aggressive and do a patch release of 2.34	09:28:06
hexa	applying	09:34:47
Arian	S3 doesn't support http2 at all AFAIK?	09:42:35
Arian	Fwiw S3 explicitly documents that the way to scale it is to open multiple TCP connections. The A record returns like 50 different IP addresses and should connect to all of them	09:43:05
Sergei Zimmerman (xokdvium)	It does on the virtual-hosted buckets apparently	09:43:13
Vladimír Čunát	Sounds like horrible design.	09:43:51

Show newer messages

Back to Room ListRoom Version: 6