| 19 Mar 2026 |
Sergei Zimmerman (xokdvium) | Well if that's the case bernardo will get an earful from me since I asked him if it does and he confirmed | 10:03:36 |
Arian | Nope 100% HTTP1.1
Host: nix-cache.s3.us-east-1.amazonaws.com
> User-Agent: curl/8.17.0
> Accept: */*
>
* Request completely sent off
< HTTP/1.1 200 OK
| 10:04:16 |
Sergei Zimmerman (xokdvium) | womp womp | 10:05:00 |
Arian | https://gist.github.com/arianvp/cf5ce0cba528acc43904d7987ae90f98 | 10:06:10 |
Arian | You can also verify yourself with just openssl:
openssl s_client -alpn 'h2,http/1.1' -connect nix-cache.s3.amazonaws.com:443
openssl s_client -alpn 'h2' -connect nix-cache.s3.amazonaws.com:443
In the first one it negotiates HTTP1.1 and in the second one it says “No ALPN Negotiated”
| 10:12:13 |
Sergei Zimmerman (xokdvium) | Hm so issue then would be keepalive making things worse | 10:31:54 |
emily | shhh, SQS will hear you | 11:28:58 |
Mic92 | I think I remember he talked about some proxies at some point. | 11:56:59 |
Mic92 | So might be not the s3 itself | 11:57:19 |
Mic92 | Sergei Zimmerman (xokdvium): https://github.com/NixOS/nix/pull/15522 so this would be the most sensible fix for now? | 11:57:33 |
Arian | I feel like we have something misconfigured with curl's connection pooling | 11:58:12 |
Mic92 | * So might be not the s3 itself that does http | 11:59:15 |
Mic92 | * So might be not the s3 itself that does http 2.0 | 12:00:36 |
Mic92 | hexa (signing key rotation when): so my plan, would be to apply the patch above to our hydra and if this fixes the issue, we could merge it and have a nix patch release today | 12:07:53 |
hexa | Works for me | 12:08:15 |
Sergei Zimmerman (xokdvium) | Don't think so. With a dummy python server with h1.1 I do get reuse:
curl: Reusing existing http: connection with host localhost
downloading 'http://localhost:9000/9x7dq2sgrw63d93pa5lyk51hgwsmmn9k.narinfo'...
| 12:15:58 |
Mic92 | https://github.com/NixOS/infra/pull/982 | 12:17:11 |
Mic92 | I am checking also what the aws sdk is actually doing, since it's also using curl | 12:22:47 |
Mic92 | https://github.com/aws/aws-sdk-cpp/blob/9204e236faaa1ca6a0342dee7caf61c7cf5ad8bb/src/aws-cpp-sdk-core/source/http/curl/CurlHandleContainer.cpp#L172-L176 | 12:24:33 |
Mic92 | So looks like we always had keep-alive | 12:25:12 |
Mic92 | but aws does it's own pooling | 12:26:48 |
Sergei Zimmerman (xokdvium) | It might have retries for the error. Also one thing to note is that old code didn't run concurrent s3 requests at all, since it was using the blocking API. Now we fire off a bunch of requests in parallel. | 12:26:53 |
Sergei Zimmerman (xokdvium) | We do curl_multi pooling too, that does reuse the handles | 12:27:25 |
Sergei Zimmerman (xokdvium) | Or rather the connections for the easy handles | 12:27:36 |
Mic92 | aws, doesn't seem to use this interface | 12:28:30 |
Sergei Zimmerman (xokdvium) | Well it effectively does the same thing. What aws probably does is retry on that 400 error | 12:29:30 |
Arian | In reply to @joerg:thalheim.io but aws does it's own pooling Only if you use the transfer API iirc which we didn't | 12:32:54 |
Arian | For S3 you also need to retry on 503 for uploads
https://docs.aws.amazon.com/AmazonS3/latest/userguide/optimizing-performance-design-patterns.html | 12:34:08 |
Arian | They use 503 for rate limit :') | 12:34:30 |
Mic92 | https://github.com/aws/aws-sdk-cpp/blob/9204e236faaa1ca6a0342dee7caf61c7cf5ad8bb/src/aws-cpp-sdk-core/source/client/CoreErrors.cpp#L90-L91 | 12:35:08 |