NixOS Infrastructure | 387 Members | |
| Next Infra call: 2024-07-11, 18:00 CEST (UTC+2) | Infra operational issues backlog: https://github.com/orgs/NixOS/projects/52 | See #infra-alerts:nixos.org for real time alerts from Prometheus. | 120 Servers |
| Sender | Message | Time |
|---|---|---|
| 7 Jun 2025 | ||
| yeah | 22:18:41 | |
| Alright I'll answer with that. Can also tell them to join this room if there's other problems | 22:19:01 | |
| (although if they're actually blocked in some way they might not be able to :P) | 22:19:17 | |
| a lot of networking equipment also doesn't answer pings or punts them to low priority | 22:19:23 | |
| so ICMP pings are not inherently reliable tests of connectivity or latency | 22:19:47 | |
In reply to @infinisil:matrix.orgUnlikely | 22:20:22 | |
| (and i can confirm that nixos.org indeed does not answer ICMP pings, but does answer HTTPS) | 22:20:44 | |
| Relevant would he a traceroute (UDP or TCP) and a curl log | 22:21:08 | |
| * | 22:21:16 | |
| My providers routers drop exactly 50% of a normal mtr ping at hop 4 or 5 | 22:43:00 | |
| not 49.5, not 51.3, exactly 50.0% | 22:43:13 | |
| its because icmp echo requests leave the fast path and go to the control plane | 22:43:24 | |
| the control plane is not equipped to handle a huge number of packages, so rate limiting kicks in | 22:43:47 | |
| "sEcUrItY" aka make debugging hard for no reason :P or maybe they run Windows Server where you can sometimes get RCE with IPv6 pings 😂 | 22:43:50 | |
| exactly | 22:43:55 | |
| * the control plane is not equipped to handle a huge number of packets, so rate limiting kicks in | 22:44:37 | |
| luckily you cannot turn it off for IPv6 completely without breaking some things | 22:44:59 | |
| you absoutely can turn off icmpv6 echo requests | 22:45:17 | |
| * you absolutely can turn off icmpv6 echo requests | 22:45:21 | |
| but breaking neighbor discovery and path mtu discovery is where things break | 22:45:56 | |
| * but blocking neighbor discovery and path mtu discovery is where things break | 22:46:00 | |
| * you absolutely can turn off icmpv6 echo responses | 22:46:12 | |
| * you absolutely can turn drop icmpv6 echo requests | 22:46:20 | |
| 23:36:18 | ||
| 8 Jun 2025 | ||
| 00:17:57 | ||
| * you absolutely can drop icmpv6 echo requests | 15:37:33 | |
| 10 Jun 2025 | ||
| I’m trying to understand our caching setup a bit better with Fastly<->S3, given bandwidth between S3 and fastly is our largest cost. I noticed we’re not serving any Do I understand correctly that https://github.com/NixOS/infra/blob/88f1c42e90ab88673ddde3bf973330fb2fcf23be/terraform/cache.tf#L138C17-L138C22 is the only thing configuring how long we hold things in cache? (seems to be 24 hours). Given we also cache 404s on narinfos I guess that makes sense. (In case the narinfo gets uploaded later it invalidates it). But can’t we cache NARs way more aggressively than 24 hours? Would reduce the bandwidth on S3 perhaps. | 11:05:09 | |
| * I’m trying to understand our caching setup a bit better with Fastly<->S3, given bandwidth between S3 and fastly is our largest cost. I noticed we’re not serving any Do I understand correctly that https://github.com/NixOS/infra/blob/88f1c42e90ab88673ddde3bf973330fb2fcf23be/terraform/cache.tf#L138C17-L138C22 is the only thing configuring how long we hold things in cache? (seems to be 24 hours). Given we also cache 404s on narinfos I guess that makes sense as we want them to be fast. ( and In case the narinfo gets uploaded later it invalidates it). But can’t we cache NARs way more aggressively than 24 hours? Would reduce the bandwidth on S3 perhaps. | 11:06:41 | |
| * I’m trying to understand our caching setup a bit better with Fastly<->S3, given bandwidth between S3 and fastly is our largest cost. I noticed we’re not serving any Do I understand correctly that https://github.com/NixOS/infra/blob/88f1c42e90ab88673ddde3bf973330fb2fcf23be/terraform/cache.tf#L138C17-L138C22 is the only thing configuring how long we hold things in cache? (seems to be 24 hours). Given we also cache 404s on narinfos I guess that makes sense as we want them to be fast. ( and In case the narinfo gets uploaded later it invalidates it). But can’t we cache NARs way more aggressively than 24 hours? Would reduce the bandwidth on S3 perhaps. | 11:08:36 | |
I guess even for 200 OK narinfos we could set Cache-Control: immutable. Just not for 404s | 11:11:06 | |