| 31 May 2025 |
emily | there's a reason we backport their updates to stable even when they're backwards-incompatible | 23:54:58 |
| 1 Jun 2025 |
Arian | I was wondering. Could we reduce our AWS egress bill by having Hydra directly warm the fastly cache? Then we don't need to pay money for fastly fetching from S3 for the first time. | 10:39:02 |
Arian | We can still keep S3 for archival. As ingress is free. But it would remove the egress fees AWS currently charges | 10:40:03 |
Arian | This could literally be a fastly script that first fetches from the store hosted by the Hydra coördinator and if the file is not on disk falls back to S3 | 10:41:35 |
Arian | It seems too simple to not have been tried before. Is there something I'm missing here? | 10:42:07 |
Vladimír Čunát | It surprises me that Fastly supports pushing into their cache. | 10:43:14 |
Arian | I'm not saying we should push. I'm saying fastly should fetch from hydra instead of s3. And only fetch to s3 when hydra returns a 404 | 10:43:44 |
emily | I guess the question is just whether Hydra can handle that load. and also how big its pipe to Fastly is? | 11:34:16 |
edef | In reply to @arianvp:matrix.org I'm not saying we should push. I'm saying fastly should fetch from hydra instead of s3. And only fetch to s3 when hydra returns a 404 we could alternatively finish the Tigris story / move more towards s3 as pure archival storage | 11:35:31 |
edef | both tigris and r2 are free egress, the norm is changing gradually | 11:36:06 |
Arian | In reply to @emilazy:matrix.org I guess the question is just whether Hydra can handle that load. and also how big its pipe to Fastly is? It is already handling that load though? It's uploading to S3 after all. | 11:41:11 |
toonn | That load would be almost doubling, no? | 11:48:45 |
edef | no, we could run the pipeline through fastly, if we wanted to | 11:49:13 |
edef | we can fetch into s3 on the aws side, pulling from fastly | 11:50:12 |
edef | whether that is wise is another question, but we can definitely do it | 11:50:27 |
edef | mostly it is isomorphic to having hydra have a more-local / free-egress storage service | 11:51:49 |
edef | and using s3 mostly as archival storage | 11:52:05 |
toonn | That's certainly better than Hydra pushing to S3 and Fastly pulling from it preferentially but won't most of the Fastly cache still be clobbered by the S3 requests? So in the end Fastly would fetch many things from Hydra twice still? | 12:29:43 |
flokli | We can also do the push through thing. So instead of hydra uploading to fastly or s3 directly, it'd send to a http daemon that uploads to both places | 13:47:44 |
flokli | It gives nicer control over the import path | 13:48:01 |
edef | yes, these are all essentially identical in terms of dataflow | 13:48:06 |
flokli | * It gives nicer control over the upload path | 13:48:12 |
edef | * yes, these are all essentially identical in terms of bulk data flow | 13:48:20 |
edef | or well, we would be spending twice the hetzner bandwidth if we go to both | 13:48:57 |
edef | mostly it depends on how much annoying orchestration we want to have | 13:49:16 |
edef | * mostly it depends on how much annoying orchestration and consistency trickery we want to have | 13:49:29 |
flokli | I assumed network bandwidth egressing from Hetzner is not the bottleneck | 13:51:01 |
hexa | it is not | 13:51:12 |
edef | it's not, no | 13:51:15 |
edef | so like, for the most part you can do anything to the NAR datapath | 13:52:06 |