| 4 Mar 2024 |
edef | the narinfo dataset is archived in several places now, that part we have covered | 22:17:52 |
raitobezarius | In reply to @edef1c:matrix.org that locks us in for 6 months, but if there are no other takers, i'll put down the $3k to buy myself 6 months of development time for an exit strategy from AWS for that data we collected enough money to put the 3K as part of the "binary cache niceties" budget | 22:18:04 |
raitobezarius | fwiw | 22:18:09 |
edef | sure, that works for me, $3k is def still a meaningful cost for me | 22:18:32 |
edef | i just know that history would not judge me kindly if i let this data go to /dev/null | 22:19:13 |
| 5 Mar 2024 |
nh2 | In reply to @zimbatm:numtide.com nh2: do you want to join the infra meeting on Thusday 18:00 GMT+1 and hash this out with us? Unfortunately I'll be on a train at that time, so my ability to join may be reduced | 02:30:40 |
nh2 | In reply to @edef1c:matrix.org so like, my biggest proposal wrt the cache GC is that we aggregate the "deleted" data into Glacier Deep Archive, as large objects edef: What will be the cost of getting them out again, just to be sure that it won't be forbiddingly large? | 02:31:47 |
edef | batch restores are free, they just have 12h latency | 03:32:17 |
edef | restores happen to S3 reduced redundancy but we'd only need to float a small fraction of the dataset at a time | 03:33:31 |
nh2 | I see, that makes sense | 03:34:59 |
edef | so we can tune that for however much compute we want to throw at it in parallel | 03:35:31 |
edef | i can run some numbers wrt the best bang-per-buck there but not right this second | 03:36:12 |
edef | basically depends on what the supply curve for EC2 spot compute looks like | 03:36:38 |
nh2 | For Ceph hosting, do we know what the IOPS of cache.nixos.org are, just to see if some basic small cluster on HDDs could handle it? | 03:39:18 |
edef | presumably you want backend I/O, ie to the S3 bucket? | 03:40:25 |
nh2 | yes, that would be the equivalent of what would hit the disks | 03:41:04 |
edef | easy stats: over the last 24h we've served 2.1TiB from all our S3 buckets, uploaded 491G, in ~30M requests | 03:47:32 |
edef | we're serving like 375Mbit/s to Fastly in the peak minute on a day chosen by Fair Dice Rollâ„¢ | 04:03:53 |
edef | not sure how to meaningfully turn these things into iops numbers just because that depends on various factors | 04:05:27 |
edef | clickhouse is refusing to deal with S3 wildcards for some reason and i haven't quite chased down why yet | 04:06:41 |
edef | i'm just taking a request that completed in that minute to have fully executed in that minute but i think that shakes out to a slightly upwards biased estimator if anything | 04:08:29 |
edef | okay i just need to upgrade clickhouse on the EC2 data box i think | 04:09:57 |
edef | * we're serving like 375Mbit/s of compressed NARs to Fastly in the peak minute on a day chosen by Fair Dice Rollâ„¢ | 04:10:45 |
edef | i'm focusing on the NAR serving because that's the actual meat of it, the narinfos are only like 90G of stuff | 04:11:17 |
edef | we also have a few other file types but they're mostly pretty marginal | 04:11:52 |
edef | WHERE NOT key REGEXP '^[0123456789abcdfghijklmnpqrsvwxyz]{32}\.narinfo$'
AND NOT key REGEXP '^[0123456789abcdfghijklmnpqrsvwxyz]{32}\.ls(\.xz)?$'
AND NOT key REGEXP '^[0123456789abcdfghijklmnpqrsvwxyz]{32}-[a-zA-Z0-9+\-_?=][a-zA-Z0-9+\-_?=.]*\.ls$'
AND NOT key REGEXP '^nar/[0123456789abcdfghijklmnpqrsvwxyz]{52}\.nar\.(bz2|xz)$'
AND NOT key REGEXP '^log/[0123456789abcdfghijklmnpqrsvwxyz]{32}-[a-zA-Z0-9+\-_?=][a-zA-Z0-9+\-_?=.]*\.drv$'
AND NOT key REGEXP '^debuginfo/[0-9a-f]{40}$'
AND NOT key REGEXP '^debuginfo/[0-9a-f]{16}$'
AND NOT key IN ('.well-known/pki-validation/gsdv.txt', 'nix-cache-info', 'index.html', 'binary-cache/', 'error-pages/403', 'error-pages/404')
| 04:12:00 |
edef | ^ that yields an empty result set if applied over the S3 inventory | 04:12:33 |
edef | debuginfo is for dwarffs which basically nobody uses, i think the 64-bit ones are even more dead, logfiles aren't a huge traffic driver either, .ls files are used by nar-index iirc but we don't have very much of those | 04:13:27 |
edef | * debuginfo is for dwarffs which basically nobody uses, i think the 64-bit ones are even more dead, logfiles aren't a huge traffic driver either, .ls files are used by nix-index iirc but we don't have very much of those | 04:13:45 |
edef | basically at peak we're serving like a gigabit of NARs | 04:15:22 |