!CcTBuBritXGywOEGWJ:matrix.org

NixOS Binary Cache Self-Hosting

170 Members
About how to host a very large-scale binary cache and more58 Servers

Load older messages


SenderMessageTime
3 Mar 2024
@delroth:delroth.netdelroth
In reply to @zimbatm:numtide.com
we've been circling this particular issue for a while now. let's take the plunge. especially if nh2 is willing to help set things up and teach us about Ceph. we could order 3 servers and get a prototype up and running.
* by "let's take the plunge" and "teach us", who do you mean by "us"? Because even if we have a Ceph cluster running, without the required work to actually use it to displace AWS usage, it's just an extra liability. And imo that work is more involved than the work required to set up a Ceph cluster, if only because nobody has even charted in details what needs to be done...
22:15:33
@delroth:delroth.netdelrothtl;dr: yes, at some point we need a Ceph cluster, but before we do the fun expensive part of setting up new shiny infra, we need someone to figure out the boring work of enabling Hydra to dual-write, enabling Fastly to dual-read, figuring out how we GC stuff from our new "hot cache" (if we go that route), etc. - all that is free, can be done now, and nobody has been lining up to do that work :)22:18:29
@raitobezarius:matrix.orgraitobezariusHydra dual-write is on my todo list22:35:36
@raitobezarius:matrix.orgraitobezariusAnd the tooling developed for the S3 GC might be reapplicable for the "hot cache" GC, even a simple LRU seems a good start (?), better policies can be deployed as things happen22:36:52
@raitobezarius:matrix.orgraitobezariusbut I mostly agree with you delroth anyway22:36:56
@raitobezarius:matrix.orgraitobezarius Fastly dual-read seems something that can be done only by people having accesses after reading all the docs and ensuring the answer is not already there as far as I understand it 22:37:35
@raitobezarius:matrix.orgraitobezarius (and I believe we concluded by 'yes' in the deduplication meetings, a fallback path can be implemented) 22:37:53
@raitobezarius:matrix.orgraitobezarius (as with Hydra dual-sign ahem) 22:38:51
@delroth:delroth.netdelroth
In reply to @raitobezarius:matrix.org
Fastly dual-read seems something that can be done only by people having accesses after reading all the docs and ensuring the answer is not already there as far as I understand it
it's not
22:43:01
@delroth:delroth.netdelrothI'm the person who did the last major Fastly change for cache.nixos.org and I did 90% of the work with no access, fwiw22:43:38
@raitobezarius:matrix.orgraitobezarius well I imagine that you either read the documentation or did it on your own account 22:43:53
@raitobezarius:matrix.orgraitobezariuswhich is what I meant by the end of my sentence22:44:00
@raitobezarius:matrix.orgraitobezariusor was there another trick I didn't think of?22:44:13
@raitobezarius:matrix.orgraitobezarius(but I think I agree with "this does not really really need Fastly access" from the get-go)22:44:37
@raitobezarius:matrix.orgraitobezariushttps://github.com/NixOS/infra/issues/396 and https://github.com/NixOS/infra/issues/394 to keep track of dual write/read23:13:20
4 Mar 2024
@ajs124:ajs124.deajs124the hydra part could probably be done with a runcommand. that way the rest of the configuration and how stuff gets copied to the main s3 wouldn't need to be modified.00:37:51
@delroth:delroth.netdelroth Is Hydra properly designed for having long running runcommands that can take minutes to run? I don't actually know how it's integrated with the queue runner, not even sure if it is 02:45:03
@delroth:delroth.netdelroth Sorry, probably not the right place to have this discussion anyway 02:45:58
@ajs124:ajs124.deajs124 IME longer running runcommands aren't an issue, but I can skim the code to check if this is an obviously bad idea.
I'll comment my suggestion and findings on the issue later.
09:31:53
@delroth:delroth.netdelroth Also we've completely disabled hydra-notify on h.n.o right now because each build completed notification fetches 250MB from the DB 11:04:40
@delroth:delroth.netdelroth So that would have to be fixed firstĀ :) 11:04:59
@edef1c:matrix.orgedefnote that if we have dual reads, we can just have a separate service that does the copying to S3, if any14:50:11
@edef1c:matrix.orgedefthe perf requirements on the storage backends are really loose, except for serving 404s14:50:34
@edef1c:matrix.orgedef * the perf requirements on the storage backends are really loose, except for serving narinfo 404s14:50:55
@edef1c:matrix.orgedefbut narinfo keys are only 5G of stuff, we can serve 404s basically however we want14:51:22
@edef1c:matrix.orgedefany actual serving ends up being from fastly and too heavily cached for the backend's perf to matter14:52:17
@zimbatm:numtide.comJonas Chevalier nh2: do you want to join the infra meeting on Thusday 18:00 GMT+1 and hash this out with us? 14:55:08
@raitobezarius:matrix.orgraitobezariusIsn't delroth going to be off?14:55:46
@raitobezarius:matrix.orgraitobezariusI think it's good to have delroth on those discussions14:55:57
@zimbatm:numtide.comJonas Chevalierit's fine, we already discussed this14:57:12

Show newer messages


Back to Room ListRoom Version: 10