!sBfrWMVsLoSyFTCkNv:nixos.org

OfBorg

170 Members
Number of builds and evals in queue: <TBD>63 Servers

You have reached the beginning of time (for this room).


SenderMessageTime
12 Oct 2023
@lily:lily.flowersLily FosterThe small stuff you listed and ability for mortals to run pieces of or all of ofborg locally are definitely pain points i'm looking at helping short-term. I appreciate you making the list ❤️15:42:31
@delroth:delroth.net@delroth:delroth.netyeah I don't think anything here is groundbreaking :)15:43:49
@lily:lily.flowersLily Foster(also local testing will let even people with infra access not have to test changes in prod 😅)15:44:01
@adam:robins.wtf@adam:robins.wtf"properly mark errors as errors" - yes, this times 10015:49:24
@delroth:delroth.net@delroth:delroth.netanother "larger stuff" topic: I'm not sure if ofborg auto-scales based on queue length, but there's been a few times recently where it's 4-6h behind on processing PRs, and I wonder if we could just throw more compute at it15:58:44
@cole-h:matrix.orgcole-h I've already tried that (manually), unfortunately. A few years ago, 3-4 ofborg evaluators was enough to chew through the queue. Nowadays, even 9 is not enough, due to eval times blowing up. 15:59:45
@cole-h:matrix.orgcole-hAlso, I don't know how I feel about marking "errors as errors" (I assume this means "failed builds turn into failed checks"). There could be any number of reasons as why the build failed that may not have anything to do with the derivation itself. Maybe the machine OOM'd. Maybe networking died. Maybe the kernel panicked. Maybe there was a hardware failure. Maybe.... Something that was decided early on was that things with a red X should not be merged under any circumstance (as always, there are exceptions, but those should be very rare). If one of those transient (or not so transient) failures happens, but nobody can reproduce it and someone decides to merge it anyways, that cheapens the meaning of a failed CI check. At least with a "skipped" check, its communicated that something may have gone wrong, but it may not be anyone in particular's fault.16:03:02
@cole-h:matrix.orgcole-h * Also, I don't know how I feel about marking "errors as errors" (I assume this means "failed builds turn into failed checks"). There could be any number of reasons as why the build failed that may not have anything to do with the derivation itself. Maybe the machine OOM'd. Maybe networking died. Maybe the kernel panicked. Maybe there was a hardware failure. Maybe.... Something that was decided early on was that things with a red X should not be merged under any circumstance (as always, there are exceptions, but those should be very rare). If one of those transient (or not so transient) failures happens, but nobody can reproduce it and someone decides to merge it anyways, that cheapens the meaning of a failed CI check. At least with a "skipped" check, it's communicated that something may have gone wrong, but it may not be anyone in particular's fault.16:03:05
@cole-h:matrix.orgcole-h(Not to say I'd block that change, per se, but it'd be nice to be convinced that it's the right thing to do.)16:03:49

Show newer messages


Back to Room ListRoom Version: 6