| 12 Aug 2021 |
Vladimír Čunát | Still, wendy's age has many disadvantages. Poor performance (of each single thread) and power efficiency come to mind. | 11:40:52 |
hexa | surprising people with tensorflow/pytorch builds on stable upgrades sadly isn't a great experience | 11:44:32 |
hexa | I do acknowledge that we would loose a bunch of compute, that somewhat diversifies our builder situation (eqx metal + X) | 11:46:40 |
Vladimír Čunát | Diversity: I've been considering to host some better physical machine than the current t4a, perhaps as a replacement of t4a. | 11:48:37 |
Vladimír Čunát | If there's interest, that is. I have no idea of any existing plans around this. | 11:50:08 |
Vladimír Čunát | * Diversity: I've been considering to host some better physical machine than the current t4b, perhaps as a replacement of t4b. | 12:58:56 |
Domen Kožar | hexa: we wouldn't lose that much with wendy gone | 13:47:52 |
Domen Kožar | it probably just needs to be done | 13:48:06 |
andi- | It would still be nice to think about diversifying the builders. If packet pulls the plug (for whatever reason) we have a problem. | 13:49:25 |
sterni | indeed | 13:50:52 |
Vladimír Čunát | I think ike is also from the old set of machines. | 14:01:31 |
Vladimír Čunát | As for pulling the plug, I think the main problem is in the parts that are hard to decentralize (e.g. queue runner) | 14:03:14 |
lukegb (he/him) | I don't think it's a case of "we need always running decentralized infrastructure" | 15:15:12 |
lukegb (he/him) | It's more a case of "what is the plan if packet goes away" | 15:15:24 |
sterni | packet doesn't even have to walk away | 15:16:52 |
sterni | what if they have a serious network or power outtage or even … uh a fire | 15:17:03 |
Vladimír Čunát | Network or power outage shouldn't last long enough to really hurt us. Cache etc. is separate. | 15:18:19 |
lukegb (he/him) | Hydra isn't really a service we need to run at 5 9s or anything | 15:18:45 |
Amine Chikhaoui | I think there is enough budget to cover unforeseen problems ? (https://opencollective.com/nixos#category-BUDGET) until a plan is figured out for when something unexpected happens | 15:21:47 |
sterni | I don't think having the money is enough to be honest | 16:06:34 |
sterni | the freenode -> matrix switch has if anything shown that if you have to scramble to find a solution in the moment the outcome will not be 100% satisfying and there is no room to get consensus for the drawbacks (which is probably the most problematic about this) | 16:07:41 |
| 14 Aug 2021 |
lukegb (he/him) | bringing up https://github.com/NixOS/nixpkgs/issues/115425 again; can we disable wendy on hydra please? (set maxjobs back to 0) | 14:36:10 |
sterni | why don't we have a feature for sse 4.2 or such? isn't that precisely what that field is for? | 14:38:02 |
lukegb (he/him) | wendy just doesn't do very well on big-parallel jobs anyway (e.g. Chrome builds) | 14:39:20 |
lukegb (he/him) | if we want to eke all the life we can from wendy then we should remove the big-parallel required feature and add features for processor extensions to all the builders that do have them | 14:40:36 |
lukegb (he/him) | but at the moment: the work wendy is doing could be done faster and better by one of the other workers | 14:41:14 |
Vladimír Čunát | My understanding is that wendy got switched to big-parallel jobs exactly in order to avoid pytorch. | 15:35:53 |
lukegb (he/him) | Yeah, but then I made pytorch big-parallel because the build takes a long time and is parallelisable xD | 15:43:18 |
Vladimír Čunát | Oh, I didn't notice the reply and found out independently. | 15:51:25 |
Vladimír Čunát | From what I've seen, wendy really is more trouble than worth. | 15:53:03 |