!RROtHmAaQIkiJzJZZE:nixos.org

NixOS Infrastructure

272 Members
Next Infra call: 2024-07-11, 18:00 CEST (UTC+2) | Infra operational issues backlog: https://github.com/orgs/NixOS/projects/52 | See #infra-alerts:nixos.org for real time alerts from Prometheus.86 Servers

Load older messages


SenderMessageTime
11 Oct 2024
@joerg:thalheim.ioMic92 dgrig: what is blocking you specifically? 08:07:55
@dgrig:erethon.comdgrig
In reply to @joerg:thalheim.io
dgrig: what is blocking you specifically?
I don't have a "blocker" per se from the nixos infra team. I've been experimenting locally with the security tracker and some other software that fricklerhandwerk wants deployed in an official namespace and manner. On the security tracker front I have some thing to figure out still, but for others (say Odoo if it's ok for us in the end) I want to sync with someone at some point on how we best want it deployed (i.e. does it belong on the non-critical infra, how do we want to backup the database, etc).
08:21:11
@joerg:thalheim.ioMic92Sure. Do Thursday, 18:00 CEST the next week work for you?08:23:58
@joerg:thalheim.ioMic92 * Sure. Does Thursday, 18:00 CEST the next week work for you?08:24:09
@dgrig:erethon.comdgrig
In reply to @joerg:thalheim.io
Sure. Does Thursday, 18:00 CEST the next week work for you?
Yes, I've blocked all the nixos infra meetings in my calendar so I can attend them.
08:24:39
@fricklerhandwerk:matrix.orgfricklerhandwerk
In reply to @joerg:thalheim.io
What is your expected update cadence?
Hm, good question. That will depend on whether we get follow-up funding and how much, but say something between 1 week and 1 month
08:25:23
@fricklerhandwerk:matrix.orgfricklerhandwerk
In reply to @fricklerhandwerk:matrix.org
Hm, good question. That will depend on whether we get follow-up funding and how much, but say something between 1 week and 1 month
There is already automation in place to do continuous deployment to staging, and we'll re-use that for production.
08:26:14
@joerg:thalheim.ioMic92What is s3 used for?08:29:56
@joerg:thalheim.ioMic92Just seems to be backup as far as I can see08:30:55
@rosscomputerguy:matrix.orgTristan Ross
In reply to @emilazy:matrix.org
wouldn't that run into the atomics and platform purity problems wrt the evaluator?
@tomberek:matrix.org: and I were discussing this a bit last night and we're not entirely sure atomics is an actual problem. How does it affect Hydra? Wouldn't this be an issue with a C++ compiler. Hydra appears to run fine from what I've heard when running on aarch64-linux. The purity thing though, as long as the system is passed through and things are expected right, shouldn't be a concern?
13:40:41
@vcunat:matrix.orgvcunat Tristan Ross: I believe the point was around x86 being strict around reordering of instructions while ARM is not. On language level you then need to be careful around https://en.cppreference.com/w/cpp/atomic/memory_order 14:02:11
@rosscomputerguy:matrix.orgTristan Ross
In reply to @vcunat:matrix.org
Tristan Ross: I believe the point was around x86 being strict around reordering of instructions while ARM is not. On language level you then need to be careful around https://en.cppreference.com/w/cpp/atomic/memory_order
Wouldn't this affect the nix cli itself and literally everything?
14:08:29
@k900:0upti.meK900No14:08:42
@k900:0upti.meK900hydra-queue-runner is like 3k lines of C++14:09:00
@k900:0upti.meK900On top of the normal Nix things14:09:06
@k900:0upti.meK900It's those bits I'm worried about, not Nix14:09:15
@rosscomputerguy:matrix.orgTristan RossOh, is there a way to test the queue runner in a way to trigger it breaking because of this on aarch64?14:10:29
@k900:0upti.meK900Not really without just running it14:10:49
@k900:0upti.meK900Realistically this is not a big problem14:11:00
@k900:0upti.meK900It can be tested and fixed14:11:03
@k900:0upti.meK900Probably in a reasonable amount of time14:11:09
@k900:0upti.meK900It's just another thing to be aware of if migrating to aarch6414:11:29
@k900:0upti.meK900And I genuinely don't see why we need to go aarch64 instead of just upgrading to a beefier and/or better cooled x8614:12:13
@k900:0upti.meK900Hydra needs throughput, not latency, so it won't really care if we have many small cores or few big cores14:12:58
@rosscomputerguy:matrix.orgTristan RossI'm just thinking in cost versus performance, if we're able to get more performance at a lower cost then wouldn't that be better than spending on a beefier expensive but similar performing system?14:14:29
@k900:0upti.meK900Depends on how much the cost difference is14:14:55
@hexa:lossy.networkhexa (signing key rotation when)
  • current: AX101, 5950X (16C/32T @ 3.4 GHz Base Clock), 128 GB RAM (~106 EUR/mo)
  • alternatives:
    • AX162-R, Epyc 9454P (48C/96T @ 2.75 GHz Base Clock), 256 GB RAM (~241 EUR/mo)
    • RX220, Altra Q80-30 (80C @ 3.0 GHz Base Clock), 256 GB RAM (~260 EUR/mo)
14:15:50
@hexa:lossy.networkhexa (signing key rotation when)
  • parallel compress slots (currently limited at 30, which seems reasonable in relation to the compute rhea has)
  • eval memory, which we compensate with zram at 150%
  • eval time, which is single-threaded and probably not fixable through hw upgrades
14:15:57
@hexa:lossy.networkhexa (signing key rotation when) *

bottlenecks:

  • parallel compress slots (currently limited at 30, which seems reasonable in relation to the compute rhea has)
  • eval memory, which we compensate with zram at 150%
  • eval time, which is single-threaded and probably not fixable through hw upgrades
14:16:15
@k900:0upti.meK900Yeah I was going to say that Hetzner doesn't really have cheap ARM14:16:25

Show newer messages


Back to Room ListRoom Version: 6