!RROtHmAaQIkiJzJZZE:nixos.org

NixOS Infrastructure

388 Members
Next Infra call: 2024-07-11, 18:00 CEST (UTC+2) | Infra operational issues backlog: https://github.com/orgs/NixOS/projects/52 | See #infra-alerts:nixos.org for real time alerts from Prometheus.119 Servers

Load older messages


SenderMessageTime
1 Mar 2025
@hexa:lossy.networkhexahttps://www.intel.com/content/www/us/en/products/sku/123550/intel-xeon-silver-4114-processor-13-75m-cache-2-20-ghz/specifications.html05:28:41
@hexa:lossy.networkhexaI sure hope that package won't surprise me in python-updates one day05:29:14
@vcunat:matrix.orgVladimír Čunát If it's 4h with 128 threads, I don't think it will fit into the 10h default limit in our current infra config. (--cores 24 for big-parallel mentioned here; I didn't verify now) 06:57:47
@vcunat:matrix.orgVladimír ČunátBut that's easy to override per job.06:58:34
@hexa:lossy.networkhexawhere is the default limit set?06:59:30
@hexa:lossy.networkhexaa meta default?06:59:49
@hexa:lossy.networkhexaI think we only configure a max-silent-time on the builders07:00:17
@hexa:lossy.networkhexaand a max-unsupported-time on the queue-runner07:00:23
@hexa:lossy.networkhexaRedacted or Malformed Event07:00:28
@vcunat:matrix.orgVladimír Čunát .meta.timeout 07:01:05
@hexa:lossy.networkhexa
pkgs/applications/networking/browsers/chromium/browser.nix
121:    timeout = 172800; # 48 hours (increased from the Hydra default of 10h)
07:01:34
@hexa:lossy.networkhexa🤡07:01:36
@hexa:lossy.networkhexa *
pkgs/applications/networking/browsers/chromium/browser.nix
121:    timeout = 172800; # 48 hours (increased from the Hydra default of 10h)
pkgs/development/tools/electron/common.nix
287:    timeout = 172800; # 48 hours (increased from the Hydra default of 10h)
07:01:49
@hexa:lossy.networkhexameta.timeout is not set by default, so … where is the default? 😄 07:02:35
@vcunat:matrix.orgVladimír ČunátI think it's in Hydra config.07:02:50
@hexa:lossy.networkhexaimage.png
Download image.png
07:03:04
@hexa:lossy.networkhexa
, timeout => getMeta($buildInfo->{meta}->{timeout}, 36000)
07:03:33
@hexa:lossy.networkhexaindeed, here we are07:03:35
@hexa:lossy.networkhexa lun: I'm a bit afraid to ask, but there is supposed to be a migraphx python package, and one of the packages I maintain would want that to support rocm 🙈 07:09:15
@hexa:lossy.networkhexasupposedly this https://github.com/ROCm/AMDMIGraphX/blob/develop/src/py/migraphx_py.cpp07:13:00
@lt1379:matrix.orgLunMention it on the big ROCm tracking issue16:04:24
@lt1379:matrix.orgLun definitely worse: https://gist.githubusercontent.com/LunNova/b1cf007f1af52b4dc353fd9925857b97/raw/63ea9ec1a500d5ef6ad4f2f0eac7a59b6db6e310/huge%2520composable_kernel%2520template%2520instantiation.txt 16:07:29
@lt1379:matrix.orgLunThe ~4h builds are nix cores config set to 128 on a 64c/128t epyc milan eng sample that's clocking down to <3GHz due to power limits, not sure what the relative speedup per core will be but probably not enough to overcome dropping to 24 build threads. Does bumping meta.timeout to 20h to start with sound reasonable?16:13:51
@emilazy:matrix.orgemilymight make sense to do 48 and then scale down based on the actual time16:19:08
@emilazy:matrix.orgemilyto avoid spending 20 hours on a build that times out and gets thrown away16:19:16
@emilazy:matrix.orgemily(I am not on the infra team, don't trust anything I say)16:19:34
@vcunat:matrix.orgVladimír ČunátI believe it's fine to put an unnecessarily big value in there. It's just a single package. I see the main purpose to limit some accidents where resources would be spent continually without bringing value.16:23:17
@vcunat:matrix.orgVladimír ČunátBut honestly, doing so much work in a single derivation is rather risky. Generally it's better to split it up.16:24:19
@vcunat:matrix.orgVladimír ČunátFor example, sometimes we need to restart the queue runner, i.e. all builds in progress get scrapped.16:24:57
@vcunat:matrix.orgVladimír ČunátWe even had periods where it crashed often (same effect).16:25:16

Show newer messages


Back to Room ListRoom Version: 6