!PbtOpdWBSRFbEZRLIf:numtide.com

Nix Community Projects

662 Members
Meta discussions related to https://nix-community.org. (For project specific discussions use github issues or projects own matrix channel). Need help from an admin? Open an issue on https://github.com/nix-community/infra/issues166 Servers

Load older messages


SenderMessageTime
22 Sep 2021
@mic92:nixos.devMic92I don't think disabling fsck.f2fs is an option06:21:47
@mic92:nixos.devMic92ext4 also has issues with inodes just like f2fs I think06:22:24
@nix-community-bot:nixos.devnix-community-bot[firing] systemd_service_failed: nix-community-build02 failed to (re)start service nixpkgs-update-github.service. [firing] systemd_service_failed: nix-community-build02 failed to (re)start service nixpkgs-update-pypi.service. [firing] systemd_service_failed: nix-community-build02 failed to (re)start service nixpkgs-update-updatescript.service. 07:19:30
@nix-community-bot:nixos.devnix-community-bot[firing] systemd_service_failed: nix-community-build02 failed to (re)start service nixpkgs-update-github.service. [firing] systemd_service_failed: nix-community-build02 failed to (re)start service nixpkgs-update-pypi.service. [firing] systemd_service_failed: nix-community-build02 failed to (re)start service nixpkgs-update-updatescript.service. 11:24:30
@mic92:nixos.devMic92@ryantm: why do all those services constantly fail lately?11:25:42
@ryantm:matrix.orgryantm Mic92: By parallelizing the updates, I quadrupled the chances of something crashing. I also fixed some issues where it would sit for a long time without timing out, so it is probably getting to the crashes more quickly. I think it was crashing before just much more slowly. 14:48:00
@mic92:nixos.devMic92@ryantm: is there no way of catching errors at a higher level in your application?14:48:44
@ryantm:matrix.orgryantm Mic92: I'm sure there is! I've been slowly looking at the crashes and trying to fix them. For example: https://github.com/ryantm/nixpkgs-update/commit/279960c32fd2de5d86172251a041508f7945bb98 14:52:36
@ryantm:matrix.orgryantmI'm not sure if that commit got deployed though, because of the issues with nixops.14:52:55
@mic92:nixos.devMic92I deployed yesterday14:53:45
@mic92:nixos.devMic92all sources here should be deployed: https://github.com/nix-community/infra/tree/master/nix14:54:12
@mic92:nixos.devMic92Maybe we should filter update errors and if they hit some threshold than we alert.14:56:10
@mic92:nixos.devMic92That could be done by parsing nixpkgs-update logs or nixpkgs-update could expose some metrics that we can collect with telegraf.14:56:38
@mic92:nixos.devMic92this would be also an option: https://hackage.haskell.org/package/ekg-statsd-0.2.5.0/docs/System-Remote-Monitoring-Statsd.html14:58:04
@mic92:nixos.devMic92telegraf has a statsd input14:58:19
@mic92:nixos.devMic92Your application would send stats to telegraf and than it is queryable with prometheus14:58:36
@ryantm:matrix.orgryantmYeah, looks like I still need to deploy that top-level IO error catch.15:02:47
@mic92:nixos.devMic92Right, but this still give you some insights once the error handler is in place.15:03:53
@mic92:nixos.devMic92Because with updates something always will fail, but you still want to see if it deviates from what you had before.15:04:40
@nix-community-bot:nixos.devnix-community-bot[firing] systemd_service_failed: nix-community-build02 failed to (re)start service nixpkgs-update-github.service. [firing] systemd_service_failed: nix-community-build02 failed to (re)start service nixpkgs-update-pypi.service. [firing] systemd_service_failed: nix-community-build02 failed to (re)start service nixpkgs-update-updatescript.service. 15:29:31
@mic92:nixos.devMic92I snoozed alerts again for 120h15:46:51
@nix-community-bot:nixos.devnix-community-bot [nix-community/infra] ryantm pushed to master: update nixpkgs-update * top-level IO exception catching - https://github.com/nix-community/infra/commit/86a999b233d078568fd3690da194cdfee8244d60 16:06:19
@ryantm:matrix.orgryantmThe top-level IO exception catch should be in place now.16:09:08
23 Sep 2021
@nix-community-bot:nixos.devnix-community-bot[resolved] filesystem_inodes_full: build02.nix-community.org:9273 device disk/by-uuid/29a6b37b-fafb-46a1-b856-1e1c20dc053b on /nix/store got less than 10% inodes left on its filesystem. [resolved] filesystem_inodes_full: build02.nix-community.org:9273 device disk/by-uuid/29a6b37b-fafb-46a1-b856-1e1c20dc053b on / got less than 10% inodes left on its filesystem. 03:18:30
@nix-community-bot:nixos.devnix-community-bot [nix-community/infra] ryantm pushed to master: nixpkgs-update: reset after IO exception caught - https://github.com/nix-community/infra/commit/4ba238f2a6d427e5292cff1f97403d201f8d096e 04:16:05
@nix-community-bot:nixos.devnix-community-bot [nix-community/infra] ryantm pushed to master: nixpkgs-update: cleanup nixpkgs-review files - https://github.com/nix-community/infra/commit/56764eff2351124eb17ddef4e3ddf07ff8d51117 04:23:31
@ryantm:matrix.orgryantmThat latest commit should fix the inode usage issues.04:24:11
24 Sep 2021
@ryantm:matrix.orgryantm Mic92: I think you can un-snooze those alerts it isn't crashing with the new exception catching. 02:08:59
@nix-community-bot:nixos.devnix-community-bot [nix-community/infra] ryantm pushed 2 commits to master: https://github.com/nix-community/infra/commit/e15605269594f55d241b220c4067ce360844c97f
ryantm: update nixpkgs-update: block gnome
ryantm: Merge branch 'master' of github.com:nix-community/infra
03:04:38
@mic92:nixos.devMic92done06:57:31

Show newer messages


Back to Room ListRoom Version: 6