!JQvnJacrwKgtkGHYHO:matrix.org

NixOS + Framework

135 Members
Discussing NixOS in the context of the Framework laptop38 Servers

Load older messages


SenderMessageTime
22 Mar 2025
@vaw:nlih.devaw joined the room.21:26:28
25 Mar 2025
@freedbeats:the-lamp.netfreedbeats joined the room.15:24:22
27 Mar 2025
@twitchy0:matrix.orgtwitchy0 joined the room.01:09:02
3 Apr 2025
@damccull:matrix.orgdamccull

My FW16 seems to be turning off the dGPU with runtime power management, but the amdgpu driver then seems to be crashing, and the device is removed from my kernel, no longer showing up in nvtop or accessible to games, (but does still show up in lspci). If I use this to reload the card, it comes back, and after a while crashes again:

echo 1 | sudo tee /sys/bus/pci/devices/0000\:03\:00.0/remove
echo 1 | sudo tee /sys/bus/pci/devices/0000\:03\:00.1/remove
echo 1 | sudo tee /sys/bus/pci/rescan

Anyone else having this issue or know how to fix it?

22:15:50
@damccull:matrix.orgdamccullThere any issues with amdgpu in the kernal right now22:42:02
@damccull:matrix.orgdamccull* There any issues with amdgpu in the kernal right now?22:42:04
4 Apr 2025
@ctsdownloads:fedora.imMatt HI'm keeping a spreadsheet for stuff like this, which kernel?09:45:55
@damccull:matrix.orgdamccull6.14.0. K900 on the https://matrix.to/#/#users:nixos.org channel found the solution though at https://gitlab.freedesktop.org/drm/amd/-/issues/4083. The patch on that page works. I imagine the fix will be in a future update. For now, instead of compiling the full kernel, I follow the nixos wiki to package the amdgpu module and patched that directly. About a 5 minute compile on my Framework 16.18:36:16
@damccull:matrix.orgdamccullPackaging example is amdgpu which was helpful: https://nixos.wiki/wiki/Linux_kernel#Patching_a_single_In-tree_kernel_module18:36:49
@damccull:matrix.orgdamccullpatch example here: https://wiki.nixos.org/wiki/AMD_GPU#System_Hang_with_Vega_Graphics_(and_select_GPUs)18:38:26
@damccull:matrix.orgdamccull

Relevant part of my configuration.nix:


{
  config,
  inputs,
  lib,
  pkgs,
  ...
}:
let
  amdgpu-kernel-module = pkgs.callPackage ../../nixos_modules/patches/amdgpu-kernel-module.nix {
    # Make sure the module targets the same kernel as your system is using.
    kernel = config.boot.kernelPackages.kernel;
  };
  # linuxPackages_latest 6.13 (or linuxPackages_zen 6.13)
  # amdgpu-stability-patch = pkgs.fetchpatch {
  #   name = "amdgpu-stability-patch";
  #   url = "https://github.com/torvalds/linux/compare/ffd294d346d185b70e28b1a28abe367bbfe53c04...SeryogaBrigada:linux:4c55a12d64d769f925ef049dd6a92166f7841453.diff";
  #   hash = "sha256-q/gWUPmKHFBHp7V15BW4ixfUn1kaeJhgDs0okeOGG9c=";
  # };
in
{
  boot = {
    extraModulePackages = [
      (amdgpu-kernel-module.overrideAttrs (_: {
        patches = [
          ./0001-drm-amdgpu-mes11-optimize-MES-pipe-FW-version-fetchi.patch
          # amdgpu-stability-patch
        ];
      }))
    ];
  };
}
18:41:05
@damccull:matrix.orgdamccull

patch file itself, which can be found on the first AMD link:

From 2771f96336e3c469622f6e0e132f24ad1ba6a5c6 Mon Sep 17 00:00:00 2001
From: Alex Deucher <alexander.deucher@amd.com>
Date: Thu, 27 Mar 2025 17:33:49 -0400
Subject: [PATCH] drm/amdgpu/mes11: optimize MES pipe FW version fetching

Don't fetch it again if we already have it.  It seems the
don't reliably have the proper value at resume in some
cases.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4083
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/mes_v11_0.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
index 16f8bc36afa07..06b51867c9aac 100644
--- a/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/mes_v11_0.c
@@ -895,6 +895,10 @@ static void mes_v11_0_get_fw_version(struct amdgpu_device *adev)
 {
 	int pipe;
 
+	/* return early if we have already fetched these */
+	if (adev->mes.sched_version && adev->mes.kiq_version)
+		return;
+
 	/* get MES scheduler/KIQ versions */
 	mutex_lock(&adev->srbm_mutex);
 
-- 
2.49.0
18:41:47
@damccull:matrix.orgdamccullThis solved my problem, for anyone else having it as well.18:42:04
@damccull:matrix.orgdamccullAnd it doesn't do a full kernel recompile, just the amdgpu module, which then overrides the kernel built-in version.18:42:23
@damccull:matrix.orgdamccull Matt H: I think kernels 6.13.7->6.14.0 might be affected though. That's what the bug report on the first link says, and I didn't have any luck with downgrading to pkgs.linuxPackages_6_13 in my configuration.nix. 18:43:33
@damccull:matrix.orgdamccull

forgot one other file needed. This one is amdgpu-kernel-module.nix, referenced in the configuration.nix as well. This is the part that packages the amdgpu module separately from the kernel:

{
  pkgs,
  lib,
  kernel ? pkgs.linuxPackages_latest.kernel,
}:

pkgs.stdenv.mkDerivation {
  pname = "amdgpu-kernel-module";
  inherit (kernel)
    src
    version
    postPatch
    nativeBuildInputs
    ;

  kernel_dev = kernel.dev;
  kernelVersion = kernel.modDirVersion;

  modulePath = "drivers/gpu/drm/amd/amdgpu";

  buildPhase = ''
    BUILT_KERNEL=$kernel_dev/lib/modules/$kernelVersion/build

    cp $BUILT_KERNEL/Module.symvers .
    cp $BUILT_KERNEL/.config        .
    cp $kernel_dev/vmlinux          .

    make "-j$NIX_BUILD_CORES" modules_prepare
    make "-j$NIX_BUILD_CORES" M=$modulePath modules
  '';

  installPhase = ''
    make \
      INSTALL_MOD_PATH="$out" \
      XZ="xz -T$NIX_BUILD_CORES" \
      M="$modulePath" \
      modules_install
  '';

  meta = {
    description = "AMD GPU kernel module";
    license = lib.licenses.gpl3;
  };
}
18:45:11
@damccull:matrix.orgdamccullAll this is taken from nixos.wiki, wiki.nixos.org, and the amd bug tracker. GL everyone.18:45:48
@damccull:matrix.orgdamccull Oh, one last thing Matt H , this isn't just a nixos issue. Apparently the OP of the bug is on fedora, and on that tracker at the bottom there's a link to another, related issue, where hibernation is prevented from happening due to this bug. My own experience included hibernation not working. Everything's fine now with the patch. 18:54:24
7 Apr 2025
@cbobrobison:matrix.orgcbobrobison joined the room.02:02:37
8 Apr 2025
@ctsdownloads:fedora.imMatt H Thanks for the clarity damccull 🙏😀 00:17:06
@damccull:matrix.orgdamccullYou're quite welcome. I'm happy to help because the community for nix and framework both have been so helpful to me.05:15:51
9 Apr 2025
@niko:nyanbinary.rsnyanbinary 🏳️‍⚧️Same here21:07:12
@niko:nyanbinary.rsnyanbinary 🏳️‍⚧️ Matt H: look at #gaming:nixos.org 21:07:21
@niko:nyanbinary.rsnyanbinary 🏳️‍⚧️I had this problem21:07:24
@niko:nyanbinary.rsnyanbinary 🏳️‍⚧️Its a regression in 6.13 21:07:31
@niko:nyanbinary.rsnyanbinary 🏳️‍⚧️6.14 still has the same problem21:07:37
@niko:nyanbinary.rsnyanbinary 🏳️‍⚧️6.12 works21:07:40
@damccull:matrix.orgdamccullYeah, I'm getting weird graphical glitches on .13 and .14 too. About to switch back to 1221:08:14
@niko:nyanbinary.rsnyanbinary 🏳️‍⚧️Same here21:10:16
@niko:nyanbinary.rsnyanbinary 🏳️‍⚧️red band on the screen/screen flashing?21:10:21

Show newer messages


Back to Room ListRoom Version: 10