Fedora 40 kernel-6.13.n and split lock on Zen5

Hi,

I have been having hard peecee lockups after kernel 6.13.x was pushed for Fedora 40. This only happens to my Zen5 9950X machines and not my Zen3 machines.

My Zen5 9950X machine freezes/locks up with no messages on dmesg or journalctl to help me figure out what’s going on. The machine has to have power cut until the LEDs on the motherboard go out. All of the fans are running in an max/extreme RPM speed.

There’s been nothing happening that I can focus on as to why this has started happening in recent weeks/month. I ran memtest for a weekend.

It will do this while I am using the machine and while I’m not logged in (gdm is running).

It has nothing to do with NVIDIA or AMD graphics cards. I have tried both: RTX 4060 (nvidia) and RX 6650 XT (amdgpu).

After cutting power, the machine boots backup w/o no indication it had crashed (e.g no abrt reporting).

To debug this, I have:

  1. gone back to kernel 6.12.15-100: freezes/hard locks are gone
  2. disabled split lock with kernel-6.13.x split_lock_detect=off: freezes/hard locks are gone

I am currently using option #2.

Machine configuration:
ROG CROSSHAIR X670E HERO, BIOS 2904 03/04/2025 (latest BIOS)
AMD Ryzen 9 9950X 16-Core Processor (family: 0x1a, model: 0x44, stepping: 0x0)
microcode: Current revision: 0x0b404023
GRUB_CMDLINE_LINUX=“rhgb quiet LANG=en_US.UTF-8 iommu=pt usbcore.autosuspend=-1 split_lock_detect=off rd.driver.blacklist=nouveau modprobe.blacklist=nouveau”

I only have this freeze/hard lock on my Zen5 machine (Zen3 boxes don’t do this). I don’t get anything from abrt or journalctl or dmesg that gives me any indication of what to look at.

here’s my inxi:

Memory:
  System RAM: total: 96 GiB available: 93.78 GiB used: 12.3 GiB (13.1%)
  Array-1: capacity: 192 GiB note: est. slots: 4 modules: 2 EC: None
  Device-1: Channel-A DIMM 0 type: no module installed
  Device-2: Channel-A DIMM 1 type: DDR5 size: 48 GiB speed: spec: 4800 MT/s
    actual: 5600 MT/s
  Device-3: Channel-B DIMM 0 type: no module installed
  Device-4: Channel-B DIMM 1 type: DDR5 size: 48 GiB speed: spec: 4800 MT/s
    actual: 5600 MT/s
CPU:
  Info: 16-core model: AMD Ryzen 9 9950X bits: 64 type: MT MCP cache:
    L2: 16 MiB
  Speed (MHz): avg: 2981 min/max: 600/5752 cores: 1: 2981 2: 2981 3: 2981
    4: 2981 5: 2981 6: 2981 7: 2981 8: 2981 9: 2981 10: 2981 11: 2981 12: 2981
    13: 2981 14: 2981 15: 2981 16: 2981 17: 2981 18: 2981 19: 2981 20: 2981
    21: 2981 22: 2981 23: 2981 24: 2981 25: 2981 26: 2981 27: 2981 28: 2981
    29: 2981 30: 2981 31: 2981 32: 2981
Graphics:
  Device-1: NVIDIA AD107 [GeForce RTX 4060] driver: nvidia v: 570.124.04
  Device-2: Logitech BRIO Ultra HD Webcam driver: snd-usb-audio,uvcvideo
    type: USB
  Display: server: X.Org v: 1.20.14 with: Xwayland v: 24.1.6 driver: X:
    loaded: nvidia gpu: nvidia,nvidia-nvswitch resolution: 5120x2160~120Hz
  API: EGL v: 1.5 drivers: nvidia,swrast
    platforms: gbm,x11,surfaceless,device
  API: OpenGL v: 4.6.0 compat-v: 4.5 vendor: nvidia mesa v: 570.124.04
    renderer: NVIDIA GeForce RTX 4060/PCIe/SSE2
  API: Vulkan v: 1.3.296 drivers: N/A surfaces: xcb,xlib
  Info: Tools: api: eglinfo, glxinfo, vulkaninfo
    de: kscreen-doctor,xfce4-display-settings gpu: corectrl, nvidia-settings,
    nvidia-smi x11: xdriinfo, xdpyinfo, xprop, xrandr

It sounds like you know what is going on and have found a workaround. Maybe the next Kernel will fix it.

Consider submitting a bug report.

1 Like

I don’t know that I have anything understood other than a workaround.

I had never heard of split-lock or VFIO until this issue. I don’t game or use the VFIO kernel module with my Fedora boxen. I would guess that there’s some software I use that triggers a split-lock? I’m still murky on precisely what split-lock does. I have not checked the kernel 6.13 changelog for edits/changes to/with it.

I also don’t have any artifact evidence of an issue to report. No dmesgs, no abrts, no backtraces/call stacks, nothing but loud fans and a locked up machine at random times.

I was hoping to reach smarter Zen5 Fedora users that would chime in and confirm or say I am on the wrong track. :slight_smile:

Where would you suggest I create a bug report?

Fedora uses the RedHat Bugzilla tracker

This looks interesting:

I don’t use virtualization regularly or often but I do have VirtualBox’s kmods installed and the kvm/kvm_amd kmods are installed/loaded. Oye!

I see there’s a 6.13.7 kernel in koji/updates-testing. I’ll take a look over the weekend.

1 Like

Possibly related to this with a fix coming in upcoming 6.14 kernel:

Thanks for the reply.

I checked that out and I don’t think it’s applicable to me as I disable/remove/turn-off or otherwise prevent any power saving options or features.

Also, I haven’t gone back to my RX 6650 XT yet. I’m still on the RTX 4060 w/nvidia drivers.

Also, I don’t think the researcher is using a Zen5 CPU. This issue only affects Zen5 after AMD added bus lock detection to the 6.13.x kernel. My other AM4/Zen3’s are fine and don’t do this.