System is randomly crashing while playing games on a fresh install

Hi, so I just build my first PC last week. Since then, I’ve experienced crashes once nearly every day. Most were while I was playing Dark Souls 3 but I also got a crash once when I was browsing youtube on firefox.

I’ve tried looking in journalctl -r but I haven’t noticed any logs before logs of system booting.

I’m at a loss at how to even go about diagnosing this problem since I don’t really have much experience in linux. Help would be much appreciated.

Output for inxi -b

System:
  Host: fedora Kernel: 6.8.5-301.fc40.x86_64 arch: x86_64 bits: 64
  Desktop: GNOME v: 46.5 Distro: Fedora Linux 40 (Workstation Edition)
Machine:
  Type: Desktop Mobo: Micro-Star model: B650M GAMING WIFI (MS-7E30) v: 1.0
    serial: <superuser required> UEFI: American Megatrends LLC. v: 1.60
    date: 06/12/2024
CPU:
  Info: 6-core AMD Ryzen 5 7600 [MT MCP] speed (MHz): avg: 545
    min/max: 545/5170
Graphics:
  Device-1: Advanced Micro Devices [AMD/ATI] Navi 32 [Radeon RX 7700 XT /
    7800 XT] driver: amdgpu v: kernel
  Device-2: Advanced Micro Devices [AMD/ATI] Raphael driver: amdgpu
    v: kernel
  Display: wayland server: X.Org v: 24.1.3 with: Xwayland v: 24.1.3
    compositor: gnome-shell driver: dri: radeonsi gpu: amdgpu
    resolution: 2560x1440~144Hz
  API: OpenGL v: 4.6 vendor: amd mesa v: 24.1.7 renderer: AMD Radeon RX
    7700 XT (radeonsi navi32 LLVM 18.1.6 DRM 3.57 6.8.5-301.fc40.x86_64)
Network:
  Device-1: Realtek RTL8125 2.5GbE driver: r8169
  Device-2: MEDIATEK MT7922 802.11ax PCI Express Wireless Network Adapter
    driver: mt7921e
Drives:
  Local Storage: total: 931.51 GiB used: 120.78 GiB (13.0%)
Info:
  Memory: total: 32 GiB note: est. available: 30.45 GiB used: 3.28 GiB (10.8%)
  Processes: 372 Uptime: 29m Shell: Zsh inxi: 3.3.36

Crashes/hangs that happen seemingly at random are difficult to diagnose. Sometimes connecting a secondary monitor and leaving a command like sudo journalctl -f or sudo dmesg -w running on the second monitor while you use the system normally on the primary monitor will work so that you can see the last error messages before the system hung.

Having a significant amount of RAM in your system might be what makes the hang seem to happen at random if the problem is some bad memory cells at a high address. Essentially, everything will work fine for a long while until you do something that requires enough memory that it will try to access one of those higher addresses. You might want to run MemTest86+ on your system overnight to verify that your memory chips are good.

As well as running the memtest it is worth you checking all cables and modules are plugged in correctly. A poor connection can cause the isdues you are seeing.

I unplug and replug each module and cable to make sure the connections are good when faced with this situation.

You seem to be using the initial install software and to not have updated it.

I would suggest that you update fully sudo dnf upgrade then try again and see if there is any change. We cannot tell if the problem may have already been solved by updates with the kernel you show in that inxi information you posted.

Ran it thrice for ~2hours and mem passed.

I did check thoroughly after building, but I’ll check again.

I did update but via gnome software. According to rpm -qa kernel, I have the 6.10.12 and 6.11.3 kernels but apparently it doesn’t switch the default one? I don’t even get the prompts at startup asking about the kernel to use that I used to get on my previous machine (barebones Fedora running swayfx), so I thought everything worked. I’ll check this once.

I’d recommend more testing and memtest programs.


I had a RX 580 and 6600 XT have odd crashes and general instability with RAM settings that passed overnight on open-source memtest, but got errors with HCI’s memtest on Windows. Changed some settings, retested with HCI, and no more instability.

Specifically I also only saw that with AMD GPUs and frequently crashing with Vulkan and DX12 stuff on Windows (DX11 was less-suspicious/more-stable); I had a RTX 3060 with the same unstable RAM settings and had no games crash or system instability (still failed memtest). I suspect most things touching Vulkan on Linux would probably run into something similar with AMD GPUs (exposing more instability with Vulkan with bad system memory/RAM settings).


In-lieu of mem-testing, regardless of whatever your RAM says, if it’s DDR4 then 2133MHz, max safe DDR4 voltage (1.3V?), auto-BIOS timings (likely uses a stable table from RAM at 2133), and then general-use the system a while. If there’s no crashing, it’s definitely RAM-related :stuck_out_tongue: My RAM was 3666 but my X470 board only liked two sticks at 3122 and 4 at 3000 with specific settings and higher-than-average voltage; but all-safe at 2133 1.3V.

When booting it is common to open the grub menu by holding the shift key as soon as the bios splash screen disappears.

If that fails then you can manually set it to display the menu by using
sudo grub2-editenv - unset menu_auto_hide.

Since you said the system is already updated then I would guess the crashes are because of using the 6.8.5 kernel while all of the other software is up to date.

What is the result of sudo grubby --default-kernel
and sudo grubby --default-index?

$ sudo grubby --default-kernel
/boot/vmlinuz-6.8.5-301.fc40.x86_64
$ sudo grubby --default-index
2

Used 6.11 and didn’t get a crash today. Let’s hope the trend continues.

TBH, not really comfortable with messing around with RAM settings. Otherwise, I’ve 2 sticks of DDR5 @ 6000MT/s (30 40 40) @ 1.35V according to bios. The speeds are supposed to be supported by the motherboard, and according to AMD, the 7000 series of CPU best work with 6000MT/s. So, it shouldn’t be a problem on that front. But of course, if nothing works, will try this as well.

1 Like

You appear to have set the default kernel to always use the 6.8.5 version (index 2) and that kernel will never be removed as long as it was the one booted at the time an upgrade is performed.

If you wish to switch to the latest kernel as default you could use grubby to reset the default kernel index to 0 as is the default with a new installation. sudo grubby --set-default-index=0

Hi, and welcome to the forum.

Apart from al the software ideas people write here I would also have a look at the temperature inside your PC. This includes CPU, GPU and motherboard temperatures.

I write this cause you mentioned crashes mostly happen during playing games which could indicate high CPU and GPU usage and therefore higher temperatures. Is the fan/are the fans spinning at high speeds when you play a game?

1 Like

Nope, used 6.11.3-200 kernel and pc crashed again :frowning_face:.

I did start monitor it while in game using vitals gnome-extension but the temps never crossed 70C even during parts where I could feel the heat, which is normal as far I’ve searched.

I’ll trying some more mem test overnight today.

1 Like

Did it fix itself?

Yuuup. Fixed itself after my last reply. I no longer get the freezing/crashes.

1 Like