*Everything* crashing on new system using new hardware

I just upgraded my main PC, and nearly everything is crashing. Mainly steam, steam games, firefox, discord (vesktop client), and xwayland are not happy. Since all of these things were working before the upgrade, i figure its something to do with the hardware. Does anyone know what the heck i should be doing?

There doesn’t seem to be any rhyme or reason to what causes a crash, and nothing is showing up in dmesg that would give me pause.

Specs:
CPU - Ryzen 7 9700x
GPU - Intel Arc A770 16GB
Mobo - Asrock B650 Pro RS
RAM - G.Skill Flare X5 64GB
Boot drive - WD Blue SATA M.2 1TB

Kernel - 6.12.11-200
Mesa - 24.3.4

Any help would be appreciated. I’m at a loss here :frowning:

Did you upgrade motherboard, CPU, GPU and RAM but keep the M.2 SSD with all your system files?

Hardware upgrades risk a) defective hardware, and b) incompatible BIOS and linux configuration settings. Some vendors provide standalone bootable hardware test systems.
The Fedora Live USB installer provides a memory test.

Have you tried booting a recent (better chance that it supports new hardware) USB installer?

You can try using the grub2 editor to add <space>3 to the end of the kernel command line to boot to a text console. Then use journalctl --no-hostname -b -1 p <N> to look for error messages. I usually start with N=3. If I don’t find the error I increase N to get lower “priority” errors. There are many other journalctl options to filter out the key records from the mass of data it collects – see man journalctl. You can also run inxi -Fzxx and post the output as pre-formatted text in case someone recognizes a hardware-specific issue.

1 Like

that was originally what i did, yeah

I ended up just reinstalling with a fresh F41 install. Same problems. The RAM test is a great idea, so I’ll do that.
I also had no idea that journalctl could filter error messages like that, neat!
I’ll try this when i get a chance.

edit:
There is some good known working RAM in my mother’s PC, so I will try it in mine when I get the chance. It will be faster than doing a memory scan, but I’ll probably still do one just to be thorough.

System:
  Kernel: 6.12.11-200.fc41.x86_64 arch: x86_64 bits: 64 compiler: gcc
    v: 14.2.1
  Desktop: GNOME v: 47.3 tk: GTK v: 3.24.43 wm: gnome-shell dm: GDM
    Distro: Fedora Linux 41 (Workstation Edition)
Machine:
  Type: Desktop System: ASRock product: B650 Pro RS v: N/A
    serial: <superuser required>
  Mobo: ASRock model: B650 Pro RS serial: <superuser required> UEFI: American
    Megatrends LLC. v: 3.16 date: 12/18/2024
Battery:
  Device-1: hidpp_battery_0 model: Logitech G305 Lightspeed Wireless Gaming
    Mouse serial: <filter> charge: 100% (should be ignored)
    status: discharging
CPU:
  Info: 8-core model: AMD Ryzen 7 9700X bits: 64 type: MT MCP arch: N/A rev: 0
    cache: L1: 640 KiB L2: 8 MiB L3: 32 MiB
  Speed (MHz): avg: 4468 min/max: 600/5581 boost: enabled cores: 1: 4468
    2: 4468 3: 4468 4: 4468 5: 4468 6: 4468 7: 4468 8: 4468 9: 4468 10: 4468
    11: 4468 12: 4468 13: 4468 14: 4468 15: 4468 16: 4468 bogomips: 121364
  Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm
Graphics:
  Device-1: Intel DG2 [Arc A770] vendor: Acer Incorporated ALI driver: i915
    v: kernel arch: Xe-HPG pcie: speed: 2.5 GT/s lanes: 1 ports:
    active: DP-1,DP-3 empty: DP-2, DP-4, HDMI-A-1, HDMI-A-2, HDMI-A-3
    bus-ID: 03:00.0 chip-ID: 8086:56a0
  Device-2: Advanced Micro Devices [AMD/ATI] Granite Ridge [Radeon Graphics]
    vendor: ASRock driver: amdgpu v: kernel pcie: speed: 16 GT/s lanes: 16
    ports: active: none empty: DP-5, DP-6, DP-7, HDMI-A-4, Writeback-1
    bus-ID: 11:00.0 chip-ID: 1002:13c0 temp: 34.0 C
  Device-3: Logitech HD Pro Webcam C920 driver: snd-usb-audio,uvcvideo
    type: USB rev: 2.0 speed: 480 Mb/s lanes: 1 bus-ID: 1-8:4 chip-ID: 046d:082d
  Display: wayland server: Xwayland v: 24.1.4 compositor: gnome-shell
    driver: gpu: i915 display-ID: 0
  Monitor-1: DP-1 model: Dell P2412H res: 1920x1080 dpi: 92
    diag: 609mm (24")
  Monitor-2: DP-3 model: Acer XF243Y P res: 1920x1080 dpi: 93
    diag: 604mm (23.8")
  API: OpenGL v: 4.6 vendor: intel mesa v: 24.3.4 glx-v: 1.4 es-v: 3.2
    direct-render: yes renderer: Mesa Intel Arc A770 Graphics (DG2)
    device-ID: 8086:56a0 display-ID: :0.0
  API: EGL Message: EGL data requires eglinfo. Check --recommends.
  Info: Tools: api: glxinfo x11: xdriinfo, xdpyinfo, xprop, xrandr
Audio:
  Device-1: Intel DG2 Audio vendor: Acer Incorporated ALI
    driver: snd_hda_intel v: kernel pcie: speed: 2.5 GT/s lanes: 1
    bus-ID: 04:00.0 chip-ID: 8086:4f90
  Device-2: Advanced Micro Devices [AMD/ATI] Rembrandt Radeon High
    Definition Audio driver: snd_hda_intel v: kernel pcie: speed: 16 GT/s
    lanes: 16 bus-ID: 11:00.1 chip-ID: 1002:1640
  Device-3: Advanced Micro Devices [AMD] Family 17h/19h/1ah HD Audio
    vendor: ASRock driver: snd_hda_intel v: kernel pcie: speed: 16 GT/s
    lanes: 16 bus-ID: 11:00.6 chip-ID: 1022:15e3
  Device-4: C-Media Antlion USB adapter
    driver: hid-generic,snd-usb-audio,usbhid type: USB rev: 1.1 speed: 12 Mb/s
    lanes: 1 bus-ID: 1-6.3:5 chip-ID: 0d8c:002c
  Device-5: Logitech HD Pro Webcam C920 driver: snd-usb-audio,uvcvideo
    type: USB rev: 2.0 speed: 480 Mb/s lanes: 1 bus-ID: 1-8:4 chip-ID: 046d:082d
  API: ALSA v: k6.12.11-200.fc41.x86_64 status: kernel-api
  Server-1: JACK v: 1.9.22 status: off
  Server-2: PipeWire v: 1.2.7 status: active with: 1: pipewire-pulse
    status: active 2: wireplumber status: active 3: pipewire-alsa type: plugin
Network:
  Device-1: Realtek RTL8125 2.5GbE vendor: ASRock driver: r8169 v: kernel
    pcie: speed: 5 GT/s lanes: 1 port: d000 bus-ID: 0e:00.0 chip-ID: 10ec:8125
  IF: enp14s0 state: up speed: 1000 Mbps duplex: full mac: <filter>
Drives:
  Local Storage: total: 3.64 TiB used: 1.12 TiB (30.8%)
  ID-1: /dev/sda vendor: Western Digital model: WD10EZEX-22MFCA0
    size: 931.51 GiB speed: 6.0 Gb/s serial: <filter>
  ID-2: /dev/sdb vendor: Western Digital model: WDS100T2B0B-00YS70
    size: 931.51 GiB speed: 6.0 Gb/s serial: <filter>
  ID-3: /dev/sdc vendor: Seagate model: ST2000LM009-1R9174 size: 1.82 TiB
    type: USB rev: 3.2 spd: 5 Gb/s lanes: 1 serial: <filter>
Partition:
  ID-1: / size: 929.93 GiB used: 272.91 GiB (29.3%) fs: btrfs dev: /dev/sdb3
  ID-2: /boot size: 973.4 MiB used: 354.5 MiB (36.4%) fs: ext4
    dev: /dev/sdb2
  ID-3: /boot/efi size: 598.8 MiB used: 19.3 MiB (3.2%) fs: vfat
    dev: /dev/sdb1
  ID-4: /home size: 929.93 GiB used: 272.91 GiB (29.3%) fs: btrfs
    dev: /dev/sdb3
Swap:
  ID-1: swap-1 type: zram size: 8 GiB used: 0 KiB (0.0%) priority: 100
    dev: /dev/zram0
Sensors:
  System Temperatures: cpu: 44.6 C mobo: 21.0 C gpu: amdgpu temp: 33.0 C
  Fan Speeds (rpm): fan-1: 1208
Info:
  Memory: total: 60 GiB note: est. available: 60.4 GiB used: 4.91 GiB (8.1%)
  Processes: 583 Power: uptime: 15h 25m wakeups: 1 Init: systemd v: 256
    target: graphical (5) default: graphical
  Packages: pm: rpm pkgs: N/A note: see --rpm Compilers: N/A Shell: Bash
    v: 5.2.32 running-in: ptyxis-agent inxi: 3.3.37

Output of inxi -Fzxx

eb 06 20:07:32 kernel: hub 8-0:1.0: config failed, hub doesn't have any ports! (err -19)
Feb 06 20:07:33 kernel: snd_hda_intel 0000:04:00.0: Unknown capability 0
Feb 06 20:07:45 gdm-password][2202]: gkr-pam: unable to locate daemon control file
Feb 06 20:07:45 systemd[2222]: Failed to start app-gnome-gnome\x2dkeyring\x2dpkcs11-2339.scope - Application launched by gnome-session-binary.
Feb 06 20:07:45 systemd[2222]: Failed to start app-gnome-liveinst\x2dsetup-2365.scope - Application launched by gnome-session-binary.
Feb 06 20:07:46 systemd[2222]: Failed to start app-gnome-user\x2ddirs\x2dupdate\x2dgtk-2630.scope - Application launched by gnome-session-binary.
Feb 06 20:07:53 kernel: BTRFS error (device sdb3): bdev /dev/sdb3 errs: wr 0, rd 0, flush 0, corrupt 8, gen 0
Feb 06 20:07:53 kernel: BTRFS error (device sdb3): bdev /dev/sdb3 errs: wr 0, rd 0, flush 0, corrupt 9, gen 0
Feb 06 20:07:53 kernel: BTRFS error (device sdb3): bdev /dev/sdb3 errs: wr 0, rd 0, flush 0, corrupt 10, gen 0
Feb 06 20:14:59 systemd-coredump[4713]: [🡕] Process 3326 (vesktop) of user 1000 dumped core.
                                        
                                        Module libwayland-server.so.0 from rpm wayland-1.23.0-2.fc41.x86_64
                                        Module dri_gbm.so from rpm mesa-24.3.4-3.fc41.x86_64
                                        Module libigdgmm.so.12 from rpm intel-gmmlib-22.5.5-1.fc41.x86_64
                                        Module iHD_drv_video.so from rpm intel-media-driver-free-24.4.4-1.fc41.x86_64
                                        Module libva-drm.so.2 from rpm libva-2.22.0-3.fc41.x86_64
                                        Module libva.so.2 from rpm libva-2.22.0-3.fc41.x86_64
                                        Module libpciaccess.so.0 from rpm libpciaccess-0.16-13.fc41.x86_64
                                        Module libdrm_intel.so.1 from rpm libdrm-2.4.124-1.fc41.x86_64
                                        Module libdrm_amdgpu.so.1 from rpm libdrm-2.4.124-1.fc41.x86_64
                                        Module libelf.so.1 from rpm elfutils-0.192-7.fc41.x86_64
                                        Module libdrm_radeon.so.1 from rpm libdrm-2.4.124-1.fc41.x86_64
                                        Module libxshmfence.so.1 from rpm libxshmfence-1.3.2-5.fc41.x86_64
                                        Module libxcb-sync.so.1 from rpm libxcb-1.17.0-3.fc41.x86_64
                                        Module libxcb-randr.so.0 from rpm libxcb-1.17.0-3.fc41.x86_64
                                        Module libsensors.so.4 from rpm lm_sensors-3.6.0-20.fc41.x86_64
                                        Module libSPIRV-Tools.so from rpm spirv-tools-2024.4-1.fc41.x86_64
                                        Module libxcb-xfixes.so.0 from rpm libxcb-1.17.0-3.fc41.x86_64
                                        Module libxcb-present.so.0 from rpm libxcb-1.17.0-3.fc41.x86_64
                                        Module libxcb-dri3.so.0 from rpm libxcb-1.17.0-3.fc41.x86_64
                                        Module libXxf86vm.so.1 from rpm libXxf86vm-1.1.6-1.fc41.x86_64
                                        Module libxcb-glx.so.0 from rpm libxcb-1.17.0-3.fc41.x86_64
                                        Module libgallium-24.3.4.so from rpm mesa-24.3.4-3.fc41.x86_64

output of journalctl --no-hostname -b -1 -p3

Those btrfs errors are a little concerning. My SSD has been written to a bunch…


gnome-shell crash in the middle of gaming

I’d recommend mem-testing from Windows (more tools available); HCI’s memtest showed errors overnight for me when memtest86 didn’t, and I had Vulkan crashes with AMD GPUs whereas DX11 stuff was fine with the bad memory config.

In my case I had to lower memory speed (2700X didn’t like 4 sticks at their rated 3666 but fine at 3200 and upped 1.5V).


At the very least, I installed F41 Workstation last night, and fully-updated I haven’t seen anything crash yet with Intel UHD 630.

I can try lowering the RAM speed. 5600 currently (advertised speed) but i can try lowering to 5000

edit: no change. everything still crashes lowered all the way to 4800

You might find this more useful (more details):

coredumpctl list

since it shows the SIGnal type (like “SIGABRT”) and you can then use the “info” subcommand to select any of the coredumps by using the PID as the arg:

coredumpctl list

Fri 2025-02-07 16:57:20 EST 9108 1000 1000 SIGABRT present  /usr/bin/kaccess                   -
Fri 2025-02-07 16:57:20 EST 9206 1000 1000 SIGABRT present  /usr/libexec/org_kde_powerdevil    -

coredumpctl info 9206

           PID: 9206 (org_kde_powerde)
           UID: 1000 (user)
           GID: 1000 (user)
        Signal: 6 (ABRT)
     Timestamp: Fri 2025-02-07 16:57:20 EST (1h 23min ago)
  Command Line: /usr/libexec/org_kde_powerdevil
    Executable: /usr/libexec/org_kde_powerdevil
 Control Group: /user.slice/user-1000.slice/user@1000.service/background.slice/plasma-powerdevil.service
          Unit: user@1000.service
     User Unit: plasma-powerdevil.service
         Slice: user-1000.slice
     Owner UID: 1000 (user)
       Boot ID: 3a2774b276844615a6f0b897260494a1
    Machine ID: 606b1acf646145ed8a19cacf0295d31e
      Hostname: host.fios-router.home
       Storage: /var/lib/systemd/coredump/core.org_kde_powerde.1000.3a2774b276844615a6f0b897260494a1.9206.1738965440000000.zst (present)
       Message: Process 9206 (org_kde_powerde) of user 1000 dumped core.
                
                Stack trace of thread 9206:
                #0  0x00007f6c1268027c n/a (n/a + 0x0)
                #1  0x00007f6c12626cbe n/a (n/a + 0x0)
                #2  0x00007f6c1260e6d6 n/a (n/a + 0x0)
                #3  0x00007f6c12c1ac32 n/a (n/a + 0x0)
                #4  0x00007f6c12c6ded8 n/a (n/a + 0x0)
                #5  0x00007f6c12c1c2dc n/a (n/a + 0x0)
                #6  0x00007f6c1342d327 n/a (n/a + 0x0)
                #7  0x00007f6c134e009e n/a (n/a + 0x0)
                #8  0x00007f6c134e0b58 n/a (n/a + 0x0)
                #9  0x00007f6c12cfd625 n/a (n/a + 0x0)
                #10 0x00007f6c134e58fd n/a (n/a + 0x0)
                #11 0x00007f6c134e7379 n/a (n/a + 0x0)
                #12 0x0000558b73aa1a55 n/a (n/a + 0x0)
                #13 0x00007f6c126105f5 n/a (n/a + 0x0)
                #14 0x00007f6c126106a8 n/a (n/a + 0x0)
                #15 0x0000558b73aa2375 n/a (n/a + 0x0)
                ELF object binary architecture: AMD x86-64

Does it work with 1 GPU?

Second GPU is part of my CPU and isn’t used.

1 Like

Not saying it is your problem, but I would use the autoconfigured RAM speeds as stated in How to Configure and Overclock your RAM in the BIO... - AMD Community .
As I understand it there is a relation between RAM speed and CPU cycles.
That guide also recommends putting the RAM in the two furthest slots.

already doing both :frowning:


There’s so many, i don’t know where to start :sob:

That looks like a hardware problem, there’s too many unrelated softwares crashing to be likely a software issue.

You might want to hold back the core dumps a bit, a lot of those core files are huge. I make an “drop-in” file path like this at /etc/systemd/coredump.conf.d/override.conf

and put in that override.conf file

[Coredump]
#Storage=external
#Compress=yes
# On 32-bit, the default is 1G instead of 32G.
#ProcessSizeMax=32G
#ExternalSizeMax=32G
#JournalSizeMax=767M
MaxUse=100M
#KeepFree=
#EnterNamespace=no

That MaxUse=100M keeps the total disk usage to core files to 100M or less. You can see “man coredump.conf” and “man coredump.conf.d” for more information, like what MaxUse and KeepFree do. Pick whatever “MaxUse” size you want, it would remove old core files as “MaxUse” is exceeded.

1 Like

Thanks, this helps a lot. I’m going to run a SMART self test on my boot SSD, since I have a sneaking suspicion that that might be part of it.

I appreciate all the help so far

edit: CPU appears to be fine, and not causing stability problems. Ran a prime number benchmark on all cores and it didn’t bat an eye. I’m still thinking storage, since I also ran a RAM test with the memtester package and it tested good.

edit2: SMART self test returned good.

 5415.269141] BTRFS info (device sdb3): scrub: started on devid 1
[ 5421.083726] BTRFS error (device sdb3): unable to fixup (regular) error at logical 1736048640 on dev /dev/sdb3 physical 2818179072
[ 5421.083726] BTRFS error (device sdb3): unable to fixup (regular) error at logical 1735852032 on dev /dev/sdb3 physical 2817982464
[ 5421.084133] BTRFS warning (device sdb3): checksum error at logical 1736048640 on dev /dev/sdb3, physical 2818179072, root 256, inode 56377, offset 770048, length 4096, links 1 (path: lettuce/.local/share/Steam/steamapps/common/Steamworks Shared/_CommonRedist/DirectX/Jun2010/Mar2009_d3dx9_41_x86.cab)
[ 5421.084133] BTRFS warning (device sdb3): checksum error at logical 1735852032 on dev /dev/sdb3, physical 2817982464, root 256, inode 56377, offset 573440, length 4096, links 1 (path: lettuce/.local/share/Steam/steamapps/common/Steamworks Shared/_CommonRedist/DirectX/Jun2010/Mar2009_d3dx9_41_x86.cab)
[ 5421.762571] BTRFS error (device sdb3): unable to fixup (regular) error at logical 2096431104 on dev /dev/sdb3 physical 3178561536
[ 5421.762911] BTRFS warning (device sdb3): checksum error at logical 2096431104 on dev /dev/sdb3, physical 3178561536, root 256, inode 56296, offset 1601536, length 4096, links 1 (path: lettuce/.local/share/Steam/steamapps/common/Steamworks Shared/_CommonRedist/DirectX/Jun2010/APR2007_d3dx9_33_x64.cab)
[ 5422.602081] BTRFS error (device sdb3): unable to fixup (regular) error at logical 2532179968 on dev /dev/sdb3 physical 3614310400
[ 5422.602429] BTRFS warning (device sdb3): checksum error at logical 2532179968 on dev /dev/sdb3, physical 3614310400, root 257, inode 21561, offset 4096, length 4096, links 1 (path: usr/lib/modules/6.11.4-301.fc41.x86_64/kernel/drivers/gpu/drm/gma500/gma500_gfx.ko.xz)
[ 5432.324246] BTRFS error (device sdb3): unable to fixup (regular) error at logical 8548974592 on dev /dev/sdb3 physical 9631105024
[ 5432.326703] BTRFS warning (device sdb3): checksum error at logical 8548974592 on dev /dev/sdb3, physical 9631105024, root 257, inode 305546, offset 8192, length 4096, links 1 (path: usr/lib/firmware/nvidia/gp10b/gr/fecs_inst.bin.xz)
[ 5432.765751] BTRFS error (device sdb3): unable to fixup (regular) error at logical 8774156288 on dev /dev/sdb3 physical 9856286720
[ 5432.765799] BTRFS warning (device sdb3): checksum error at logical 8774156288 on dev /dev/sdb3, physical 9856286720, root 256, inode 8531, offset 24547328, length 4096, links 1 (path: lettuce/.cache/vesktop-updater/pending/vesktop-1.5.5.x86_64.rpm)

Tried a btrfs scrub start on my SSD and got this in dmesg. Doesn’t look good to my untrained eye.

I burned so many times with Btrfs in the past that I decided to revisit it in the next 10 years or so once I can trust my data to it.

Since moving on from Universal Blue to NixOS a year ago I opted to use the trusty ext4 filesystem and I don’t regret my decision a bit.

There’s a nasty bug with kernel 6.13 and the Intel drivers (a regression fixed in 6.14) that completely locked my laptop a couple of times and I had to force hard shutdowns. Upon booting back, ext4 was able to self-heal the two times, using the journaling data.

I don’t doubt Btrfs is an advanced filesystem with tons of fancy features, but in my personal experience it always has been unreliable. I believe it makes sense to use it when implementing a snapshots/rollback system like openSUSE has, but otherwise I learned to stay as far as possible from it.

DYOR, of course, but I’d recommend you consider using anything else than Btrfs next time you install your system if you will not use its advanced features!

HTH

1 Like

yeah, it’s looking like I will be installing ubuntu and testing that to see if it will work for my usecase here.

Sounds like it is worth trying, you have been tenacious with this issue so hope to see you back here one day.

You can just install Fedora with EXT4 partitions though.