Gnome-shell crashes in libdrm_nouveau.so.2

Ever since I switched to Wayland I experience occasional crashes of gnome-shell.
Sometimes 0 per week, sometime 5 or 6 or more. Today I had 2 such crashes.

Process 3208 (gnome-shell) of user 1000 dumped core.

0x00007fe9f80e6b98 pushbuf_kref (libdrm_nouveau.so.2 + 0x5b98)
0x00007fe9f80e72ec pushbuf_validate (libdrm_nouveau.so.2 + 0x62ec)
0x00007fe9ce48ac79 nvc0_flush (nouveau_dri.so + 0xa8ac79)

Anyone knows how should I proceed to have chance for a fix in the future?

What GPU model do you have?

You can file a bug against nouveau: MesaDrivers · freedesktop.org

Depending on your nvidia GPU model, you may want to consider using the proprietary nvidia driver.

Thank you for response:

Device-1: Intel CoffeeLake-H GT2 [UHD Graphics 630] driver: i915 v: kernel
Device-2: NVIDIA TU117GLM [Quadro T1000 Mobile] driver: nouveau v: kernel
Device-3: Cheng Uei Precision Industry (Foxlink) HP Wide Vision HD

proprietary driver is too problematic for me. I have secure boot and a few other things enabled and using Nvidia drivers was too much hassle…

You could maybe disable the nvidia GPU and just run on the Intel IGP.

The good news is that your card is Turing-based, so it should be supported by nvidia’s new firmware-heavy open kernel module, and thus nouveau (with NVK) should be able to properly drive the hardware in the future using that same firmware.

I think I am running on Intel. Not sure why nouveau gets loaded. How do you suggest to disable it?

Also now crash is worse - everything dies, screen completely frozen:

BUG: kernel NULL pointer dereference, address: 0000000000000008
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP NOPTI
CPU: 9 PID: 2615 Comm: gnome-shell Tainted: G O 6.7.9-200.fc39.x86_64 #1
Hardware name: HP HP ZBook 15 G6/860F, BIOS R92 Ver. 01.20.01 06/30/2022
RIP: 0010:gp100_vmm_pgt_mem+0xbb/0x170 [nouveau]
Code: 8b 46 58 48 01 c2 48 09 c3 49 89 56 58 45 01 e5 41 0f b7 47 12 49 8b 7f 08 89 da 42 8d 2c e0 48 8b 47 08 41 83 c4 01 48 89 ee <48> 8b 40 08 ff d0 0f 1f 00 49 8b 7f 08 48 89 d9 48 8d 75 04 48 c1
RSP: 0000:ffffa45c0305f850 EFLAGS: 00010202
RAX: 0000000000000000 RBX: 00000000000c8001 RCX: 0000000000000001
RDX: 00000000000c8001 RSI: 0000000000000030 RDI: ffff95a79757f280
RBP: 0000000000000030 R08: ffffa45c0305faa8 R09: 0000000000000004
R10: ffff95a7970c9c60 R11: ffff95a78d7d8c00 R12: 0000000000000007
R13: 000000000000000a R14: ffffa45c0305faa8 R15: ffff95a797580a80
FS: 00007f131c5aa640(0000) GS:ffff95b2cd640000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000008 CR3: 000000010f370005 CR4: 00000000003706f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:

? __die+0x23/0x70
? page_fault_oops+0x171/0x4e0
? exc_page_fault+0x7f/0x180
? asm_exc_page_fault+0x26/0x30
? gp100_vmm_pgt_mem+0xbb/0x170 [nouveau]
nvkm_vmm_iter.isra.0+0x2f7/0x890 [nouveau]
? __pfx_nvkm_vmm_ref_ptes+0x10/0x10 [nouveau]
? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau]
? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau]
nvkm_vmm_ptes_get_map+0xb1/0xf0 [nouveau]
? __pfx_nvkm_vmm_ref_ptes+0x10/0x10 [nouveau]
? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau]
nvkm_vmm_map_locked+0x219/0x390 [nouveau]
nvkm_vmm_map+0x89/0xe0 [nouveau]
nvkm_vram_map+0x5a/0x80 [nouveau]
gf100_mem_map+0xc7/0x170 [nouveau]
nvkm_umem_map+0x69/0x100 [nouveau]
nvkm_ioctl_map+0x7e/0xf0 [nouveau]
nvkm_ioctl+0x10b/0x250 [nouveau]
nvif_object_map_handle+0xc8/0x180 [nouveau]
nouveau_ttm_io_mem_reserve+0x189/0x2e0 [nouveau]
ttm_bo_vm_fault_reserved+0xa7/0x3b0 [ttm]
? mmap_region+0x716/0x960
nouveau_ttm_fault+0x69/0xa0 [nouveau]
__do_fault+0x30/0x130
do_fault+0x7e/0x460
__handle_mm_fault+0x782/0xdb0
handle_mm_fault+0x17f/0x360
do_user_addr_fault+0x1e2/0x670
exc_page_fault+0x7f/0x180
asm_exc_page_fault+0x26/0x30
RIP: 0033:0x7f1321b7cc07

glxinfo | grep ‘OpenGL renderer’
OpenGL renderer string: Mesa Intel(R) UHD Graphics 630 (CFL GT2)

The nouveau driver does not properly support the nvidia [Quadro T1000 Mobile] GPU. Thus it is subject to potentially triggering crashes. The same would be true if no driver were installed or if you manage to disable the device.

The fix is relatively simple but does require you work at the command line a bit to resolve it.

  1. First install akmods with sudo dnf install akmods

  2. Follow the steps shown in the file /usr/share/doc/akmods/README.secureboot so the system is prepared to sign the nvidia modules when they are installed. This will require using sudo with each command listed there.

  3. Ensure the rpmfusion-nonfree-nvidia-driver repo is enabled by running dnf repolist and verify that repo is shown in the list. If not then enable that repo by using the gnome software app and enabling it thru the 3rd party repos list (hamburger menu at the top right).

  4. Install the nvidia drivers with sudo dnf install akmod-nvidia xorg-x11-drv-nvidia-cuda.

  5. Wait about 5 minutes then reboot. The nvidia drivers should now load even when secureboot is enabled.

1 Like