Laptop fails to suspend when some flatpaks are open

I’m using Fedora 43 (KDE edition) with my ASUS ROG G752VT notebook with an NVIDIA GeForce GTX970M. he akmod-nvidia drivers are installed in version “580.119.02”. I have an external monitor connected that is set as my primary display. My sleep mode is set to “deep”. The nvidia sleep/resume/hibernate services are enabled.

Recently, a wakeup issue I had fixed itself magically, and I can suspend fine usually. However, I noticed that sometimes I would get an issue where the laptop would fail to go to sleep. The screens would be black and the fans would be running, and the laptop won’t respond to input or wake up, forcing me to hard-reset.

Through trial and error, I figured out that this occurs only when some flatpak apps are open (only one of them needs to be open). The issue occurs with:

  • Fastmail
  • Harune Media Player
  • Elisa Audio Player
  • Ungoogled Chromium
  • VS Codium

Flatpaks that did not have the issue:

  • KolourPaint
  • Gimp
  • Discord

I haven’t found any non-Flatpak apps that cause the issue.

The JournalCTL is this: Feb 02 20:21:40 rog-ultramarine systemd[1833]: Started dbus-:1.2-org.fcitx.Fcitx - Pastebin.com
The applications in question don’t seem to have any prominent lines, but there are several instances of flatpak saying that it failed to contact KWallet, and other flatpak errors.

Does anyone have an idea on what causes this or how to fix this? Any help is greatly appreciated.

lines 750 onwards imply you hit a kernel bug.

Feb 02 20:26:51 rog-ultramarine kernel: list_add corruption. prev is NULL.
Feb 02 20:26:51 rog-ultramarine kernel: ------------[ cut here ]------------
Feb 02 20:26:51 rog-ultramarine kernel: kernel BUG at lib/list_debug.c:25!
Feb 02 20:26:51 rog-ultramarine kernel: fbcon: Taking over console
Feb 02 20:26:51 rog-ultramarine kernel: Oops: invalid opcode: 0000 [#1] SMP PTI
Feb 02 20:26:51 rog-ultramarine systemd[1]: nvidia-suspend.service: Main process exited, code=killed, status=11/SEGV
Feb 02 20:26:51 rog-ultramarine kernel: CPU: 4 UID: 0 PID: 4076 Comm: nvidia-sleep.sh Tainted: P           OE       6.18.7-200.fc43.x86_64 #1 PREEMPT(lazy) 
Feb 02 20:26:51 rog-ultramarine systemd[1]: nvidia-suspend.service: Failed with result 'signal'.
Feb 02 20:26:51 rog-ultramarine kernel: Tainted: [P]=PROPRIETARY_MODULE, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
Feb 02 20:26:51 rog-ultramarine kernel: Hardware name: ASUSTeK COMPUTER INC. G752VT/G752VT, BIOS G752VT.307 04/26/2019

it never suspends because the nvidia-suspend.service gets killed.

A kernel bug sounds pretty serious. What can I do in this situation?
Should I report this to NVIDIA? But I know they don’t do bugfixes for my graphics card model anymore due to another recent incident.

For now I’ll try to boot the older kernel from grub and report back.

Reporting bugs — The Linux Kernel documentation for how to report kernel bugs.

You could try reporting it to Nvidia, but you might as well tell my mother all about it too, for the attention it’ll get and she’s been dead for 12 years.

You could also report it to the ultramarine developers… you may get more interest than my mother and nvidia will pay it! :wink:

Thank you!

For the record, I tried booting the previous kernel from grub (6.17.11-300) and it didn’t change the behavior.

Regarding the kernel bug report, it seems my new kernel (6.18.7-200) is not in the “supported kernels” on kernel.org, only 6.18.8 is. That means I can’t report the bug to the kernel directly right? Unless I install one of the supported kernels.

Regardless of version, you have the Nvidia driver kernel modules installed, so the kernel is “tainted” and the kernel developers won’t accept reports against it.

That’s something you could try though - switch to the nouveau drivers and that same kernel and see if you get the same issue with those flatpaks.

If you do, then it’s a kernel bug and you can report is against an untainted kernel.
If you don’t, it’s the nvidia driver which is the root cause and may well get fixed in an nvidia driver update.

1 Like

Ok, I’ll try that. I generally don’t like nouveau because it caused me severe issues before, but if I avoid those, I’m sure I can test it.

But do I also need to install a supported kernel, or is the 6.18.7 fine?

Aye - I feel your reluctance with the nouveau drivers - I’ve always had woefully poor performance from them, but it would just be for the duration of a test to prove is it the kernel or Nv drivers which are at fault. Switch back as soon as you like. Worst that happens is that your suspend process is a PITA depending on what you have running.

I suspect you’ll be fine with the 6.18.8 kernel, but if you want to play it entirely safe, boot 6.18.7 with a temporary blacklist on nvidia and un-blacklist nouveau for that boot only via the grub kernel params. i.e select kernel in grub, hit e, adjust params, boot that one time only with the drivers swapped, do the test, reboot and you’re back to “normal”; minimises the faffing about you have to do.

Oh, performance is the least of my worries with nouveau.

Nouveau:

  • kills my KWin constantly if my external monitor is connected
  • causes my screen to remain black indefinitely upon wake up
  • does not allow Electron apps (Fastmail, VS Codium) to properly render. They are just windows without content.

I did try now with the nouveau drivers and can confirm that the specific “not sleeping properly” bug is a problem of the Nvidia drivers, as the laptop suspends properly everytime with nouveau. (Well, it doesn’t wake up afterwards, but it does finish suspending :wink: )

Which means there’s basically no chance of it getting fixed unless it also occurs in later Nvidia generations, because I know for a fact that Nvidia doesn’t care about bugs in my generation of GPU. And I can’t report it to the Kernel because the Kernel is “tainted”. (that was the first I’ve heard of that btw. It sounds pretty dramatic)

It just means that the kernel devs can’t be sure that what your reporting is actually a problem with their code (as in this case, even though it triggers a kernel bug report), as they aren’t responsible for all the other code which might be running. Also means they can’t replicate the issue easily as they have no idea what you might also have installed and if they cannot replicate it, they cannot be sure they have corrected it.

If you look at the start of a journal log, you should see that the kernel is flagged as tainted and at that point all bets are off for reporting kernel bugs.

1 Like

Ah, you misunderstood me. I understood very quickly what “tainted” meant and why I can’t report it when I first read the reply from “P G”. My remark was mainly about how the kernel being “tainted” sounds very dramatic.

Anyway, I guess I’ll give up on this issue and just don’t suspend my laptop ever. Maybe I’ll get a new PC after all.

You’ve been very helpful, Steve. I appreciate you taking the time out of your day to help me with my problem.

2 Likes

“Refurbished” enterprise grade PC’s are currently in oversupply due to workforce reductions and upgrading for AI workloads by large enterprises. My experience is that it takes some time for linux support to kick in for new PC’s.