Suspend failure and degraded performance after failed suspend on kernel 6.10.3

$ sudo lspci -nn -vv -s 00:1f.6
00:1f.6 Ethernet controller [0200]: Intel Corporation Device [8086:550a] (rev 20)
        DeviceName: Ethernet controller
        Subsystem: CLEVO/KAPOK Computer Device [1558:a743]
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0
        Interrupt: pin D routed to IRQ 194
        IOMMU group: 15
        Region 0: Memory at b54a0000 (32-bit, non-prefetchable) [size=128K]
        Capabilities: [c8] Power Management version 3
                Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=1 PME-
        Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
                Address: 00000000fee00c18  Data: 0000
        Kernel driver in use: e1000e
        Kernel modules: e1000e

I can reproduce the issue on Kernel 6.10.5, which just rolled out.

Could you also post the output of inxi -MCnzxx please? This will collect info about the computer model, cpu and again network.

Last commit to the e1000e driver was on 2024-07-10 a few days before 6.10 was released.

I think you need to open a ticket at http://bugzilla.redhat.com
for Product: fedora / Component: kernel and describe the issue and refer also to the ticket from the other post.

And post/attach the output of the inxi command from above too.
I think here it is impoirtant to to say that you see this regression on a Meteor Lake platform.

Please provide also the relevant lines from journalctl for a failed suspend attempt in text form.

$ inxi -MCnzxx
Machine:
  Type: Laptop System: Notebook product: V54x_6x_TU v: V540TU
    serial: <superuser required> Chassis: type: 10 serial: <superuser required>
  Mobo: Notebook model: V54x_6x_TU v: V540TU serial: <superuser required>
    UEFI: 3mdeb v: Dasharo (coreboot+UEFI) v0.9.0 date: 07/17/2024
CPU:
  Info: 16-core (6-mt/10-st) model: Intel Core Ultra 7 155H bits: 64
    type: MST AMCP arch: Meteor Lake rev: 4 cache: 24 MiB note: check
  Speed (MHz): avg: 987 high: 2003 min/max: 400/4500:4800:3800:2500 cores:
    1: 2003 2: 1700 3: 400 4: 400 5: 1208 6: 1571 7: 1924 8: 400 9: 1770 10: 400
    11: 1952 12: 400 13: 999 14: 999 15: 1000 16: 1000 17: 1002 18: 400
    19: 1000 20: 400 21: 400 22: 400 bogomips: 131788
  Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx
Network:
  Device-1: Intel Meteor Lake PCH CNVi WiFi driver: iwlwifi v: kernel
    bus-ID: 00:14.3 chip-ID: 8086:7e40
  IF: wlp0s20f3 state: up mac: <filter>
  Device-2: Intel vendor: CLEVO/KAPOK driver: e1000e v: kernel port: N/A
    bus-ID: 00:1f.6 chip-ID: 8086:550a
  IF: eno0 state: down mac: <filter>

I have submitted the bug to red hat.
https://bugzilla.redhat.com/show_bug.cgi?id=2306163

I can confirm the issue is still present in kernel 6.10.6. There are no updates to the red hat bug report yet.

a possible fix needs to land first in the development branch (currently it’s 6.11. (rawhide or fc41 since the brach was created a few days ago)

There are no updates to the red hat bug report yet.

It could help if you would add the journalctl output of a failed suspend attempt and you could reference the kernel.org ticket as I asked you.

These are very busy people and they won’t browse through a thread on askfedora.
Put in all relevant info in a compact form. It’s still possible that you won’t get any feedback, because it’s probably already known issue but there is no easy fix available atm. We don’t know.

1 Like

There are 6 LHDB probes for Notebook V54x_6x_TU. Most are using 6.6.21-yocto-standard kernels. The Yocto project targets IOT devices, so likely has different power management strategies than laptops from major vendors.

From the LHDB, the pci:8086:550a:1558:a743 ethernet controller is unique to this model. If it isn’t soldered to the system board a cheap workaround would be to replace the controller with one known to work with Fedora 40.

I guess the controller’s pcid needs to be excluded from some kind of attempted fix for meteor lake platforms that was merged in the 6.10 kernels.
The controller works flawlessly with 6.9 kernels.

modinfo e10002 has:

parm:           SmartPowerDownEnable:Enable PHY smart power down (array of int)

From https://github.com/Joursoir/e1000e:

SmartPowerDownEnable
--------------------
Valid Range: 0-1
Allows Phy to turn off in lower power states. The user can turn off this
parameter in supported chipsets.

Thank you. I did provide a full journalctl output in a github gist attached to my redhat bug report because the character limit prevented me from pasting it in a comment. Is there anything else I can do to help?

I can reproduce the issue on Kernel 6.10.7.