Low Nvidia performance with newer kernels

You are running wayland and I use Xorg.

ok, thanks :slight_smile:

The thing to note for me is ‘LnkCap2’ in lspci -vv

LnkCap2: Supported Link Speeds: 2.5-16GT/s, Crosslink- Retimer+ 2Retimers+ DRS-

It shows supported link speeds for the GPU. It is entirely absent from lspci output on 6.13. Similarly, bridge devices are also stuck at Gen 1.

Try using sudo

$ uname -r
6.14.0-63.fc42.x86_64
$ sudo lspci -vv -s $(lspci | grep -i VGA | awk '{print $1}') |grep -e LnkCap -e LnkCtl
		LnkCap:	Port #0, Speed 8GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <512ns, L1 <4us
		LnkCtl:	ASPM Disabled; RCB 64 bytes, LnkDisable- CommClk+
		LnkCap2: Supported Link Speeds: 2.5-8GT/s, Crosslink- Retimer- 2Retimers- DRS-
		LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
		LnkCtl3: LnkEquIntrruptEn- PerformEqu-
1 Like

sudo was already used in both cases.

So I’ve installed the rawhide kernel using the Rawhide-Nodebug repo. Nvidia proprietary drivers don’t seem to be working any more, with lspci -vv now showing nouveau in use.

However, the Link speeds and Caps are now within expected values. Is the PCIe bandwidth issue related specifically to the proprietary driver?

01:00.0 VGA compatible controller: NVIDIA Corporation GA104M [GeForce RTX 3070 Mobile / Max-Q] (rev a1) (prog-if 00 [VGA controller])
	Subsystem: Lenovo Device 3833
	Physical Slot: 1
	Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Interrupt: pin A routed to IRQ 0
	IOMMU group: 16
	Region 0: Memory at 60000000 (32-bit, non-prefetchable) [disabled] [size=16M]
	Region 1: Memory at 6000000000 (64-bit, prefetchable) [disabled] [size=256M]
	Region 3: Memory at 6010000000 (64-bit, prefetchable) [disabled] [size=32M]
	Region 5: I/O ports at 5000 [disabled] [size=128]
	Expansion ROM at 61000000 [virtual] [disabled] [size=512K]
	Capabilities: [60] Power Management version 3
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold-)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
		Address: 0000000000000000  Data: 0000
	Capabilities: [78] Express (v2) Legacy Endpoint, IntMsgNum 0
		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s unlimited, L1 <64us
			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ TEE-IO-
		DevCtl:	CorrErr- NonFatalErr- FatalErr- UnsupReq-
			RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset-
			MaxPayload 256 bytes, MaxReadReq 512 bytes
		DevSta:	CorrErr+ NonFatalErr- FatalErr- UnsupReq+ AuxPwr- TransPend-
		LnkCap:	Port #0, Speed 16GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <512ns, L1 <4us
			ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
		LnkCtl:	ASPM Disabled; RCB 64 bytes, LnkDisable- CommClk-
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 2.5GT/s (downgraded), Width x8 (downgraded)
			TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
		DevCap2: Completion Timeout: Range AB, TimeoutDis+ NROPrPrP- LTR+
			 10BitTagComp+ 10BitTagReq+ OBFF Via message, ExtFmt- EETLPPrefix-
			 EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
			 FRS-
			 AtomicOpsCap: 32bit- 64bit- 128bitCAS-
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-
			 AtomicOpsCtl: ReqEn-
			 IDOReq- IDOCompl- LTR+ EmergencyPowerReductionReq-
			 10BitTagReq- OBFF Disabled, EETLPPrefixBlk-
		LnkCap2: Supported Link Speeds: 2.5-16GT/s, Crosslink- Retimer+ 2Retimers+ DRS-
		LnkCtl2: Target Link Speed: 16GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance Preset/De-emphasis: -6dB de-emphasis, 0dB preshoot
		LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete- EqualizationPhase1-
			 EqualizationPhase2- EqualizationPhase3- LinkEqualizationRequest-
			 Retimer- 2Retimers- CrosslinkRes: unsupported
	Capabilities: [b4] Vendor Specific Information: Len=14 <?>
	Capabilities: [100 v1] Virtual Channel
		Caps:	LPEVC=0 RefClk=100ns PATEntryBits=1
		Arb:	Fixed- WRR32- WRR64- WRR128-
		Ctrl:	ArbSelect=Fixed
		Status:	InProgress-
		VC0:	Caps:	PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
			Arb:	Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
			Ctrl:	Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
			Status:	NegoPending- InProgress-
	Capabilities: [250 v1] Latency Tolerance Reporting
		Max snoop latency: 0ns
		Max no snoop latency: 0ns
	Capabilities: [258 v1] L1 PM Substates
		L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
			  PortCommonModeRestoreTime=255us PortTPowerOnTime=10us
		L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
			   T_CommonMode=0us LTR1.2_Threshold=0ns
		L1SubCtl2: T_PwrOn=10us
	Capabilities: [128 v1] Power Budgeting <?>
	Capabilities: [420 v2] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP-
			ECRC- UnsupReq- ACSViol- UncorrIntErr- BlockedTLP- AtomicOpBlocked- TLPBlockedErr-
			PoisonTLPBlocked- DMWrReqBlocked- IDECheck- MisIDETLP- PCRC_CHECK- TLPXlatBlocked-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP-
			ECRC- UnsupReq- ACSViol- UncorrIntErr- BlockedTLP- AtomicOpBlocked- TLPBlockedErr-
			PoisonTLPBlocked- DMWrReqBlocked- IDECheck- MisIDETLP- PCRC_CHECK- TLPXlatBlocked-
		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+
			ECRC- UnsupReq- ACSViol- UncorrIntErr+ BlockedTLP- AtomicOpBlocked- TLPBlockedErr-
			PoisonTLPBlocked- DMWrReqBlocked- IDECheck- MisIDETLP- PCRC_CHECK- TLPXlatBlocked-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+ CorrIntErr- HeaderOF-
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+ CorrIntErr- HeaderOF+
		AERCap:	First Error Pointer: 00, ECRCGenCap- ECRCGenEn- ECRCChkCap- ECRCChkEn-
			MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
		HeaderLog: 00000000 00000000 00000000 00000000
	Capabilities: [600 v1] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
	Capabilities: [900 v1] Secondary PCI Express
		LnkCtl3: LnkEquIntrruptEn- PerformEqu-
		LaneErrStat: 0
	Capabilities: [bb0 v1] Physical Resizable BAR
		BAR 0: current size: 16MB, supported: 16MB
		BAR 1: current size: 8GB, supported: 64MB 128MB 256MB 512MB 1GB 2GB 4GB 8GB
		BAR 3: current size: 32MB, supported: 32MB
	Capabilities: [c1c v1] Physical Layer 16.0 GT/s <?>
	Capabilities: [d00 v1] Lane Margining at the Receiver
		PortCap: Uses Driver+
		PortSta: MargReady- MargSoftReady-
	Capabilities: [e00 v1] Data Link Feature <?>
	Kernel modules: nouveau

The nvidia driver doesn’t support the 6.15rc kernel.

If you want to use nvidia with the 6.15rc kernel you will need to use the beta driver from F43 as I have zero intention to backport to the stable releases

At this point, I’m just tired and frustrated. I’ve also been posting on Nvidia forums, but so far no one there knows what’s going on either.

UPDATE: Nvidia driver 575 has fixed the PCIe bandwidth issue! I’m now getting the full 16GT/s with kernel 6.14.4.

UPDATE 2: Unfortunately, the issue has now returned again. I had full PCIe performance right up until I ran my usual optimisation script, which also contains autoaspm.

UPDATE 3: I appear to have replicated the situation in which I get full PCIe performance.

  1. Disable the rpmfusion rawhide repo
  2. Downgrade to 570 and reboot to build kernel modules
  3. Reboot into affected kernel
  4. Reenable rpmfusion rawhide and update nvidia drivers
  5. Reboot again, and check with lspci -vv and vkcube

I also get this AFTER the upgrade, if it helps:

$ cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX Open Kernel Module for x86_64  570.144  Release Build  (dvs-builder@U22-I3-AF02-07-1)  Thu Apr 10 20:13:56 UTC 2025
GCC version:  gcc version 15.1.1 20250425 (Red Hat 15.1.1-1) (GCC) 

Also, the above output says it’s the open kernel module, despite installing akmod-nvidia, and not akmod-nvidia-open. How do I set it to use the proprietary module?

Post

rpm -qa akmod-nvidia\*
$ rpm -qa akmod-nvidia\*
akmod-nvidia-575.51.02-4.fc43.x86_64

In addition: Nvidia kmods are now rebuilt. No Kernel/Userspace driver mismatches at the moment.

akmod-nvidia-575.51.02-4 has detection for the open driver, if your card supports the open driver it will default to it automatically.

1 Like

Maybe try this to use the proprietary module

sudo sh -c 'echo "%_without_kmod_nvidia_detect  1" > /etc/rpm/macros.nvidia-kmod'

The file already exists with this parameter in there

Yes, it was indeed 0:

%_with_kmod_nvidia_open 0

%if 0%{?_with_kmod_nvidia_open:1} checks for whether variable is defined or not. if defined, regardless value is 0 or 1 it always evaluates to True

1 Like

Thanks for correcting my error.

Post edited Low Nvidia performance with newer kernels - #57 by leigh123linux

Yup, this one seems to have actually built and loaded the proprietary module:

cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module  575.51.02  Thu Apr 10 16:08:06 UTC 2025
GCC version:  gcc version 15.1.1 20250425 (Red Hat 15.1.1-1) (GCC) 

Unfortunately, the proprietary module has not fixed the original PCIe bandwidth issue.

@leigh123linux I had edited this out earlier, but this might be important:

I had not modified the nvidia rpm macro in any way until you brought it up. It is possible that the macro on the 575 (or lower) package may be bugged.