Upgrade broke nvidia fusion drivers yet again

After an update a few days ago, booting to KDE failed after plymouth.

I’ve lost count of how many times this has happened - usually a kernel change that is incompatible with the nvidia fusion drivers (I wonder if any quality control is done on Fedora before pushing kernel updates - I mostly use FreeBSD and I’ve had almost no problems on that platform).

This is with Fedora 37 on a fairly old HP Intel Xeon workstation, GT 370 video card, 390 nvidia drivers.

If I switch to a terminal, I see that the nouveau kernel modules are loaded and there are no sign of the nvidia drivers (presumably cos dkms failed).

Any suggestions?

That GPU and those drivers are old and not supported well.
The fact you are mentioning dkms tells me you did not install the drivers from rpmfusion. Had you installed from the rpmfusion repo your drivers would be automatically updated for a newer kernel with akmods and the user would not need to do anything manually.

Since the nvidia drivers are not distributed nor supported directly by fedora this statement is way off base. Fedora cannot do quality control for software it does not distribute. When users choose to use packages and repos that are not within the fedora ecosystem then it is up to them to manage the software they added to the system. It cannot be claimed to be fedora’s responsibility unless the user chooses to ONLY use software that fedora distributes through their own repos.

Most members here will gladly support users with problems but rants and accusing statements about quality control like that are out of line and should not be seen.

If you are having problems with your system please post some details about the system, hardware, versions, and exactly what the problem is. Help is available for most situations.

The card is less than a year old.

Now you are making things up. If I said that I installed via rpmfusion it is because I installed via rpmfusion.

Well yes, it should work automatically. In reality it breaks frequently as kernels get pushed with incompatible interface changes. This is a recurring issue - there are many threads on this forum to start with.

In this case it seems that nv-gpu-numa.c no longer builds due to a function signature change to a member of nv_dir_context_t (the return type having changed from int to bool).

Are you saying that Fedora can’t test compatibility with the market leading graphics card vendor? It takes two to tango and coordination with nvidia could be much better,

We are doing beta testing now, did you do your part ?

I’m afraid this is a limitation of the current development model, when free and open-source code takes priority and performs the leading role, and non-free code delays the development and becomes a problem for those who want to use it.

I doubt the current practice can be changed, so you should probably use methods that minimize negative effects, but this likely requires to sacrifice some security for the sake of stability.

A simple workaround is to avoid booting kernels which has issues building the module and just stick to the latest working kernel until a new working kernel arrives.

Otherwise, try to avoid buying and using hardware that depends on non-free code.

There is no Nvidia GT 370. Whatever card it is, it can’t be “less than a year old” if you’re intentionally using older 390xx drivers.

Please check Nvidia’s list of supported GPUs for the correct driver version for your GPU.


If the 390xx driver from RPMFusion fails to build, please report to RPMFusion. You didn’t provide any actual logs or error messages but perhaps it is the same bug already reported here.

1 Like

My mistake. I meant GT 730.

Yes, the compile error looks the same, thanks.

There are several GT 730s with significant differences. Only 1 specific model requires the 390xx drivers (I think the GF108); the rest are supported by 470xx drivers.

Check the PCI ID with lspci -nn and compare the second set of 4 hex digits with the previously linked Nvidia list.

1 Like

When you update the kernel, DKMS will automatically recompile the driver for the new kernel. If you update and reboot right away, then the recompile gets cut off. Generally I recommend waiting about 5 minutes after updating before you reboot. I know this because I dealt with it a lot.

1 Like

I learned this hard way couple of times :slight_smile: I wonder if it’s possible to do it visible, e.g., via notification. And ask additional questions in case user wants to reboot/shutdown before DKMS akmod has finished its job.

You can run systemctl list-jobs and wait for the akmod (or DKMS) service to finish then reboot.

You can script delaying the reboot - that I leave as an exercise…

2 Likes

The Quadro FX 370 was released in 2007
If you instead meant the Geforce GT 730 it was released in 2014

Nothing in your post says it was installed from rpmfusion, and you mentioned dkms implying that the modules came from another source. It has always been my experience that the drivers from rpmfusion are built by akmods and that has automatically worked as long as the user allows time for the drivers to be rebuilt before the reboot. Dkms is never even installed on recent versions of fedora unless one chooses to install a package from a 3rd party repo (other than rpmfusion) which requires dkms for the module build.

With that said, The nvidia.com site drivers page says the GT 730 is properly supported by the 470 driver. I would suggest that you remove the 390 driver you claim to be using and instead install the 470xx driver from rpmfusion. That may give a much better support for that GPU.

RPMFusion Nvidia drivers use akmods, not dkms. They serve a similar purpose but akmods builds kmods as rpm packages which integrates nicely with Fedora.

akmods should build required kmods on boot (see /usr/share/doc/akmods/README). So rebooting early should not cause any problems—but I don’t know since I do offline upgrades with dnf, which avoids this situation entirely.

If you or @arturasb are feeling adventurous, I would suggest testing this again :grin: . Next kernel update, just reboot immediately.

If you have a graphical boot screen (e.g. Fedora logo), you can press Esc to see the boot messages. You should see something about akmods.service indicating the kmod is being built. It may take a couple of minutes (pretty consistently ~1m25s on my 5-year-old i5-8400).

There are reports on the fedora users list that akmod fails to rebuild sometimes on first boot if you do not wait. Personally I have not see this, but more then one person reported the breakage.

I know, that akmod builds on boot, but maybe couple of times there was something wrong with build and manual intervention was needed. I just think that visual clues about akmod in action would improve UX.

For people who not have installed any akmods here the link to rpm fusion:
Packaging/KernelModules/Akmods - RPM Fusion

akmods.noarch > description:

Akmods startup script will rebuild akmod packages during system boot,
while its background daemon will build them for kernels right after they were installed.

Packages:

sudo dnf list akmo*

 Available Packages
 akmod-VirtualBox.x86_64                               7.0.6-1.fc37                                   rpmfusion-free-updates   
 akmod-crystalhd.x86_64                                20220825-1.fc37                                rpmfusion-free           
 akmod-ndiswrapper.x86_64                              1.63-7.fc37                                    rpmfusion-free           
 akmod-nvidia.x86_64                                   3:530.41.03-1.fc37                             rpmfusion-nonfree-updates
 akmod-nvidia-340xx.x86_64                             1:340.108-24.fc37                              rpmfusion-nonfree-updates
 akmod-nvidia-390xx.x86_64                             3:390.157-1.fc37                               rpmfusion-nonfree-updates
 akmod-nvidia-470xx.x86_64                             3:470.161.03-2.fc37                            rpmfusion-nonfree-updates
 akmod-v4l2loopback.x86_64                             0.12.7-2.fc37                                  rpmfusion-free           
 akmod-wl.x86_64                                       6.30.223.271-46.fc37                           rpmfusion-nonfree-updates
 akmod-xtables-addons.x86_64                           3.23-1.fc37                                    rpmfusion-free-updates   
 akmods.noarch                                         0.5.7-9.fc37                                   fedora

Ok, I got a :bulb: on

kmods are the tools/programs needed for the “kernel management modules”

while akmods are startup scripts to trigger this tools for apps listed above to update.

This could cause that the system has to be restartet twice if akmod/kmod packages are updated at the same time as a kernel ? Could this be causing this hick ups @arturasb mentioned ?

Answer from @vgaetera links to the build guidelines and says:

# Ensure that any installed kmods are built for the currently-running
	
# kernel at boot
	
# https://bugzilla.redhat.com/show_bug.cgi?id=1518258
	
enable akmods.service

There is feature request already:

Bug 2118105 - [feature request] Better user feedback while running at boot time

Both akmods and dkms services are enabled by default on Fedora Workstation and should start at boot when the respective packages are installed.

1 Like

kmod package contains programs related to kernel module management, e.g. lsmod and modprobe.

But in the context of “building a kmod” or “nvidia kmod”, it’s just short for “kernel module”.

akmods package contains the programs to automatically build kernel modules as rpm packages, and various systemd services to build on startup/shutdown.

No. With a live update (e.g. dnf upgrade), after the reboot, the new kernel/akmods/whatever package are already being used. The new kmod will be built, causing the boot to take longer, but no additional reboot is needed.

With offline update, on the first reboot, the new kernel/akmods/whatever package is installed. On the second reboot (normal for an offline update), it behaves like the live update’s first reboot.

Anyway this is far out of topic now.

3 Likes

“I’ve lost count of how many times this has happened - usually a kernel change that is incompatible with the nvidia fusion drivers”

What do you think that I meant when I said that ???

“My mistake. I meant GT 730.”

And that ???

And then you say “you claim to be using”. Well I guess you know better than I do what I’m using.

Is there an ignore feature in this forum?