Fedora 44 GPU stated as llvmpipe despite nvidia drivers installed

After it being solved some information on what happened:

  • Fedora 44 brought nvidia driver 595 with it which is incompatible with my Geforce 1060 Card
  • Installing the nvidia drivers by the .run file method and trying other stuff left a mess in my system
  • Correct solution would have been to directly remove 595 packages mentioned further below by the package manager and installing 580 after that. Probably that would have solved it rightaway.
  • I needed then to do a lot of cleanup of the not properly uninstalled 595 drivers
  • So my conclusion DO NOT USE .run file , use the package manager be hesitant to use other wild methods (setting parameters you do not understand, add / remove files based on internet recommendations, …) and keep track of what you did so you might be able to undo it manually.

I updated to Fedora 44 some days ago and since I have trouble getting the GPU to work properly. It seems the system still is not using the GPU.

System Infromation tells me this:

Operating System: Fedora Linux 44
KDE Plasma Version: 6.6.4
KDE Frameworks Version: 6.25.0
Qt Version: 6.10.3
Kernel Version: 6.19.14-300.fc44.x86_64 (64-bit)
Graphics Platform: Wayland
Processors: 4 × Intel® Core™ i5-7500 CPU @ 3.40GHz
Memory: 16 GiB of RAM (15.6 GiB usable)
Graphics Processor: llvmpipe
Manufacturer: Gigabyte Technology Co., Ltd.
Product Name: B250M-DS3H

and inxi shows this:

Graphics:
  Device-1: NVIDIA GP106 [GeForce GTX 1060 6GB] driver: nvidia v: 580.159.03
    arch: Pascal pcie: speed: 2.5 GT/s lanes: 16 ports: active: HDMI-A-1
    empty: DP-1, DP-2, DP-3, DVI-D-1 bus-ID: 01:00.0 chip-ID: 10de:1c03
  Device-2: Logitech HD Webcam C510 driver: snd-usb-audio,uvcvideo type: USB
    rev: 2.0 speed: 480 Mb/s lanes: 1 bus-ID: 1-7.3:8 chip-ID: 046d:081d
  Display: wayland server: X.org v: 1.21.1.22 with: Xwayland v: 24.1.11
    compositor: kwin_wayland driver: gpu: nvidia,nvidia-nvswitch display-ID: 0
  Monitor-1: HDMI-A-1 model: Idek Iiyama PL3480WQ res: 3440x1440 hz: 100
    dpi: 110 diag: 864mm (34")
  API: EGL v: 1.5 platforms: device: 1 drv: swrast surfaceless: drv: swrast
    wayland: drv: swrast x11: drv: swrast inactive: gbm,device-0
  API: OpenGL v: 4.5 vendor: mesa v: 26.0.5 glx-v: 1.4 direct-render: yes
    renderer: llvmpipe (LLVM 22.1.1 256 bits) device-ID: ffffffff:ffffffff
    display-ID: :0.0
  API: Vulkan v: 1.4.341 surfaces: N/A device: 0 type: cpu
    driver: mesa llvmpipe device-ID: 10005:0000
  Info: Tools: api: clinfo, eglinfo, glxinfo, vulkaninfo
    de: kscreen-console,kscreen-doctor gpu: nvidia-settings,nvidia-smi
    wl: wayland-info x11: xdriinfo, xdpyinfo, xprop, xrandr

Any suggestions on how to move forward would be great since any overlay lets my FPS go down dramatically, even the restart overlay etc.

I’m using RTX 3070 on the system (so Idk if it works to you), but below worked for me:

sudo dnf remove "*nvidia*" -y
sudo dnf install akmod-nvidia xorg-x11-drv-nvidia-cuda nvidia-gpu-firmware
sudo akmods --force --rebuild
sudo reboot

Just edit the comment, instead of leaving badness around

Done, sorry for distracting.

check outputs of vulkaninfo --summary and eglinfo -B.

That won’t work for a Pascal card on F44, because akmod-nvidia provides the 595 driver, which isn’t compatible with the card.

Pascal users on F44 need to be on akmod-nvidia-580xx (How to switch).

1 Like

Thank you for pointing me there.
I had figured out the i need the 580 drivers before. But I think I did not exclude nvidia-firmware when removing the packages.

Now I tried to reinstall nvidia-gpu-firmware.noarch and after that reinstall the 580 drivers but with no success.

Still it looks like the GPU is ignored.

OpenGL (EGL) from Systeminformation states these warnings at the beginning:

libEGL warning: failed to get driver name for fd -1

libEGL warning: MESA-LOADER: failed to retrieve device information

libEGL warning: failed to get driver name for fd -1

I found this suspicious output when calling nvidia-smi:

nvidia-smi
Failed to initialize NVML: Driver/library version mismatch
NVML library version: 595.58

Looks to me like somehere the 595 is still hiding from me :frowning:

If this returns nothing

rpm -qa *\nvidia\* |grep 595

Try running

sudo dracut -fv

I found libnvidia-ml-so of 595 in /usr/lib and /usr/lib64 and deleted those. After that nvidia-smi works as expected.

Unfortunately still its using llvmpipe.

Why are you deleting random files?, package management handles that, not you.

Where did 595 come from?, nvidia .run file?

1 Like

yes likely a leftover of the uninstaller of the .run file
Package manager likely did not do that because it came from the .run file installation i believe.

Try fixing the damage the .run file causes

sudo dnf reinstall *\nvidia\* mesa\* egl\* xorg\*

reinstalled seemingly without problems, otherwise no changes.

querying for 595 packages gives me these two:

nvidia-modprobe 595.71.05-1.fc44 x86_64
nvidia-persistenced 595.71.05-1.fc44 x86_64

Should those be removed? But on the other hand they would have been removed already when removing all “nvidia” packages earlier.

No.
Those are the appropriate packages for use with the 580xx nvidia packages

I achieved some progress by identifying orphaned driver files in

 # The actual OpenCL library
ls -la /usr/lib64/libOpenCL*
ls -la /usr/lib64/libnvidia-opencl*
# Check who owns them
rpm -qf /usr/lib64/libnvidia-opencl.so.1
rpm -qf /etc/OpenCL/vendors/nvidia.icd

Result were two driver files that belonged to the 595 version of the driver and one of them was referenced by a symlink of a file that belonged to the 580 version.

After removing those as well as the symlink and reinstalling the 580 packages as well as rebuilding akmod there is some progress in that system information recognizes the 1060 and does not point to llvmpipe anymore.

Still OpenGL is using llvmpipe that i have to figure out now.

Intermediate conclusion for me DO NOT USE the .run file installation if possible, relying on package manager is much much better. Still learning by making mistakes.

Maybe remove the nvidia drivers and then list all files with
find /etc /usr/ -iname '*nvidia*' 2>/dev/null or feed the output to rpm -qf ,
ignore files in /usr/lib/firmware.

Maybe start again and reinstall fedora, consider it a life lesson for using the nvidia .run file.

It is solved finally:

  • DRM modeset and preservememory allocations needed to be adapted (Probably I manipulated them when following an instruction to fix my problems)
sudo grubby --update-kernel=ALL --args="nvidia-drm.modeset=1 nvidia.NVreg_PreserveVideoMemoryAllocations=1"
  • still files of the run file installation were hanging around in user space
# Find any 595-versioned nvidia libraries on the filesystem
find /usr/lib64 /usr/lib /usr/local/lib64 /usr/local/lib -name "*nvidia*595*" 2>/dev/null | sort
# Broader check - all libnvidia .so files and their versions
find /usr/lib64 /usr/lib /usr/local -name "libnvidia-*.so*" 2>/dev/null | sort
  • After deleting those files, and another 580 reinstall, reboot the problem was finally solved
!!! USE AT YOUR OWN RISK !!!
Step 1 — Remove all 595 files (requires sudo)
sudo find /usr/lib64 /usr/lib -name "*595.58.03*" -delete
Step 2 — Reinstall 580xx RPM packages to restore correct symlinks
sudo dnf reinstall \
  xorg-x11-drv-nvidia-580xx-libs \
  xorg-x11-drv-nvidia-580xx-cuda-libs \
  xorg-x11-drv-nvidia-580xx-cuda \
  xorg-x11-drv-nvidia-580xx \
  xorg-x11-drv-nvidia-580xx-power
Step 3 — Rebuild ldconfig cache
sudo ldconfig
Step 4 — Verify no 595 files remain and symlinks point to 580
# Must return nothing
find /usr/lib64 /usr/lib -name "*595*" 2>/dev/null
# Must show 580.159.03 paths
ldconfig -p | grep "libEGL_nvidia\|libGLX_nvidia" | sort
Step 5 — Reboot
sudo reboot
After reboot, run the full verification:
sudo dmesg | grep -i "api mismatch"
glxinfo | grep "OpenGL renderer"

Thanks everyone!

1 Like

In my past experience with the .run files I was able to always do a clean removal with the reversal of the installation.
filename.run --uninstall where the installation was merely filename.run