Hello! I’m having some trouble getting fedora (or Rhel) working properly on my Threadripper pro system. It’s a threadripper 3975wx with an asus wrx80e II motherboard and currently using an nvidia gpu.
After the system boots into the OS, the status code on my motherboard goes from AA (which asus says indicates it’s all good) to 00 (which they say indicates issues with the cpu and memory). The Fedora system also ends up freezing after some time which I can only resolve by turning off the PC. Right before it boots into the OS I also get 2 warnings. One is from nouveau indicating an unknown chipset and one is from from iwlwifi saying invalid buffer destination.
At first I thought it was a hardware issue, but I tried both Ubuntu and Windows and both posted fine with status code AA and I was able to start running tasks without issue. I’m not sure if this is the right forum to ask, but does anyone have any insight into how I might be able to debug this? I’d prefer to use Fedora/RHEL. Thanks for any help!
things i’ve tried:
turn off the VGA controller
ran mem test
tried both fedora 39 and 40. The resolution on 39 was a lot worse out of the box and froze quicker. RHEL 9.4 also has some resolution issues
We would need the logs that are related to be certain, but using the nouveau driver might be the issue. If not already doing so then maybe installing and using the nvidia drivers from rpmfusion may solve this issue. https://rpmfusion.org/Howto/NVIDIA
Please post the output of sudo dmesg | grep -iE "nvidia|nouveau|secure" as well as inxi -Fzxx so we can see details that may be a factor.
hey, so I reinstalled Fedora and reinstalled the Nvidia drivers following those docs. I also went through the secure boot steps. I attached all the logs I’m seeing after running those two commands, unfortunately still seem to be getting the wrong boot code but the PC isn’t freezing now !
It appears you actually have the nvidia drivers loaded and active.
However, the driver version is not the latest. That would be 560.35.03 if installed from rpmfusion as indicated. I do not know for certain, but it is possible that the RTX 4070 Ti card may not be fully supported by that driver version.
Please show us the output of dnf list installed '*nvidia*'.
It would appear that somehow you have managed to update the kernel and the firmware packages, but that for some reason you do not have the latest nvidia driver version which is 560.35.03…
My suggestion would be to find out why the nvidia driver was not updated (unless that was deliberate) and update it to the latest version. Then find out if the update works any better.
The nvidia driver may be locked to prevent updating in the file /etc/dnf/dnf.conf or in the respective repo file in /etc/yum.repos.d/
I’m sorry if this is maybe a silly question, but what is the process to update to a specific version? The RPMFusion site is kind of vague. Do i need to uninstall the driver first and then install a specific version?
No
The update should be managed with a simple sudo dnf upgrade '*nvidia*' and verify that the version being installed is 560.35.03. If that is not the version to be installed then we need to figure out why the older version is not being replaced.
Which is why I mentioned the possibility that it may be prevented from updating in one of those 2 locations.
I have the rpmfusion-nonfree & rpmfusion-nonfree-updates repos enabled so did not realize the newer driver was not in the rpmfusion-nonfree-nvidia-driver repo.
Yes, to get a driver newer than the 555 version one must enable the other repos as shown in the rpmfusion config page then an upgrade will pull in the 560 drivers.
I think the rpmfusion-nonfree-nvidia-driver repo would be an ideal place to distribute only the Nvidia recommended drivers, which today are version 550.107.02.
The 555.* and 560.* drivers are either BETA or NFB (the short-lived new feature branch).
Why the restriction.?
Nvidia does not have the same widespread access to fedora users that is seen by the rpmfusion repos.
Even beta software needs testers to verify usability and having it available on fedora seems an excellent way to have many users and systems that run the driver and test it to prove stability and performance.
Fedora is a leading edge distro with frequent updates so it is an excellent test bed for even beta software. (On many different hardware platforms and configurations.)
The drivers do not get placed into the driver repo until they have been run for some time by users and stability has been proven. They appear in the updates-testing repo (and apparently the updates repo) earlier. It is up to the user to decide which version to install and use.
Personally I have 3 systems.
A laptop with a GTX 1650 card, a desktop used as a server with 2 GTX 1050 cards, and my daily driver with an RTX 3050 card. All are running the latest kernels and the latest 560 drivers from rpmfusion with no problems anywhere.
I am actually planning to upgrade my laptop to fedora 41 which is about to be released as Beta. My f41 VM has encountered no issues.
In point of fact, on my f41 VM I see this:
$ dnf list akmod-nvidia
Updating and loading repositories:
Repositories loaded.
Available packages
akmod-nvidia.x86_64 3:560.35.03-1.fc41 rpmfusion-nonfree
akmod-nvidia.x86_64 3:560.35.03-1.fc41 rpmfusion-nonfree-nvidia-driver
so f41 will only have the 560 drivers and newer for current versions of the nvidia cards…
Why force it on every user on a stable release f39/ f40?
I have no objections to distribute beta and nfb drivers on beta (f41) or rawhide (f42) but
NFB drivers should be an opt-in ( e.g. an extra-nfb repo ) for stable releases like f39/f40.
An opt-in has then also an easy opt-out, disable the extra-nfb repo, remove drivers and re-install. This can all be documented.
And yet there are many reports on nvidia forums to be found. See also the other topic with proton/wine failing on an Optimus system with a 2nd ext. monitor.
NFB drivers had major issues last(?) year when they would not work at all with desktops configured at refresh rates >90Hz (my desktop is @144Hz). That was the point when I started to build my own rpms for the stable recommended drivers when NFB drivers hit the rpmfusion repos.
It’s impossible to catch all issues with NFB before push to repo. For the nvidia-driver repo, it’s about the initial user experience. I guess @ing123 would have gotten his system up and running with 550.107.02 in no time.
I also currently have 560* installed because I had time and was curious about the explicit-sync Wayland support. So I opted-in, I wouldn’t have if I needed a stable system because of work, RL etc.
No one is ‘forced’ to use the nvidia drivers (they can stay with nouveau) or to upgrade (they can choose to avoid upgrading the version in use). They can even choose to use hardware with an nvidia GPU or with a different GPU.
We see kernel upgrades frequently and are given the same choices. Upgrade or don’t upgrade. Actually that applies to every piece of software in fedora (and every other linux distro).
It is all up to the user to select what they choose to do.
If the option is not available we are taking the choice away from the user and making the decision for them.
Deciding what a user is allowed to do is more within the realm of Apple or Microsoft than the FOSS world. Both those sources design and release their software in such a way that the only choice a user has is to use the OS or not use that OS. A user cannot choose which pieces of the OS software works best for their needs.
Every user has the choice to do exactly what you state. Nothing is forced.
It is impossible to catch all issues ever.
The myriad of hardware and software configs mean there are always some specific situations that are unanticipated and may present issues. Your note about the many reports is noted but is severely skewed in the negative. To be fully cognizant of the situation you should realize that for every single problem reported there are many thousands that have no issues at all.
Many bugs can only be identified by using the software then identifying the conditions where a problem is seen. These are the ‘edge’ cases that may not ever otherwise be found.
Exactly, and that is why the drivers must be available as soon as reasonably possible, and why it is still 100% user choice to use, to upgrade, or not.
Your argument about being forced to upgrade is undermined by the discussion in this thread. Other versions of the drivers are available and as the OP found out, for their system the older driver was more stable.
Why would anyone bitch about rpmfusion providing the NFB (560xx) for an unstable distro like fedora.
If you want production quality, use RHEL or Debian.
‘Forced’ is of course an exaggeration! I could have also said ‘blessed’ with new drivers.
I feel like we have completely different ideas of what a “regular” or “average” user is.
a default fedora workstation installation has automatic updates enabled! Assume that most users will not change that. When do those users shall review updates!? They get notified when new updates were installed and will reboot!
user was probably directed to rpmfusion repositories to set up his nvidia GPU.
Assume rpmfusion-free and rpmfusion-nonfree repositories are enabled (and the appropriate *-updates repositories)
Those users are not interested and not aware of what kind of drivers they have installed. All they want is a stable system and are happy not to use Windows any more.
If more advanced users are wondering why they do not get the ‘shiny’ new drivers, then they can be directed to the optional NFB repository. Simple.
The rest will happily continue to work with the stable branch and migrate to the new drivers as soon as they are considered stable/recommended by upstream.
In the meantime, the regular rpmfusion repositories will continue to receive updates for the stable recommended nvidia drivers as they become available.
The current approach does not serve this type of users very well.
They are released, but only to the interested party who want to try these kind of drivers. That’s the whole point.
Very mature response. Thunbs up!
I checked https://fedoraproject.org/ twice and I don’t see any statement that this is an unstable distro. I should probably ask the admins to put a big warning on the front page.
NOT true!
They either are using the default (nouveau) driver (in which case they may not know) or they knowingly have installed the nvidia drivers (in which case they certainly know it is installed since they had to install it manually)
Exactly, and when made available it is up to the user if they choose to use the newer version.
It is users choice and if the newer driver is not made available as soon as reasonably possible then there are (and have been) users who complain about the delay in having it available.
I have seen comments like “it was just released by nvidia, why is it not available for fedora”. You seem to want only the most stable but the majority seem to want the latest and greatest instantly.
@anotheruser
This has gone off topic for the thread and should stop here.
Discussion of the driver on the repo is not related to booting.
Please start your own thread if you wish to continue this discussion about the drivers and when they are placed into the repo.
Hi, just to followup on this thread the motherboard still posts with q code 00 after updating to latest drivers (560) but I haven’t experienced any freezing and it seems like workloads are running fine.
Not sure what else to check but not sure how related this is to fedora anymore. Thanks for all the help