Nvidia and default drivers: stability? Security? ... and relation to market shares of hardware?

I have to disagree with the quote:

I indeed suggest to use the default instead of the proprietary nvidia driver → more stability/security given the holistic development and testing in and around the kernel.

I started with F36 initially with noveau default drivers. I found every so often I had random crashes and complete like complete power cycle reset crasher with this ‘default’ driver.

Once I changed to nvidia proprietary drivers using rpmfusion drivers that those problems completely dissappeared. Plus with the proprietary drivers you get better control over the GPU options and health monitoring.

Since F38 update I have been unable to login via Plasma X11 desktop setting. Just returns back to same login screen. Only wayland options worked. Haven’t figured out or investigated cause yet. But my ‘default’ driver instability has now returned with complete screen freeze and all other key combinations inoperable.

As a result I have to point out that the quote from my own experience is a bit of a stretch.

This is why I prefer ‘tainting’ my kernel with nvidia proprietary drivers and don’t care what the graphics linux purist say. Nvidia products are the best. FULL STOP PERIOD. I have tried them all, AMD, INTEL etc.

Nvidia has 90% market monopoly whether you like it or not. For good reason. So constantly fighting that reality is what’s causing these issues. All the distributions need to wake up to this, get off the purity white horse and move on rather then fighting this issue for 30 years. More important things to do with linux development resources. The battle has been lost. It’s time that’s admitted.

@rkoppelh I don’t know your very issue (and be aware that in some cases you sometimes have not found the bug but only its trigger), but issues with the default drivers are rare. At the same time, we have more or less always open issues with nvidia. These are issues by design since nvidia’s drivers are not included in the long development and testing chain up to the “running kernel/OS”. The stability issues are not restricted to Fedora.

Also, please do not take single parts of my arguments and interpret them on themselves:
If the default driver works for you and fulfills your needs, I indeed suggest to use the default instead of the proprietary nvidia driver → more stability/security given the holistic development and testing in and around the kernel.

With regards to the needs, if you want to exploit the card’s capabilities, the nvidia driver will more or less always outcompete the default driver given that the latter is much dependent on reverse engineering, which means that it is always at least one (if not more) steps behind while this type of development might also not result in the best efficiency in the product. I did not generally suggest to not use nvidia drivers.

Further, this also means that it takes time until the default driver supports a card and therefore, even if you do not need much performance, it can happen that your card is not yet supported by the default so that you need the proprietary driver.

However, that the default driver is superior in security and stability is a common phenomenon throughout the Linux community (and the inevitable result of the way it is integrated), which is at least one of the reasons why the kernel community regularly tries to impose means in the kernels that create incentives to open up and integrate with the community (which can also mean to make it harder to not open up/integrate).

That said, any code can have bugs. Maybe you ended up in one of the rare bugs of the default driver (and hopefully you did file a bug report), but this still remains seldom, especially compared to nvidia. Of course it has to be added that “bugs” in this respect do not imply a flaw in nvidia due to bad coding but can be the outcome if two completely separated entities develop code for the kernel level, while both don’t/cannot consider each other. So the bugs with nvidia lie often in the dynamics in between the codes+configurations of both sides.

Beyond, if your issue with the default driver remains open, feel free to open a topic and a bug report if not yet done.

If so, I guess this is less related to their drivers but to their hardware and its potential use cases :wink: I have not much knowledge about the graphics market, but I am not sure if that is a tautology: nvidia is more or less its own market segment as far as I got it - does it have “global competitors” for dedicated high-performance graphics since ATI has been integrated into AMD? (That’s really a question, I don’t really know since I am out of that topic for many years, but it would be indeed interesting to know :smiley: ).

My subjective perception is that ATI within AMD has developed to a separated market segment in between the low-performance but cheap (cheap in terms of financial but also power consumption) Intel graphics and the dedicated high-performance/-costs nvidia graphics. So I think the competition in between them is limited to the edge in between the segments, and often mostly indirect (in terms of integrating buyers from other segments, e.g., by making youtube watchers to power gamers :wink: ), ain’t it?


To avoid confusion, I have split this to avoid blurring the topics.

The nouveau driver is not stable in my experience for well documented reasons. No nvidia provided docs or firmware details etc.

There is a replacement being worked on by Collabra, NVK, that sounds very promising.

For a long time the drivers that rpmfusion provide have worked very well.

For ref I used to have rtx1060 now using rtx3060.

Both kde plasma x11 and wayland work for me.
I have to use x11 is i want to use Steam to play games.

If only it were that simple. We’ve put in a lot of work to make the experience as smooth as possible, but it’s ultimately not up to us. What exactly more should we do? Hold kernel updates back until whenever Nvidia decides that they’re ready?

Take a look at Nvidia’s repo — they provide packages, but only up through F37. Should we be waiting on them still before we release Fedora Linuux 38? How long should we wait?

I get the frustration, but this is Nvidia’s code, Nvidia’s hardware, and Nvidia’s problem to solve. Please take this passion and bring it to them, where maybe it will have some effect. Vendors do listen to customer requests.

I know that because it’s working — painfully slowly, but… it is working. We’re getting there. See
NVIDIA Transitioning To Official, Open-Source Linux GPU Kernel Driver - Phoronix. That’s not complete, but it’s a move in the right direction, and as long as people keep asking for it, those moves will continue.

Again, this isn’t really a “purity” thing. Like AMD and Intel, there are binary blobs that are full of mystery, and probably still proprietary userspace tools. But, it will solve the biggest practical problem.

2 Likes

These are indeed good points! It could be added that the massive evaluation and testing within the different horizontal and vertical communities before BLOBs end up on the “production” systems can balance many potential issues and limit the risks of these “mysteries”.

I have to admit, that makes sense, but somehow I have had in mind that the BLOBs are mostly network and related/comparable drivers (I assumed mostly WiFi). I remember to have read often that the latter are the big issue in Linux-libre and “de-BLOBed” projects, but I have not heard about issues with graphics drivers other than nvidia in this respect (but I have to admit I am not actively following what is going on in *libre and such areas). But now that you say it, I indeed found out that some of them are BLOBs :grin: Something learned today ^^

Most Nvidia cards are used either in systems purchased with Windows or macOS and never boot linux, or in linux servers that offload tasks (AI, crypto, …) to the GPU.

Many Linux systems running on older or low-end hardware are used for web apps (similar use case to ChromeOS Flex or fydeOS, but Fedora users, rightly or not, have fewer concerns over prvacy and security). Here you have to consider longevity as well as market share. I suspect older Apple Intel systems are over-represented. Nvidia GPU’s were used in some popular Apple Intel systems, but for laptops with dual embedded and GPU graphics, some fraction disable the GPU to reduce heat and power consumption, so the priority may be to simply disable the Nvidia GPU rather than install a buggy driver.

As a Linux gamer the restricted driver is a most. The Open Source won’t cut it by a mile. It’s one of the first thing I install on a fresh install (after complete update of the system first).

Well yeah!
AFAIK the nouveau driver does not support hardware acceleration so rendering gets offloaded to the CPU and things just crawl. The nvidia driver does support hardware acceleration.

I have seen nothing that would make one believe that.
While it is true that windows has the much larger market share, and thus ‘most’ would never boot linux, – There are, however, many who do use linux on hardware with the nvidia GPUs so most of those would wish to have the best graphics performance possible.

I seem to recall that many are unable to disable the dGPU in bios so this may be impossible or at best difficult for them. Even totally blocking the loading of the drivers so the dGPU cannot be used seems to cause problems for some.

Any hardware component with no driver loaded can become problematic at times.

Your argument does not explain why rpmfusion drivers implementing the nvidia repos’ work - reliably with latest kernels in the past. Don’t know about now due to Plasma X11 desktop login issue.

I’ve never had issues with rpmfusion drivers other when they were a bit late re kernel update or once they stuffed up rpm packaging.

Now please don’t see me as some Nvidia zealot. My comments are based on many graphics cards since using linux from 2010 or so. I don’t care which graphics card I use as long as it does the job.

I’ve had always issues with noveau, usually stability on older nvidia and my current TItan Xp. From 2013 I had used AMD W2000 but AMD is worse re support then nvidia. Again I used proprietary as defaults version where not up to scratch. Dumped AMD cards re support forever. My son who runs Visual Effect studio for cinema projects has had same experience and refuses to deal with AMD cards professionally.

My use case is workstation, media editing, visual effects, rendering, scientific computational, cad all professional engineering type stuff. Stuff nvidia is made for. I have no interest in gaming.

I have not submitted a bug because it only came back with my F38 upgrade since I can’t do Plasma X11 desktop, then a dead kernel 6.3.4 etcetc. Only so many hours in a day. The bug is nuisance but does not stop me working at the moment. But it is there and I will investigate it when I get the next chance or when I need to do intense computational stuff again. For normal browser work the main issue is I have to reboot the machine once every few days when everything freezes.

I think everyone would agree, but some people require a reliable and secure platform for work, access to health care and government services, travel arrangements, shopping, and support services (including RedHat bugzilla and fedoraproject.org). Linux users have not received that from Nvidia. When you install or upgrade Fedora, you should not be stuck without a GUI, which may mean using integrated graphics, nouveau, or a newer open source driver.

I agree with this comment.
However, having used fedora from the beginning (Fedora Core 1) and using nvidia also the same period I do not recall ever having the type errors you refer to.

I admit that I tend to stick with older stable hardware longer than most, having only upgraded my GPU in my desktop from a GTX 1050 to an RTX 3050 in the last month.

The errors reported mostly seem to show that users often are impatient when upgrades to software occur (kernel or drivers) and do not allow suitable time for the drivers to be compiled properly. The reported errors also mostly show that users may select to install from different sources and mixing repos often causes problems. This does not mean that nvidia GPUs are bad nor that the drivers are buggy.

I recently dealt with a case where different driver packages were installed from fedora-cuda, inttf, and rpmfusion all at the same time. That kind of melange is ripe for problems but does not necessary indicate anything is buggy.

We can agree to disagree here, but I seem to have a much better experience with nvidia than many espouse and I feel nvidia is getting the negative vibes only because they have kept the drivers proprietary. They do support linux and have provided fully functional drivers for us for many years – just not as open source.

There have been times in the past where the rpmfusion folks have been faced with waiting for nvidia to fix the nvidia close source code for a new kernel.
It has not happened for a while, but this is not unheard of.
I do not know the details of why you are seeing issues.
Also it is not all nvidia hardware that is affected.

I agree that nvidia cards have work well for years, and have used many systems (Dell desktops, iMac’s, and a macbook Pro stuck on Nvidia due a hardware fault) that only provided Nvidia graphics.

Now, however, non-free Nvidia drivers have not kept up with linux, so many users with otherwise functional systems are stuck with choosing between the limitations of Nvidia or different limitations of nouveau.

That is not my experience, are you referring to old nvidia GPUs?
Please clarify,

Sorry couldn’t help but laugh. I am still using Titan Xp on my workstation, upgrading to it last year from an AMD W2000

Considering upgrading to now cheap used RTX3080s though my limitation is I have a PCIE3 bus. Though test out there show running PCIE4 GPU cards in PCIE3 has minimal impact on their performance like a few percent degradation.

All GPU’s will lose vendor support when the cost of making updates goes past some threshold.
Furthermore, vendors chose the timing when linux innovations require significant changes to their binary blobs (e.g., Nvidia’s GBL versus Wayland GBM).

Xwayland doesn’t count as it still relies on the Xorg libraries. Some groups want to get Xorg off workstations so they can focus on Wayland (and macOS).

From 2019 Using Linux with Wayland:

[…] if you’re using Nvidia’s proprietary graphics driver, Wayland probably won’t work for you. This is related to the compositing problem above. To make the process work, your graphics driver must talk to Wayland compositors in a certain way.

Intel and AMD graphics cards don’t have this problem, since they use the expected standard, called GBM (Generic Buffer Management). Nvidia believes that their way of speaking to Wayland, called EGL, is better, and as such sticks to that instead.

This problem can be solved in two ways: Nvidia drivers implement GBM, or Wayland compositors implement EGLStreams. Currently, Nvidia seems uninterested in pursuing the former solution.

I don’t recall when Wayland came to Fedora, but F36 was first to get Wayland with Nvidia. Nouveau has supported Wayland for nearly a decade.

The “groups” are basically: the core people who were the main Xorg developers and are now the X11 developers.

What the Linux community is forgetting is that Linux is not used as desktops for just home computer, it is used also (small numbers like mine), as professional workstations or GPU High Performance Workstations and GPU servers. But there’s also the biggest, server use cases of 4 Nvidia GPU cards per mobo in reach rack of 19" cabinets (eg AI anyone?) with 10s of such cabinets or more. This is the professional stuff. Linux still has to play nicely with the Nvidia stuff. You will see more and more of this Nvidia/mobo cabinets as AI progresses.

As an example, my son here in Australia runs a visual effects company. As a result of Fox (now Disney), Australian/USA film production is closely integrated into Hollywood/Sydney/Brisbane and Byron Bay. This enables 24/7 film production for Netflix, Disney and others.

He has cabinets of such Ryzen/Nvidia linux servers for rendering though for workstations he prefers AMD-Ryzen Windows with Nvidia GPUs due to the professional software apps out there. But also the main users is the visual performance arts community, the content creators ,and they hate the linux technical detail. Ease of use without issues for ‘right minded’ artists is what’s needed. So Apple and Windows have that captive user. Linux community ignored them and this X11/Wayland issue only compounds the problem. Yet their one of the biggest GPU users treating latest GPUs as disposable items as technology evolves. All linked to the USA with his business the biggest data customer in Australia transferring regularly terabytes between Oz and USA as each film progresses.

So the fact that only ~2% of PC desktops using linux out there are mainly linux enthusiasts is in many ways self inflicted and reminds me of the above quote:

When you install or upgrade Fedora, you should not be stuck without a GUI,

My point: If the linux community wants wider adoption to desktops, which they should as at least as competition to Apple and Windows, then the linux community, while having done a good job re graphics, needs to do even better (linux ease of use) and this wayland/X11 divide and using excuses that the leader in GPU (Nvidia) uses different interface standards and linux community knows better frankly sounds absurd. Nvidia’s market capitalisation makes that argument look absurd. The users have made their judgment and the investment community has followed.

As I said above, Nvidia has won. Like it or not. That is the fact!

Sadly, monopolies set the standard and if you do not follow the standard you’ll be stuck without a GUI., e.g., Nvidia’s EGL versus Wayland GBM. So linux community, work with Nvidia rather then against it and make their GPUs work to their full potential whether by X11 or Wayland to assure a desktop future. There’s no point dictating to Nvidia what YOU think is better. If you were right then IBM/RH, funding fedora, stock prices would be through the roof. I only wish, I’m an IBM investor and believer in linux OS. So far a disappointing financial investment.

I’ve just had another KDE Plasma/Wayland crash. Although from F36 had X11 and rpmfusion drivers am forced to use Wayland since I can’t login to Plasma X11.

Can’t hold back. Need to now investigate. But new to graphics problem chasing. So can you please point me to what information I need to collect and submit bug reports.

When you open a new Bugzilla report it should provide a worksheet to fill in with the details.
Your goal should be to provide enough information so others can easily reproduce the bug.
It is useful to check whether the bug started with a new kernel or is present in earlier kernels. It is also useful to mention the history of the system (e.g., fresh install versus upgraded since Fedora NN) and any uncommon peripherals (Braille keyboard, etc.). Rather than list a bunch of USB device I usually try to reproduce the problem with the minimal number of connected devices.