Booting with several legacy nvidia devices

I have 2 GeForce GTX Titans GK110 and 1 Quadro 2000 GF106GL nvidia boards in a workstation
plus an ASPEED Asus-provided VGA controller (presumably on the motherboard).

My number one goal is to use CUDA on the Titans and to be able to run a monitor (no need for fast frame rates).

What are the practical limitations to what I can do in F38?

So far I have failed to boot with either the 390.xx or the 470.xx drivers.

General advice?

Please post the output of inxi -Gzxx or the relevant portions of lspci -nnk so we can see the full details of the GPUs.

A preliminary search at nvidia.com shows the titan may be limited to the 470xx drivers and the quadro appears to be supported by the 535 drivers. If there is that large a difference in the devices then one could probably not run both types at the same time.

We need the details to verify exactly which cards are being used and which drivers are required.

Several devices that were supported by the 470 driver were dropped when the next driver level (495 and up) was released. If yours are in that situation then the quadro is probably too new for the 470 driver while the titan is too old for the 535 driver. We need the details to verify that.

Graphics:
Device-1: NVIDIA GF106GL [Quadro 2000] driver: nouveau v: kernel arch: Fermi
pcie: speed: 2.5 GT/s lanes: 8 ports: active: DP-1,DVI-I-1 empty: DP-2
bus-ID: 04:00.0 chip-ID: 10de:0dd8 temp: 52.0 C
Device-2: ASPEED Graphics Family vendor: ASUSTeK driver: ast v: kernel
ports: active: none off: VGA-1 empty: none bus-ID: 0a:00.0
chip-ID: 1a03:2000
Device-3: NVIDIA GK110 [GeForce GTX TITAN] vendor: ASUSTeK GTXTITAN-6GD5
driver: nouveau v: kernel arch: Kepler pcie: speed: 2.5 GT/s lanes: 16
ports: active: none empty: DP-3, DVI-D-1, DVI-I-2, HDMI-A-1
bus-ID: 83:00.0 chip-ID: 10de:1005 temp: 49.0 C
Device-4: NVIDIA GK110 [GeForce GTX TITAN] vendor: eVga.com.
driver: nouveau v: kernel arch: Kepler pcie: speed: 2.5 GT/s lanes: 16
ports: active: none empty: DP-4, DVI-D-2, DVI-I-3, HDMI-A-2
bus-ID: 84:00.0 chip-ID: 10de:1005 temp: 43.0 C
Display: x11 server: X.Org v: 1.20.14 with: Xwayland v: 22.1.9
compositor: gnome-shell v: 44.3 driver: X: loaded: modesetting
unloaded: fbdev,vesa dri: nouveau gpu: ast,nouveau display-ID: :1
screens: 1
Screen-1: 0 s-res: 5504x1440 s-dpi: 96
Monitor-1: DP-1 pos: right model: 27inch DP res: 2560x1440 dpi: 109
diag: 685mm (27")
Monitor-2: DVI-I-1 pos: primary,left model: Asus VW246 res: 1920x1080
dpi: 92 diag: 609mm (24")
Monitor-3: VGA-1 mapped: VGA-1-1 note: disabled size-res: N/A
API: OpenGL v: 4.3 Mesa 23.1.4 renderer: NVC3 direct-render: Yes

Looking at those details I see from nvidia that

  1. the Quadro 2000 is supported by the 390xx drivers. (fermi architecture)
  2. the GeForce GTX Titan is supported by both the 390xx and the 470xx drivers. (kepler architecture)

It is clear that if all cards are to be active then nothing newer than the 390xx drivers should be installed.

I have no clue how the Asus ASPEED ma1Ghost
GPU would interact with the others, but since it is a different driver it may or may not function without interfering with the nvidia cards.

To get everything nvidia working and allow using cuda I would suggest that first one remove all that you have previously installed that is nvidia related. You did not state where you obtained any drivers you may have previously tried, but if installed from rpmfusion then removal is simple.

  1. sudo dnf remove '*nvidia*' --exclude nvidia-gpu-firmware.
    this will remove all nvidia related packages that were installed using dnf except the required firmware for nvidia gpus.
    Note that one may verify the nvidia-gpu-firmware package remains and no others by using dnf list installed '*nvidia*' after doing the removal in this step. If it does not appear then one must add sudo dnf install nvidia-gpu-firmware to reinstall it.
  2. Update the system fully
    sudo dnf upgrade --refresh
  3. Reboot before continuing so one is certain the latest kernel is active.
  4. If not already done enable the rpmfusion repos as noted in my post above.
  5. Install the 390xx nvidia drivers
    sudo dnf install akmod-nvidia-390xx
  6. wait at least 5 minutes for the modules to be compiled and installed
  7. verify the installation of the modules with
    dnf list installed kmod-nvidia\*
  8. Once step 7 returns a listing for that package showing the running kernel version then reboot again. The drivers should now load.

Note that secure boot must be disabled.
If one wishes to use secure boot the akmods package must be installed and the steps shown in /usr/share/doc/akmods/README.secureboot must be done before any of the above installation steps are performed so the modules are signed and can be loaded with secure boot enabled.

It seems the cuda driver may not be available from rpmfusion for the 390xx nvidia driver so one may need to search for how to install a suitable cuda driver for those older GPUs.

For that old a gpu and using the nvidia 390xx driver one may be able to use the cuda toolkit 8.0 or 9.0 which I found reference to here.

1 Like

I followed the recipe and the machine booted up
with 3 nvidia devices:


Fri Aug  4 15:51:04 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.157                Driver Version: 390.157                   |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Quadro 2000         Off  | 00000000:04:00.0  On |                  N/A |
| 30%   60C    P0    N/A /  N/A |    456MiB /   964MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX TITAN   Off  | 00000000:83:00.0 Off |                  N/A |
| 31%   43C    P8    13W / 250W |      5MiB /  6083MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   2  GeForce GTX TITAN   Off  | 00000000:84:00.0 Off |                  N/A |
| 30%   36C    P8    13W / 250W |      5MiB /  6083MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      2552      G   /usr/libexec/Xorg                            244MiB |
|    0      4679      G   /usr/bin/gnome-control-center                 11MiB |
|    0      4794      G   /usr/bin/gnome-shell                         125MiB |
+-----------------------------------------------------------------------------+

I went ahead and installed xorg-x11-drv-nvidia-390xx-cuda

Thank you!

PS I can see that the toolkit will be a whole new challenge!

I apologize.
I failed to find and identify the xorg-x11-drv-nvidia-390xx-cuda package from rpmfusion so I suggested the toolkit directly from nvidia. When you noted the name I checked again and found it.

With that one from rpmfusion one should need nothing else. It enables cuda with nvidia and is tested and configured to work.

Please ignore my branch off into left field and see if everything will just work for you as is without adding anything extra.

Thanks. I am trying to understand if installing xorg-x11-drv-nvidia-390xx-cuda is sufficient.

Although nvidia-smi works, I don’t see nvcc, for example, and I haven’t managed to run any test codes yet. I am assuming that I still need to install CUDA but how that should be done is not clear to me.

Do I need to install CUDA as in the CUDA Howto from RPMFUSION **under Legacy NVIDIA 340xx/CUDA 6.5 and Fedora 20 (and later) **? The commands are

sudo yum install install http://developer.download.nvidia.com/compute/cuda/repos/fedora20/x86_64/cuda-repo-fedora20-6.5-14.x86_64.rpm
sudo yum install cuda

When I try these (with yum → dnf) its seems to trigger a variety of conflicts.

I wonder if I need to use the command

dnf module disable nvidia-driver

to freeze any changes related to nvidia-driver first?

Thanks for any suggestions!

First off, once one has installed the akmod-nvidia-390xx package the system is blocked from upgrading to a newer driver level. The assumption is that one installs that driver because it is needed for the hardware installed and newer drivers are not expected to work.

Those commands are related to a different location and it is a really really bad idea to install similar packages from more than one location. As long as the drivers are loaded and working then please quit trying to make changes since that is certain to break things at some point.

You posted this which shows the nvidia drivers loaded and then added that I went ahead and installed xorg-x11-drv-nvidia-390xx-cuda so you already seem to have the cuda drivers loaded that match the installed nvidia drivers.

Note that the older drivers and supporting apps do not all have the same tools as the latest drivers.

One would only need to disable the nvidia-driver module if it were actually installed and active, and if so it usually prevents a successful install of the rpmfusion packages. One can see if it is enabled with dnf module list --enabled

(1) I have not installed anything beyond xorg-x11-drv-nvidia-390xx-cuda since dnf indicated problems beyond that point (I did not allow ‘dnf install cuda’ [from the fedora20 repos] to make any changes).

(2) At the moment
dnf module list --enabled
doesn’t return anything

(3) I notice that nvidia-smi shows processes that are “attached” or interested in the
gpus – I assume that is a good thing; nvidia-settings sees all nvidia devices

(4) BUT I think I may have a basic misunderstanding: how do I compile c code to use the gpu if I don’t have nvcc? how do I write the hello world program in this environment?

I assume you are referring to this

and it would seem that is only available from nvidia.

A quick search for what is nvcc compiler on google gives a lot of responses and by fine tuning the search and reading some of the links one may find what you need.

Note that since you are not using a modern GPU and have a very old version of cuda there may be a problem in having the code do what you expect though it is up to you how the solution is found that works for your situation. I have never tried to actually develop nor compile code for use on the gpu so it is beyond my expertise to go further than suggesting searches.

I would expect that as long as you do not replace any of the nvidia or cuda drivers (which do originate from nvidia) that the problem may have a solution that will work.

Thank you. Can you suggest a piece of code that I can use to test the gpu’s right now before attempting to solve the hello world problem?

I do not have any direct experience as stated above. One may try nvidia.com or one of the nvidia forums for better advice than I might be able to offer.