Fedora 38 and Nvidia CUDA Toolkit

Hi,

I’d been running the latest Nvidia drivers on Fedora 38 along with CUDA (basic) via rpmfusion non-free repo and it’s was running great. I decided to try some dev work out though and needed to install the NVIDIA CUDA toolkit on my system which I tried to do following the RPMFusion guide here. I’m not sure if I did it correctly though, these are the steps they outline (the website says f35 but I swapped for f37):

sudo dnf config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/fedora37/x86_64/cuda-fedora37.repo
sudo dnf clean all
sudo dnf module disable nvidia-driver
sudo dnf -y install cuda

The thing is that they don’t specify what packages you should have installed before you use run sudo dnf -y install cuda. As I already had the rpmfusion cuda driver installed (non toolkit) I was concerned about overwriting it and causing issues, so I removed it before running the commands above. This appears to have worked for the most part, although there are a number of errors when I run a dnf update:

sunny@Fed sudo dnf update
Repository vivaldi is listed more than once in the configuration
Last metadata expiration check: 0:02:27 ago on Mon 13 Nov 2023 22:48:37 GMT.
Dependencies resolved.

 Problem: package cuda-12-3-12.3.0-1.x86_64 from cuda-fedora37-x86_64 requires cuda-runtime-12-3 >= 12.3.0, but none of the providers can be installed
  - package cuda-12.3.0-1.x86_64 from cuda-fedora37-x86_64 requires cuda-12-3 >= 12.3.0, but none of the providers can be installed
  - package cuda-runtime-12-3-12.3.0-1.x86_64 from cuda-fedora37-x86_64 requires cuda-drivers >= 545.23.06, but none of the providers can be installed
  - cannot install the best update candidate for package cuda-12.2.2-1.x86_64
  - package cuda-drivers-545.23.06-1.x86_64 from cuda-fedora37-x86_64 is filtered out by modular filtering
===========================================================================================================================================
 Package                              Architecture              Version                      Repository                               Size
===========================================================================================================================================
Skipping packages with broken dependencies:
 cuda-12-3                            x86_64                    12.3.0-1                     cuda-fedora37-x86_64                    7.4 k
 cuda                                 x86_64                    12.3.0-1                     cuda-fedora37-x86_64                    7.3 k
 cuda-runtime-12-3                    x86_64                    12.3.0-1                     cuda-fedora37-x86_64                    7.3 k

Transaction Summary
===========================================================================================================================================
Skip  3 Packages

Nothing to do.
Complete!

The issue comes when I try to compile things, there’s complaints about my gcc binary being incompatible with cuda and so I tried to use a conda environment to install a compatible gcc and g++ (version 12.2), however that has failed as well with issues related to glibc incompatibilities. So i’m almost there.

My question is, do you know if I can install the nvidia official fedora 37 version of CUDA Toolkit alongside the rpmfusion version which is fully compatible?

I’m clutching at straws tbh though, I’m not sure why nvidia is treating fedora as a third class citizen and not providing updates for cuda toolkit!!

Any help appreciated :slight_smile:

Thanks
Sunny

Remove those 3 packages
cuda-12-3
cuda
cuda-runtime-12-3

They do not belong in any fedora repo and the cuda-fedora37-x86_64 repo should be disabled as well.

Is there a reason you are using fedora 38 and chose to use the cuda-fedora37 repo instead?

Note that cuda packages from rpmfusion are mostly not compatible with cuda packages installed in the method you are attempting. You probably will have to move completely to rpmfusion packages or completely to the drivers directly from nvidia and the cuda-fedora* repos.

Thanks Jeff !

Sorry for the delayed reply, been hectic. The reason I’ve installed the Nvidia recommended is because I need the additional tools included with Nvidia Toolkit, which is an additional 7GB install over the ones and a lot of the AI projects require a binary called nvcc which only comes with the cuda toolkit.

I mean on paper the Non free repo provides cuda installation and running nvidia-smi showed that I had cuda v 12.2 or 12.3 (I can’t remember now), so it looked like it should work but when you actually try to do anything like compile or run some cuda specific code, it falls flat.

Is there a reason you are using fedora 38 and chose to use the cuda-fedora37 repo instead?

  • Yes, nvidia hasn’t rolled a cuda-fedora38 repo afaik yet, so myself and a lot of other Fedorians are having to hack together our drivers in order to get the Cuda toolkit installed. As such there isn’t great documentation (or i’ve not found any yet) that explains how to setup fedora with Nvidia for AI dev with Cuda.

You know what I’m confused as hell about all the nvidia drivers!! I would love a clear path on how to set this all up for what I want to do.

So my current understanding is that options for the nvidia official route are:

  • Nvidia display drivers in package form, no repo
  • Cuda drivers in repo form (great) but only for F37 (not so great!)

RPM Fusion:

  • Drivers in repo form
  • Cuda drivers in repo form
  • No Tookit and no nvcc

Am I right about that? What do you recommend I do?

Btw, I found this very interesting looking github project for nvidia driver setup on Fedora, not sure if you’ve come across it before? However it’s not clear to me yet if they have tookit install options

Thanks bud, appreciate your help! :slight_smile:
Sunny