Could not find cuda drivers in tensorflow installation

Hi.
I tried installing TensorFlow on my desktop through Pip. I installed the Nvidia drivers as explained in this YouTube video.

When I start to execute a code the following warnings are shown and TensorFlow cannot use my GPU:

2024-04-04 09:33:32.411418: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-04-04 09:33:32.412223: I external/local_tsl/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2024-04-04 09:33:32.451307: I external/local_tsl/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2024-04-04 09:33:32.644791: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-04-04 09:33:33.573553: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT

Does anyone know how to install cudart_stub to solve this problem?

I’m using Fedora 39 and this is my GPU details:

$ nvidia-smi 
Thu Apr  4 10:16:25 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.67                 Driver Version: 550.67         CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 3050 ...    Off |   00000000:01:00.0 Off |                  N/A |
| N/A   42C    P0            749W /   60W |       5MiB /   4096MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A      3320      G   /usr/bin/gnome-shell                            1MiB |
+-----------------------------------------------------------------------------------------+

thanks

Added pip, tensorflow

Added cuda

Please post dnf list installed \*cuda\* so we may see what is actually installed for cuda.

I don’t use tensorflow but I do have cuda installed.
While the process shown in that youtube video was prevelant and useful in the past I have never found it useful

The nvidia and cuda drivers are manually installed by me with the command
sudo dnf install akmod-nvidia xorg-x11-drv-nvidia-cuda and reboot. DONE.

Installing tensorflow does seem to require pip, and the following steps but otherwise the first half of that video is managed by the one line command above.

I also have one system where the gpu is old enough that cuda 12.3 and later does not permit a particular app using cuda to function so I have locked that particular machine to running an older version of the nvidia drivers and cuda 12.2.

You might see if that is the issue with the following

  1. sudo dnf remove \*nvidia\* --exclude nvidia-gpu-firmware
  2. sudo dnf install akmod-nvidia-535* xorg-x11-drv-nvidia-cuda-535*

Then reboot and test to see if tensorflow will work with the older version of cuda.

If that works then it may be necessary to lock nvidia to the 535 version so it does not automatically replace that with the newer version for you.

2 Likes

Thanks for replying.

My installed package related to CUDA is listed here:

$ dnf list installed \*cuda\* 
Installed Packages
cuda.x86_64                            11.8.0-1        @cuda-fedora35-x86_64    
cuda-11-8.x86_64                       11.8.0-1        @cuda-fedora35-x86_64    
cuda-cccl-11-7.x86_64                  11.7.91-1       @cuda-fedora35-x86_64    
cuda-cccl-11-8.x86_64                  11.8.89-1       @cuda-fedora35-x86_64    
cuda-command-line-tools-11-7.x86_64    11.7.1-1        @cuda-fedora35-x86_64    
cuda-command-line-tools-11-8.x86_64    11.8.0-1        @cuda-fedora35-x86_64    
cuda-compiler-11-7.x86_64              11.7.1-1        @cuda-fedora35-x86_64    
cuda-compiler-11-8.x86_64              11.8.0-1        @cuda-fedora35-x86_64    
cuda-cudart-11-7.x86_64                11.7.99-1       @cuda-fedora35-x86_64    
cuda-cudart-11-8.x86_64                11.8.89-1       @cuda-fedora35-x86_64    
cuda-cudart-devel-11-7.x86_64          11.7.99-1       @cuda-fedora35-x86_64    
cuda-cudart-devel-11-8.x86_64          11.8.89-1       @cuda-fedora35-x86_64    
cuda-cuobjdump-11-7.x86_64             11.7.91-1       @cuda-fedora35-x86_64    
cuda-cuobjdump-11-8.x86_64             11.8.86-1       @cuda-fedora35-x86_64    
cuda-cupti-11-7.x86_64                 11.7.101-1      @cuda-fedora35-x86_64    
cuda-cupti-11-8.x86_64                 11.8.87-1       @cuda-fedora35-x86_64    
cuda-cuxxfilt-11-7.x86_64              11.7.91-1       @cuda-fedora35-x86_64    
cuda-cuxxfilt-11-8.x86_64              11.8.86-1       @cuda-fedora35-x86_64    
cuda-demo-suite-11-8.x86_64            11.8.86-1       @cuda-fedora35-x86_64    
cuda-documentation-11-7.x86_64         11.7.91-1       @cuda-fedora35-x86_64    
cuda-documentation-11-8.x86_64         11.8.86-1       @cuda-fedora35-x86_64    
cuda-driver-devel-11-7.x86_64          11.7.99-1       @cuda-fedora35-x86_64    
cuda-driver-devel-11-8.x86_64          11.8.89-1       @cuda-fedora35-x86_64    
cuda-gdb-11-7.x86_64                   11.7.91-1       @cuda-fedora35-x86_64    
cuda-gdb-11-8.x86_64                   11.8.86-1       @cuda-fedora35-x86_64    
cuda-libraries-11-7.x86_64             11.7.1-1        @cuda-fedora35-x86_64    
cuda-libraries-11-8.x86_64             11.8.0-1        @cuda-fedora35-x86_64    
cuda-libraries-devel-11-7.x86_64       11.7.1-1        @cuda-fedora35-x86_64    
cuda-libraries-devel-11-8.x86_64       11.8.0-1        @cuda-fedora35-x86_64    
cuda-memcheck-11-7.x86_64              11.7.91-1       @cuda-fedora35-x86_64    
cuda-memcheck-11-8.x86_64              11.8.86-1       @cuda-fedora35-x86_64    
cuda-nsight-11-7.x86_64                11.7.91-1       @cuda-fedora35-x86_64    
cuda-nsight-11-8.x86_64                11.8.86-1       @cuda-fedora35-x86_64    
cuda-nsight-compute-11-7.x86_64        11.7.1-1        @cuda-fedora35-x86_64    
cuda-nsight-compute-11-8.x86_64        11.8.0-1        @cuda-fedora35-x86_64    
cuda-nsight-systems-11-7.x86_64        11.7.1-1        @cuda-fedora35-x86_64    
cuda-nsight-systems-11-8.x86_64        11.8.0-1        @cuda-fedora35-x86_64    
cuda-nvcc-11-7.x86_64                  11.7.99-1       @cuda-fedora35-x86_64    
cuda-nvcc-11-8.x86_64                  11.8.89-1       @cuda-fedora35-x86_64    
cuda-nvdisasm-11-7.x86_64              11.7.91-1       @cuda-fedora35-x86_64    
cuda-nvdisasm-11-8.x86_64              11.8.86-1       @cuda-fedora35-x86_64    
cuda-nvml-devel-11-7.x86_64            11.7.91-1       @cuda-fedora35-x86_64    
cuda-nvml-devel-11-8.x86_64            11.8.86-1       @cuda-fedora35-x86_64    
cuda-nvprof-11-7.x86_64                11.7.101-1      @cuda-fedora35-x86_64    
cuda-nvprof-11-8.x86_64                11.8.87-1       @cuda-fedora35-x86_64    
cuda-nvprune-11-7.x86_64               11.7.91-1       @cuda-fedora35-x86_64    
cuda-nvprune-11-8.x86_64               11.8.86-1       @cuda-fedora35-x86_64    
cuda-nvrtc-11-7.x86_64                 11.7.99-1       @cuda-fedora35-x86_64    
cuda-nvrtc-11-8.x86_64                 11.8.89-1       @cuda-fedora35-x86_64    
cuda-nvrtc-devel-11-7.x86_64           11.7.99-1       @cuda-fedora35-x86_64    
cuda-nvrtc-devel-11-8.x86_64           11.8.89-1       @cuda-fedora35-x86_64    
cuda-nvtx-11-7.x86_64                  11.7.91-1       @cuda-fedora35-x86_64    
cuda-nvtx-11-8.x86_64                  11.8.86-1       @cuda-fedora35-x86_64    
cuda-nvvp-11-7.x86_64                  11.7.101-1      @cuda-fedora35-x86_64    
cuda-nvvp-11-8.x86_64                  11.8.87-1       @cuda-fedora35-x86_64    
cuda-profiler-api-11-8.x86_64          11.8.86-1       @cuda-fedora35-x86_64    
cuda-runtime-11-8.x86_64               11.8.0-1        @cuda-fedora35-x86_64    
cuda-sanitizer-11-7.x86_64             11.7.91-1       @cuda-fedora35-x86_64    
cuda-sanitizer-11-8.x86_64             11.8.86-1       @cuda-fedora35-x86_64    
cuda-toolkit-11-7.x86_64               11.7.1-1        @cuda-fedora35-x86_64    
cuda-toolkit-11-7-config-common.noarch 11.7.99-1       @cuda-fedora35-x86_64    
cuda-toolkit-11-8.x86_64               11.8.0-1        @cuda-fedora35-x86_64    
cuda-toolkit-11-8-config-common.noarch 11.8.89-1       @cuda-fedora35-x86_64    
cuda-toolkit-11-config-common.noarch   11.8.89-1       @cuda-fedora35-x86_64    
cuda-toolkit-config-common.noarch      11.8.89-1       @cuda-fedora35-x86_64    
cuda-tools-11-7.x86_64                 11.7.1-1        @cuda-fedora35-x86_64    
cuda-tools-11-8.x86_64                 11.8.0-1        @cuda-fedora35-x86_64    
cuda-visual-tools-11-7.x86_64          11.7.1-1        @cuda-fedora35-x86_64    
cuda-visual-tools-11-8.x86_64          11.8.0-1        @cuda-fedora35-x86_64    
xorg-x11-drv-nvidia-cuda.x86_64        3:550.67-1.fc39 @rpmfusion-nonfree-nvidia-driver
xorg-x11-drv-nvidia-cuda-libs.i686     3:550.67-1.fc39 @rpmfusion-nonfree-nvidia-driver
xorg-x11-drv-nvidia-cuda-libs.x86_64   3:550.67-1.fc39 @rpmfusion-nonfree-nvidia-driver

I already have both akmod-nvidia and xorg-x11-drv-nvidia-cuda installed.

I installed older versions of Nvidia drivers as you mentioned 12.2. But TensorFlow still has the same problem.

PyTorch works properly with my GPU and TensorFlow seems too hard to configure with my GPU. So I would use PyTorch instead.

Thanks for your time

I see that you have those packages installed from the cuda-fedora35-x86_64 repo.

Many see problems with using cuda drivers and other packages from a cuda-fedoraXX repo when also installing nvidia drivers from rpmfusion.

To avoid problems with conflicts between packages installed from differing repos I suggest avoiding the different repos.

I also see that you are probably using Fedora 39 and a much older repo for those cuda packages which by itself may cause problems, along with the fact that you have different versions of the same packages installed. Examples include

There is a newer repo cuda-fedora39-x86_64 which I enabled according to instructions at the nvidia site but when I did a simple test to install cuda from there I got this message.

# dnf install cuda*
cuda-fedora39-x86_64                                                                                145 kB/s | 182 kB     00:01    
Error: 
 Problem 1: package cuda-drivers-fabricmanager-550-550.54.15-1.x86_64 from cuda-fedora39-x86_64 requires cuda-drivers-550 = 550.54.15, but none of the providers can be installed
  - installed package xorg-x11-drv-nvidia-cuda-3:550.67-1.fc39.x86_64 obsoletes cuda-drivers < 550.67.100 provided by cuda-drivers-550.54.15-1.x86_64 from cuda-fedora39-x86_64
  - cannot install the best candidate for the job
  - problem with installed package xorg-x11-drv-nvidia-cuda-3:550.67-1.fc39.x86_64
 Problem 2: package cuda-drivers-fabricmanager-550-550.54.15-1.x86_64 from cuda-fedora39-x86_64 requires cuda-drivers-550 = 550.54.15, but none of the providers can be installed
  - package cuda-drivers-fabricmanager-550.54.15-1.x86_64 from cuda-fedora39-x86_64 requires cuda-drivers-fabricmanager-550 = 550.54.15, but none of the providers can be installed
  - installed package xorg-x11-drv-nvidia-cuda-3:550.67-1.fc39.x86_64 obsoletes cuda-drivers < 550.67.100 provided by cuda-drivers-550.54.15-1.x86_64 from cuda-fedora39-x86_64
  - cannot install the best candidate for the job
(try to add '--skip-broken' to skip uninstallable packages or '--nobest' to use not only best candidate packages)

which shows some of the conflict between packages directly from nvidia and packages from rpmfusion.

The driver from rpmfusion is version 12.4 and your packages from nvidia are (both) versions 11.7 & 11.8 so there is a version mismatch between driver (rpmfusion) and cuda (nvidia) packages as well as an installed version mismatch

I wold guess that if you either removed the cuda packages installed directly from nvidia, or instead updated the repo to the proper one for fedora 39 then cleaned up the cuda package version problems it may solve your issues.

1 Like