Hi.
I tried installing TensorFlow on my desktop through Pip. I installed the Nvidia drivers as explained in this YouTube video.
When I start to execute a code the following warnings are shown and TensorFlow cannot use my GPU:
2024-04-04 09:33:32.411418: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-04-04 09:33:32.412223: I external/local_tsl/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2024-04-04 09:33:32.451307: I external/local_tsl/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2024-04-04 09:33:32.644791: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-04-04 09:33:33.573553: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
Does anyone know how to install cudart_stub to solve this problem?
I’m using Fedora 39 and this is my GPU details:
$ nvidia-smi
Thu Apr 4 10:16:25 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.67 Driver Version: 550.67 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3050 ... Off | 00000000:01:00.0 Off | N/A |
| N/A 42C P0 749W / 60W | 5MiB / 4096MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 3320 G /usr/bin/gnome-shell 1MiB |
+-----------------------------------------------------------------------------------------+
Please post dnf list installed \*cuda\* so we may see what is actually installed for cuda.
I don’t use tensorflow but I do have cuda installed.
While the process shown in that youtube video was prevelant and useful in the past I have never found it useful
The nvidia and cuda drivers are manually installed by me with the command sudo dnf install akmod-nvidia xorg-x11-drv-nvidia-cuda and reboot. DONE.
Installing tensorflow does seem to require pip, and the following steps but otherwise the first half of that video is managed by the one line command above.
I also have one system where the gpu is old enough that cuda 12.3 and later does not permit a particular app using cuda to function so I have locked that particular machine to running an older version of the nvidia drivers and cuda 12.2.
You might see if that is the issue with the following
I see that you have those packages installed from the cuda-fedora35-x86_64 repo.
Many see problems with using cuda drivers and other packages from a cuda-fedoraXX repo when also installing nvidia drivers from rpmfusion.
To avoid problems with conflicts between packages installed from differing repos I suggest avoiding the different repos.
I also see that you are probably using Fedora 39 and a much older repo for those cuda packages which by itself may cause problems, along with the fact that you have different versions of the same packages installed. Examples include
There is a newer repo cuda-fedora39-x86_64 which I enabled according to instructions at the nvidia site but when I did a simple test to install cuda from there I got this message.
# dnf install cuda*
cuda-fedora39-x86_64 145 kB/s | 182 kB 00:01
Error:
Problem 1: package cuda-drivers-fabricmanager-550-550.54.15-1.x86_64 from cuda-fedora39-x86_64 requires cuda-drivers-550 = 550.54.15, but none of the providers can be installed
- installed package xorg-x11-drv-nvidia-cuda-3:550.67-1.fc39.x86_64 obsoletes cuda-drivers < 550.67.100 provided by cuda-drivers-550.54.15-1.x86_64 from cuda-fedora39-x86_64
- cannot install the best candidate for the job
- problem with installed package xorg-x11-drv-nvidia-cuda-3:550.67-1.fc39.x86_64
Problem 2: package cuda-drivers-fabricmanager-550-550.54.15-1.x86_64 from cuda-fedora39-x86_64 requires cuda-drivers-550 = 550.54.15, but none of the providers can be installed
- package cuda-drivers-fabricmanager-550.54.15-1.x86_64 from cuda-fedora39-x86_64 requires cuda-drivers-fabricmanager-550 = 550.54.15, but none of the providers can be installed
- installed package xorg-x11-drv-nvidia-cuda-3:550.67-1.fc39.x86_64 obsoletes cuda-drivers < 550.67.100 provided by cuda-drivers-550.54.15-1.x86_64 from cuda-fedora39-x86_64
- cannot install the best candidate for the job
(try to add '--skip-broken' to skip uninstallable packages or '--nobest' to use not only best candidate packages)
which shows some of the conflict between packages directly from nvidia and packages from rpmfusion.
The driver from rpmfusion is version 12.4 and your packages from nvidia are (both) versions 11.7 & 11.8 so there is a version mismatch between driver (rpmfusion) and cuda (nvidia) packages as well as an installed version mismatch
I wold guess that if you either removed the cuda packages installed directly from nvidia, or instead updated the repo to the proper one for fedora 39 then cleaned up the cuda package version problems it may solve your issues.