Fedora 35 and cuda: How to get GPU tf to work?

aanno · November 7, 2021, 4:42pm

I’ve finally found the solution (it has nothing to do with tensorflow-gpu or keras-gpu missing)!

It is indeed needed to build tf on your own, and I’ve done this by following the instruction at build from source for the docker method (but I used podman instead of docker). (This is exactly what @akza suggested.)

In this case, you use a docker image as build environment:

mkdir docker-tensorflow
cd docker-tensorflow
podman run --gpus all -it -w /tensorflow_src  -v $PWD:/mnt:z -e HOST_PERMS="$(id -u):$(id -g)" tensorflow/tensorflow:devel-gpu bash

and than follow the instructions given in the link to do a GPU build on docker. Building tf took several hours on my machine. At the end you could quit the docker container, and

$ ls 
tensorflow-2.8.0-cp38-cp38-linux_x86_64.whl

You’ve got a tensorflow for python 3.8 (because the docker image has installed this version). Hence:

conda create --name tf
conda activate tf
conda install python=3.8
pip install tensorflow-2.8.0-cp38-cp38-linux_x86_64.whl

will install tf in a newly create (ana)conda env ‘tf’.

Trying to use that you will probably run into:

$ python 
Python 3.9.7 | packaged by conda-forge | (default, Sep 29 2021, 19:20:46) 
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from tensorflow.python.client import device_lib
2021-11-07 16:22:21.912950: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudnn.so.8'; dlerror: libcudnn.so.8: cannot open shared object file: No such file or directory
2021-11-07 16:22:21.912969: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1850] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...

The solution for this is to install the latest version of cudnn. I unpacked the distribution in /opt and:

export LD_LIBRARY_PATH=/opt/cuda/lib64:$LD_LIBRARY_PATH
$ python
Python 3.8.12 | packaged by conda-forge | (default, Oct 12 2021, 21:57:06) 
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from tensorflow.python.client import device_lib
>>> device_lib.list_local_devices()
2021-11-07 17:40:55.694335: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE3 SSE4.1 SSE4.2 AVX AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-11-07 17:40:55.748089: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-11-07 17:40:56.212687: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /device:GPU:0 with 3327 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 2060, pci bus id: 0000:01:00.0, compute capability: 7.5
[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 13785827214071798721
xla_global_id: -1
, name: "/device:GPU:0"
device_type: "GPU"
memory_limit: 3489464320
locality {
  bus_id: 1
  links {
  }
}
incarnation: 11688225645642055078
physical_device_desc: "device: 0, name: NVIDIA GeForce RTX 2060, pci bus id: 0000:01:00.0, compute capability: 7.5"
xla_global_id: 416903419
]

Topic		Replies	Views
I have jupyter working My Tensorflow installation works in Command Prompt but not in Jupyter Ask Fedora podman	4	408	June 18, 2024
Is there a way to install docker with gpu support? Ask Fedora f36 , docker , nvidia	10	3981	September 17, 2022
Could not find cuda drivers in tensorflow installation Ask Fedora pip , cuda , desktop , nvidia , tensorflow	5	1357	April 5, 2024
GPU Support for Deep Learning frameworks (Pytorch/Tensorflow) Project Discussion ai-ml-sig	14	1881	January 7, 2025
NVIDIA Container Runtime not recognized in Docker on Fedora Ask Fedora	1	72	March 31, 2025

Fedora 35 and cuda: How to get GPU tf to work?

Related topics