I’ve finally found the solution (it has nothing to do with tensorflow-gpu
or keras-gpu
missing)!
It is indeed needed to build tf
on your own, and I’ve done this by following the instruction at build from source for the docker method (but I used podman instead of docker). (This is exactly what @akza suggested.)
In this case, you use a docker image as build environment:
mkdir docker-tensorflow
cd docker-tensorflow
podman run --gpus all -it -w /tensorflow_src -v $PWD:/mnt:z -e HOST_PERMS="$(id -u):$(id -g)" tensorflow/tensorflow:devel-gpu bash
and than follow the instructions given in the link to do a GPU build on docker. Building tf took several hours on my machine. At the end you could quit the docker container, and
$ ls
tensorflow-2.8.0-cp38-cp38-linux_x86_64.whl
You’ve got a tensorflow for python 3.8 (because the docker image has installed this version). Hence:
conda create --name tf
conda activate tf
conda install python=3.8
pip install tensorflow-2.8.0-cp38-cp38-linux_x86_64.whl
will install tf in a newly create (ana)conda env ‘tf’.
Trying to use that you will probably run into:
$ python
Python 3.9.7 | packaged by conda-forge | (default, Sep 29 2021, 19:20:46)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from tensorflow.python.client import device_lib
2021-11-07 16:22:21.912950: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudnn.so.8'; dlerror: libcudnn.so.8: cannot open shared object file: No such file or directory
2021-11-07 16:22:21.912969: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1850] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
The solution for this is to install the latest version of cudnn. I unpacked the distribution in /opt
and:
export LD_LIBRARY_PATH=/opt/cuda/lib64:$LD_LIBRARY_PATH
$ python
Python 3.8.12 | packaged by conda-forge | (default, Oct 12 2021, 21:57:06)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from tensorflow.python.client import device_lib
>>> device_lib.list_local_devices()
2021-11-07 17:40:55.694335: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: SSE3 SSE4.1 SSE4.2 AVX AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-11-07 17:40:55.748089: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-11-07 17:40:56.212687: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /device:GPU:0 with 3327 MB memory: -> device: 0, name: NVIDIA GeForce RTX 2060, pci bus id: 0000:01:00.0, compute capability: 7.5
[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 13785827214071798721
xla_global_id: -1
, name: "/device:GPU:0"
device_type: "GPU"
memory_limit: 3489464320
locality {
bus_id: 1
links {
}
}
incarnation: 11688225645642055078
physical_device_desc: "device: 0, name: NVIDIA GeForce RTX 2060, pci bus id: 0000:01:00.0, compute capability: 7.5"
xla_global_id: 416903419
]