TensorFlow is a well-known and open source machine learning framework in industry as well as academia that supports both CPU and GPU based computations.
In the case of only CPU support, all is straightforward; you can easily install TensorFlow by using pip, anaconda, and docker; however, when it comes to GPU support, the story is a bit complicated.
In the TensorFlow official website, using docker image is suggested as the best way to employ TensorFlow GPU. However, I believe that building it from source is the best choice since the docker image could be failed due to minor version unmatched in any possible packages.
I have struggled for a week to install TensorFlow GPU on Fedora 33; I have tried different possible methods including various versions of nvidia drivers, CUDA installers, docker images, and even nvidia TensorFlow images.
Finally, I could successfully built and install TensorFlow 2.5 (i.e., both GPU and CPU) on Fedora 33 with kernel 5.9.16 as follows. Hope, this post could be helpful for you as well.
Instructions on TensorFlow 2.9 (TensorRT support) and Fedora 35 is explained in Release 1 of this post.
- The most important steps are 1, 2, and 3; after each step, follow the instructions to assure that packages are installed in a proper way.
- The following steps can be applied when the versions of your kernel, cuda, and the nvidia driver are different with the ones used here; however, make sure that both the nvidia driver and cuDDN package support the actuall version of your CUDA package.
You need root privilege only for steps 1 to 7.
-
Install compatible nvidia driver supported CUDA 11.1 by following here
-
Install CUDA Toolkit version 11.1 by following here
-
Download cuDDN version 8.0.05 (libcudnn8, libcudnn8-devel, libcudnn8-samples) that is compatible with CUDA-11.1 from here (you need to make a free account) and install them by
rpm -Uvh libcudnn8*.rpm
command. Then, verify the installation as follows (note that "freeimage" and "freeimage-dev" pachakges must be installed to make the cuDNN samples):cp -r /usr/src/cudnn_samples_v8/ $HOME cd $HOME/cudnn_samples_v8/mnistCUDNN make clean && make ./mnistCUDNN
If cuDNN is properly installed and running on your Linux system, you will see a long message with Test passed! at the end.
-
Install python 3.8 (
sudo dnf install python38
) and the corresponding python-devel package. -
Make a virtual environment to build and install TensorFlow and active it (Note that you can just use python instead of python3 if your system recognizes python as the default Python3.8 interpreter; you can also make the virtual environment under user not root)
cd ‘your_desire_dir’ python3 -m venv ‘your_venv_name’ source ‘your_venv_name’/bin/activate
Now, your bash looks like this: (your_venv_name) [root@XX XX]$
-
Install git (
sudo dnf install git
) -
Install perl-core package (
sudo dnf install perl-core
) -
Install required prepackes
pip install -U pip numpy wheel pip install -U keras_preprocessing --no-deps
-
Download executable version of bazel-3.7.2 for Linux from here, rename it to bazel and add it to PATH (export
PATH=$PATH:’your_dir’/bazel
) -
Download the TensorFlow source code:
git clone https://github.com/tensorflow/tensorflow.git cd tensorflow
This downloads the most recent version, we can choose other versions as follows:
git checkout version_name # r2.2, r2.3, etc.
-
Configure the build process by answering some questions through executing
python configure.py
:Note: do not change the default location of Python
Note: Do you wish to build TensorFlow with ROCm support? [y/N]: n
Note: Do you wish to build TensorFlow with CUDA support? [y/N]: y
Note: Do you want to use clang as CUDA compiler? [y/N]: n
Note: Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -Wno-sign-compare]:-march=native
-
Start the build process (it takes few hours to be completed) {--config=v2 to build tensorflow 2.x}
bazel build --config=cuda --config=v2 --config=nonccl --config=xla //tensorflow/tools/pip_package:build_pip_package --verbose_failures
-
Make the whl file, which will be used to install TensorFlow 2.5
./bazel-bin/tensorflow/tools/pip_package/build_pip_package 'your_desire_dir'
-
Install TensorFlow 2.5
pip install 'your_desire_dir'/tensorflow-2.5.0-*.whl
-
Save the following python code in test.py and execute it (
python test.py
)import tensorflow as tf from tensorflow.python.client import device_lib print("TF Version: "+tf.__version__) print() print(device_lib.list_local_devices())
You must see "TF Version: 2.5" and a long bash output containig something like follows:
Created TensorFlow device (/device:GPU:0 with 22434 MB memory) -> physical GPU (device: 0, name: TITAN RTX, pci bus id: 0000:01:00.0, compute capability: 7.5) [name: "/device:CPU:0" device_type: "CPU" memory_limit: 268435456 locality { } incarnation: 17126562804018110364 , name: "/device:GPU:0" device_type: "GPU" memory_limit: 23523885056 locality { bus_id: 1 links { } } incarnation: 17839688261725336163 physical_device_desc: "device: 0, name: TITAN RTX, pci bus id: 0000:01:00.0, compute capability: 7.5" ]
The process is very similar to the TensorFlow GPU support with few following differences:
-
Steps 1, 2, and 3 are unnecessary, of course.
-
Follow steps 4~10.
-
Configure the build process by answering some questions through executing
python configure.py
:Note: do not change the default location of Python
Note: Do you wish to build TensorFlow with CUDA support? [y/N]: n
Note: Do you want to use clang as CUDA compiler? [y/N]: n
Note: Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -Wno-sign-compare]:-march=native
-
Start the build process (it takes few hours to be completed) {--config=v2 to build tensorflow 2.x}
bazel build --config=v2 --config=nonccl //tensorflow/tools/pip_package:build_pip_package --verbose_failures
-
Make the whl file, which will be used to install TensorFlow 2.5
./bazel-bin/tensorflow/tools/pip_package/build_pip_package 'your_desire_dir'
-
Install TensorFlow 2.5; do not intsall it in the same virtual environment as TensorFlow GPU.
pip install 'your_desire_dir'/tensorflow-2.5.0-*.whl
-
Save the following python code in test.py and execute it (
python test.py
)import tensorflow as tf from tensorflow.python.client import device_lib print("TF Version: "+tf.__version__) print() print(device_lib.list_local_devices())
You must see "TF Version: 2.5" and a long bash output containig something like follows:
[name: "/device:CPU:0" device_type: "CPU" memory_limit: 268435456 locality { } incarnation: 17126562804018110364 ]
Save the following python code in check.py and execute it (python check.py
)
import tensorflow as tf
import time
start = time.perf_counter()
with tf.device('/CPU:0'):
for i in range (1,1000):
if i%100==0:
print (i)
for j in range (1,1000):
tf.reduce_sum(tf.random.normal([1000, 1000]))
stop = time.perf_counter()
time_passed = stop-start
print("time with CPU: "+str(round((time_passed/60),3)))
print ()
start = time.perf_counter()
with tf.device('/GPU:0'):
for i in range (1,1000):
if i%100==0:
print (i)
for j in range (1,1000):
tf.reduce_sum(tf.random.normal([1000, 1000]))
stop = time.perf_counter()
time_passed = stop-start
print("time with GPU: "+str(round((time_passed/60),3)))
In my machine with Intel i9-9900K CPU (3.60GHz), 128 GB RAM, and TITAN RTX GPU, "time with CPU: 25.827" and "time with GPU: 2.054".
During installation process of a new kernel, DKMS is uninstalled; then, it is built and installed again. However, when the installed nvidia driver is not compatible with the new kernel, DKMS is uninstalled and will not be installed again; you cannot login via new installed kernel. In this case,
- Login via an already installed old kernel
- Download the nvidia driver compatible with your existing CUDA and the new kernel; make in exectable
- Uninstall current nvidia driver by "nvidia-installer --uninstall"
- reboot and login in run-level 3 (text-mode with network by adding 'init 3' to the end of the boot command of GRUB) via new kernel
- Install the new downlowded nvidia driver
Now, TensorFlow-gpu is usable and no extra setting is required.
ENJOY!