Skip to content

TensorFlow

agthomas-ucsb edited this page Aug 17, 2021 · 8 revisions

TensorFlow is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries and community resources that lets researchers push the state-of-the-art in ML and developers easily build and deploy ML powered applications. For our purposes, this is very useful for transient detection. Properly utilizing TensorFlow requires using gpu support (otherwise we are working our cpu far too hard and it will be far too slow). To ensure you have a basic understanding of TensorFlow, experiment with the following commands.

Docker

To run TensorFlow with GPU support, we use Docker which is the preferred way to do this on Linux (and one of the only ways). While the setup is quite complicated, you can verify Docker is installed and working quite easily using their hello-world option:

$ sudo docker run hello-world

Note that docker requires sudo permission by default which we have not adjusted since the use case is currently quite limited. If you do not have permission to use sudo, you will not be able to use TensorFlow under the current configuration.

NVIDIA Docker Support

To utilize the GPU, we are spoiled by NVIDIA and only require one tool, which is documented here: To verify that NVIDIA Docker Support is working, run the following: https://github.com/NVIDIA/nvidia-docker

sudo docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi

This should return

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.51.06    Driver Version: 450.51.06    CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla T4            On   | 00000000:00:1E.0 Off |                    0 |
| N/A   34C    P8     9W /  70W |      0MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

TensorFlow

Finally, if Docker and NVIDIA Docker Support are configured correctly, we are ready to run TensorFlow. Start by running:

sudo docker run --gpus all -it --rm tensorflow/tensorflow:latest-gpu \
  python -c "import tensorflow as tf; print(tf.reduce_sum(tf.random.normal([1000, 1000])))"

which should return tf.Tensor(1399.2882, shape=(), dtype=float32). Now that you have verified TensorFlow works with GPU support, make sure you understand what the above command does and try your own envoirnment using the below with Jupyter support as well.

Start a TensorFlow envoirnment with:

sudo docker run -it --rm tensorflow/tensorflow:latest-gpu-jupyter

For more options and ways to run TensorFlow via Docker see:

docker run --help

For more information, please see the various documentations below.

https://www.tensorflow.org/install/gpu

https://www.tensorflow.org/install/docker

https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#docker

Update: TensorFlow is currently configured for alex only since he installed the necessary software. A globally installed/accessible version will come soon.