Skip to content

tensorflow/profiler

Repository files navigation

Tensorboard Profiler Plugin

The profiler includes a suite of tools for JAX, TensorFlow, and PyTorch/XLA. These tools help you understand, debug and optimize programs to run on CPUs, GPUs and TPUs.

The profiler plugin offers a number of tools to analyse and visualize the performance of your model across multiple devices. Some of the tools include:

  • Overview: A high-level overview of the performance of your model. This is an aggregated overview for your host and all devices. It includes:
    • Performance summary and breakdown of step times.
    • A graph of individual step times.
    • A table of the top 10 most expensive operations.
  • Trace Viewer: Displays a timeline of the execution of your model that shows:
    • The duration of each op.
    • Which part of the system (host or device) executed an op.
    • The communication between devices.
  • Memory Profile Viewer: Monitors the memory usage of your model.
  • Graph Viewer: A visualization of the graph structure of HLOs of your model.

Demo

First time user? Come and check out this Colab Demo.

Prerequisites

  • TensorFlow >= 2.18.0
  • TensorBoard >= 2.18.0
  • tensorboard-plugin-profile >= 2.18.0

Note: The Tensorboard Profiler Plugin requires access to the Internet to load the Google Chart library. Some charts and tables may be missing if you run TensorBoard entirely offline on your local machine, behind a corporate firewall, or in a datacenter.

To profile on a single GPU system, the following NVIDIA software must be installed on your system:

  1. NVIDIA GPU drivers and CUDA Toolkit:

    • CUDA 12.5 requires 525.60.13 and higher.
  2. Ensure that CUPTI 10.1 exists on the path.

    $ /sbin/ldconfig -N -v $(sed 's/:/ /g' <<< $LD_LIBRARY_PATH) | grep libcupti

    If you don't see libcupti.so.12.5 on the path, prepend its installation directory to the $LD_LIBRARY_PATH environmental variable:

    $ export LD_LIBRARY_PATH=/usr/local/cuda/extras/CUPTI/lib64:$LD_LIBRARY_PATH

    Run the ldconfig command above again to verify that the CUPTI 12.5 library is found.

    If this doesn't work, try:

    $ sudo apt-get install libcupti-dev

To profile a system with multiple GPUs, see this guide for details.

To profile multi-worker GPU configurations, profile individual workers independently.

To profile cloud TPUs, you must have access to Google Cloud TPUs.

Quick Start

The profiler plugin follows the TensorFlow versioning scheme. As a result, the tensorboard-plugin-profile PyPI package can be behind the tbp-nightly PyPI package. In order to get the latest version of the profiler plugin, you can install the nightly package..

To install the nightly version of profiler:

$ pip uninstall tensorboard-plugin-profile
$ pip install tbp-nightly

Run TensorBoard:

$ tensorboard --logdir=profiler/demo

If you are behind a corporate firewall, you may need to include the --bind_all tensorboard flag.

Go to localhost:6006/#profile of your browser, you should now see the demo overview page show up. Overview Page Congratulations! You're now ready to capture a profile.

Next Steps