BERT-large demo of the monolithic BERT operator

Introduction

This is a sample used to demonstrate performance of the BERT operator optimization. The sample will download a fresh bert-large-uncased model from HuggingFace and convert it to SavedModel format.

A short benchmark script will then be run. First, the script will generate an optimized saved_model.pb which uses the monolithic BERT operator. This may take a few minutes. Next, the model will be tested on dummy data, first using the original model graph, and then using the optimized graph. The benchmark uses a fixed sequence length of 128, though the batch size and the number of warmup/benchmark iterations can be configured.

The next section provides instructions on how to run the sample in a Docker container.

The third section provides instructions on how to run the sample on a clean Ubuntu 20.04 environment (e.g. an AWS instance), though this was not tested. The steps essentially involve running the scripts the same way they are executed in the Dockerfile.

Docker demo:

Build the docker image by running the utility script in the samples/tensorflow_performance directory:
```
./build_image -t <image-tag> <any other docker build args here>
```
NOTE: It may be necessary to change pass proxy setup to docker build, the Dockerfile accepts build arguments for this purpose, for example:
```
./build_image -t <image-tag> --build-arg http_proxy=<http-proxy> --build-arg https_proxy=<https_proxy>
```

Run the benchmark inside a container:

docker run <image-tag> BATCH_SIZE WARMUP_ITERATIONS BENCHMARK_ITERATIONS

For example:

docker run <image-tag> 16 10 50

Bare-metal / AWS instance demo:

From the tensorflow_performance directory:

Set up proxy variables if necessary, for example:

export http_proxy=<http_proxy>
export https_proxy=<https_proxy>
export no_proxy=<no_proxy>
echo "Acquire::http::proxy \"${http_proxy}\";" >> /etc/apt/apt.conf

Install dependencies:
```
./install_dependencies.sh
```
Compile the project:
```
./compile.sh
```
Prepare the model:
```
./prepare_model.sh
```

Run the benchmark:

./run_benchmark BATCH_SIZE WARMUP_ITERATIONS BENCHMARK_ITERATIONS

For example:

./run_benchmark 16 10 50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

BERT-large demo of the monolithic BERT operator

Introduction

Docker demo:

Bare-metal / AWS instance demo:

Files

README.md

Latest commit

History

README.md

File metadata and controls

BERT-large demo of the monolithic BERT operator

Introduction

Docker demo:

Bare-metal / AWS instance demo: