BERT model optimization is an open-source optimization for BERT language processing model.
The optimization is based on Bfloat16 Optimization Boosts Alibaba Cloud BERT Model Performance.
Furthermore, it utilizes Intel® oneAPI Deep Neural Network Library (oneDNN) to obtain additional performance gains.
BERT model optimization is split into two parts, model modifier, which modifies the model to use a custom operator and a custom operator which utilizes oneDNN.
Currently HuggingFace BERT models (PyTorch and TensorFlow backends) and models built using TensorFlow 1.x and 2.x are supported.
We provide the way to modify some TensorFLow models from TFhub, google-research/bert and Hugging Face.
If you wish to modify your custom TensorFlow model, we provide the step by step guide how to do it. Please check our README page.
- Requirements for building from source
- Building from source
- Getting started
- Samples
- License
- Features and known issues
- Support
BERT model optimization supports systems meeting the following requirements:
- oneDNN Library 3.1 or later
- C++ compiler
- CMake
- Linux based operating system
-
Clone and build:
git clone https://github.com/intel/light-model-transformer cd light-model-transformer/BERT mkdir build cd build source /opt/intel/oneapi/setvars.sh # Make sure CMake can find oneDNN cmake .. -DBACKENDS="TF\;PT" # Use TF (Tensorflow), PT (PyTorch) or both, based on which frameworks you wish to use. cmake --build . -j 8
-
Run benchmark:
tests/benchmark/benchmark
For the currently supported use cases, short tutorials on usage are provided. All of them require built from source the BERT Operator (BertOp), refer to Building from source
- tensorflow 1.x (Tested on TF v.1.15)
- tensorflow 2.x (Tested on TF v.2.5, v.2.9, v.2.12)
- tensorflow 2.x without using the model_modifier module (only HuggingFace models are currently supported)
- pytorch (only HuggingFace models are currently supported)
- Model Zoo for Intel® Architecture
There are scripts which demonstrate BertOp integration capabilities:
BERT model optimization is licensed under Apache License Version 2.0. Refer to the "LICENSE" file for the full license text and copyright notice.
This distribution includes third party software governed by separate license terms.
Apache License Version 2.0:
- Google AI BERT
- Tensorflow tutorials
- Intel® oneAPI Deep Neural Network Library (oneDNN)
- Model Zoo for Intel® Architecture
See ChangeLog
Please submit your questions, feature requests, and bug reports on the GitHub issues page.