Name		Name	Last commit message	Last commit date
parent directory ..
116-sparsity-optimization.ipynb		116-sparsity-optimization.ipynb
README.md		README.md

README.md

Accelerate Inference of Sparse Transformer Models with OpenVINO™ and 4th Gen Intel® Xeon® Scalable Processors

This tutorial demonstrates how to improve performance of sparse Transformer models with OpenVINO on 4th Gen Intel® Xeon® Scalable processors. It uses a pre-trained model from the Hugging Face Transformers library and shows how to convert it to the OpenVINO™ IR format and run inference on a CPU, using a dedicated runtime option that enables sparsity optimizations. It also demonstrates how to get more performance stacking sparsity with 8-bit quantization. To simplify the user experience, the Hugging Face Optimum library is used to convert the model to the OpenVINO™ IR format and quantize it using Neural Network Compression Framework.

NOTE: This tutorial requires OpenVINO 2022.3 or newer and 4th Gen Intel® Xeon® Scalable processor that can be acquired on Amazon Web Services (AWS).

Notebook Contents

The tutorial consists of the following steps:

Download and quantize sparse the public BERT model, using OpenVINO integration with Hugging Face Optimum.
Compare sparse 8-bit vs. dense 8-bit inference performance.

Installation Instructions

If you have not installed all required dependencies, follow the Installation Guide.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

116-sparsity-optimization

116-sparsity-optimization

README.md

Accelerate Inference of Sparse Transformer Models with OpenVINO™ and 4th Gen Intel® Xeon® Scalable Processors

Notebook Contents

Installation Instructions

Files

116-sparsity-optimization

Directory actions

More options

Directory actions

More options

Latest commit

History

116-sparsity-optimization

Folders and files

parent directory

README.md

Accelerate Inference of Sparse Transformer Models with OpenVINO™ and 4th Gen Intel® Xeon® Scalable Processors

Notebook Contents

Installation Instructions