- Docker is the recommended way to deploy OpenVINO Model Server. Pre-built container images are available on Docker Hub and Red Hat Ecosystem Catalog.
- Host Model Server on baremetal.
- Deploy OpenVINO Model Server in Kubernetes via helm chart, Kubernetes Operator or OpenShift Operator.
This is a step-by-step guide on how to deploy OpenVINO™ Model Server on Linux, using a pre-build Docker Container.
Before you start, make sure you have:
- Docker Engine installed
- Intel® Core™ processor (6-13th gen.) or Intel® Xeon® processor (1st to 4th gen.)
- Linux, macOS or Windows via WSL
- (optional) AI accelerators supported by OpenVINO. Accelerators are tested only on bare-metal Linux hosts.
This example shows how to launch the model server with a ResNet50 image classification model from a cloud storage:
Pull an image from Docker:
docker pull openvino/model_server:latest
docker pull registry.connect.redhat.com/intel/openvino-model-server:latest
wget https://storage.openvinotoolkit.org/repositories/open_model_zoo/2022.1/models_bin/2/resnet50-binary-0001/FP32-INT1/resnet50-binary-0001.{xml,bin} -P models/resnet50/1
docker run -u $(id -u) -v $(pwd)/models:/models -p 9000:9000 openvino/model_server:latest \
--model_name resnet --model_path /models/resnet50 \
--layout NHWC:NCHW --port 9000
wget https://raw.githubusercontent.com/openvinotoolkit/model_server/main/demos/common/static/images/zebra.jpeg
wget https://raw.githubusercontent.com/openvinotoolkit/model_server/main/demos/common/python/classes.py
pip3 install ovmsclient
echo 'import numpy as np
from classes import imagenet_classes
from ovmsclient import make_grpc_client
client = make_grpc_client("localhost:9000")
with open("zebra.jpeg", "rb") as f:
img = f.read()
output = client.predict({"0": img}, "resnet")
result_index = np.argmax(output[0])
print(imagenet_classes[result_index])' >> predict.py
python predict.py
zebra
If everything is set up correctly, you will see 'zebra' prediction in the output.
It is possible to deploy Model Server outside of container. To deploy Model Server on baremetal, use pre-compiled binaries for Ubuntu20, Ubuntu22 or RHEL8.
::::{tab-set} :::{tab-item} Ubuntu 20.04 :sync: ubuntu-20-04 Download precomiled package:
wget https://github.com/openvinotoolkit/model_server/releases/download/v2023.1/ovms_ubuntu20.tar.gz
or build it yourself:
# Clone the model server repository
git clone https://github.com/openvinotoolkit/model_server
cd model_server
# Build docker images (the binary is one of the artifacts)
make docker_build
# Unpack the package
tar -xzvf dist/ubuntu/ovms.tar.gz
Install required libraries:
sudo apt update -y && apt install -y libpugixml1v5 libtbb2
::: :::{tab-item} Ubuntu 22.04 :sync: ubuntu-22-04 Download precomiled package:
wget https://github.com/openvinotoolkit/model_server/releases/download/v2023.1/ovms_ubuntu22.tar.gz
or build it yourself:
# Clone the model server repository
git clone https://github.com/openvinotoolkit/model_server
cd model_server
# Build docker images (the binary is one of the artifacts)
make docker_build BASE_OS_TAG_UBUNTU=22.04
# Unpack the package
tar -xzvf dist/ubuntu/ovms.tar.gz
Install required libraries:
sudo apt update -y && apt install -y libpugixml1v5
::: :::{tab-item} RHEL 8.7 :sync: rhel-8-7 Download precomiled package:
wget https://github.com/openvinotoolkit/model_server/releases/download/v2023.1/ovms_redhat.tar.gz
or build it yourself:
# Clone the model server repository
git clone https://github.com/openvinotoolkit/model_server
cd model_server
# Build docker images (the binary is one of the artifacts)
make docker_build BASE_OS=redhat
# Unpack the package
tar -xzvf dist/redhat/ovms.tar.gz
Install required libraries:
sudo dnf install -y pkg-config && sudo rpm -ivh https://vault.centos.org/centos/8/AppStream/x86_64/os/Packages/tbb-2018.2-9.el8.x86_64.rpm
::: ::::
Start the server:
wget https://storage.openvinotoolkit.org/repositories/open_model_zoo/2022.1/models_bin/2/resnet50-binary-0001/FP32-INT1/resnet50-binary-0001.{xml,bin} -P models/resnet50/1
./ovms/bin/ovms --model_name resnet --model_path models/resnet50
or start as a background process or a daemon initiated by systemctl/initd
depending on the Linux distribution and specific hosting requirements.
Most of the Model Server documentation demonstrate containers usage, but the same can be achieved with just the binary package.
Learn more about model server starting parameters.
NOTE: When serving models on AI accelerators, some additional steps may be required to install device drivers and dependencies. Learn more in the Additional Configurations for Hardware documentation.
There are three recommended methods for deploying OpenVINO Model Server in Kubernetes:
- helm chart - deploys Model Server instances using the helm package manager for Kubernetes
- Kubernetes Operator - manages Model Server using a Kubernetes Operator
- OpenShift Operator - manages Model Server instances in Red Hat OpenShift
For operators mentioned in 2. and 3. see the description of the deployment process
- Start the server
- Try the model server features
- Explore the model server demos