How to Use ONNX Runtime Server for Prediction

ONNX Runtime Server provides an easy way to start an inferencing server for prediction with GRPC endpoints.

The CLI command to start the server is shown below:

$ ./onnxruntime_server --helpfull
onnxruntime_server: ./onnxruntime_server --model_path trained.onnx

  Flags from onnxruntime_server.cpp:
    --address (The base server address); default: "0.0.0.0";
    --grpc_port (GRPC port to listen to requests); default: 50051;
    --log_level (Logging level. Allowed options (case sensitive): info, warning,
      error, fatal); default: INFO;
    --model_path (Path to ONNX model); default: ;
    --num_threads (Number of server threads); default: 0;

Note: The only mandatory argument for the program here is model_path

Start the Server

To host an ONNX model as an inferencing server, simply run:

./onnxruntime_server --model_path /<your>/<model>/<path>

Dependencies

The Abseil C++ library is cloned as a submodule. Run the following commands after cloning this repository:

git submodule init
git submodule update

Download the ONNX Runtime Release for your architecture.

You also need to build and install gRPC. Follow the gRPC Quick start

View ONNX Model Properties

View and inspect ONNX model properties using Netron and note model input names and expected tensor sizes.

onnxruntime_server implements a gRPC service that consumes input names in gRPC requests and produces output names in gRPC responses that directly map to ONNX model properties.

Build

Generate the Makefile:

% mkdir -p build && cd build
% cmake -DCMAKE_PREFIX_PATH=/<your>/<grpc>/<path> -DONNXRuntime_ROOT_DIR=/<your>/<onnxruntime>/<path>/onnxruntime-osx-x86_64-1.16.3 ..

Build the sources:

% make
...
[ 55%] Built target inference_grpc_proto
[ 66%] Building CXX object CMakeFiles/onnxruntime_serving.dir/serving.cc.o
[ 77%] Linking CXX static library libonnxruntime_serving.a
[ 77%] Built target onnxruntime_serving
[ 88%] Linking CXX executable onnxruntime_server
[100%] Built target onnxruntime_server

Built With

Abseil - An open-source collection of C++ code (compliant to C++11) designed to augment the C++ standard library.

GRPC Endpoint

To use the GRPC endpoint, the protobuf can be found here. You could generate your client and make a GRPC call to it. To learn more about how to generate the client code and call to the server, please refer to the tutorials of GRPC.

Advanced Topics

Number of Worker Threads

You can change this to optimize server utilization. The default is the number of CPU cores on the host machine.

Extensions

The following Visual Studio Code extensions are highly recommended for working with this project:

C/C++ for Visual Studio Code - Provides rich C and C++ language support, including features such as IntelliSense, debugging, and code navigation.
CMake For VisualStudio Code - Enables convenient configuration and building of CMake projects within VS Code.
CMake Tools - Provides additional CMake support, including capabilities for configuring, building, and testing CMake projects.

License

This project is licensed under the MIT License - see the LICENSE.md file for details

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.github/workflows		.github/workflows
.vscode		.vscode
cmake		cmake
proto		proto
third_party		third_party
.clang-format		.clang-format
.gitignore		.gitignore
.gitmodules		.gitmodules
.pre-commit-config.yaml		.pre-commit-config.yaml
CMakeLists.txt		CMakeLists.txt
LICENSE.md		LICENSE.md
README.md		README.md
onnxruntime_server.cc		onnxruntime_server.cc
serving.cc		serving.cc
serving.h		serving.h

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

How to Use ONNX Runtime Server for Prediction

Start the Server

Dependencies

View ONNX Model Properties

Build

Built With

GRPC Endpoint

Advanced Topics

Number of Worker Threads

Extensions

License

About

Releases

Packages

Languages

License

regel/onnxruntime_server

Folders and files

Latest commit

History

Repository files navigation

How to Use ONNX Runtime Server for Prediction

Start the Server

Dependencies

View ONNX Model Properties

Build

Built With

GRPC Endpoint

Advanced Topics

Number of Worker Threads

Extensions

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages