Skip to content

Latest commit

 

History

History
2 lines (2 loc) · 298 Bytes

README.md

File metadata and controls

2 lines (2 loc) · 298 Bytes

Compute Embeddings from ML models

This repository aims at setting up a Nvidia's Triton Inference server which simplifies the deployment of AI models at scale in production. It natively supports multiple framework backends like TensorFlow, PyTorch, ONNX Runtime, Python, and even custom backends.