LLM Operatror

LLM Operator builds a software stack that provides LLM as a service. More specifically it provides the OpenAI-compatible API anywhere, including the following functionality:

LLM fine-tuning job management
LLM inference
Fine-tuned models management
Training/validation file management

Additionally it provides the following components as optional:

Vector DB (e.g., Milvus)
Object Store (e.g., MinIO)
GPU Operator
Monitoring
Kubeflow / MLFlow

High-level Architecture

An Initial Demo Scenario

A user uploads a dataset to File Manager.
The user creates a fine-tuning job in Job Manager. Job Manager generates a LoRA adapter with the uploaded dataset and stores the LoRA adapter in Model Registry.
Inference Manager is notified and imports a new model.
The user runs a chatbot using the fine-tuned model.

Please see the demo video.

Use Cases

Run LLM in an on-prem datacenter
Run LLM at edge
Run LLM across multiple cloud providers

Technical Challenges

Be able to satisfy both the SLO of fine tuning jobs and inference on a limited number of GPUs (e.g., Run a large fine-tuning jobs at midnight when no one is using inference)
Support heterogeneous GPUs (from A100 to B100)
Support heterogeneous models (from small models to large models)

Name		Name	Last commit message	Last commit date
Latest commit History 90 Commits
.github/workflows		.github/workflows
aws		aws
deployments/llm-operator		deployments/llm-operator
docs		docs
hack		hack
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM Operatror

High-level Architecture

An Initial Demo Scenario

Use Cases

Technical Challenges

About

Releases

Packages

Languages

License

junm-cloudnatix/llm-operator

Folders and files

Latest commit

History

Repository files navigation

LLM Operatror

High-level Architecture

An Initial Demo Scenario

Use Cases

Technical Challenges

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages