Skip to content

Latest commit

 

History

History
41 lines (23 loc) · 1.64 KB

README.md

File metadata and controls

41 lines (23 loc) · 1.64 KB

Kubernetes LLM Instance Gateway

The LLM Instance Gateway came out of wg-serving and is sponsored by SIG Apps. This repo contains: the load balancing algorithm, ext-proc code, CRDs, and controllers to support the LLM Instance Gateway.

This Gateway is intented to provide value to multiplexed LLM services on a shared pool of compute. See the proposal for more info.

Status

This project is currently in development.

For more rapid testing, our PoC is in the ./examples/ dir.

Getting Started

Install the CRDs into the cluster:

make install

Delete the APIs(CRDs) from the cluster:

make uninstall

Deploying the ext-proc image Refer to this README on how to deploy the Ext-Proc image used to support Instance Gateway.

Contributing

Our community meeting is weekly at Th 10AM PDT; zoom link here.

We currently utilize the #wg-serving slack channel for communications.

Contributions are readily welcomed, thanks for joining us!

Code of conduct

Participation in the Kubernetes community is governed by the Kubernetes Code of Conduct.