Support to serve vLLM on Kubernetes with LWS (vllm-project#4829)

Signed-off-by: kerthcet <[email protected]>
jeejeelee · May 16, 2024 · 8e7fb5d · 8e7fb5d
1 parent 9a31a81
commit 8e7fb5d
Show file tree

Hide file tree

Showing 2 changed files with 13 additions and 0 deletions.
diff --git a/docs/source/serving/deploying_with_lws.rst b/docs/source/serving/deploying_with_lws.rst
@@ -0,0 +1,12 @@
+.. _deploying_with_lws:
+
+Deploying with LWS
+============================
+
+LeaderWorkerSet (LWS) is a Kubernetes API that aims to address common deployment patterns of AI/ML inference workloads.
+A major use case is for multi-host/multi-node distributed inference.
+
+vLLM can be deployed with `LWS <https://github.com/kubernetes-sigs/lws>`_ on Kubernetes for distributed model serving.
+
+Please see `this guide <https://github.com/kubernetes-sigs/lws/tree/main/docs/examples/vllm>`_ for more details on
+deploying vLLM on Kubernetes using LWS.
diff --git a/docs/source/serving/integrations.rst b/docs/source/serving/integrations.rst
@@ -8,4 +8,5 @@ Integrations
    deploying_with_kserve
    deploying_with_triton
    deploying_with_bentoml
+   deploying_with_lws
    serving_with_langchain