-
Notifications
You must be signed in to change notification settings - Fork 62
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add runtime template of vLLM ROCM (#280)
* Add runtime template of vLLM ROCM Signed-off-by: Vaibhav Jain <[email protected]> * Fix runtime image reference Signed-off-by: Vaibhav Jain <[email protected]> * Update template description to highlight ROCm Signed-off-by: Vaibhav Jain <[email protected]> --------- Signed-off-by: Vaibhav Jain <[email protected]>
- Loading branch information
1 parent
fe9e14b
commit 18a5e80
Showing
4 changed files
with
65 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
vllm-rocm-image=quay.io/opendatahub/vllm:fast-rocm |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,51 @@ | ||
apiVersion: template.openshift.io/v1 | ||
kind: Template | ||
metadata: | ||
labels: | ||
opendatahub.io/dashboard: 'true' | ||
opendatahub.io/ootb: 'true' | ||
annotations: | ||
description: vLLM ServingRuntime to support ROCm (for AMD GPUs) | ||
openshift.io/display-name: vLLM ROCm ServingRuntime for KServe | ||
openshift.io/provider-display-name: Red Hat, Inc. | ||
tags: rhods,rhoai,kserve,servingruntime | ||
template.openshift.io/documentation-url: https://github.com/opendatahub-io/vllm | ||
template.openshift.io/long-description: This template defines resources needed to deploy vLLM ServingRuntime with KServe in Red Hat OpenShift AI | ||
opendatahub.io/modelServingSupport: '["single"]' | ||
opendatahub.io/apiProtocol: 'REST' | ||
name: vllm-rocm-runtime-template | ||
objects: | ||
- apiVersion: serving.kserve.io/v1alpha1 | ||
kind: ServingRuntime | ||
metadata: | ||
name: vllm-rocm-runtime | ||
annotations: | ||
openshift.io/display-name: vLLM ROCm ServingRuntime for KServe | ||
opendatahub.io/recommended-accelerators: '["amd.com/gpu"]' | ||
labels: | ||
opendatahub.io/dashboard: 'true' | ||
spec: | ||
annotations: | ||
prometheus.io/port: '8080' | ||
prometheus.io/path: '/metrics' | ||
multiModel: false | ||
supportedModelFormats: | ||
- autoSelect: true | ||
name: vLLM | ||
containers: | ||
- name: kserve-container | ||
image: $(vllm-rocm-image) | ||
command: | ||
- python | ||
- -m | ||
- vllm.entrypoints.openai.api_server | ||
args: | ||
- "--port=8080" | ||
- "--model=/mnt/models" | ||
- "--served-model-name={{.Name}}" | ||
env: | ||
- name: HF_HOME | ||
value: /tmp/hf_home | ||
ports: | ||
- containerPort: 8080 | ||
protocol: TCP |