New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

switch to using upstream vllm with new metric #54

Merged

k8s-ci-robot merged 2 commits into kubernetes-sigs:main from coolkp:main

Dec 12, 2024

Contributor

coolkp commented Nov 22, 2024

switch to using upstream vllm with new metric
use latest series in gauge metric vllm:lora_requests_info using the time value of the series
Use label pair running_lora_adapters
update vllm example deployment to use latest
Fixes Switch to upstream vLLM #22

k8s-ci-robot added the do-not-merge/invalid-commit-message label

k8s-ci-robot requested review from kfswain and robscott

November 22, 2024 23:00

k8s-ci-robot added cncf-cla: yes size/L labels

coolkp changed the title ~~switch to using upstream vllm with new metric Fixes #22~~ switch to using upstream vllm with new metric

k8s-ci-robot removed the do-not-merge/invalid-commit-message label

Contributor Author

coolkp commented Nov 22, 2024

/assign conliu

Contributor

k8s-ci-robot commented Nov 22, 2024

@coolkp: GitHub didn't allow me to assign the following users: conliu.

Note that only kubernetes-sigs members with read permissions, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time.
For more information please see the contributor guide

In response to this:

/assign conliu

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

liu-cong reviewed

View reviewed changes

Contributor

liu-cong left a comment

Overall lgtm. Will lgtm once PR is updated with benchmark numbers.

examples/poc/manifests/vllm/vllm-lora-deployment.yaml Show resolved Hide resolved

pkg/ext-proc/backend/vllm/metrics.go Outdated

               	klog "k8s.io/klog/v2"
               )
               const (
-              	ActiveLoRAAdaptersMetricName = "vllm:info_active_adapters_info"
+              	LoraRequestInfoMetricName                = "vllm:lora_requests_info"
+              	LoraRequestInfoMetricNameRunningAdapters = "running_adapters"

Contributor

liu-cong Nov 25, 2024

LoraRequestInfoMetricNameRunningAdapters -> LoraRequestInfoRunningAdaptersMetricName

pkg/ext-proc/backend/vllm/metrics.go Outdated Show resolved Hide resolved

pkg/ext-proc/backend/vllm/metrics.go Outdated Show resolved Hide resolved

pkg/ext-proc/backend/vllm/metrics.go Show resolved Hide resolved

Contributor

liu-cong commented Dec 2, 2024

/assign

k8s-ci-robot assigned liu-cong

k8s-ci-robot added the needs-rebase label

liu-cong reviewed

View reviewed changes

pkg/ext-proc/backend/vllm/metrics.go Outdated Show resolved Hide resolved


          Get running adapters from latest series in new metric

07c9b9c

Signed-off-by: Kunjan Patel <[email protected]>

coolkp force-pushed the main branch from a0f7627 to dc662fa Compare

December 11, 2024 18:25

k8s-ci-robot removed the needs-rebase label

coolkp force-pushed the main branch from dc662fa to 266ac7e Compare

December 11, 2024 18:48

liu-cong reviewed

View reviewed changes

examples/poc/manifests/vllm/vllm-lora-deployment.yaml Show resolved Hide resolved

pkg/ext-proc/backend/vllm/metrics.go Outdated Show resolved Hide resolved

pkg/ext-proc/backend/vllm/metrics.go Outdated Show resolved Hide resolved

pkg/ext-proc/backend/vllm/metrics.go Outdated Show resolved Hide resolved

pkg/ext-proc/backend/vllm/metrics.go Show resolved Hide resolved


          Get running adapters from latest series in new metric, add table driv…

3f5c7bb

…en test function, delete old metrics

Signed-off-by: Kunjan Patel <[email protected]>
Signed-off-by: Kunjan <[email protected]>

coolkp force-pushed the main branch from 266ac7e to 3f5c7bb Compare

December 11, 2024 20:04

Contributor

liu-cong commented Dec 11, 2024

/lgtm

k8s-ci-robot added the lgtm label

ahg-g reviewed

View reviewed changes

pkg/ext-proc/backend/vllm/metrics.go

               	dto "github.com/prometheus/client_model/go"
               	"github.com/prometheus/common/expfmt"
               	"go.uber.org/multierr"
+              	"inference.networking.x-k8s.io/llm-instance-gateway/pkg/ext-proc/backend"

Contributor

ahg-g Dec 12, 2024

lets keep our imports in a separate section

pkg/ext-proc/backend/vllm/metrics.go

		@@ -90,35 +95,55 @@ func promToPodMetrics(metricFamilies map[string]dto.MetricFamily, existing bac
		}
		*/

Contributor

ahg-g Dec 12, 2024

@liu-cong do we have an algorithm proposal to use this metric? if not, can we cleanup this code?

Contributor

ahg-g commented Dec 12, 2024

/approve

Contributor

k8s-ci-robot commented Dec 12, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ahg-g, coolkp

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [ahg-g]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot added the approved label

k8s-ci-robot merged commit 918960c into kubernetes-sigs:main

2 checks passed

danehans mentioned this pull request

Tagged Release for vLLM Image #119

Open

kfswain mentioned this pull request

No Active LoRA Adapters When Testing POC Example #109

Closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved cncf-cla: yes lgtm size/L