-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TGIS metrics #18
TGIS metrics #18
Conversation
Signed-off-by: Joe Runde <[email protected]>
Signed-off-by: Joe Runde <[email protected]>
Signed-off-by: Joe Runde <[email protected]>
Signed-off-by: Joe Runde <[email protected]>
The metrics look pretty good, the only things that I can't trigger atm are:
Output from
|
Signed-off-by: Joe Runde <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Belated review, this looks great thanks @joerunde!
Later we can perhaps make the TGIS metrics configurable, i.e. to be able to toggle back to regular vLLM metrics.
from vllm.entrypoints.grpc.pb.generation_pb2 import (GenerationResponse, | ||
Parameters, StopReason) | ||
|
||
|
||
def log_response(inputs: List[str], params: Parameters, prefix_id: str, | ||
response: GenerationResponse, times, kind_log: str, | ||
method_str: str, logger: logging.Logger): | ||
response: GenerationResponse, engine_response: RequestOutput, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor query.. any reason for passing engine_response
rather than engine_response.metrics
here?
vllm_stat_logger=vllm_stat_logger, | ||
max_sequence_len=self.config.max_model_len) | ||
# 🌶️🌶️🌶️ sneaky sneak | ||
self.engine.engine.stat_logger = tgis_stats_logger |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🌶️
@njhill this PR shouldn't disable the regular vllm metrics, it'll output both! |
This PR implements a subset of the metrics from the TGIS image. I tried to make sure that everything from our current ops dashboard is supported. These are:
I colocated all the tgis metrics code in
tgis_utils/metrics.py
to make it easy to find and change. The metrics are reported eithergrpc_server.py
for data that's easily available at the grpc server level and not currently covered by the vllm StatLogger, orThe token length metrics depend on the open PR here: vllm-project/vllm#2764. (But the rest of the metrics will function without those changes)
They could have been implemented right in the grpc server, but I wanted to keep this metrics reporting aligned with those upcoming changes.