Token Count Tracking in VLLM with lmformatenforcer #148

wrench1997 · 2024-11-07T01:19:50Z

I am using VLLM with lmformatenforcer and would like to request a feature to track token counts directly within the output of the generate function. This feature would allow for easier monitoring of token usage and would help in managing generation costs without needing to manually encode and decode text.

Proposed Solution

Could you add a token count field to the output of the generate function? For example, the output could include the number of tokens processed for both input and generated output text.

This addition would streamline token management and provide useful data for cost tracking.

Thank you for considering this request, and I look forward to any guidance or potential updates on this feature!

noamgat · 2024-11-30T07:47:26Z

Hi! The inference engines that LM Format Enforcer plugs into (transformers, vLLM etc) provide this functionality. Is there something specific to LMFE which you need to be reported for cost purposes?

wrench1997 changed the title ~~Feature Request: Token Count Tracking in VLLM with lmformatenforcer~~ Token Count Tracking in VLLM with lmformatenforcer Nov 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Token Count Tracking in VLLM with lmformatenforcer #148

Token Count Tracking in VLLM with lmformatenforcer #148

wrench1997 commented Nov 7, 2024 •

edited

Loading

noamgat commented Nov 30, 2024

Token Count Tracking in VLLM with lmformatenforcer #148

Token Count Tracking in VLLM with lmformatenforcer #148

Comments

wrench1997 commented Nov 7, 2024 • edited Loading

noamgat commented Nov 30, 2024

wrench1997 commented Nov 7, 2024 •

edited

Loading