You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am using VLLM with lmformatenforcer and would like to request a feature to track token counts directly within the output of the generate function. This feature would allow for easier monitoring of token usage and would help in managing generation costs without needing to manually encode and decode text.
Proposed Solution
Could you add a token count field to the output of the generate function? For example, the output could include the number of tokens processed for both input and generated output text.
This addition would streamline token management and provide useful data for cost tracking.
Thank you for considering this request, and I look forward to any guidance or potential updates on this feature!
The text was updated successfully, but these errors were encountered:
wrench1997
changed the title
Feature Request: Token Count Tracking in VLLM with lmformatenforcer
Token Count Tracking in VLLM with lmformatenforcer
Nov 7, 2024
Hi! The inference engines that LM Format Enforcer plugs into (transformers, vLLM etc) provide this functionality. Is there something specific to LMFE which you need to be reported for cost purposes?
I am using VLLM with lmformatenforcer and would like to request a feature to track token counts directly within the output of the generate function. This feature would allow for easier monitoring of token usage and would help in managing generation costs without needing to manually encode and decode text.
Proposed Solution
Could you add a token count field to the output of the generate function? For example, the output could include the number of tokens processed for both input and generated output text.
This addition would streamline token management and provide useful data for cost tracking.
Thank you for considering this request, and I look forward to any guidance or potential updates on this feature!
The text was updated successfully, but these errors were encountered: