Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Token Count Tracking in VLLM with lmformatenforcer #148

Open
wrench1997 opened this issue Nov 7, 2024 · 1 comment
Open

Token Count Tracking in VLLM with lmformatenforcer #148

wrench1997 opened this issue Nov 7, 2024 · 1 comment

Comments

@wrench1997
Copy link

wrench1997 commented Nov 7, 2024

I am using VLLM with lmformatenforcer and would like to request a feature to track token counts directly within the output of the generate function. This feature would allow for easier monitoring of token usage and would help in managing generation costs without needing to manually encode and decode text.

Proposed Solution

Could you add a token count field to the output of the generate function? For example, the output could include the number of tokens processed for both input and generated output text.

This addition would streamline token management and provide useful data for cost tracking.

Thank you for considering this request, and I look forward to any guidance or potential updates on this feature!

@wrench1997 wrench1997 changed the title Feature Request: Token Count Tracking in VLLM with lmformatenforcer Token Count Tracking in VLLM with lmformatenforcer Nov 7, 2024
@noamgat
Copy link
Owner

noamgat commented Nov 30, 2024

Hi! The inference engines that LM Format Enforcer plugs into (transformers, vLLM etc) provide this functionality. Is there something specific to LMFE which you need to be reported for cost purposes?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants