You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I encountered an issue when using the Hugging Face Inference API with the Qwen2-VL-7B-Instruct model. Despite providing valid input, the API returned an error indicating that the token count exceeds the limit.
{
"error": {
"message": "Input validation error: `inputs` tokens + `max_new_tokens` must be <= 4096. Given: 8740 `inputs` tokens and 500 `max_new_tokens`",
"http_status_code": 422
}
}
{"error":"Model Qwen/Qwen2-VL-2B-Instruct is currently loading","estimated_time":176.71885681152344}{"error":"Input validation error: `inputs` tokens + `max_new_tokens` must be <= 4096. Given: 8740 `inputs` tokens and 500 `max_new_tokens`","error_type":"validation"}
Hello @NEWbie0709,
This is probably the same issue mentioned in this one #2760 and this is definitely an issue on TGI side rather than huggingface_hub. I suggest checking this related issue in TGI : text-generation-inference#2923 as other users as well experience the same problem with images consuming more tokens than they should.
Describe the bug
I encountered an issue when using the Hugging Face Inference API with the Qwen2-VL-7B-Instruct model. Despite providing valid input, the API returned an error indicating that the token count exceeds the limit.
Reproduction
Run the following curl command:
Logs
System info
The text was updated successfully, but these errors were encountered: