Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Max tokens parameter is being incorrect set #4596

Closed
3 tasks done
yuzukumo opened this issue Apr 30, 2024 · 3 comments
Closed
3 tasks done

[Bug] Max tokens parameter is being incorrect set #4596

yuzukumo opened this issue Apr 30, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@yuzukumo
Copy link

Bug Description

image
Max toekns has been regulated to max 512,000 while the models support more than this limit. For example, gemini-1.5-pro-latest supports a max tokens of 1,048,576.

Steps to Reproduce

Adjust the max tokens parameter.

Expected Behavior

Max 512,000.

Screenshots

No response

Deployment Method

  • Docker
  • Vercel
  • Server

Desktop OS

Windows 11

Desktop Browser

Edge

Desktop Browser Version

124

Smartphone Device

No response

Smartphone OS

No response

Smartphone Browser

No response

Smartphone Browser Version

No response

Additional Logs

No response

@yuzukumo yuzukumo added the bug Something isn't working label Apr 30, 2024
@yuzukumo
Copy link
Author

A PR has been submitted to fix it: #4597

@Algorithm5838
Copy link
Contributor

The max_tokens setting is often misunderstood, and its description in this project's settings is incorrect. To clarify, max_tokens does not refer to the LLM's context, which is the combination of input and output tokens. Instead, max_tokens limits the number of output tokens generated by the LLM.

To illustrate this, consider the gpt-4-turbo model, which has a context size of 128,000 tokens. However, its output is capped at 4,096 tokens. To obtain the maximum output, you should set max_tokens to 4,096. If you set it to 128,000, you'll encounter an error, as you won't leave sufficient tokens for the input.

Note that max_tokens is disabled by default in this project, but enabled for vision models with a setting of 4,000. If you're using a custom fork, be sure to enable max_tokens accordingly.

@yuzukumo
Copy link
Author

yuzukumo commented May 3, 2024

Really thx for the clarification!

@yuzukumo yuzukumo closed this as completed May 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants