How to generate a fix amount of tokens when calling the model #1334
Unanswered
JeremyGe07
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I have read the doc but sitll cannot find how to generate a fix amount of tokens in call function. Lets say, e.g., how to gurantee to generate 50 tokens in the following code?
In llama.cpp ,we can use the -n(--n-predict) parameter to set the number of tokens to predict when generating text, and use the --ignore-eos parameter to keep generating until the set nubmer length as this doc said.
Thanks for your help in advance.
Beta Was this translation helpful? Give feedback.
All reactions