You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Some tokens are meant to appear often in the text, and it may be desirable to avoid penalizing them because of their frequency. For newline token, there is an option: penalize_nl or --no-penalize-nl. However, other tokens that are used in prompt formatting appear often as well, and there's no option to exclude them from penalties.
Models with added tokens may have some tokens both in formatting and in model's output. In particular, some models use im_end token as stop token. This is already being discussed in #3538. The concern is that responses may get unnecessarily long as the stop token gets penalized more and more because of its presence in every message.
To address this issue, I propose adding an option to exempt arbitrary tokens from penalties. This new option would work similarly to penalize_nl, and would supersede it. It should be independent from the logit_bias.
Another possible solution is to automatically exclude all stop tokens from penalties. While this approach may be easier, it does not cover all cases, such as semicolon. Though in fact, these approaches are not mutually exclusive and could both be implemented if desired.
The text was updated successfully, but these errors were encountered:
Someone else proposed a more general solution: supplying separate text for penalty calculation. It can exclude formatting elements like "ASSISTANT:" and end tokens. But I can't find that proposal now.
Some tokens are meant to appear often in the text, and it may be desirable to avoid penalizing them because of their frequency. For newline token, there is an option:
penalize_nl
or--no-penalize-nl
. However, other tokens that are used in prompt formatting appear often as well, and there's no option to exclude them from penalties.Models with added tokens may have some tokens both in formatting and in model's output. In particular, some models use im_end token as stop token. This is already being discussed in #3538. The concern is that responses may get unnecessarily long as the stop token gets penalized more and more because of its presence in every message.
To address this issue, I propose adding an option to exempt arbitrary tokens from penalties. This new option would work similarly to
penalize_nl
, and would supersede it. It should be independent from thelogit_bias
.Another possible solution is to automatically exclude all stop tokens from penalties. While this approach may be easier, it does not cover all cases, such as semicolon. Though in fact, these approaches are not mutually exclusive and could both be implemented if desired.
The text was updated successfully, but these errors were encountered: