MinP for Exllama 2
Min P sampling added. When Top P = 0.69
it will override and scale based on 'Min P'. Replace the sampler.py in /text-generation-webui-main/installer_files/env/Lib/site-packages/exllamav2/generator
and it should function.
The way that it works is:
- Every possible token has a probability percentage attached to it.
- The base min p value represents the starting required percentage. (For example, 0.05 = only include tokens that are at least 5% probable)
- This gets multiplied by the top token in the entire list's probability. So if your top token is 90%, then that 5% is multiplied by 0.9x (4.5%)
- So if the top token is 90% probable, and your base_min_p is set to 0.05, then only tokens that are at least 4.5% probable will be sampled from before temperature is applied.
This method seems more effective at selecting the reasonable tokens compared to both Top P and Top K.
Edit the SamplerBaseMinP.txt
file to change the base 'consideration' value. The default is 0.05 (5%), but lower values can work surprisingly well even with a high temperature.
This is how you toggle it on.
Note: This is built off the ALT version of the Entropy sampling implementation, but the Dynamic Temp is still only applied if your temp is set to 1.84, so you are not forced to use it.
Graphic Explanation of Min P: