-
-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Usage]: KV Cache Evict Method #6
Comments
Hello, There are 4 main parts of the eviction process:
Step 4 is handled by the Before determining which reordering to apply in |
And you're correct that the purpose of |
Thank you so much for the explanation! |
Hi,
Thank you again for the great repo.
I would like to dive more into it. I am looking at the place where the actual eviction happens after sorting. I would like to understand if the code discards the same number of KVs per layer & head.
As far as I understand, sorting + eviction happens in L441. More specifically, the eviction happens in
count_block_evictions
which uses aC
code. Am I following correctly here?If so, I wonder if this is a
uniform_evict
as the number of KVs in each layer/head is the same. If not, could you guide on how to do it? Isuniform_evict
part working?Before submitting a new issue...
The text was updated successfully, but these errors were encountered: