A question about attention block sparsity #23

imh966 · 2024-03-22T03:27:59Z

Hi, thanks for your excellent work! I'm quite interested in your approach to handling missing KV cache in attention block sparsity. However, when I read the code for the accuracy benchmark, I didn't find any code related to missing KV cache. It seems that the code only modifies the output of self-attention without touching KV cache. I'm wondering if there's any code that hasn't been released, or perhaps I didn't find the right code for the accuracy benchmark.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A question about attention block sparsity #23

A question about attention block sparsity #23

imh966 commented Mar 22, 2024

A question about attention block sparsity #23

A question about attention block sparsity #23

Comments

imh966 commented Mar 22, 2024