Skip to content

b3620

Compare
Choose a tag to compare
@github-actions github-actions released this 24 Aug 20:44
e11bd85
CPU/CUDA: Gemma 2 FlashAttention support (#8542)

* CPU/CUDA: Gemma 2 FlashAttention support

* apply logit_softcap to scale in kernel

* disable logit softcapping tests on Metal

* remove metal check