ggml:Mamba Cuda kernel performance improve#9186
Closed
piDack wants to merge 20 commits intoggerganov:master from piDack:mfalcon_mamba_cuda
+264-6
Commits
Commits on Aug 24, 2024
- committed
- committed
Update CUDA ops ssm_conv and ssm_scan to match CPU implementation from PR ggerganov#7531 (as per eb589d5)
committed- committed
- committed
Update CUDA ops and tests to match implementation from commit 8fb57ac (llama : use im2col and mul_mat to perform convolution for Mamba); GPU version breaks with assert because of unsupported MUL_MAT
committed
Commits on Aug 25, 2024
Commits on Aug 26, 2024
Commits on Aug 27, 2024
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
Commits on Aug 28, 2024
- committed