Update on naive_attn module #21

seungrokj · 2024-05-28T07:03:52Z

Hi,

Fixed some syntax errors in use_naive_attn module in rocm_flash_attn.py

Also, allowed navi3x to use use_triton_flash_attn.
https://github.com/ROCm/vllm/blob/perf_benchmark_navi/vllm/attention/backends/rocm_flash_attn.py

Added one more triton.Config for better perf (3~10% perf gain for llama3 70b on mi300x)

https://github.com/ROCm/vllm/blob/perf_benchmark_navi/vllm/attention/ops/triton_flash_attention.py

Regards,
Seungrok

To fix, https://ontrack-internal.amd.com/browse/SWDEV-469079 CP https://github.com/ROCm/vllm/pull/16/files to perf_benchmark_navi branch

github-actions · 2024-11-20T02:01:39Z

This pull request has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this pull request should remain open. Thank you!

root and others added 18 commits May 13, 2024 06:04

gqa triton FA update

2a2d914

gqa triton FA update

a18dc26

gqa triton FA update

135ce8d

gqa triton FA update, dockerfile add

1cc689e

gqa triton FA update, dockerfile add

d328167

tunableops table path fix

aefd4d3

tp8

5b90c8f

batch1to32sweep

d0c9bc3

enable triton fa in navi

c8bfa8e

FA log debug to info

c8fb65e

naive mha syntax error fix

385f2cd

update missing config

a803ce9

triton fa enabled

b3349ac

triton fa enabled

af01cef

triton fa enabled

f18de52

Update attention_utils.cuh (#58)

3bd137e

To fix, https://ontrack-internal.amd.com/browse/SWDEV-469079 CP https://github.com/ROCm/vllm/pull/16/files to perf_benchmark_navi branch

triton FA fix for non power-of-two head_size, phi-2 model, head_size=80

d9359c4

pinning down numpy version for AI models

807d654

github-actions bot added the stale label Nov 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update on naive_attn module #21

Update on naive_attn module #21

seungrokj commented May 28, 2024

github-actions bot commented Nov 20, 2024

Update on naive_attn module #21

Are you sure you want to change the base?

Update on naive_attn module #21

Conversation

seungrokj commented May 28, 2024

github-actions bot commented Nov 20, 2024