Skip to content

Commit

Permalink
Export recent MHA changes from fbcode
Browse files Browse the repository at this point in the history
triton_splitk extensively expanded.
Includes some paged attention, merge_attentions

ghstack-source-id: 2e01b95df689395cb64844eb6db5e5638058e599
Pull Request resolved: fairinternal/xformers#1031

__original_commit__ = fairinternal/xformers@19b9c21
  • Loading branch information
bottler authored and xFormers Bot committed Feb 19, 2024
1 parent de5e5b9 commit a565083
Show file tree
Hide file tree
Showing 10 changed files with 1,552 additions and 503 deletions.
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [0.0.25] - TBD
### Added
- New merge_attentions function
### Improved
- fMHA: Updated Flash-Attention to v2.5.2: this has a performance improvement for multiquery.
- fMHA: triton_splitk changed and expanded. Now amalgamates using LSE. Can autotune, supports causal with a small number of queries - not just 1. Experimental support for paged attention.
### Removed

## [0.0.24] - 2024-01-31
Expand Down
Loading

0 comments on commit a565083

Please sign in to comment.