Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
wvsplitk templatized and better tuned for MI300 (opendatahub-io#132)
* improvements to wvSpltK * wvsplt gemm; better handle MI300 and large A[] sizes * lint fix * Adjustments to better handle small weights in TP8. * early-out bug fix * better wave load balancing in wvSplt * add missing skip for wvsplt_big * Bug fix for wvSplt_big in load balancing at M4, lint fix.
- Loading branch information