Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extended Gemm interface to support mixed precision operations #500

Merged
merged 56 commits into from
May 15, 2024

Conversation

OuadiElfarouki
Copy link
Collaborator

Extend Gemm operator interface to support mixed precision operations, namely by decoupling matrix A and B type element_in_t from output matrix C and scalars alpha and beta type element_out_t.

Following oneMKL's spec notation for Gemm API : https://spec.oneapi.io/versions/latest/elements/oneMKL/source/domains/blas/gemm.html#onemkl-blas-gemm, this PR enables (Ta==Tb) to be set independently from (Tc==Ts). This feature has been enabled at a first stage for Ta=Tb=sycl::half and Tc=Ts=float. Thus enabling half support also enables the mixed precision case of (half, float) for gemm.

Changes include:

  • Updating different Gemm kernel implementations to account for the decoupled types.
  • Necessary CMake and Kernel generation scripts updates to account for the couple of types instead of single type in gemm case.
  • Necessary changes to unit-tests to account for this feature in the Gemm case.

Note :
Following oneMKL expected Gemm API, Support of bfloat16-float and std::int8_t-float would be straightforward afterwards, but the additional cases of Ta==Tb==Tc while Ts (alpha & beta) is separate will require additional decoupling & re-design work..

OuadiElfarouki and others added 30 commits February 5, 2024 09:04
cmake/CmakeFunctionHelper.cmake Outdated Show resolved Hide resolved
cmake/CmakeFunctionHelper.cmake Show resolved Hide resolved
cmake/CmakeFunctionHelper.cmake Outdated Show resolved Hide resolved
cmake/CmakeFunctionHelper.cmake Outdated Show resolved Hide resolved
include/operations/blas3_trees.h Outdated Show resolved Hide resolved
src/operations/blas3/gemm_common.hpp Show resolved Hide resolved
src/operations/blas3/gemm_local_joint_matrix.hpp Outdated Show resolved Hide resolved
src/sb_handle/portblas_handle.hpp Outdated Show resolved Hide resolved
test/blas_test_macros.hpp Outdated Show resolved Hide resolved
test/unittest/blas3/blas3_gemm_common.hpp Outdated Show resolved Hide resolved
Copy link
Collaborator

@s-Nick s-Nick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for you work @OuadiElfarouki.
There is some confusion with element_in_t and element_out_t because you forget to update element_t to one or the other. Can you please fix it? @hjabird highlighted many of them. In general, I agree that something like in_element_t and out_element_t would be more readable, please consider it.

src/operations/blas3/gemm_interleaved.hpp Outdated Show resolved Hide resolved
if constexpr (is_half<element_in_t>::value ||
!std::is_same_v<element_in_t, element_out_t>) {
#pragma unroll
for (int v = 0; v < VectorSize; ++v) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate why this loop is necessary?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@s-Nick AdaptiveCpp doesn't support sycl::vecsycl::half for mad operation, so had to break it down to its elements through the loop

@OuadiElfarouki OuadiElfarouki requested a review from Rbiessy April 18, 2024 15:13
@OuadiElfarouki
Copy link
Collaborator Author

@s-Nick @Rbiessy @hjabird I've rebased and fixed some bugs in the current PR after recent commits. Feel free to check it ! Thanks.

Copy link
Collaborator

@Rbiessy Rbiessy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a quick look I don't have any major concern.

Copy link
Collaborator

@s-Nick s-Nick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@s-Nick s-Nick merged commit 3a3113a into codeplaysoftware:master May 15, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants