Skip to content
This repository was archived by the owner on Jan 13, 2025. It is now read-only.

Revisited & Fixed Half (fp16) data support #495

Merged

Conversation

OuadiElfarouki
Copy link
Collaborator

This PR is an update and extension of half data support in portBLAS and includes following changes :

  • half support is enabled using the cmake option BLAS_ENABLE_HALF and is only applied to operators meant to support half according to oneMKL spec (so far in this PR axpy, scal and gemm)
  • unittests & benchmarksare extended to support mixed-precision comparison (reference BLAS libs only support float/double).
  • Extended unittests for axpy, scal, and gemm (+gemm_batched) using half.
  • Extended portblas, cublas & rocblas benchmarks for gemm (+gemm_batched).
  • Separated gemm configurations when using half data type for each TUNING_TARGET from the float/double configurations.

Other notes :

  • half precision support is disabled when targetting DEFAULT_CPU due to lack of fp16 support.
  • some legacy gemm configurations for intel GPU targets with sycl::half have been removed (not based on a tuning but rather a temporary reduction of generated kernels)

@muhammad-tanvir-1211 muhammad-tanvir-1211 merged commit 3adb52c into codeplaysoftware:master Feb 27, 2024
3 checks passed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants