Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Half-Precision SGEMV() and SGEMM() not support CblasColMajor #2399

Open
djeong20 opened this issue Jan 9, 2024 · 1 comment
Open

Half-Precision SGEMV() and SGEMM() not support CblasColMajor #2399

djeong20 opened this issue Jan 9, 2024 · 1 comment

Comments

@djeong20
Copy link
Contributor

djeong20 commented Jan 9, 2024

Currently, the implementation of half-precision SGEMV() and SGEMM() does not support CblasColMajor for the order parameter (CBLAS_ORDER).

/** CBLAS_ORDER order parameter is not used. */
static void sgemv_FP16(CBLAS_ORDER order, CBLAS_TRANSPOSE TransA,
                       const unsigned int M, const unsigned int N,
                       const float alpha, const _FP16 *A,
                       const unsigned int lda, const _FP16 *X, const int incX,
                       const float beta, _FP16 *Y, const int incY) {

  unsigned int incy = abs(incY);
  unsigned int incx = abs(incX);

  if (TransA == CblasTrans) {
#if (defined USE__FP16 && USE_NEON)
    nntrainer::neon::sgemv_transpose_neon_fp16(A, X, Y, M, N, alpha, beta);
#else
    sgemv_loop_fp16(i, j, N, M);
#endif
  } else {
#if (defined USE__FP16 && USE_NEON)
    nntrainer::neon::sgemv_neon_fp16(A, X, Y, M, N, alpha, beta);
#else
    sgemv_loop_fp16(j, i, M, N);
#endif
  }
}

This limitation results in issues when trying to use these functions with column-major data layouts.
Such support for CblasColMajor in the half-precision implementations of SGEMV() and SGEMM() is needed.

@taos-ci
Copy link

taos-ci commented Jan 9, 2024

:octocat: cibot: Thank you for posting issue #2399. The person in charge will reply soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants