-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DPC++ compile bug #483
Comments
I also find another compile error when I use DPC++ on NVIDIA A100, but I can't fix it. I want to test the performance of portBLAS gemm on tensor core, so I revise if (${start_idx} AND ${sm_val} GREATER_EQUAL "80")
add_definitions(-DSB_ENABLE_JOINT_MATRIX=1)
add_definitions(-DNVIDIA_GPU=1)
list(APPEND DPCPP_FLAGS "-Xclang;-cl-mad-enable")
list(APPEND DPCPP_FLAGS "-DSYCL_EXT_ONEAPI_MATRIX_VERSION=4")
list(APPEND DPCPP_FLAGS "-DSB_ENABLE_JOINT_MATRIX=1")
list(APPEND DPCPP_FLAGS "-DNVIDIA_GPU=1")
endif() Meanwhile, I revise the call of API // reg_res[frag] = joint_matrix_mad(sg, inA, inB, reg_res[frag]);
joint_matrix_mad(sg, reg_res[frag], inA, inB, reg_res[frag]); After that I compile portBLAS and sample with the command $ CC=clang CXX=clang++ cmake -GNinja ../ -DSYCL_COMPILER=dpcpp -DDPCPP_SYCL_TARGET="nvptx64-nvidia-cuda" -DDPCPP_SYCL_ARCH="sm_80" -DCMAKE_PREFIX_PATH=/opt/OpenBLAS -DCMAKE_THREAD_LIBS_INIT=-lpthread -DBLAS_ENABLE_TESTING=OFF -DBLAS_ENABLE_BENCHMARK=OFF -DCMAKE_BUILD_TYPE=Debug
$ ninja and get the error portBLAS/samples/../src/operations/blas3/gemm_local_joint_matrix.hpp:562:13: error: use of undeclared identifier 'get_wi_data'
562 | get_wi_data(sg, float_out)[i] = alpha_ * data_left;
| ^
portBLAS/samples/../src/operations/blas3/gemm_local_joint_matrix.hpp:607:9: error: use of undeclared identifier 'get_wi_data'
607 | get_wi_data(sg, float_out)[i] =
| ^
portBLAS/samples/../src/operations/blas3/gemm_local_joint_matrix.hpp:576:40: error: use of undeclared identifier 'get_wi_data'
576 | static_cast<element_t>(get_wi_data(sg, reg_res[frag])[i]);
| ^ In DPC++ #define get_wi_data(sg, jm) jm.matrix_impl.wi_marray and compile, I will get another error portBLAS/samples/../src/operations/blas3/gemm_local_joint_matrix.hpp:562:13: error: no member named 'matrix_impl' in 'sycl::ext::oneapi::experimental::matrix::joint_matrix<sycl::sub_group, float, sycl::ext::oneapi::experimental::matrix::use::accumulator, 16, 16>'
562 | get_wi_data(sg, float_out)[i] = alpha_ * data_left;
| ^ ~~~~~~~~~
portBLAS/samples/../src/operations/blas3/gemm_local_joint_matrix.hpp:33:32: note: expanded from macro 'get_wi_data'
33 | #define get_wi_data(sg, jm) jm.matrix_impl.wi_marray
| ~~ ^ It seems like that since the code in I want to learn how to fix it and test the performance of gemm on tensor core. |
Is anyone working on this issue?🤔 |
Hi @horrorChen, |
Thanks for your reply @muhammad-tanvir-1211. Actually, I find that the call of Hope for your update of work. |
Hi @horrorChen |
hi @muhammad-tanvir-1211 I use the command below. $ CC=clang CXX=clang++ cmake -GNinja ../ -DSYCL_COMPILER=dpcpp -DDPCPP_SYCL_TARGET="nvptx64-nvidia-cuda" -DDPCPP_SYCL_ARCH="sm_80" -DTUNING_TARGET=NVIDIA_GPU -DCMAKE_PREFIX_PATH=/opt/OpenBLAS -DCMAKE_THREAD_LIBS_INIT=-lpthread -DBLAS_ENABLE_TESTING=OFF -DBLAS_ENABLE_BENCHMARK=OFF
$ ninja The error info is like
Did you ever encounter this problem? |
Hi @horrorChen |
Hi @muhammad-tanvir-1211 |
intel/llvm#12218 has just been merged. |
In file
portBLAS/include/blas_meta.h
, the header fileno longer exists in DPC++, instead you can use
which is located in
llvm-dpcpp/build/include/sycl/ext/oneapi/experimental/complex/complex.hpp
.The text was updated successfully, but these errors were encountered: