SYCL2020 Updates, main branch (2024.11.19.) #136

krasznaa · 2024-11-19T10:32:02Z

Connected to #128, these updates are needed to be able to use oneAPI 2025.0.0 with this project. With the hope that the latest version of oneAPI may work a little better. 🤔 (But also to be able to give our code to Intel developers, so that they could see the issues that we see, with the latest compilers.)

It will clash with #128, but we can just push that one in first, and then I'll resolve the one conflict I anticipate.

krasznaa · 2024-11-19T11:31:24Z

oneAPI 2025.0.0 FTW!

[bash][Celeborn]:build-sycl > export SYCLFLAGS="-fsycl-targets=spir64,spir64_x86_64"
[bash][Celeborn]:build-sycl > cmake --preset sycl ../algebra-plugins/
Preset CMake variables:

  ALGEBRA_PLUGINS_BUILD_BENCHMARKS="TRUE"
  ALGEBRA_PLUGINS_BUILD_TESTING="TRUE"
  ALGEBRA_PLUGINS_FAIL_ON_WARNINGS="TRUE"
  ALGEBRA_PLUGINS_INCLUDE_EIGEN="TRUE"
  ALGEBRA_PLUGINS_INCLUDE_VECMEM="TRUE"
  ALGEBRA_PLUGINS_SETUP_EIGEN3="TRUE"
  ALGEBRA_PLUGINS_SETUP_VECMEM="TRUE"
  ALGEBRA_PLUGINS_TEST_SYCL="TRUE"
  ALGEBRA_PLUGINS_USE_SYSTEM_LIBS="FALSE"
  CMAKE_BUILD_TYPE="RelWithDebInfo"
  VECMEM_BUILD_SYCL_LIBRARY="TRUE"

-- The CXX compiler identification is IntelLLVM 2025.0.0
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /home/krasznaa/software/intel/oneapi-2025.0.0/compiler/2025.0/bin/compiler/clang++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Building VecMem as part of the Algebra Plugins project
-- Looking for a CUDA compiler
-- Looking for a CUDA compiler - NOTFOUND
-- Looking for a HIP compiler
-- Looking for a HIP compiler - NOTFOUND
-- Performing Test COMPILER_HAS_HIDDEN_VISIBILITY
-- Performing Test COMPILER_HAS_HIDDEN_VISIBILITY - Success
-- Performing Test COMPILER_HAS_HIDDEN_INLINE_VISIBILITY
-- Performing Test COMPILER_HAS_HIDDEN_INLINE_VISIBILITY - Success
,,,
-- Configuring done (104.3s)
-- Generating done (0.7s)
-- Build files have been written to: /home/krasznaa/ATLAS/projects/algebra/build-sycl
[bash][Celeborn]:build-sycl > make -j8
[  0%] Building CXX object common/CMakeFiles/test_algebra_qualifiers_hpp.dir/CMakeFiles/test_algebra_qualifiers_hpp.cpp.o
[  0%] Building CXX object _deps/vecmem-build/core/CMakeFiles/vecmem_core.dir/src/memory/allocator.cpp.o
[  0%] Building CXX object math/cmath/CMakeFiles/test_algebra_math_cmath_hpp.dir/CMakeFiles/test_algebra_math_cmath_hpp.cpp.o
...
[ 90%] Built target vecmem_sycl
[ 90%] Building SYCL object tests/accelerator/sycl/CMakeFiles/algebra_test_vecmem_sycl.dir/vecmem_cmath.sycl.o
[ 90%] Building SYCL object tests/accelerator/sycl/CMakeFiles/algebra_test_eigen_sycl.dir/eigen_eigen.sycl.o
[ 90%] Building SYCL object tests/accelerator/sycl/CMakeFiles/algebra_test_array_sycl.dir/array_cmath.sycl.o
[ 90%] Building SYCL object tests/accelerator/sycl/CMakeFiles/algebra_test_eigen_sycl.dir/eigen_cmath.sycl.o
[100%] Linking CXX executable ../bin/algebra_test_eigen
[100%] Built target algebra_test_eigen
[100%] Linking SYCL executable ../../../bin/algebra_test_vecmem_sycl
[100%] Linking SYCL executable ../../../bin/algebra_test_array_sycl
[100%] Built target algebra_test_vecmem_sycl
[100%] Built target algebra_test_array_sycl
[100%] Linking SYCL executable ../../../bin/algebra_test_eigen_sycl
[100%] Built target algebra_test_eigen_sycl
[bash][Celeborn]:build-sycl >

Unfortunately in WSL the tests produce some numerical issues, but they do run!

[bash][Celeborn]:build-sycl > ./bin/algebra_test_array_sycl
...
[ RUN      ] algebra_plugins/test_sycl_basics/sycl_array_cmath<double>.transform3
[       OK ] algebra_plugins/test_sycl_basics/sycl_array_cmath<double>.transform3 (4 ms)
[----------] 5 tests from algebra_plugins/test_sycl_basics/sycl_array_cmath<double> (72 ms total)

[----------] Global test environment tear-down
[==========] 10 tests from 2 test suites ran. (3353 ms total)
[  PASSED  ] 9 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] algebra_plugins/test_sycl_basics/sycl_array_cmath<float>.vector_3d_ops, where TypeParam = test_types<float,std::array<float,2ul>,std::array<float,3ul>,std::array<float,2ul>,std::array<float,3ul>,algebra::cmath::transform3<algebra::cmath::matrix::actor<unsigned long,algebra::matrix::array_type,algebra::matrix::matrix_type,float,algebra::cmath::matrix::determinant::actor<unsigned long,algebra::matrix::matrix_type,float,algebra::cmath::matrix::determinant::partial_pivot_lud<unsigned long,algebra::matrix::matrix_type,float,algebra::cmath::element_getter<unsigned long,algebra::matrix::array_type,float>>,algebra::cmath::matrix::determinant::hard_coded<unsigned long,algebra::matrix::matrix_type,float,algebra::cmath::element_getter<unsigned long,algebra::matrix::array_type,float>,2ul,4ul> >,algebra::cmath::matrix::inverse::actor<unsigned long,algebra::matrix::matrix_type,float,algebra::cmath::matrix::inverse::partial_pivot_lud<unsigned long,algebra::matrix::matrix_type,float,algebra::cmath::element_getter<unsigned long,algebra::matrix::array_type,float>>,algebra::cmath::matrix::inverse::hard_coded<unsigned long,algebra::matrix::matrix_type,float,algebra::cmath::element_getter<unsigned long,algebra::matrix::array_type,float>,2ul,4ul> >,algebra::cmath::element_getter<unsigned long,algebra::matrix::array_type,float>,algebra::cmath::block_getter<unsigned long,algebra::matrix::array_type,float> > >,unsigned long,algebra::array::matrix_type,algebra::cmath::matrix::actor<unsigned long,algebra::matrix::array_type,algebra::matrix::matrix_type,float,algebra::cmath::matrix::determinant::actor<unsigned long,algebra::matrix::matrix_type,float,algebra::cmath::matrix::determinant::partial_pivot_lud<unsigned long,algebra::matrix::matrix_type,float,algebra::cmath::element_getter<unsigned long,algebra::matrix::array_type,float>>,algebra::cmath::matrix::determinant::hard_coded<unsigned long,algebra::matrix::matrix_type,float,algebra::cmath::element_getter<unsigned long,algebra::matrix::array_type,float>,2ul,4ul> >,algebra::cmath::matrix::inverse::actor<unsigned long,algebra::matrix::matrix_type,float,algebra::cmath::matrix::inverse::partial_pivot_lud<unsigned long,algebra::matrix::matrix_type,float,algebra::cmath::element_getter<unsigned long,algebra::matrix::array_type,float>>,algebra::cmath::matrix::inverse::hard_coded<unsigned long,algebra::matrix::matrix_type,float,algebra::cmath::element_getter<unsigned long,algebra::matrix::array_type,float>,2ul,4ul> >,algebra::cmath::element_getter<unsigned long,algebra::matrix::array_type,float>,algebra::cmath::block_getter<unsigned long,algebra::matrix::array_type,float> > >

 1 FAILED TEST
[bash][Celeborn]:build-sycl >

So there is hope yet. 😉

krasznaa · 2024-11-19T11:32:38Z

Not counting the MSVC issue, I'll also want to improve the CI tests a little in this PR. 🤔 So let's make it into a draft for the moment.

Using <sycl/sycl.hpp> as the main SYCL include, and picking up all types from the ::sycl namespace.

Didn't add a test with an NVIDIA backend, as the oneAPI+CUDA combination used by the CI crashes on the current code of the project. (The latest versions of both do succeed however. So there's no point in debugging this further.) Updated (almost) all CI tests to use v67 of the Docker images.

sonarqubecloud · 2024-11-22T14:27:19Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

krasznaa · 2024-11-22T14:54:47Z

The unit tests are interesting... 🤔 With oneAPI 2025.0.0, when building just for an x86 backend, I get this from the tests:

[bash][Legolas]:algebra-plugins > ./build-intel/bin/algebra_test_array_sycl 
Running main() from /data/ssd-1tb/projects/algebra-plugins/build-intel/_deps/googletest-src/googletest/src/gtest_main.cc
[==========] Running 10 tests from 2 test suites.
...
[----------] Global test environment tear-down
[==========] 10 tests from 2 test suites ran. (383 ms total)
[  PASSED  ] 10 tests.
[bash][Legolas]:algebra-plugins > ./build-intel/bin/algebra_test_eigen_sycl
...
/data/ssd-1tb/projects/algebra-plugins/algebra-plugins/tests/accelerator/common/test_basics_base.hpp:131: Failure
The difference between m_output_host->at(i) and m_output_device->at(i) is nan, which exceeds std::abs(0.001 * m_output_host->at(i)), where
m_output_host->at(i) evaluates to 45.70000000000001,
m_output_device->at(i) evaluates to -nan, and
std::abs(0.001 * m_output_host->at(i)) evaluates to 0.045700000000000011.

/data/ssd-1tb/projects/algebra-plugins/algebra-plugins/tests/accelerator/common/test_basics_base.hpp:131: Failure
The difference between m_output_host->at(i) and m_output_device->at(i) is nan, which exceeds std::abs(0.001 * m_output_host->at(i)), where
m_output_host->at(i) evaluates to 45.70000000000001,
m_output_device->at(i) evaluates to -nan, and
std::abs(0.001 * m_output_host->at(i)) evaluates to 0.045700000000000011.

[  FAILED  ] algebra_plugins/test_sycl_basics/sycl_eigen_eigen<double>.matrix22_ops, where TypeParam = test_types<double,algebra::eigen::array<double,2>,algebra::eigen::array<double,3>,algebra::eigen::array<double,2>,algebra::eigen::array<double,3>,algebra::eigen::math::transform3<double>,int,algebra::eigen::matrix_type> (105 ms)
[ RUN      ] algebra_plugins/test_sycl_basics/sycl_eigen_eigen<double>.transform3
[       OK ] algebra_plugins/test_sycl_basics/sycl_eigen_eigen<double>.transform3 (3 ms)
[----------] 5 tests from algebra_plugins/test_sycl_basics/sycl_eigen_eigen<double> (146 ms total)

[----------] Global test environment tear-down
[==========] 20 tests from 4 test suites ran. (735 ms total)
[  PASSED  ] 17 tests.
[  FAILED  ] 3 tests, listed below:
[  FAILED  ] algebra_plugins/test_sycl_basics/sycl_eigen_eigen<float>.vector_3d_ops, where TypeParam = test_types<float,algebra::eigen::array<float,2>,algebra::eigen::array<float,3>,algebra::eigen::array<float,2>,algebra::eigen::array<float,3>,algebra::eigen::math::transform3<float>,int,algebra::eigen::matrix_type>
[  FAILED  ] algebra_plugins/test_sycl_basics/sycl_eigen_eigen<float>.matrix22_ops, where TypeParam = test_types<float,algebra::eigen::array<float,2>,algebra::eigen::array<float,3>,algebra::eigen::array<float,2>,algebra::eigen::array<float,3>,algebra::eigen::math::transform3<float>,int,algebra::eigen::matrix_type>
[  FAILED  ] algebra_plugins/test_sycl_basics/sycl_eigen_eigen<double>.matrix22_ops, where TypeParam = test_types<double,algebra::eigen::array<double,2>,algebra::eigen::array<double,3>,algebra::eigen::array<double,2>,algebra::eigen::array<double,3>,algebra::eigen::math::transform3<double>,int,algebra::eigen::matrix_type>

 3 FAILED TESTS
[bash][Legolas]:algebra-plugins >

I.e. the Eigen tests really don't like something. Coming up with a lot of NaNs. 😕

Then, if I build with oneAPI 2025.0.0 + CUDA 12.6.2 for the NVIDIA backend, I get:

[bash][Legolas]:algebra-plugins > ./build-nvidia/bin/algebra_test_array_sycl 
Running main() from /data/ssd-1tb/projects/algebra-plugins/build-nvidia/_deps/googletest-src/googletest/src/gtest_main.cc
[==========] Running 10 tests from 2 test suites.
...
[----------] Global test environment tear-down
[==========] 10 tests from 2 test suites ran. (413 ms total)
[  PASSED  ] 10 tests.
[bash][Legolas]:algebra-plugins > ./build-nvidia/bin/algebra_test_eigen_sycl
...
/data/ssd-1tb/projects/algebra-plugins/algebra-plugins/tests/accelerator/common/test_basics_base.hpp:131: Failure
The difference between m_output_host->at(i) and m_output_device->at(i) is 32768, which exceeds std::abs(0.001 * m_output_host->at(i)), where
m_output_host->at(i) evaluates to 130276.765625,
m_output_device->at(i) evaluates to 97508.765625, and
std::abs(0.001 * m_output_host->at(i)) evaluates to 130.276765625.

/data/ssd-1tb/projects/algebra-plugins/algebra-plugins/tests/accelerator/common/test_basics_base.hpp:131: Failure
The difference between m_output_host->at(i) and m_output_device->at(i) is 8192, which exceeds std::abs(0.001 * m_output_host->at(i)), where
m_output_host->at(i) evaluates to 113913.921875,
m_output_device->at(i) evaluates to 105721.921875, and
std::abs(0.001 * m_output_host->at(i)) evaluates to 113.913921875.

/data/ssd-1tb/projects/algebra-plugins/algebra-plugins/tests/accelerator/common/test_basics_base.hpp:131: Failure
The difference between m_output_host->at(i) and m_output_device->at(i) is 8192.015625, which exceeds std::abs(0.001 * m_output_host->at(i)), where
m_output_host->at(i) evaluates to 122127.078125,
m_output_device->at(i) evaluates to 113935.0625, and
std::abs(0.001 * m_output_host->at(i)) evaluates to 122.127078125.

[  FAILED  ] algebra_plugins/test_sycl_basics/sycl_eigen_eigen<float>.vector_3d_ops, where TypeParam = test_types<float,algebra::eigen::array<float,2>,algebra::eigen::array<float,3>,algebra::eigen::array<float,2>,algebra::eigen::array<float,3>,algebra::eigen::math::transform3<float>,int,algebra::eigen::matrix_type> (91 ms)
...
[----------] Global test environment tear-down
[==========] 20 tests from 4 test suites ran. (407 ms total)
[  PASSED  ] 19 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] algebra_plugins/test_sycl_basics/sycl_eigen_eigen<float>.vector_3d_ops, where TypeParam = test_types<float,algebra::eigen::array<float,2>,algebra::eigen::array<float,3>,algebra::eigen::array<float,2>,algebra::eigen::array<float,3>,algebra::eigen::math::transform3<float>,int,algebra::eigen::matrix_type>

 1 FAILED TEST
[bash][Legolas]:algebra-plugins >

Just one of the Eigen tests fail in this case, with a bunch of 2^X value differences between the host and GPU calculations. Almost as if random bits would get turned on in the result float-s. 😕

The CUDA tests on the other hand all succeed.

[bash][Legolas]:algebra-plugins > ctest --test-dir build-cuda/
Internal ctest changing into directory: /data/ssd-1tb/projects/algebra-plugins/build-cuda
Test project /data/ssd-1tb/projects/algebra-plugins/build-cuda
    Start 1: algebra_test_array
1/6 Test #1: algebra_test_array ...............   Passed    0.00 sec
    Start 2: algebra_test_eigen
2/6 Test #2: algebra_test_eigen ...............   Passed    0.00 sec
    Start 3: algebra_test_vecmem
3/6 Test #3: algebra_test_vecmem ..............   Passed    0.00 sec
    Start 4: algebra_test_array_cuda
4/6 Test #4: algebra_test_array_cuda ..........   Passed    0.56 sec
    Start 5: algebra_test_eigen_cuda
5/6 Test #5: algebra_test_eigen_cuda ..........   Passed    0.54 sec
    Start 6: algebra_test_vecmem_cuda
6/6 Test #6: algebra_test_vecmem_cuda .........   Passed    0.41 sec

100% tests passed, 0 tests failed out of 6

Total Test time (real) =   1.54 sec
[bash][Legolas]:algebra-plugins >

So SYCL is not in much love with Eigen. (Or vice versa...)

* Made the code SYCL2020 compatible. Using <sycl/sycl.hpp> as the main SYCL include, and picking up all types from the ::sycl namespace. * Update to vecmem 1.13.0.

krasznaa requested a review from niermann999 November 19, 2024 10:32

krasznaa force-pushed the SYCLUpdates-main-20241119 branch from 8281362 to 1a599e5 Compare November 19, 2024 11:22

niermann999 approved these changes Nov 19, 2024

View reviewed changes

krasznaa marked this pull request as draft November 19, 2024 11:31

This was referenced Nov 20, 2024

How to limit number of threads per group in algorithms? (2024.11.18.) uxlfoundation/oneDPL#1936

Open

MSVC C++20 Fix, main branch (2024.11.21.) acts-project/vecmem#305

Merged

krasznaa force-pushed the SYCLUpdates-main-20241119 branch from 1a599e5 to 61f036a Compare November 21, 2024 14:38

krasznaa added 3 commits November 22, 2024 14:34

Update to vecmem 1.12.0.

283e145

Made the code SYCL2020 compatible.

bd92987

Using <sycl/sycl.hpp> as the main SYCL include, and picking up all types from the ::sycl namespace.

Update to vecmem 1.13.0.

c21bb72

krasznaa force-pushed the SYCLUpdates-main-20241119 branch from b20d8a8 to b6edf0a Compare November 22, 2024 14:15

krasznaa force-pushed the SYCLUpdates-main-20241119 branch from b6edf0a to ba63e70 Compare November 22, 2024 14:26

krasznaa marked this pull request as ready for review November 22, 2024 14:43

niermann999 approved these changes Nov 25, 2024

View reviewed changes

krasznaa merged commit 254ea58 into acts-project:main Nov 25, 2024
27 checks passed

krasznaa deleted the SYCLUpdates-main-20241119 branch November 25, 2024 12:55

krasznaa mentioned this pull request Nov 28, 2024

CI Updates, main branch (2024.11.28.) #139

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SYCL2020 Updates, main branch (2024.11.19.) #136

SYCL2020 Updates, main branch (2024.11.19.) #136

krasznaa commented Nov 19, 2024

krasznaa commented Nov 19, 2024

krasznaa commented Nov 19, 2024

sonarqubecloud bot commented Nov 22, 2024

krasznaa commented Nov 22, 2024

SYCL2020 Updates, main branch (2024.11.19.) #136

SYCL2020 Updates, main branch (2024.11.19.) #136

Conversation

krasznaa commented Nov 19, 2024

krasznaa commented Nov 19, 2024

krasznaa commented Nov 19, 2024

sonarqubecloud bot commented Nov 22, 2024

Quality Gate passed

krasznaa commented Nov 22, 2024