Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thread ID Cleanup, main branch (2025.01.09.) #810

Merged

Conversation

krasznaa
Copy link
Member

@krasznaa krasznaa commented Jan 9, 2025

Following #808, I thought it would be time to clean up the rest of the device functions as well with how they receive thread identifiers.

  • Made all functions that need a "global index" use traccc::device::global_index_t;
    • This allowed removing a number of static_cast-s from these functions;
  • All functions that receive a thread_id and/or barrier object, now do so using constant references;
    • All such types provide only const functions. So I didn't see much reason for using non-const references here. 🤔

After this, I went and harmonized the CUDA, SYCL and ALPAKA codes a little as well.

  • Moved all thread_id types to be private headers in their libraries;
  • Added automatic, compile-time checks that they would all fulfill the traccc::device::concepts::thread_id1 concept;
  • Introduced traccc::cuda::details::global_index1() and traccc::sycl::details::global_index(...) as helper functions for generating traccc::device::global_index_t values;

While doing all of this, I tried to fix up the includes in all the touched files a bit. Since many of them were doing very questionable things. (Including way more files than necessary, hiding missing includes in some of the common device headers.)

I'm pretty happy with these updates myself, but am interested in your opinions.

@krasznaa krasznaa added cleanup Makes the code all clean and tidy cuda Changes related to CUDA sycl Changes related to SYCL alpaka Changes related to Alpaka labels Jan 9, 2025
Made all of them into "private headers", and added automated tests that they
would fulfill the appropriate concept.
While also cleaning up the includes of the files a bit.
@krasznaa krasznaa force-pushed the ThreadIdCleanup-main-20250108 branch from ced855e to 74badb9 Compare January 9, 2025 07:14
Copy link
Member

@stephenswat stephenswat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I generally dislike this idea that because vecmem only supports 32-bit integers we should be downcasting all out accesses in a 64-bit memory space to 32-bit at the generation site of those indices, rather than generating them at the appropriate size and only downcasting them necessary.

device/cuda/src/sanity/contiguous_on.cuh Outdated Show resolved Hide resolved
device/cuda/src/utils/global_index.hpp Show resolved Hide resolved
/// Function creating a global index in a 1D CUDA kernel
__device__ inline device::global_index_t global_index1() {

return blockIdx.x * blockDim.x + threadIdx.x;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I don't think we need this; the thread identifier classes already serve this purpose. 😕

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about it. But note that global_index_t is sort of "its own thing" in the code, it's not directly tied to the thread_id classes. Those are still allowed to return any integer if they really need to. (That's another story by itself.)

This just seemed like a nicely readable way of expressing what we want. 🤔 In both the CUDA and SYCL code.

device/cuda/src/utils/thread_id.hpp Show resolved Hide resolved
device/sycl/src/utils/global_index.hpp Show resolved Hide resolved
device/sycl/src/utils/thread_id.hpp Show resolved Hide resolved
Copy link

sonarqubecloud bot commented Jan 9, 2025

@stephenswat stephenswat enabled auto-merge (squash) January 9, 2025 10:11
@stephenswat stephenswat merged commit bdfa3f5 into acts-project:main Jan 9, 2025
27 of 29 checks passed
@krasznaa krasznaa deleted the ThreadIdCleanup-main-20250108 branch January 9, 2025 10:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
alpaka Changes related to Alpaka cleanup Makes the code all clean and tidy cuda Changes related to CUDA sycl Changes related to SYCL
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants