-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FlatterScatter #271
FlatterScatter #271
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
offline discussed catch2 should be fetched on the need to not increase the repository size to much.
The rest looks good! ONly one small question for the heap init part is open.
mallocMC::OOMPolicies::ReturnNull, | ||
mallocMC::ReservePoolPolicies::AlpakaBuf<Acc>, | ||
mallocMC::AlignmentPolicies::Shrink<ShrinkConfig>>; | ||
|
||
ALPAKA_STATIC_ACC_MEM_GLOBAL int** arA; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO this must be equal to https://github.com/alpaka-group/alpaka/blob/7e1b6ce24c5fadc9c58dd0d1dc59c022e96a087b/test/unit/mem/view/src/ViewStaticAccMem.cpp#L19
alpaka::DevGlobal
is missing and the usage requires a .get()
but it looks it runs and the DevGlobal
is only required for SYCL. SO I am fine to keep it as it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opened: #272 to track this.
using VecType = alpaka::Vec<Dim, Idx>; | ||
|
||
auto poolView = alpaka::createView(dev, reinterpret_cast<char*>(pool), alpaka::Vec<Dim, Idx>(memsize)); | ||
alpaka::memset(queue, poolView, 0U); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: You mentioned last time offline that we use a kernel to initialize the heap but here the memset is still present. It is fine for this PR I would like to avoid that we use memset and a on device initialization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, good catch! There is an init
method for the heap but apparently it is unused. It's on my TODO list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opened #273 to track this.
|
||
auto workDivSingleThread | ||
= alpaka::WorkDivMembers<Dim, Idx>{VecType::ones(), VecType::ones(), VecType::ones()}; | ||
alpaka::exec<TAcc>(queue, workDivSingleThread, FlatterScatterAlloc::InitKernel{}, heap, pool, memsize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sounds like the kernel which is performing the on device init equal to the memset above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, damn! It's much worse! The above memset is completely redundant. -.-
ALPAKA_FN_INLINE ALPAKA_FN_ACC auto getAvailableMultiPages(auto const& /*acc*/, uint32_t const chunkSize) const | ||
-> uint32_t | ||
{ | ||
// TODO(lenz): This is not thread-safe! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should not forget this todo and handle it in a future PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is kind of a minor issue: The informational methods getAvailableSlots
and its helpers provide information that cannot possibly be used in a thread-safe manner. So, calling them while other threads are write-accessing the same allocator is not a reasonable thing to do anyways. Mostly left it in for informational purposes but maybe I should drop the TODO(lenz)
part. It is documented in the Usage.md
document by the way.
TAcc const& acc, | ||
uint32_t const index, | ||
uint32_t const numBytes, | ||
uint32_t& chunkSizeCache) -> bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only a note: this should be refactored in the future and return the value together with the current return value as tuple.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opened #274 to track this.
1bbb6fe
to
42a96e9
Compare
Okay. I've surgically removed the Catch2 commit and replaced it with a git submodule. I've removed the redundant |
This PR adds amongst a number of modernisations the new FlatterScatter creation policy.