Skip to content

Commit

Permalink
Apply suggestions from code review
Browse files Browse the repository at this point in the history
Co-authored-by: HJA Bird <[email protected]>
Co-authored-by: pgorlani <[email protected]>
  • Loading branch information
3 people authored Mar 26, 2024
1 parent f89fd4b commit 782f413
Show file tree
Hide file tree
Showing 2 changed files with 15 additions and 22 deletions.
14 changes: 7 additions & 7 deletions src/interface/blas1/backend/amd_gpu.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -35,13 +35,13 @@ typename sb_handle_t::event_t _asum(
sb_handle_t& sb_handle, index_t _N, container_0_t _vx, increment_t _incx,
container_1_t _rs, const typename sb_handle_t::event_t& _dependencies) {
/**
* This compile time check is absolutely necessary for AMD gpu.
* AMD atomic operations required a specific combination of hardware that we
*cannot check neither enforce to users. Since reduction operators kernel
*implementation useses atomic operation without that particular combination
*the operator may fail silently. This check enforce a different atomic
*address space causing a big performance degradation, but making the kernel
*behaves correctly also with managed memory (aka malloc_shared allocation).
* This compile time check is absolutely necessary for AMD GPUs.
* AMD's atomic operations require a specific combination of hardware that
* cannot be checked nor enforced. Since the reduction operator kernel
* implementation uses atomic operations, without that particular hardware combination
* the reduction may silently fail. This check enforces a different atomic
* address space causing a big performance degradation, but also making the kernel
* behave correctly with managed memory (aka malloc_shared allocation).
**/
#ifdef SB_ENABLE_USM
bool usm_managed_mem{false};
Expand Down
23 changes: 8 additions & 15 deletions src/operations/blas1/WGAtomicReduction.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -126,21 +126,14 @@ PORTBLAS_INLINE
cl::sycl::plus<value_t>());
}
if (ndItem.get_local_id()[0] == 0) {
if constexpr (!usmManagedMem) {
auto atomic_res =
cl::sycl::atomic_ref<value_t, cl::sycl::memory_order::relaxed,
cl::sycl::memory_scope::device,
cl::sycl::access::address_space::global_space>(
lhs_.get_data()[0]);
atomic_res += val;
} else {
auto atomic_res =
cl::sycl::atomic_ref<value_t, cl::sycl::memory_order::relaxed,
cl::sycl::memory_scope::device,
cl::sycl::access::address_space::generic_space>(
lhs_.get_data()[0]);
atomic_res += val;
}
constexpr cl::sycl::access::address_space addr_sp =
usmManagedMem ? cl::sycl::access::address_space::generic_space
: cl::sycl::access::address_space::global_space;
auto atomic_res =
cl::sycl::atomic_ref<value_t, cl::sycl::memory_order::relaxed,
cl::sycl::memory_scope::device, addr_sp>(
lhs_.get_data()[0]);
atomic_res += val;
}

return {};
Expand Down

0 comments on commit 782f413

Please sign in to comment.