You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, I've recently noticed that GPUArrays don't define a Base.hash implementation and fallback to the default one. This requires one to @allowscalar which is slow and also means one has to differentiate between CPU and GPU arrays when calling hash.
MWE:
using GPUArrays, CUDA
A =cu(rand(1024, 1024))
hash(A) # errors@allowscalar Base.hash(A) # slow
I'm not sure what a good implementation for GPU arrays would be. An inefficient GPU default could be:
Base.hash(arr::T, h) where T <:AbstractGPUArray=mapreduce(hash, hash, arr; init =hash(T, h))
That of course touches every element and works with UInt64 values, but it would be faster than the normal default.
From what I can tell the default Base.hash for arrays is accessing O(log n) elements. I'm not sure how to neatly map such a pattern onto GPUs, if someone has any pointers I'd be happy to implement it
The text was updated successfully, but these errors were encountered:
I don't think that fallback would work; CUDA.jl's mapreduce executes in a nondeterministic order, requiring an associative and commutative operator, which Base.hash isn't.
Hello, I've recently noticed that GPUArrays don't define a
Base.hash
implementation and fallback to the default one. This requires one to@allowscalar
which is slow and also means one has to differentiate between CPU and GPU arrays when calling hash.MWE:
I'm not sure what a good implementation for GPU arrays would be. An inefficient GPU default could be:
That of course touches every element and works with
UInt64
values, but it would be faster than the normal default.From what I can tell the default Base.hash for arrays is accessing O(log n) elements. I'm not sure how to neatly map such a pattern onto GPUs, if someone has any pointers I'd be happy to implement it
The text was updated successfully, but these errors were encountered: