-
Notifications
You must be signed in to change notification settings - Fork 1
Conversation
// | ||
// This example provides a stdexec(senders/receivers) implementation for choleskey decomposition code. | ||
#include <algorithm> | ||
#include <exec/any_sender_of.hpp> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would recommend cleaning up any unused headers like any_sender_of.hpp
|
||
sum_vec[piece] = std::transform_reduce( | ||
std::execution::par, | ||
counting_iterator(start), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not correct since counting_iterator(start)
and counting_iterator(start +N)
are two separate objects and may not be iterable.
This is valid since nvhpc/22.9+ as per https://forums.developer.nvidia.com/t/internal-compiler-error-bad-sptr-in-var-refsym/253631. The error was originating from cudart/11.7 -> cudatoolkit/11.7
in our default PM environment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nevermind, this error was coming from libcudart/11.7
which is already loaded into modules and takes precedence even with nvhpc/23.7 compiler. Doing a ml unload cudatoolkit
and rerunning cmake and make uses the latest cudart/12.x
from nvhpc/23.x
module and works fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great! thanks a lot.
Yes, I have used ml unload cudatoolkit
when load modules:
I build with following options and no build issue:
cmake .. -DCMAKE_CXX_COMPILER=$(which nvc++) -DCMAKE_C_COMPILER=$(which nvc) -DCMAKE_BUILD_TYPE=Release -DSTDPAR=gpu
modules:
ml use /global/cfs/cdirs/m1759/wwei/nvhpc_23_7/modulefiles ; ml unload cudatoolkit ; ml nvhpc/23.1 cmake/3.24
[resolved]Issues:
[resolved] sync_wait() syntax Issue:
[resolved]last two diagnal results incorrect