Skip to content
This repository has been archived by the owner on Nov 2, 2023. It is now read-only.

sender_choleskey #28

Merged
merged 5 commits into from
Oct 3, 2023
Merged

sender_choleskey #28

merged 5 commits into from
Oct 3, 2023

Conversation

hcq9102
Copy link
Collaborator

@hcq9102 hcq9102 commented Sep 26, 2023

  • using std::execution

[resolved]Issues:

  • [resolved] sync_wait() syntax Issue:

  • [resolved]last two diagnal results incorrect

@mhaseeb123 mhaseeb123 merged commit b21fb35 into mhaseeb123:main Oct 3, 2023
1 check passed
//
// This example provides a stdexec(senders/receivers) implementation for choleskey decomposition code.
#include <algorithm>
#include <exec/any_sender_of.hpp>
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would recommend cleaning up any unused headers like any_sender_of.hpp


sum_vec[piece] = std::transform_reduce(
std::execution::par,
counting_iterator(start),
Copy link
Owner

@mhaseeb123 mhaseeb123 Oct 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not correct since counting_iterator(start) and counting_iterator(start +N) are two separate objects and may not be iterable.

This is valid since nvhpc/22.9+ as per https://forums.developer.nvidia.com/t/internal-compiler-error-bad-sptr-in-var-refsym/253631. The error was originating from cudart/11.7 -> cudatoolkit/11.7 in our default PM environment.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also being caught by the compiler. see below:

image

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind, this error was coming from libcudart/11.7 which is already loaded into modules and takes precedence even with nvhpc/23.7 compiler. Doing a ml unload cudatoolkit and rerunning cmake and make uses the latest cudart/12.x from nvhpc/23.x module and works fine.

Copy link
Collaborator Author

@hcq9102 hcq9102 Oct 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great! thanks a lot.
Yes, I have used ml unload cudatoolkit when load modules:

I build with following options and no build issue:
cmake .. -DCMAKE_CXX_COMPILER=$(which nvc++) -DCMAKE_C_COMPILER=$(which nvc) -DCMAKE_BUILD_TYPE=Release -DSTDPAR=gpu

modules:
ml use /global/cfs/cdirs/m1759/wwei/nvhpc_23_7/modulefiles ; ml unload cudatoolkit ; ml nvhpc/23.1 cmake/3.24

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants