Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add initial support for async stack traces to Unifex #616

Merged
merged 9 commits into from
Jul 18, 2024

Conversation

ispeters
Copy link
Contributor

@ispeters ispeters commented Jul 5, 2024

This PR copies the core of Folly's async stack trace support into include/unifex/tracing and builds on it to add support for generalized Senders.

When UNIFEX_NO_ASYNC_STACKS is falsey, unifex::connect returns a wrapped operation state that injects async stack tracing into the operation tree.

  • The wrapper operation:
    • stores an AsyncStackFrame for the wrapped operation; and
    • wraps the receiver.
  • In the wrapper operation's customization of unifex::start we:
    • create an AsyncStackRoot on the stack;
    • push the wrapper operation's AsyncStackFrame onto the current async stack;
    • activate the wrapper operation's AsyncStackFrame on the current AsyncStackRoot; and
    • start the wrapped operation.
  • In the wrapper receiver's completion methods we:
    • create an AsyncStackRoot on the stack;
    • copy the parent operation's AsyncStackFrame to the stack;
    • activate the parent AsyncStackFrame on the current AsyncStackRoot; and
    • invoke the parent operation's receiver.

The effect is that we build up a linked list (technically a DAG) of AsyncStackFrames pointing "up" toward the start of the operation as unifex::start recurses into the nested operation state and then unwind it on the way back out as the receiver completion methods are invoked. At any given time, the current thread's AsyncStackRoot is sitting on the most recently-activated "normal" stack frame that is participating in async stack management, allowing Folly's co_bt.py debugger extension to figure out when it should stop walking normal stack frames and start walking async stack frames.

As alluded to above, the behaviour of the async stack tracing machinery is controlled by the UNIFEX_NO_ASYNC_STACKS preprocessor macro. If it's truthy, async stacks are not traced; if it's falsey, they are traced. The default in unifex/config.hpp is to enable async stack tracing in non-Windows debug builds.

  • Why not Windows builds?
    • Because there's something weird about how any_sender_of<> builds on Windows (both Clang and MSVC); the resolution is to land PR Make any_sender_of<> play nicer with MSVC #619, but that PR breaks an internal Meta build so I'll have to come back to it.
  • Why only debug builds?
    • The additional work done to track async stacks adds non-trivial binary size to the output so I figure it should default to off for release builds. You can turn it on by defining UNIFEX_NO_ASYNC_STACKS=0 in your release build script if the extra debuggability is worth the extra binary size in production.

This iteration is an MVP:

  • only general senders are supported, not coroutines
  • the "return addresses" captured for each sender point to unifex::_get_return_address::default_return_address<T>(), where T is the type of the sender
    • this is better than nothing because the resulting symbol includes the sender's fully-qualified name, but it's not great

Futures PRs will:

  • add support for tracing the async stacks of coroutines
  • improve the rendering of async stack traces by making senders capture a pointer to the call site of their factory
  • maybe shrink the binary size overhead of enabling this feature if I can figure out how to eliminate some of the recursion

Co-authored-by: Ján Ondrušek [email protected]
Co-authored-by: Jessica Wong [email protected]
Co-authored-by: Deniz Evrenci [email protected]

@ispeters ispeters requested review from janondrusek and jesswong July 5, 2024 23:35
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 5, 2024
@ispeters ispeters force-pushed the async_stack_traces branch 2 times, most recently from b578556 to 4f975a7 Compare July 9, 2024 18:22
@ispeters ispeters marked this pull request as ready for review July 9, 2024 18:27
@ispeters
Copy link
Contributor Author

ispeters commented Jul 9, 2024

This isn't really ready for review; I'm marking it as such to trigger some Meta-internal automation.

@ispeters ispeters marked this pull request as draft July 9, 2024 18:59
@ispeters ispeters force-pushed the async_stack_traces branch 4 times, most recently from c357920 to 9d3358f Compare July 16, 2024 01:27
This diff, originally by @janondrusek and @jesswong, copies the core of
Folly's [async stack trace support](https://github.com/facebook/folly/tree/main/folly/tracing)
into `include/unifex/tracing`.
@ispeters ispeters force-pushed the async_stack_traces branch from 9d3358f to 711a18e Compare July 16, 2024 05:31
Stop using `void*` to represent both instruction pointers and stack
frame pointers and start using `unifex::instruction_ptr` and
`unifex::frame_ptr`.
@ispeters ispeters force-pushed the async_stack_traces branch 4 times, most recently from ac17292 to a37253f Compare July 16, 2024 20:00
ispeters added 7 commits July 16, 2024 21:38
We need a way to restore a `ScopedAsyncStackRoot` to the "no active
frame" state before destroying it on the way out of a customization of
`unifex::start` but the frame we want to deactivate is a member of the
operation state, which means it's likely already been destroyed. This
diff adds `ScopedAsyncStackRoot::ensureFrameDeactivated()`, which
performs most of the same actions as `deactivateAsyncStackFrame()` but
without touching the frame. I think this still technically invokes UB by
copying and comparing a zapped pointer, but it's better than what we had
before.
This diff adds a new receiver query CPO that is expected to return the
address of the `AsyncStackFrame` associated with the receiver's
operation.
This diff adds a new sender query CPO that is expected to return the
instruction pointer best representing the "return address" for the
sender; the default implementation returns the return address of a
function template instantiation that includes the sender's type in its
signature as a kind of "better than nothing" result.
The `instruction_ptr` type is best rendered by the debugger as an
"address", which will render as a symbol + offset rather than an
arbitrary hexadecimal value. This diff adds a comment to the type
documenting this fact.
This diff modifies `unifex::sync_wait()` to establish an
`AsyncStackRoot` on the stack while the awaited operation is running.
This diff modifies `unifex::connect` to inject async stack tracking into
every operation state is it's built.
The Unifex unit test suite won't build for Windows with async stack
injection enabled *unless* PR #619 (Make any_sender_of<> play nicer with
MSVC) is also merged, but that PR causes Windows + Clang + ASAN errors
in Meta-internal builds.

This diff works around the above conflict by disabling async stack
injection in Windows builds by default so we don't need PR #619. We can
change the default once we figure out a proper resolution to the ASAN
problem.
@ispeters ispeters force-pushed the async_stack_traces branch from a37253f to 2d1d85b Compare July 17, 2024 05:30
@ispeters ispeters changed the title Import Folly's async stack library Add initial support for async stack traces to Unifex Jul 17, 2024
@ispeters ispeters marked this pull request as ready for review July 17, 2024 05:58
@ericniebler
Copy link
Collaborator

cool! would be extra super duper cool if it came with debugger scripts for dumping the backtrace. but maybe that belongs in a separate PR.

do you have an example of such a backtrace? i'm curious what it looks like.

@ispeters
Copy link
Contributor Author

ispeters commented Jul 17, 2024

cool! would be extra super duper cool if it came with debugger scripts for dumping the backtrace. but maybe that belongs in a separate PR.

This PR makes Unifex's runtime representation of async stacks compatible with Folly's so Folly's co_bt.py can dump Unifex's stacks, too. I've got agreement in principle with the relevant Folly folks that async stacks ought to live in some third library that both Unifex and Folly can depend upon; not sure when/if we'll get there, but I'd love to see other S/R libraries depend upon it, too.

do you have an example of such a backtrace? i'm curious what it looks like.

Here's an lldb session debugging the Nest test in libunifex/test/let_value_test.cpp with some filename redactions:

(lldb) b let_value_test.cpp:139
Breakpoint 3: where = …`Let_Nested_Test::TestBody()::$_4::operator()(std::vector<int, std::allocator<int>>&) const::'lambda'()::operator()() const + 20 at let_value_test.cpp:139:29, address = 0x0000000000eb62b4
(lldb) r
Process 197428 launched: '…' (x86_64)
Note: Google Test filter = Let.Nested
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from Let
[ RUN      ] Let.Nested
Process 197428 stopped
* thread #1, name = '…', stop reason = breakpoint 1.1
    frame #0: 0x0000000000ea9b0a unittest`Let_Nested_Test::TestBody(this=0x00007ffff601a7f0) at let_value_test.cpp:129:31
   126 	}
   127
   128 	TEST(Let, Nested) {
-> 129 	  timed_single_thread_context context;
   130 	  // More complicated 'let_value' example that shows recursive let_value-scopes,
   131 	  // additional
   132
(lldb) c
Process 197428 resuming
producing vector
Process 197428 stopped
* thread #5, name = '…', stop reason = breakpoint 3.1
    frame #0: 0x0000000000eb62b4 …`Let_Nested_Test::TestBody(this=0x00007fffffffd1c8)::$_4::operator()(std::vector<int, std::allocator<int>>&) const::'lambda'()::operator()() const at let_value_test.cpp:139:29
   136 	              asyncVector(context),
   137 	              [&](std::vector<int>& v) {
   138 	                return async(context, [&] {
-> 139 	                  std::cout << "printing vector" << std::endl;
   140 	                  for (int& x : v) {
   141 	                    std::cout << x << ", ";
   142 	                  }
(lldb) co_bt
#0  0x0000000000eb62b4 in Let_Nested_Test::TestBody()::$_4::operator()(std::vector<int, std::allocator<int>>&) const::'lambda'()::operator()() const () at …unifex/test/let_value_test.cpp:139
#1  0x0000000000eb6295 in void std::__invoke_impl<void, Let_Nested_Test::TestBody()::$_4::operator()(std::vector<int, std::allocator<int>>&) const::'lambda'()>(std::__invoke_other, Let_Nested_Test::TestBody()::$_4::operator()(std::vector<int, std::allocator<int>>&) const::'lambda'()&&) () at …libgcc/include/c++/trunk/bits/invoke.h:61
#2  0x0000000000eb6275 in std::__invoke_result<Let_Nested_Test::TestBody()::$_4::operator()(std::vector<int, std::allocator<int>>&) const::'lambda'()>::type std::__invoke<Let_Nested_Test::TestBody()::$_4::operator()(std::vector<int, std::allocator<int>>&) const::'lambda'()>(Let_Nested_Test::TestBody()::$_4::operator()(std::vector<int, std::allocator<int>>&) const::'lambda'()&&) () at …libgcc/include/c++/trunk/bits/invoke.h:96
#3  0x0000000000eb6205 in std::invoke_result<Let_Nested_Test::TestBody()::$_4::operator()(std::vector<int, std::allocator<int>>&) const::'lambda'()>::type std::invoke<Let_Nested_Test::TestBody()::$_4::operator()(std::vector<int, std::allocator<int>>&) const::'lambda'()>(Let_Nested_Test::TestBody()::$_4::operator()(std::vector<int, std::allocator<int>>&) const::'lambda'()&&) () at …libgcc/include/c++/trunk/functional:97
#4  0x0000000000eb6169 in void unifex::_then::_receiver<unifex::_inject::_rcvr_wrapper<unifex::_let_v::_successor_receiver<unifex::_let_v::_op<unifex::_then::_sender<unifex::_timed_single_thread_context::_schedule_after_sender<std::chrono::duration<long, std::ratio<1l, 1000l>>>::type, auto (anonymous namespace)::$_3::operator()<unifex::timed_single_thread_context>(unifex::timed_single_thread_context&) const::'lambda'()>::type&&, Let_Nested_Test::TestBody()::$_4, unifex::_inject::_rcvr_wrapper<unifex::_when_all::_element_receiver<0ul, unifex::_inject::_rcvr_wrapper<unifex::_then::_receiver<unifex::_inject::_rcvr_wrapper<unifex::_sync_wait::_receiver<unifex::_unit::unit>::type>::type, Let_Nested_Test::TestBody()::$_6>::type>::type, unifex::_let_v::_sender<unifex::_then::_sender<unifex::_timed_single_thread_context::_schedule_after_sender<std::chrono::duration<long, std::ratio<1l, 1000l>>>::type, auto (anonymous namespace)::$_3::operator()<unifex::timed_single_thread_context>(unifex::timed_single_thread_context&) const::'lambda'()>::type, Let_Nested_Test::TestBody()::$_4>::type&&, unifex::_let_v::_sender<unifex::_just::_sender<int>::type, Let_Nested_Test::TestBody()::$_5>::type&&>::type>::type>::type, std::vector<int, std::allocator<int>>>::type>::type, Let_Nested_Test::TestBody()::Let_Nested_Test::TestBody()::$_4::operator()(std::vector<int, std::allocator<int>>&) const::'lambda'()>::type::set_value<>() && () at …unifex/include/unifex/then.hpp:72
#5  0x0000000000eb56ad in unifex::instruction_ptr unifex::_get_return_address::default_return_address<unifex::_then::_sender<unifex::_timed_single_thread_context::_schedule_after_sender<std::chrono::duration<long, std::ratio<1l, 1000l>>>::type, Let_Nested_Test::TestBody()::$_4::operator()(std::vector<int, std::allocator<int>>&) const::'lambda'()>::type>() () at …unifex/include/unifex/tracing/get_return_address.hpp:57
#6  0x0000000000eb255d in unifex::instruction_ptr unifex::_get_return_address::default_return_address<unifex::_let_v::_sender<unifex::_then::_sender<unifex::_timed_single_thread_context::_schedule_after_sender<std::chrono::duration<long, std::ratio<1l, 1000l>>>::type, auto (anonymous namespace)::$_3::operator()<unifex::timed_single_thread_context>(unifex::timed_single_thread_context&) const::'lambda'()>::type, Let_Nested_Test::TestBody()::$_4>::type>() () at …unifex/include/unifex/tracing/get_return_address.hpp:57
#7  0x0000000000eb17bd in unifex::instruction_ptr unifex::_get_return_address::default_return_address<unifex::_when_all::_sender<unifex::_let_v::_sender<unifex::_then::_sender<unifex::_timed_single_thread_context::_schedule_after_sender<std::chrono::duration<long, std::ratio<1l, 1000l>>>::type, auto (anonymous namespace)::$_3::operator()<unifex::timed_single_thread_context>(unifex::timed_single_thread_context&) const::'lambda'()>::type, Let_Nested_Test::TestBody()::$_4>::type, unifex::_let_v::_sender<unifex::_just::_sender<int>::type, Let_Nested_Test::TestBody()::$_5>::type>::type>() () at …unifex/include/unifex/tracing/get_return_address.hpp:57
#8  0x0000000000eb155d in unifex::instruction_ptr unifex::_get_return_address::default_return_address<unifex::_then::_sender<unifex::_when_all::_sender<unifex::_let_v::_sender<unifex::_then::_sender<unifex::_timed_single_thread_context::_schedule_after_sender<std::chrono::duration<long, std::ratio<1l, 1000l>>>::type, auto (anonymous namespace)::$_3::operator()<unifex::timed_single_thread_context>(unifex::timed_single_thread_context&) const::'lambda'()>::type, Let_Nested_Test::TestBody()::$_4>::type, unifex::_let_v::_sender<unifex::_just::_sender<int>::type, Let_Nested_Test::TestBody()::$_5>::type>::type, Let_Nested_Test::TestBody()::$_6>::type>() () at …unifex/include/unifex/tracing/get_return_address.hpp:57
#9  0x0000000000ea9c46 in Let_Nested_Test::TestBody() () at …unifex/test/let_value_test.cpp:133
#10 0x0000000001351539 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) () at …gtest.cc:2677
#11 0x0000000001351269 in testing::Test::Run() () at …gtest.cc:2699
#12 0x0000000001352d93 in testing::TestInfo::Run() () at …gtest.cc:2844
#13 0x0000000001354b9c in testing::TestSuite::Run() () at …gtest.cc:3022
#14 0x0000000001368fdc in testing::internal::UnitTestImpl::RunAllTests() () at …gtest.cc:5926
#15 0x0000000001368b5e in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) () at …gtest.cc:2675
#16 0x0000000001368722 in testing::UnitTest::Run() () at …gtest.cc:5489
#17 0x000000000130f231 in RUN_ALL_TESTS() () at …gtest.h:2317
#18 0x000000000130f162 in main () at …:20
#19 0x00007ffff7c2c657 in __libc_start_call_main () at ???:0
#20 0x00007ffff7c2c718 in __libc_start_main@@GLIBC_2.34 () at ???:0
#21 0x00000000009cdba1 in _start () at …glibc…/sysdeps/x86_64/start.S:118

Things to note:

  • the process is stopped on thread 5 (the test schedules some work onto a non-main thread), but the stack traces back to _start() on the main thread
  • frames 5-8 are using the ugly, default customizations of unifex::get_return_address to figure out what instruction pointer to use to represent the suspended operation; those frames will provide more useful information when the corresponding algorithms capture their call sites
  • frame 9 is the call site of sync_wait in the test body

@ericniebler
Copy link
Collaborator

I'm looking for a frame that represents the transition from thread 1 to thread 5, something like a transfer. I'm not seeing it tho. Why?

@ispeters
Copy link
Contributor Author

I'm looking for a frame that represents the transition from thread 1 to thread 5, something like a transfer. I'm not seeing it tho. Why?

Because the sender that did that has already completed by the time the breakpoint I selected hits. Completed operations are, in this respect, analogous to functions that have returned—they're not on the stack because the stack represents the list of suspended operations waiting to be completed.

@ispeters ispeters merged commit 16740d0 into main Jul 18, 2024
151 checks passed
@ispeters ispeters deleted the async_stack_traces branch July 18, 2024 05:29
ispeters added a commit that referenced this pull request Aug 26, 2024
This change extends the work in #616 to support async stack frames in
`task<>` coroutines, including those that invoke `at_coroutine_exit()`.

In `task<>`, when `UNIFEX_NO_ASYNC_STACKS` is falsey, the awaiter returned from
`task<>`'s customization of `unifex::await_transform` stores an
`AsyncStackFrame`. The awaiter pushes its frame onto the current async stack in
`await_suspend()` and pops it again in `await_resume()`; since
`await_resume()` is only invoked for value and error completions, this
arrangement leaves it up to the waiting task to pop the awaiter's frame
when the awaited task completes with done. This can be expressed as a
new rule:

- when a coroutine completes with a value or an error, it is responsible
  for popping its own `AsyncStackFrame`; but
- when a coroutine completes with done, the *caller* is responsible for
  popping the callee's `AsyncStackFrame` as a part of the caller's
  `unhandled_done()` coroutine.

To support this new requirement of `unhandled_done()` (that it is
responsible for popping the callee's stack frame), this change
introduces `popAsyncStackFrameFromCaller`, which takes the caller's
stack frame by reference so that it can assert that, after popping the
current async frame (whatever it is), the new top frame is the caller's
frame.

A `task<>` promise has an `AsyncStackFrame*` that, when it's not
`nullptr`, points to the `AsyncStackFrame` in the awaiter waiting for
the task. This pointer exists even when `UNIFEX_NO_ASYNC_STACKS` is
truthy to help mitigate against ODR violations; linking together two TUs
with `UNIFEX_NO_ASYNC_STACKS` set differently is not explicitly
supported but, by ensuring this pointer always exists, some ODR problems
are avoided. When a `task<>` is awaited from a TU with async stack
support enabled, the awaited task's awaiter sets the promise's
`AsyncStackFrame*` to point to the awaiter's frame; when a `task<>` is
awaited from a TU with async stack support disabled, this assignment
never happens and the promise's pointer remains null.

The above description of `task<>`'s async stack maintenance only covers
the recursive case of on coroutine awaiting another. The base case is
handled in `connect_awaitable()`, where an `AsyncStackRoot` is set up
before starting the connected awaitable.

`stop_if_requested` used to model both `sender` and `awaitable` so that
`co_await stop_if_requested();` could take advantage of symmetric
transfer. The `stop_if_requested` sender now customizes
`await_transform` to express its participation in async stack
management. This means of expressing async stack awareness is
unsatisfying but I don't have any better ideas right now.

Lastly, `unifex::await_transform()` now wraps naturally-awaitable
arguments in an `awaiter_wrapper` that ensures the `coroutine_handle<>`
passed to the wrapped awaitable is one that establishes an active
`AsyncStackRoot` before resuming the real waiting coroutine.
ispeters added a commit that referenced this pull request Aug 28, 2024
This change extends the work in #616 to support async stack frames in
`task<>` coroutines, including those that invoke `at_coroutine_exit()`.

In `task<>`, when `UNIFEX_NO_ASYNC_STACKS` is falsey, the awaiter returned from
`task<>`'s customization of `unifex::await_transform` stores an
`AsyncStackFrame`. The awaiter pushes its frame onto the current async stack in
`await_suspend()` and pops it again in `await_resume()`; since
`await_resume()` is only invoked for value and error completions, this
arrangement leaves it up to the waiting task to pop the awaiter's frame
when the awaited task completes with done. This can be expressed as a
new rule:

- when a coroutine completes with a value or an error, it is responsible
  for popping its own `AsyncStackFrame`; but
- when a coroutine completes with done, the *caller* is responsible for
  popping the callee's `AsyncStackFrame` as a part of the caller's
  `unhandled_done()` coroutine.

To support this new requirement of `unhandled_done()` (that it is
responsible for popping the callee's stack frame), this change
introduces `popAsyncStackFrameFromCaller`, which takes the caller's
stack frame by reference so that it can assert that, after popping the
current async frame (whatever it is), the new top frame is the caller's
frame.

A `task<>` promise has an `AsyncStackFrame*` that, when it's not
`nullptr`, points to the `AsyncStackFrame` in the awaiter waiting for
the task. This pointer exists even when `UNIFEX_NO_ASYNC_STACKS` is
truthy to help mitigate against ODR violations; linking together two TUs
with `UNIFEX_NO_ASYNC_STACKS` set differently is not explicitly
supported but, by ensuring this pointer always exists, some ODR problems
are avoided. When a `task<>` is awaited from a TU with async stack
support enabled, the awaited task's awaiter sets the promise's
`AsyncStackFrame*` to point to the awaiter's frame; when a `task<>` is
awaited from a TU with async stack support disabled, this assignment
never happens and the promise's pointer remains null.

The above description of `task<>`'s async stack maintenance only covers
the recursive case of on coroutine awaiting another. The base case is
handled in `connect_awaitable()`, where an `AsyncStackRoot` is set up
before starting the connected awaitable.

`stop_if_requested` used to model both `sender` and `awaitable` so that
`co_await stop_if_requested();` could take advantage of symmetric
transfer. The `stop_if_requested` sender now customizes
`await_transform` to express its participation in async stack
management. This means of expressing async stack awareness is
unsatisfying but I don't have any better ideas right now.

Lastly, `unifex::await_transform()` now wraps naturally-awaitable
arguments in an `awaiter_wrapper` that ensures the `coroutine_handle<>`
passed to the wrapped awaitable is one that establishes an active
`AsyncStackRoot` before resuming the real waiting coroutine.
ispeters added a commit that referenced this pull request Aug 28, 2024
This change extends the work in #616 to support async stack frames in
`task<>` coroutines, including those that invoke `at_coroutine_exit()`.

In `task<>`, when `UNIFEX_NO_ASYNC_STACKS` is falsey, the awaiter returned from
`task<>`'s customization of `unifex::await_transform` stores an
`AsyncStackFrame`. The awaiter pushes its frame onto the current async stack in
`await_suspend()` and pops it again in `await_resume()`; since
`await_resume()` is only invoked for value and error completions, this
arrangement leaves it up to the waiting task to pop the awaiter's frame
when the awaited task completes with done. This can be expressed as a
new rule:

- when a coroutine completes with a value or an error, it is responsible
  for popping its own `AsyncStackFrame`; but
- when a coroutine completes with done, the *caller* is responsible for
  popping the callee's `AsyncStackFrame` as a part of the caller's
  `unhandled_done()` coroutine.

To support this new requirement of `unhandled_done()` (that it is
responsible for popping the callee's stack frame), this change
introduces `popAsyncStackFrameFromCaller`, which takes the caller's
stack frame by reference so that it can assert that, after popping the
current async frame (whatever it is), the new top frame is the caller's
frame.

A `task<>` promise has an `AsyncStackFrame*` that, when it's not
`nullptr`, points to the `AsyncStackFrame` in the awaiter waiting for
the task. This pointer exists even when `UNIFEX_NO_ASYNC_STACKS` is
truthy to help mitigate against ODR violations; linking together two TUs
with `UNIFEX_NO_ASYNC_STACKS` set differently is not explicitly
supported but, by ensuring this pointer always exists, some ODR problems
are avoided. When a `task<>` is awaited from a TU with async stack
support enabled, the awaited task's awaiter sets the promise's
`AsyncStackFrame*` to point to the awaiter's frame; when a `task<>` is
awaited from a TU with async stack support disabled, this assignment
never happens and the promise's pointer remains null.

The above description of `task<>`'s async stack maintenance only covers
the recursive case of on coroutine awaiting another. The base case is
handled in `connect_awaitable()`, where an `AsyncStackRoot` is set up
before starting the connected awaitable.

`stop_if_requested` used to model both `sender` and `awaitable` so that
`co_await stop_if_requested();` could take advantage of symmetric
transfer. The `stop_if_requested` sender now customizes
`await_transform` to express its participation in async stack
management. This means of expressing async stack awareness is
unsatisfying but I don't have any better ideas right now.

Lastly, `unifex::await_transform()` now wraps naturally-awaitable
arguments in an `awaiter_wrapper` that ensures the `coroutine_handle<>`
passed to the wrapped awaitable is one that establishes an active
`AsyncStackRoot` before resuming the real waiting coroutine.
ispeters added a commit that referenced this pull request Aug 29, 2024
This change extends the work in #616 to support async stack frames in
`task<>` coroutines, including those that invoke `at_coroutine_exit()`.

In `task<>`, when `UNIFEX_NO_ASYNC_STACKS` is falsey, the awaiter returned from
`task<>`'s customization of `unifex::await_transform` stores an
`AsyncStackFrame`. The awaiter pushes its frame onto the current async stack in
`await_suspend()` and pops it again in `await_resume()`; since
`await_resume()` is only invoked for value and error completions, this
arrangement leaves it up to the waiting task to pop the awaiter's frame
when the awaited task completes with done. This can be expressed as a
new rule:

- when a coroutine completes with a value or an error, it is responsible
  for popping its own `AsyncStackFrame`; but
- when a coroutine completes with done, the *caller* is responsible for
  popping the callee's `AsyncStackFrame` as a part of the caller's
  `unhandled_done()` coroutine.

To support this new requirement of `unhandled_done()` (that it is
responsible for popping the callee's stack frame), this change
introduces `popAsyncStackFrameFromCaller`, which takes the caller's
stack frame by reference so that it can assert that, after popping the
current async frame (whatever it is), the new top frame is the caller's
frame.

A `task<>` promise has an `AsyncStackFrame*` that, when it's not
`nullptr`, points to the `AsyncStackFrame` in the awaiter waiting for
the task. This pointer exists even when `UNIFEX_NO_ASYNC_STACKS` is
truthy to help mitigate against ODR violations; linking together two TUs
with `UNIFEX_NO_ASYNC_STACKS` set differently is not explicitly
supported but, by ensuring this pointer always exists, some ODR problems
are avoided. When a `task<>` is awaited from a TU with async stack
support enabled, the awaited task's awaiter sets the promise's
`AsyncStackFrame*` to point to the awaiter's frame; when a `task<>` is
awaited from a TU with async stack support disabled, this assignment
never happens and the promise's pointer remains null.

The above description of `task<>`'s async stack maintenance only covers
the recursive case of on coroutine awaiting another. The base case is
handled in `connect_awaitable()`, where an `AsyncStackRoot` is set up
before starting the connected awaitable.

`stop_if_requested` used to model both `sender` and `awaitable` so that
`co_await stop_if_requested();` could take advantage of symmetric
transfer. The `stop_if_requested` sender now customizes
`await_transform` to express its participation in async stack
management. This means of expressing async stack awareness is
unsatisfying but I don't have any better ideas right now.

Lastly, `unifex::await_transform()` now wraps naturally-awaitable
arguments in an `awaiter_wrapper` that ensures the `coroutine_handle<>`
passed to the wrapped awaitable is one that establishes an active
`AsyncStackRoot` before resuming the real waiting coroutine.
ispeters added a commit that referenced this pull request Aug 29, 2024
This change extends the work in #616 to support async stack frames in
`task<>` coroutines, including those that invoke `at_coroutine_exit()`.

In `task<>`, when `UNIFEX_NO_ASYNC_STACKS` is falsey, the awaiter returned from
`task<>`'s customization of `unifex::await_transform` stores an
`AsyncStackFrame`. The awaiter pushes its frame onto the current async stack in
`await_suspend()` and pops it again in `await_resume()`; since
`await_resume()` is only invoked for value and error completions, this
arrangement leaves it up to the waiting task to pop the awaiter's frame
when the awaited task completes with done. This can be expressed as a
new rule:

- when a coroutine completes with a value or an error, it is responsible
  for popping its own `AsyncStackFrame`; but
- when a coroutine completes with done, the *caller* is responsible for
  popping the callee's `AsyncStackFrame` as a part of the caller's
  `unhandled_done()` coroutine.

To support this new requirement of `unhandled_done()` (that it is
responsible for popping the callee's stack frame), this change
introduces `popAsyncStackFrameFromCaller`, which takes the caller's
stack frame by reference so that it can assert that, after popping the
current async frame (whatever it is), the new top frame is the caller's
frame.

A `task<>` promise has an `AsyncStackFrame*` that, when it's not
`nullptr`, points to the `AsyncStackFrame` in the awaiter waiting for
the task. This pointer exists even when `UNIFEX_NO_ASYNC_STACKS` is
truthy to help mitigate against ODR violations; linking together two TUs
with `UNIFEX_NO_ASYNC_STACKS` set differently is not explicitly
supported but, by ensuring this pointer always exists, some ODR problems
are avoided. When a `task<>` is awaited from a TU with async stack
support enabled, the awaited task's awaiter sets the promise's
`AsyncStackFrame*` to point to the awaiter's frame; when a `task<>` is
awaited from a TU with async stack support disabled, this assignment
never happens and the promise's pointer remains null.

The above description of `task<>`'s async stack maintenance only covers
the recursive case of on coroutine awaiting another. The base case is
handled in `connect_awaitable()`, where an `AsyncStackRoot` is set up
before starting the connected awaitable.

`stop_if_requested` used to model both `sender` and `awaitable` so that
`co_await stop_if_requested();` could take advantage of symmetric
transfer. The `stop_if_requested` sender now customizes
`await_transform` to express its participation in async stack
management. This means of expressing async stack awareness is
unsatisfying but I don't have any better ideas right now.

Lastly, `unifex::await_transform()` now wraps naturally-awaitable
arguments in an `awaiter_wrapper` that ensures the `coroutine_handle<>`
passed to the wrapped awaitable is one that establishes an active
`AsyncStackRoot` before resuming the real waiting coroutine.
ispeters added a commit that referenced this pull request Aug 29, 2024
This change extends the work in #616 to support async stack frames in
`task<>` coroutines, including those that invoke `at_coroutine_exit()`.

In `task<>`, when `UNIFEX_NO_ASYNC_STACKS` is falsey, the awaiter returned from
`task<>`'s customization of `unifex::await_transform` stores an
`AsyncStackFrame`. The awaiter pushes its frame onto the current async stack in
`await_suspend()` and pops it again in `await_resume()`; since
`await_resume()` is only invoked for value and error completions, this
arrangement leaves it up to the waiting task to pop the awaiter's frame
when the awaited task completes with done. This can be expressed as a
new rule:

- when a coroutine completes with a value or an error, it is responsible
  for popping its own `AsyncStackFrame`; but
- when a coroutine completes with done, the *caller* is responsible for
  popping the callee's `AsyncStackFrame` as a part of the caller's
  `unhandled_done()` coroutine.

To support this new requirement of `unhandled_done()` (that it is
responsible for popping the callee's stack frame), this change
introduces `popAsyncStackFrameFromCaller`, which takes the caller's
stack frame by reference so that it can assert that, after popping the
current async frame (whatever it is), the new top frame is the caller's
frame.

A `task<>` promise has an `AsyncStackFrame*` that, when it's not
`nullptr`, points to the `AsyncStackFrame` in the awaiter waiting for
the task. This pointer exists even when `UNIFEX_NO_ASYNC_STACKS` is
truthy to help mitigate against ODR violations; linking together two TUs
with `UNIFEX_NO_ASYNC_STACKS` set differently is not explicitly
supported but, by ensuring this pointer always exists, some ODR problems
are avoided. When a `task<>` is awaited from a TU with async stack
support enabled, the awaited task's awaiter sets the promise's
`AsyncStackFrame*` to point to the awaiter's frame; when a `task<>` is
awaited from a TU with async stack support disabled, this assignment
never happens and the promise's pointer remains null.

The above description of `task<>`'s async stack maintenance only covers
the recursive case of on coroutine awaiting another. The base case is
handled in `connect_awaitable()`, where an `AsyncStackRoot` is set up
before starting the connected awaitable.

`stop_if_requested` used to model both `sender` and `awaitable` so that
`co_await stop_if_requested();` could take advantage of symmetric
transfer. The `stop_if_requested` sender now customizes
`await_transform` to express its participation in async stack
management. This means of expressing async stack awareness is
unsatisfying but I don't have any better ideas right now.

Lastly, `unifex::await_transform()` now wraps naturally-awaitable
arguments in an `awaiter_wrapper` that ensures the `coroutine_handle<>`
passed to the wrapped awaitable is one that establishes an active
`AsyncStackRoot` before resuming the real waiting coroutine.
ispeters added a commit that referenced this pull request Aug 29, 2024
This change extends the work in #616 to support async stack frames in
`task<>` coroutines, including those that invoke `at_coroutine_exit()`.

In `task<>`, when `UNIFEX_NO_ASYNC_STACKS` is falsey, the awaiter returned from
`task<>`'s customization of `unifex::await_transform` stores an
`AsyncStackFrame`. The awaiter pushes its frame onto the current async stack in
`await_suspend()` and pops it again in `await_resume()`; since
`await_resume()` is only invoked for value and error completions, this
arrangement leaves it up to the waiting task to pop the awaiter's frame
when the awaited task completes with done. This can be expressed as a
new rule:

- when a coroutine completes with a value or an error, it is responsible
  for popping its own `AsyncStackFrame`; but
- when a coroutine completes with done, the *caller* is responsible for
  popping the callee's `AsyncStackFrame` as a part of the caller's
  `unhandled_done()` coroutine.

To support this new requirement of `unhandled_done()` (that it is
responsible for popping the callee's stack frame), this change
introduces `popAsyncStackFrameFromCaller`, which takes the caller's
stack frame by reference so that it can assert that, after popping the
current async frame (whatever it is), the new top frame is the caller's
frame.

A `task<>` promise has an `AsyncStackFrame*` that, when it's not
`nullptr`, points to the `AsyncStackFrame` in the awaiter waiting for
the task. This pointer exists even when `UNIFEX_NO_ASYNC_STACKS` is
truthy to help mitigate against ODR violations; linking together two TUs
with `UNIFEX_NO_ASYNC_STACKS` set differently is not explicitly
supported but, by ensuring this pointer always exists, some ODR problems
are avoided. When a `task<>` is awaited from a TU with async stack
support enabled, the awaited task's awaiter sets the promise's
`AsyncStackFrame*` to point to the awaiter's frame; when a `task<>` is
awaited from a TU with async stack support disabled, this assignment
never happens and the promise's pointer remains null.

The above description of `task<>`'s async stack maintenance only covers
the recursive case of on coroutine awaiting another. The base case is
handled in `connect_awaitable()`, where an `AsyncStackRoot` is set up
before starting the connected awaitable.

`stop_if_requested` used to model both `sender` and `awaitable` so that
`co_await stop_if_requested();` could take advantage of symmetric
transfer. The `stop_if_requested` sender now customizes
`await_transform` to express its participation in async stack
management. This means of expressing async stack awareness is
unsatisfying but I don't have any better ideas right now.

Lastly, `unifex::await_transform()` now wraps naturally-awaitable
arguments in an `awaiter_wrapper` that ensures the `coroutine_handle<>`
passed to the wrapped awaitable is one that establishes an active
`AsyncStackRoot` before resuming the real waiting coroutine.
ispeters added a commit that referenced this pull request Sep 1, 2024
This change extends the work in #616 to support async stack frames in
`task<>` coroutines, including those that invoke `at_coroutine_exit()`.

In `task<>`, when `UNIFEX_NO_ASYNC_STACKS` is falsey, the awaiter returned from
`task<>`'s customization of `unifex::await_transform` stores an
`AsyncStackFrame`. The awaiter pushes its frame onto the current async stack in
`await_suspend()` and pops it again in `await_resume()`; since
`await_resume()` is only invoked for value and error completions, this
arrangement leaves it up to the waiting task to pop the awaiter's frame
when the awaited task completes with done. This can be expressed as a
new rule:

- when a coroutine completes with a value or an error, it is responsible
  for popping its own `AsyncStackFrame`; but
- when a coroutine completes with done, the *caller* is responsible for
  popping the callee's `AsyncStackFrame` as a part of the caller's
  `unhandled_done()` coroutine.

To support this new requirement of `unhandled_done()` (that it is
responsible for popping the callee's stack frame), this change
introduces `popAsyncStackFrameFromCaller`, which takes the caller's
stack frame by reference so that it can assert that, after popping the
current async frame (whatever it is), the new top frame is the caller's
frame.

A `task<>` promise has an `AsyncStackFrame*` that, when it's not
`nullptr`, points to the `AsyncStackFrame` in the awaiter waiting for
the task. This pointer exists even when `UNIFEX_NO_ASYNC_STACKS` is
truthy to help mitigate against ODR violations; linking together two TUs
with `UNIFEX_NO_ASYNC_STACKS` set differently is not explicitly
supported but, by ensuring this pointer always exists, some ODR problems
are avoided. When a `task<>` is awaited from a TU with async stack
support enabled, the awaited task's awaiter sets the promise's
`AsyncStackFrame*` to point to the awaiter's frame; when a `task<>` is
awaited from a TU with async stack support disabled, this assignment
never happens and the promise's pointer remains null.

The above description of `task<>`'s async stack maintenance only covers
the recursive case of on coroutine awaiting another. The base case is
handled in `connect_awaitable()`, where an `AsyncStackRoot` is set up
before starting the connected awaitable.

`stop_if_requested` used to model both `sender` and `awaitable` so that
`co_await stop_if_requested();` could take advantage of symmetric
transfer. The `stop_if_requested` sender now customizes
`await_transform` to express its participation in async stack
management. This means of expressing async stack awareness is
unsatisfying but I don't have any better ideas right now.

Lastly, `unifex::await_transform()` now wraps naturally-awaitable
arguments in an `awaiter_wrapper` that ensures the `coroutine_handle<>`
passed to the wrapped awaitable is one that establishes an active
`AsyncStackRoot` before resuming the real waiting coroutine.
ispeters added a commit that referenced this pull request Sep 11, 2024
This change extends the work in #616 to support async stack frames in
`task<>` coroutines, including those that invoke `at_coroutine_exit()`.

In `task<>`, when `UNIFEX_NO_ASYNC_STACKS` is falsey, the awaiter returned from
`task<>`'s customization of `unifex::await_transform` stores an
`AsyncStackFrame`. The awaiter pushes its frame onto the current async stack in
`await_suspend()` and pops it again in `await_resume()`; since
`await_resume()` is only invoked for value and error completions, this
arrangement leaves it up to the waiting task to pop the awaiter's frame
when the awaited task completes with done. This can be expressed as a
new rule:

- when a coroutine completes with a value or an error, it is responsible
  for popping its own `AsyncStackFrame`; but
- when a coroutine completes with done, the *caller* is responsible for
  popping the callee's `AsyncStackFrame` as a part of the caller's
  `unhandled_done()` coroutine.

To support this new requirement of `unhandled_done()` (that it is
responsible for popping the callee's stack frame), this change
introduces `popAsyncStackFrameFromCaller`, which takes the caller's
stack frame by reference so that it can assert that, after popping the
current async frame (whatever it is), the new top frame is the caller's
frame.

A `task<>` promise has an `AsyncStackFrame*` that, when it's not
`nullptr`, points to the `AsyncStackFrame` in the awaiter waiting for
the task. This pointer exists even when `UNIFEX_NO_ASYNC_STACKS` is
truthy to help mitigate against ODR violations; linking together two TUs
with `UNIFEX_NO_ASYNC_STACKS` set differently is not explicitly
supported but, by ensuring this pointer always exists, some ODR problems
are avoided. When a `task<>` is awaited from a TU with async stack
support enabled, the awaited task's awaiter sets the promise's
`AsyncStackFrame*` to point to the awaiter's frame; when a `task<>` is
awaited from a TU with async stack support disabled, this assignment
never happens and the promise's pointer remains null.

The above description of `task<>`'s async stack maintenance only covers
the recursive case of on coroutine awaiting another. The base case is
handled in `connect_awaitable()`, where an `AsyncStackRoot` is set up
before starting the connected awaitable.

`stop_if_requested` used to model both `sender` and `awaitable` so that
`co_await stop_if_requested();` could take advantage of symmetric
transfer. The `stop_if_requested` sender now customizes
`await_transform` to express its participation in async stack
management. This means of expressing async stack awareness is
unsatisfying but I don't have any better ideas right now.

Lastly, `unifex::await_transform()` now wraps naturally-awaitable
arguments in an `awaiter_wrapper` that ensures the `coroutine_handle<>`
passed to the wrapped awaitable is one that establishes an active
`AsyncStackRoot` before resuming the real waiting coroutine.
ispeters added a commit that referenced this pull request Sep 11, 2024
This change extends the work in #616 to support async stack frames in
`task<>` coroutines, including those that invoke `at_coroutine_exit()`.

In `task<>`, when `UNIFEX_NO_ASYNC_STACKS` is falsey, the awaiter returned from
`task<>`'s customization of `unifex::await_transform` stores an
`AsyncStackFrame`. The awaiter pushes its frame onto the current async stack in
`await_suspend()` and pops it again in `await_resume()`; since
`await_resume()` is only invoked for value and error completions, this
arrangement leaves it up to the waiting task to pop the awaiter's frame
when the awaited task completes with done. This can be expressed as a
new rule:

- when a coroutine completes with a value or an error, it is responsible
  for popping its own `AsyncStackFrame`; but
- when a coroutine completes with done, the *caller* is responsible for
  popping the callee's `AsyncStackFrame` as a part of the caller's
  `unhandled_done()` coroutine.

To support this new requirement of `unhandled_done()` (that it is
responsible for popping the callee's stack frame), this change
introduces `popAsyncStackFrameFromCaller`, which takes the caller's
stack frame by reference so that it can assert that, after popping the
current async frame (whatever it is), the new top frame is the caller's
frame.

A `task<>` promise has an `AsyncStackFrame*` that, when it's not
`nullptr`, points to the `AsyncStackFrame` in the awaiter waiting for
the task. This pointer exists even when `UNIFEX_NO_ASYNC_STACKS` is
truthy to help mitigate against ODR violations; linking together two TUs
with `UNIFEX_NO_ASYNC_STACKS` set differently is not explicitly
supported but, by ensuring this pointer always exists, some ODR problems
are avoided. When a `task<>` is awaited from a TU with async stack
support enabled, the awaited task's awaiter sets the promise's
`AsyncStackFrame*` to point to the awaiter's frame; when a `task<>` is
awaited from a TU with async stack support disabled, this assignment
never happens and the promise's pointer remains null.

The above description of `task<>`'s async stack maintenance only covers
the recursive case of on coroutine awaiting another. The base case is
handled in `connect_awaitable()`, where an `AsyncStackRoot` is set up
before starting the connected awaitable.

`stop_if_requested` used to model both `sender` and `awaitable` so that
`co_await stop_if_requested();` could take advantage of symmetric
transfer. The `stop_if_requested` sender now customizes
`await_transform` to express its participation in async stack
management. This means of expressing async stack awareness is
unsatisfying but I don't have any better ideas right now.

Lastly, `unifex::await_transform()` now wraps naturally-awaitable
arguments in an `awaiter_wrapper` that ensures the `coroutine_handle<>`
passed to the wrapped awaitable is one that establishes an active
`AsyncStackRoot` before resuming the real waiting coroutine.
ispeters added a commit that referenced this pull request Sep 11, 2024
This change extends the work in #616 to support async stack frames in
`task<>` coroutines, including those that invoke `at_coroutine_exit()`.

In `task<>`, when `UNIFEX_NO_ASYNC_STACKS` is falsey, the awaiter returned from
`task<>`'s customization of `unifex::await_transform` stores an
`AsyncStackFrame`. The awaiter pushes its frame onto the current async stack in
`await_suspend()` and pops it again in `await_resume()`; since
`await_resume()` is only invoked for value and error completions, this
arrangement leaves it up to the waiting task to pop the awaiter's frame
when the awaited task completes with done. This can be expressed as a
new rule:

- when a coroutine completes with a value or an error, it is responsible
  for popping its own `AsyncStackFrame`; but
- when a coroutine completes with done, the *caller* is responsible for
  popping the callee's `AsyncStackFrame` as a part of the caller's
  `unhandled_done()` coroutine.

To support this new requirement of `unhandled_done()` (that it is
responsible for popping the callee's stack frame), this change
introduces `popAsyncStackFrameFromCaller`, which takes the caller's
stack frame by reference so that it can assert that, after popping the
current async frame (whatever it is), the new top frame is the caller's
frame.

A `task<>` promise has an `AsyncStackFrame*` that, when it's not
`nullptr`, points to the `AsyncStackFrame` in the awaiter waiting for
the task. This pointer exists even when `UNIFEX_NO_ASYNC_STACKS` is
truthy to help mitigate against ODR violations; linking together two TUs
with `UNIFEX_NO_ASYNC_STACKS` set differently is not explicitly
supported but, by ensuring this pointer always exists, some ODR problems
are avoided. When a `task<>` is awaited from a TU with async stack
support enabled, the awaited task's awaiter sets the promise's
`AsyncStackFrame*` to point to the awaiter's frame; when a `task<>` is
awaited from a TU with async stack support disabled, this assignment
never happens and the promise's pointer remains null.

The above description of `task<>`'s async stack maintenance only covers
the recursive case of on coroutine awaiting another. The base case is
handled in `connect_awaitable()`, where an `AsyncStackRoot` is set up
before starting the connected awaitable.

`stop_if_requested` used to model both `sender` and `awaitable` so that
`co_await stop_if_requested();` could take advantage of symmetric
transfer. The `stop_if_requested` sender now customizes
`await_transform` to express its participation in async stack
management. This means of expressing async stack awareness is
unsatisfying but I don't have any better ideas right now.

Lastly, `unifex::await_transform()` now wraps naturally-awaitable
arguments in an `awaiter_wrapper` that ensures the `coroutine_handle<>`
passed to the wrapped awaitable is one that establishes an active
`AsyncStackRoot` before resuming the real waiting coroutine.
ispeters added a commit that referenced this pull request Sep 12, 2024
This change extends the work in #616 to support async stack frames in
`task<>` coroutines, including those that invoke `at_coroutine_exit()`.

In `task<>`, when `UNIFEX_NO_ASYNC_STACKS` is falsey, the awaiter returned from
`task<>`'s customization of `unifex::await_transform` stores an
`AsyncStackFrame`. The awaiter pushes its frame onto the current async stack in
`await_suspend()` and pops it again in `await_resume()`; since
`await_resume()` is only invoked for value and error completions, this
arrangement leaves it up to the waiting task to pop the awaiter's frame
when the awaited task completes with done. This can be expressed as a
new rule:

- when a coroutine completes with a value or an error, it is responsible
  for popping its own `AsyncStackFrame`; but
- when a coroutine completes with done, the *caller* is responsible for
  popping the callee's `AsyncStackFrame` as a part of the caller's
  `unhandled_done()` coroutine.

To support this new requirement of `unhandled_done()` (that it is
responsible for popping the callee's stack frame), this change
introduces `popAsyncStackFrameFromCaller`, which takes the caller's
stack frame by reference so that it can assert that, after popping the
current async frame (whatever it is), the new top frame is the caller's
frame.

A `task<>` promise has an `AsyncStackFrame*` that, when it's not
`nullptr`, points to the `AsyncStackFrame` in the awaiter waiting for
the task. This pointer exists even when `UNIFEX_NO_ASYNC_STACKS` is
truthy to help mitigate against ODR violations; linking together two TUs
with `UNIFEX_NO_ASYNC_STACKS` set differently is not explicitly
supported but, by ensuring this pointer always exists, some ODR problems
are avoided. When a `task<>` is awaited from a TU with async stack
support enabled, the awaited task's awaiter sets the promise's
`AsyncStackFrame*` to point to the awaiter's frame; when a `task<>` is
awaited from a TU with async stack support disabled, this assignment
never happens and the promise's pointer remains null.

The above description of `task<>`'s async stack maintenance only covers
the recursive case of on coroutine awaiting another. The base case is
handled in `connect_awaitable()`, where an `AsyncStackRoot` is set up
before starting the connected awaitable.

`stop_if_requested` used to model both `sender` and `awaitable` so that
`co_await stop_if_requested();` could take advantage of symmetric
transfer. The `stop_if_requested` sender now customizes
`await_transform` to express its participation in async stack
management. This means of expressing async stack awareness is
unsatisfying but I don't have any better ideas right now.

Lastly, `unifex::await_transform()` now wraps naturally-awaitable
arguments in an `awaiter_wrapper` that ensures the `coroutine_handle<>`
passed to the wrapped awaitable is one that establishes an active
`AsyncStackRoot` before resuming the real waiting coroutine.
ispeters added a commit that referenced this pull request Sep 26, 2024
This change extends the work in #616 to support async stack frames in
`task<>` coroutines, including those that invoke `at_coroutine_exit()`.

In `task<>`, when `UNIFEX_NO_ASYNC_STACKS` is falsey, the awaiter returned from
`task<>`'s customization of `unifex::await_transform` stores an
`AsyncStackFrame`. The awaiter pushes its frame onto the current async stack in
`await_suspend()` and pops it again in `await_resume()`; since
`await_resume()` is only invoked for value and error completions, this
arrangement leaves it up to the waiting task to pop the awaiter's frame
when the awaited task completes with done. This can be expressed as a
new rule:

- when a coroutine completes with a value or an error, it is responsible
  for popping its own `AsyncStackFrame`; but
- when a coroutine completes with done, the *caller* is responsible for
  popping the callee's `AsyncStackFrame` as a part of the caller's
  `unhandled_done()` coroutine.

To support this new requirement of `unhandled_done()` (that it is
responsible for popping the callee's stack frame), this change
introduces `popAsyncStackFrameFromCaller`, which takes the caller's
stack frame by reference so that it can assert that, after popping the
current async frame (whatever it is), the new top frame is the caller's
frame.

A `task<>` promise has an `AsyncStackFrame*` that, when it's not
`nullptr`, points to the `AsyncStackFrame` in the awaiter waiting for
the task. This pointer exists even when `UNIFEX_NO_ASYNC_STACKS` is
truthy to help mitigate against ODR violations; linking together two TUs
with `UNIFEX_NO_ASYNC_STACKS` set differently is not explicitly
supported but, by ensuring this pointer always exists, some ODR problems
are avoided. When a `task<>` is awaited from a TU with async stack
support enabled, the awaited task's awaiter sets the promise's
`AsyncStackFrame*` to point to the awaiter's frame; when a `task<>` is
awaited from a TU with async stack support disabled, this assignment
never happens and the promise's pointer remains null.

The above description of `task<>`'s async stack maintenance only covers
the recursive case of on coroutine awaiting another. The base case is
handled in `connect_awaitable()`, where an `AsyncStackRoot` is set up
before starting the connected awaitable.

`stop_if_requested` used to model both `sender` and `awaitable` so that
`co_await stop_if_requested();` could take advantage of symmetric
transfer. The `stop_if_requested` sender now customizes
`await_transform` to express its participation in async stack
management. This means of expressing async stack awareness is
unsatisfying but I don't have any better ideas right now.

Lastly, `unifex::await_transform()` now wraps naturally-awaitable
arguments in an `awaiter_wrapper` that ensures the `coroutine_handle<>`
passed to the wrapped awaitable is one that establishes an active
`AsyncStackRoot` before resuming the real waiting coroutine.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants