- Proposed
- Prototype
- Implementation
- Specification
C# has support for iterator methods and async methods, but no support for a method that is both an iterator and an async method. We should rectify this by allowing for await
to be used in a new form of async
iterator, one that returns an IAsyncEnumerable<T>
or IAsyncEnumerator<T>
rather than an IEnumerable<T>
or IEnumerator<T>
, with IAsyncEnumerable<T>
consumable in a new foreach await
. An IAsyncDisposable
interface is also used to enable asynchronous cleanup.
There has been much discussion of IAsyncDisposable
(e.g. dotnet/roslyn#114) and whether it's a good idea. However, it's a required concept to add in support of async iterators. Since finally
blocks may contain await
s, and since finally
blocks need to be run as part of disposing of iterators, we need async disposal. It's also just generally useful any time cleaning up of resources might take any period of time, e.g. closing files (requiring flushes), deregistering callbacks and providing a way to know when deregistration has completed, etc.
The following interface is added to the core .NET libraries (e.g. System.Private.CoreLib / System.Runtime):
namespace System
{
public interface IAsyncDisposable
{
ValueTask DisposeAsync();
}
}
As with Dispose
, invoking DisposeAsync
multiple times is acceptable, and subsequent invocations after the first should be treated as nops, returning a synchronously completed successful task (DisposeAsync
need not be thread-safe, though, and need not support concurrent invocation). Further, types may implement both IDisposable
and IAsyncDisposable
, and if they do, it's similarly acceptable to invoke Dispose
and then DisposeAsync
or vice versa, but only the first should be meaningful and subsequent invocations of either should be a nop. As such, if a type does implement both, consumers are encouraged to call once and only once the more relevant method based on the context, Dispose
in synchronous contexts and DisposeAsync
in asynchronous ones.
(I'm leaving discussion of how IAsyncDisposable
interacts with using
to a separate discussion. And coverage of how it interacts with foreach
is handled later in this proposal.)
Alternatives considered:
DisposeAsync
accepting aCancellationToken
: while in theory it makes sense that anything async can be canceled, disposal is about cleanup, closing things out, free'ing resources, etc., which is generally not something that should be canceled; cleanup is still important for work that's canceled. The sameCancellationToken
that caused the actual work to be canceled would typically be the same token passed toDisposeAsync
, makingDisposeAsync
worthless because cancellation of the work would causeDisposeAsync
to be a nop. If someone wants to avoid being blocked waiting for disposal, they can avoid waiting on the resultingValueTask
, or wait on it only for some period of time.DisposeAsync
returning aTask
: Now that a non-genericValueTask
exists and can be constructed from anIValueTaskSource
, returningValueTask
fromDisposeAsync
allows an existing object to be reused as the promise representing the eventual async completion ofDisposeAsync
, saving aTask
allocation in the case whereDisposeAsync
completes asynchronously.- Configuring
DisposeAsync
with abool continueOnCapturedContext
(ConfigureAwait
): While there may be issues related to how such a concept is exposed tousing
,foreach
, and other language constructs that consume this, from an interface perspective it's not actually doing anyawait
'ing and there's nothing to configure... consumers of theValueTask
can consume it however they wish. IAsyncDisposable
inheritingIDisposable
: Since only one or the other should be used, it doesn't make sense to force types to implement both.IDisposableAsync
instead ofIAsyncDisposable
: We've been following the naming that things/types are an "async something" whereas operations are "done async", so types have "Async" as a prefix and methods have "Async" as a suffix.
Two interfaces are added to the core .NET libraries:
namespace System.Collections.Generic
{
public interface IAsyncEnumerable<out T>
{
IAsyncEnumerator<T> GetAsyncEnumerator();
}
public interface IAsyncEnumerator<out T> : IAsyncDisposable
{
ValueTask<bool> MoveNextAsync();
T Current { get; }
}
}
Typical consumption (without additional language features) would look like:
IAsyncEnumerator<T> enumerator = enumerable.GetAsyncEnumerator();
try
{
while (await enumerator.MoveNextAsync())
{
Use(enumerator.Current);
}
}
finally { await enumerator.DisposeAsync(); }
Discarded options considered:
Task<bool> MoveNextAsync(); T current { get; }
: UsingTask<bool>
would support using a cached task object to represent synchronous, successfulMoveNextAsync
calls, but an allocation would still be required for asynchronous completion. By returningValueTask<bool>
, we enable the enumerator object to itself implementIValueTaskSource<bool>
and be used as the backing for theValueTask<bool>
returned fromMoveNextAsync
, which in turn allows for significantly reduced overheads.ValueTask<(bool, T)> MoveNextAsync();
: It's not only harder to consume, but it means thatT
can no longer be covariant.ValueTask<T?> TryMoveNextAsync();
: Not covariant.Task<T?> TryMoveNextAsync();
: Not covariant, allocations on every call, etc.ITask<T?> TryMoveNextAsync();
: Not covariant, allocations on every call, etc.ITask<(bool,T)> TryMoveNextAsync();
: Not covariant, allocations on every call, etc.Task<bool> TryMoveNextAsync(out T result);
: Theout
result would need to be set when the operation returns synchronously, not when it asynchronously completes the task potentially sometime long in the future, at which point there'd be no way to communicate the result.IAsyncEnumerator<T>
not implementingIAsyncDisposable
: We could choose to separate these. However, doing so complicates certain other areas of the proposal, as code must then be able to deal with the possibility that an enumerator doesn't provide disposal, which makes it difficult to write pattern-based helpers. Further, it will be common for enumerators to have a need for disposal (e.g. any C# async iterator that has a finally block, most things enumerating data from a network connection, etc.), and if one doesn't, it is simple to implement the method purely aspublic ValueTask DisposeAsync() => default(ValueTask);
with minimal additional overhead.
namespace System.Collections.Generic
{
public interface IAsyncEnumerable<out T>
{
IAsyncEnumerator<T> GetAsyncEnumerator();
}
public interface IAsyncEnumerator<out T> : IAsyncDisposable
{
ValueTask<bool> WaitForNextAsync();
T TryGetNext(out bool success);
}
}
TryGetNext
is used in an inner loop to consume items with a single interface call as long as they're available synchronously. When the next item can't be retrieved synchronously, it returns false, and any time it returns false, a caller must subsequently invoke WaitForNextAsync
to either wait for the next item to be available or to determine that there will never be another item. Typical consumption (without additional language features) would look like:
IAsyncEnumerator<T> enumerator = enumerable.GetAsyncEnumerator();
try
{
while (await enumerator.WaitForNextAsync())
{
while (true)
{
int item = enumerator.TryGetNext(out bool success);
if (!success) break;
Use(item);
}
}
}
finally { await enumerator.DisposeAsync(); }
The advantage of this is two-fold, one minor and one major:
- Minor: Allows for an enumerator to support multiple consumers. There may be scenarios where it's valuable for an enumerator to support multiple concurrent consumers. That can't be achieved when
MoveNextAsync
andCurrent
are separate such that an implementation can't make their usage atomic. In contrast, this approach provides a single methodTryGetNext
that supports pushing the enumerator forward and getting the next item, so the enumerator can enable atomicity if desired. However, it's likely that such scenarios could also be enabled by giving each consumer its own enumerator from a shared enumerable. Further, we don't want to enforce that every enumerator support concurrent usage, as that would add non-trivial overheads to the majority case that doesn't require it, which means a consumer of the interface generally couldn't rely on this any way. - Major: Performance. The
MoveNextAsync
/Current
approach requires two interface calls per operation, whereas the best case forWaitForNextAsync
/TryGetNext
is that most iterations complete synchronously, enabling a tight inner loop withTryGetNext
, such that we only have one interface call per operation. This can have a measurable impact in situations where the interface calls dominate the computation.
However, there are non-trivial downsides, including significantly increased complexity when consuming these manually, and an increased chance of introducing bugs when using them. And while the performance benefits show up in microbenchmarks, we don't believe they'll be impactful in the vast majority of real usage. If it turns out they are, we can introduce a second set of interfaces in a light-up fashion.
Discarded options considered:
ValueTask<bool> WaitForNextAsync(); bool TryGetNext(out T result);
:out
parameters can't be covariant. There's also a small impact here (an issue with the try pattern in general) that this likely incurs a runtime write barrier for reference type results.
There are several possible approaches to supporting cancellation:
IAsyncEnumerable<T>
/IAsyncEnumerator<T>
are cancellation-agnostic:CancellationToken
doesn't appear anywhere. Cancellation is achieved by logically baking theCancellationToken
into the enumerable and/or enumerator in whatever manner is appropriate, e.g. when calling an iterator, passing theCancellationToken
as an argument to the iterator method and using it in the body of the iterator, as is done with any other parameter.IAsyncEnumerator<T>.GetAsyncEnumerator(CancellationToken)
: You pass aCancellationToken
toGetAsyncEnumerator
, and subsequentMoveNextAsync
operations respect it however it can.IAsyncEnumerator<T>.MoveNextAsync(CancellationToken)
: You pass aCancellationToken
to each individualMoveNextAsync
call.- 1 && 2: You both embed
CancellationToken
s into your enumerable/enumerator and passCancellationToken
s intoGetAsyncEnumerator
. - 1 && 3: You both embed
CancellationToken
s into your enumerable/enumerator and passCancellationToken
s intoMoveNextAsync
.
From a purely theoretical perspective, (5) is the most robust, in that (a) MoveNextAsync
accepting a CancellationToken
enables the most fine-grained control over what's canceled, and (b) CancellationToken
is just any other type that can passed as an argument into iterators, embedded in arbitrary types, etc.
However, there are multiple problems with that approach:
- How does a
CancellationToken
passed toGetAsyncEnumerator
make it into the body of the iterator? We could expose a newiterator
keyword that you could dot off of to get access to theCancellationToken
passed toGetEnumerator
, but a) that's a lot of additional machinery, b) we're making it a very first-class citizen, and c) the 99% case would seem to be the same code both calling an iterator and callingGetAsyncEnumerator
on it, in which case it can just pass theCancellationToken
as an argument into the method. - How does a
CancellationToken
passed toMoveNextAsync
get into the body of the method? This is even worse, as if it's exposed off of aniterator
local object, its value could change across awaits, which means any code that registered with the token would need to unregister from it prior to awaits and then re-register after; it's also potentially quite expensive to need to do such registering and unregistering in everyMoveNextAsync
call, regardless of whether implemented by the compiler in an iterator or by a developer manually. - How does a developer cancel a
foreach
loop? If it's done by giving aCancellationToken
to an enumerable/enumerator, then either a) we need to supportforeach
'ing over enumerators, which raises them to being first-class citizens, and now you need to start thinking about an ecosystem built up around enumerators (e.g. LINQ methods) or b) we need to embed theCancellationToken
in the enumerable anyway by having someWithCancellation
extension method off ofIAsyncEnumerable<T>
that would store the provided token and then pass it into the wrapped enumerable'sGetAsyncEnumerator
when theGetAsyncEnumerator
on the returned struct is invoked (ignoring that token). Or, you can just use theCancellationToken
you have in the body of the foreach. - If/when query comprehensions are supported, how would the
CancellationToken
supplied toGetEnumerator
orMoveNextAsync
be passed into each clause? The easiest way would simply be for the clause to capture it, at which point whatever token is passed toGetAsyncEnumerator
/MoveNextAsync
is ignored.
Due to all of this, the simplest and most consistent solution is simply to do (1): IAsyncEnumerable<T>
/IAsyncEnumerator<T>
are cancellation-agnostic. If you want to cancel a foreach
loop, you can use a CancellationToken
in the body and in any methods you call:
CancellationToken ct = ...;
foreach await (var i in GetData())
{
ct.ThrowIfCancellationRequested();
await UseAsync(i, ct);
...
}
If you want to pass a CancellationToken
into an iterator, you simply do so as an argument, just as with other async
methods:
static async IAsyncEnumerable<T> GetData(CancellationToken cancellationToken = default)
{
using (cancellationToken.Register(...))
{
...
await ...
...
}
}
...
foreach await (T i in GetData(ct))
{
...
}
If you want to use a CancellationToken
in a query comprehension, you just capture it:
CancellationToken ct = ...
IAsyncEnumerable<string> results = from url in source
select DownloadAsync(url, ct);
Etc.
For now, we should pursue (1).
foreach
will be augmented to support IAsyncEnumerable<T>
in addition to its existing support for IEnumerable<T>
. And it will support the equivalent of IAsyncEnumerable<T>
as a pattern if the relevant members are exposed publicly, falling back to using the interface directly if not, in order to enable struct-based extensions that avoid allocating as well as using alternative awaitables as the return type of MoveNextAsync
and DisposeAsync
.
Using the syntax:
foreach (var i in enumerable)
C# will continue to treat enumerable
as a synchronous enumerable, such that even if it exposes the relevant APIs for async enumerables (exposing the pattern or implementing the interface), it will only consider the synchronous APIs.
To force foreach
to instead only consider the asynchronous APIs, await
is inserted as follows:
foreach await (var i in enumerable)
No syntax would be provided that would support using either the async or the sync APIs; the developer must choose based on the syntax used.
Discarded options considered:
foreach (var i in await enumerable)
: This is already valid syntax, and changing its meaning would be a breaking change. This means toawait
theenumerable
, get back something synchronously iterable from it, and then synchronously iterate through that.foreach (var i await in enumerable)
,foreach (var await i in enumerable)
,foreach (await var i in enumerable)
: These all suggest that we're awaiting the next item, but there are other awaits involved in foreach, in particular if the enumerable is anIAsyncDisposable
, we will beawait
'ing its async disposal. That await is as the scope of the foreach rather than for each individual element, and thus theawait
keyword deserves to be at theforeach
level. Further, having it associated with theforeach
gives us a way to describe theforeach
with a different term, e.g. a "foreach await". But more importantly, there's value in consideringforeach
syntax at the same time asusing
syntax, so that they remain consistent with each other, andusing (await ...)
is already valid syntax.await foreach (var i in enumerable)
: This suggests that the entireforeach
is somehow returning something that's beingawait
'd, but it's not.
Still to consider:
foreach
today does not support iterating through an enumerator. We expect it will be more common to haveIAsyncEnumerator<T>
s handed around, and thus it's tempting to supportforeach await
with bothIAsyncEnumerable<T>
andIAsyncEnumerator<T>
. But once we add such support, it introduces the question of whetherIAsyncEnumerator<T>
is a first-class citizen, and whether we need to have overloads of combinators that operate on enumerators in addition to enumerables? Do we want to encourage methods to return enumerators rather than enumerables? We should continue to discuss this. If we decide we don't want to support it, we might want to introduce an extension methodpublic static IAsyncEnumerable<T> AsEnumerable<T>(this IAsyncEnumerator<T> enumerator);
that would allow an enumerator to still beforeach
'd. If we decide we do want to support it, we'll need to also decide on whether theforeach await
would be responsible for callingDisposeAsync
on the enumerator, and the answer is likely "no, control over disposal should be handled by whoever calledGetEnumerator
."
The compiler will bind to the pattern-based APIs if they exist, preferring those over using the interface (the pattern may be satisfied with instance methods or extension methods). The requirements for the pattern are:
- The enumerable must expose a
GetAsyncEnumerator
method that may be called with no arguments and that returns an enumerator that meets the relevant pattern. - The enumerator must expose a
MoveNextAsync
method that may be called with no arguments and that returns something which may beawait
ed and whoseGetResult()
returns abool
. - The enumerator must also expose
Current
property whose getter returns aT
representing the kind of data being enumerated. - The enumerator may optionally expose a
DisposeAsync
method that may be invoked with no arguments and that returns something that can beawait
ed and whoseGetResult()
returnsvoid
.
This code:
var enumerable = ...;
foreach await (T item in enumerable)
{
...
}
is translated to the equivalent of:
var enumerable = ...;
var enumerator = enumerable.GetAsyncEnumerator();
try
{
while (await enumerator.MoveNextAsync())
{
T item = enumerator.Current;
...
}
}
finally
{
await enumerator.DisposeAsync(); // omitted, along with the try/finally, if the enumerator doesn't expose DisposeAsync
}
If the iterated type doesn't expose the right pattern, the interfaces will be used.
This pattern-based compilation will allow ConfigureAwait
to be used on all of the awaits, via a ConfigureAwait
extension method:
foreach await (T item in enumerable.ConfigureAwait(false))
{
...
}
This will be based on types we'll add to .NET as well, likely to System.Threading.Tasks.Extensions.dll:
// Approximate implementation, omitting arg validation and the like
namespace System.Threading.Tasks
{
public static class AsyncEnumerableExtensions
{
public static ConfiguredAsyncEnumerable<T> ConfigureAwait<T>(this IAsyncEnumerable<T> enumerable, bool continueOnCapturedContext) =>
new ConfiguredAsyncEnumerable<T>(enumerable, continueOnCapturedContext);
public struct ConfiguredAsyncEnumerable<T>
{
private readonly IAsyncEnumerable<T> _enumerable;
private readonly bool _continueOnCapturedContext;
internal ConfiguredAsyncEnumerable(IAsyncEnumerable<T> enumerable, bool continueOnCapturedContext)
{
_enumerable = enumerable;
_continueOnCapturedContext = continueOnCapturedContext;
}
public ConfiguredAsyncEnumerator<T> GetAsyncEnumerator() =>
new ConfiguredAsyncEnumerator<T>(_enumerable.GetAsyncEnumerator(), _continueOnCapturedContext);
public struct Enumerator
{
private readonly IAsyncEnumerator<T> _enumerator;
private readonly bool _continueOnCapturedContext;
internal Enumerator(IAsyncEnumerator<T> enumerator, bool continueOnCapturedContext)
{
_enumerator = enumerator;
_continueOnCapturedContext = continueOnCapturedContext;
}
public ConfiguredValueTaskAwaitable<bool> MoveNextAsync() =>
_enumerator.MoveNextAsync().ConfigureAwait(_continueOnCapturedContext);
public T Current => _enumerator.Current;
public ConfiguredValueTaskAwaitable DisposeAsync() =>
_enumerator.DisposeAsync().ConfigureAwait(_continueOnCapturedContext);
}
}
}
}
Note that this approach will not enable ConfigureAwait
to be used with pattern-based enumerables, but then again it's already the case that the ConfigureAwait
is only exposed as an extension on Task
/Task<T>
/ValueTask
/ValueTask<T>
and can't be applied to arbitrary awaitable things, as it only makes sense when applied to Tasks (it controls a behavior implemented in Task's continuation support), and thus doesn't make sense when using a pattern where the awaitable things may not be tasks. Anyone returning awaitable things can provide their own custom behavior in such advanced scenarios.
(If we can come up with some way to support a scope- or assembly-level ConfigureAwait
solution, then this won't be necessary.)
The language / compiler will support producing IAsyncEnumerable<T>
s and IAsyncEnumerator<T>
s in addition to consuming them. Today the language supports writing an iterator like:
static IEnumerable<int> MyIterator()
{
try
{
for (int i = 0; i < 100; i++)
{
Thread.Sleep(1000);
yield return i;
}
}
finally
{
Thread.Sleep(200);
Console.WriteLine("finally");
}
}
but await
can't be used in the body of these iterators. We will add that support.
The existing language support for iterators infers the iterator nature of the method based on whether it contains any yield
s. The same will be true for async iterators. Such async iterators will be demarcated and differentiated from synchronous iterators via adding async
to the signature, and must then also have either IAsyncEnumerable<T>
or IAsyncEnumerator<T>
as its return type. For example, the above example could be written as an async iterator as follows:
static async IAsyncEnumerable<int> MyIterator()
{
try
{
for (int i = 0; i < 100; i++)
{
await Task.Delay(1000);
yield return i;
}
}
finally
{
await Task.Delay(200);
Console.WriteLine("finally");
}
}
Alternatives considered:
- Not using
async
in the signature: Usingasync
is likely technically required by the compiler, as it uses it to determine whetherawait
is valid in that context. But even if it's not required, we've established thatawait
may only be used in methods marked asasync
, and it seems important to keep the consistency. - Enabling custom builders for
IAsyncEnumerable<T>
: That's something we could look at for the future, but the machinery is complicated and we don't support that for the synchronous counterparts. - Having an
iterator
keyword in the signature: Async iterators would useasync iterator
in the signature, andyield
could only be used inasync
methods that includediterator
;iterator
would then be made optional on synchronous iterators. Depending on your perspective, this has the benefit of making it very clear by the signature of the method whetheryield
is allowed and whether the method is actually meant to return instances of typeIAsyncEnumerable<T>
rather than the compiler manufacturing one based on whether the code usesyield
or not. But it is different from synchronous iterators, which don't and can't be made to require one. Plus some developers don't like the extra syntax. If we were designing it from scratch, we'd probably make this required, but at this point there's much more value in keeping async iterators close to sync iterators.
There are over ~200 overloads of methods on the System.Linq.Enumerable
class, all of which work in terms of IEnumerable<T>
; some of these accept IEnumerable<T>
, some of them produce IEnumerable<T>
, and many do both. Adding LINQ support for IAsyncEnumerable<T>
would likely entail duplicating all of these overloads for it, for another ~200. And since IAsyncEnumerator<T>
is likely to be more common as a standalone entity in the asynchronous world than IEnumerator<T>
is in the synchronous world, we could potentially need another ~200 overloads that work with IAsyncEnumerator<T>
. Plus, a large number of the overloads deal with predicates (e.g. Where
that takes a Func<T, bool>
), and it may be desirable to have IAsyncEnumerable<T>
-based overloads that deal with both synchronous and asynchronous predicates (e.g. Func<T, ValueTask<bool>>
in addition to Func<T, bool>
). While this isn't applicable to all of the now ~400 new overloads, a rough calculation is that it'd be applicable to half, which means another ~200 overloads, for a total of ~600 new methods.
That is a staggering number of APIs, with the potential for even more when extension libraries like Interactive Extensions (Ix) is considered. But Ix already has an implementation of many of these, and there doesn't seem to be a great reason to duplicate that work; we should instead help the community improve Ix and recommend it for when developers want to use LINQ with IAsyncEnumerable<T>
.
There is also the issue of query comprehension syntax. The pattern-based nature of query comprehensions would allow them to "just work" with some operators, e.g. if Ix provides the following methods:
public static IAsyncEnumerable<TResult> Select<TSource, TResult>(this IAsyncEnumerable<TSource> source, Func<TSource, TResult> func);
public static IAsyncEnumerable<T> Where(this IAsyncEnumerable<T> source, Func<T, bool> func);
then this C# code will "just work":
IAsyncEnumerable<int> enumerable = ...;
IAsyncEnumerable<int> result = from item in enumerable
where item % 2 == 0
select item * 2;
However, there is no query comprehension syntax that supports using await
in the clauses, so if Ix added, for example:
public static IAsyncEnumerable<TResult> Select<TSource, TResult>(this IAsyncEnumerable<TSource> source, Func<TSource, ValueTask<TResult>> func);
then this would "just work":
IAsyncEnumerable<string> result = from url in urls
where item % 2 == 0
select SomeAsyncMethod(item);
async ValueTask<int> SomeAsyncMethod(int item)
{
await Task.Yield();
return item * 2;
}
but there'd be no way to write it with the await
inline in the select
clause. As a separate effort, we could look into adding async { ... }
expressions to the language, at which point we could allow them to be used in query comprehensions and the above could instead be written as:
IAsyncEnumerable<int> result = from item in enumerable
where item % 2 == 0
select async
{
await Task.Yield();
return item * 2;
};
or to enabling await
to be used directly in expressions, such as by supporting async from
. However, it's unlikely a design here would impact the rest of the feature set one way or the other, and this isn't a particularly high-value thing to invest in right now, so the proposal is to do nothing additional here right now.
Integration with IObservable<T>
and other asynchronous frameworks (e.g. reactive streams) would be done at the library level rather than at the language level. For example, all of the data from an IAsyncEnumerator<T>
can be published to an IObserver<T>
simply by foreach await
'ing over the enumerator and OnNext
'ing the data to the observer, so an AsObservable<T>
extension method is possible. Consuming an IObservable<T>
in a foreach await
requires buffering the data (in case another item is pushed while the previous item is still being processing), but such a push-pull adapter can easily be implemented to enable an IObservable<T>
to be pulled from with an IAsyncEnumerator<T>
. Etc. Rx/Ix already provide prototypes of such implementations, and libraries like https://github.com/dotnet/corefx/tree/master/src/System.Threading.Channels provide various kinds of buffering data structures. The language need not be involved at this stage.