-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Say something about asynchronous behavior on host #685
base: main
Are you sure you want to change the base?
Conversation
First thanks for the PR :) I Agree this needs some clarification! But I think we are mixing up 2 things:
|
I agree we need to be careful here. ISO C++ defines "block" (here) as:
The latest ISO C++ draft (C++26) additionally contains a new definition of "asynchronous operation" (here) which is too large to cut-and-paste. But it contains some things that I suspect we might not want to inherit: there's a note that asynchronous operations can execute synchronously in the host thread (which I don't think holds for SYCL commands); and asynchronous operations are defined entirely in terms of senders and receivers. I've not had a chance to read through the entire draft and I may be misinterpreting these new terms, but I've seen enough to convince myself that we need to pay close attention to what's happening in C++26 before making big changes to the way we describe SYCL's execution model. |
Interesting. I think you're thinking of a slightly different problem than what I had in mind (which is another good reason to try to clarify the terminology). What I had in mind is this: Assume an OpenCL backend and some OpenCL function, say In the specific case of error handling, what all this means is that an error reported by The way that the SYCL spec currently uses "member functions are asynchronous" and "calls will be non-blocking" is a related but different problem. IMO the salient piece of information the spec should convey is whether a function is blocking or non-blocking. What work a function performs synchronously and what work the function hands off to a worker thread to perform asynchronously should be implementation-defined. |
I agree with this. However, I don't see how the text you are proposing in this PR makes this more clear. It seems like the paragraph you are adding just gives some examples about how an implementation might work. It basically says that an implementation might have an internal thread that does some of the work. I don't see why we should put that in the spec. It's not required for an implementation to have such an internal thread, and I don't see how we could add any sort of CTS to verify the paragraph that you are adding. FWIW, I do think the spec needs to be more clear about blocking and non-blocking behavior and also about the forward progress guarantees of commands (kernel invocations). At present, it's unclear whether it is legal for a SYCL kernel to synchronize with another kernel or with a host thread via atomic operations (e.g. spinning on a lock). This is related to the blocking behavior of certain host APIs because you need to understand exactly what "blocking" means in these cases in order to avoid deadlock scenarios. |
That's my comment on the issue Thomas identified, which is different from the one I'd like to address in this PR.
I'm using the RFC 2119 definition of may. A SYCL implementation is allowed to manage a backend API both synchronously and asynchronously. A SYCL application is not allowed to make assumptions about whether a particular act of backend API management occurs synchronously or asynchronously (unless explicitly specified elsewhere in the spec or in a backend spec). Both of these statements are important, but ATM neither is clear in the spec. If a reader reads the preceding section (§3.7.1.1) on all the things that a SYCL implementation manages and combines that with the "All member functions of the [context] class are synchronous"-type statements, the reader could easily come away with some very confused ideas about what happens synchronously and asynchronously in SYCL. By far the clearest section of the spec that explains the asynchronous nature of SYCL is §4.13.1. Error handling rules. That information doesn't belong in a dark corner of the programming interface chapter; that information belongs front and center in the architecture chapter. |
I agree that C++26 alignment sounds hairy and we need to tread carefully. As a minimum, though, I think it should be possible to go through the SYCL spec as it is, collect the statements it already makes about asynchronous behavior, and summarize them in the SYCL Application Execution Model section where they belong. Most of the important information is currently spread throughout chapter 4 (in the queue and error handling sections), so a reader needs to know to look for that information and to piece it together themselves. |
I notice you say "act of backend API management". Are you concerned about applications that do interop between SYCL and the underlying backend? If that is the thing you want to clarify, then I think we need to add something to the backend interop specification(s). If you are instead talking about SYCL APIs, then I disagree. The SYCL spec should make it clear which APIs are synchronous and which are asynchronous. For those that are asynchronous, the spec should clearly state what aspect of the API asynchronous. Applications should be able to rely on this, assuming that the implementation obeys the synchronous / asynchronous behavior that is specified. |
I'm concerned with default assumptions that users and implementers can make in the absence of a specific backend spec. I'm using backend API management in the same way that it's used in §3.7.1.1.
What are your definitions for synchronous API and asynchronous API? Where are these definitions coming from? |
I don't know. It goes back to the "forward-progress" guarantee that we are promising.
I think we want this code to work... Does this code work because we make some "promise" that (of course, the same question is old if we are using "normal" command + In short: It's complicated. At least we should clarify what we mean by "blocking / non-blocking " or " sync / async". As a first step, I think we need only one term and not the two currently. 🤷🏽 |
There are a bunch of corner-cases in this snippet, but I think what you're asking is really just about whether the Just as you said, I believe that the SYCL specification as written forbids "eager" execution in the host thread: Section 3.9.11 says that submission of work does not block the host, which I interpret to mean that the From a correctness standpoint, a developer has to assume that the The reason I'm worried is that the new definition of "asynchronous operation" in C++26 seems to imply that an asynchronous operation should be allowed to execute in the host thread. If anybody has more information about how the ISO C++ committee decided upon this wording, I'd be interested to hear about it. I'll also ask around to see if I'm misreading anything here. |
I think we're getting a little sidetracked. To what extent SYCL should/will align with the new C++ term asynchronous operation is an interesting question, but we don't need to answer it here. The fact that the C++ term asynchronous operation exists doesn't prevent SYCL from using the word asynchronous in a more generic way. For example, the SYCL spec contains this statement:
I don't think the introduction of asynchronous operation in C++26 can or should change the meaning of this statement. I think the SYCL spec can continue to use the word asynchronous as it's used in this sentence (in quite a generic way) even when C++ has introduced the specific term asynchronous operation. I'll give some context for this PR: I'm working on the spec text for the SYCL SC error model, which needs to do a number of things differently from SYCL on account of functional safety requirements. Consequently, I need to rewrite Since moving the statements is an editorial change that could, in principle, be shared by SYCL and SYCL SC, I'm trying to work out what exactly that could look like. A straight copy-paste is possible, but I think it would be better to state the general design principles in the architecture chapter so that the programming interface chapter can refer back to them. Ideally, SYCL and SYCL SC will describe host-side asynchronous behavior with the same wording, but if SYCL really wants to explain decoupled and asynchronous execution in the error handling section, then the SYCL SC spec will probably need to diverge in this regard. |
I don't disagree, I'm just urging caution. The SYCL specification already contains five instances of "asynchronous operation", because that didn't previously mean anything -- if we're not careful, we might accidentally introduce more wording that is intended to clarify behavior but actually makes things more confusing. Colloquial use of "asynchronous" is probably fine, but introducing new terminology tied to "asynchronous" is probably a bad idea at this point.
I agree that moving the statements from Section 4.13 to earlier in the specification makes sense. Personally, I'd prefer that change to introducing new wording about "resource management". |
Oh, I didn't understand the PR then. The current PR add implementation details, AFAIK It doesn't move the description of what I like the idea of having a special section about "asynchronously" ("resource management," for me, makes me think about buffer /accessor) to describe what we mean by it (or at least put that in the glosary section) I guess, I agree, we should move "SYCL applications are asynchronous in the sense that host and device code executions are decoupled from one another except at specific points. For example, device code executions often begin when dependencies in the SYCL task graph are satisfied, which occurs asynchronously from host code execution" to before in the spec section. Are we all agreeing that we are using non-blocking / asynchronously interchangeably? |
I agree that people have a tendency to mix these up, but I don't think they should be used interchangeably. Whether an API is blocking or non-blocking is about providing a contract to let a developer know whether the calling thread waits within the function for a specified event to occur, or whether the calling thread runs the function and then calls The fact that |
I see. (This is similar to concurrency versus parallelism, I suppose). Thanks for the clarification! |
I borrowed the term resource management from the preceding section ( What I'd like to achieve in this PR is that by the end of chapter 3, the reader understands that some interactions between the SYCL runtime and a backend happen synchronously and some interactions happen asynchronously (WRT a user thread). The general rule is that the SYCL user cannot make assumptions about when and where a SYCL runtime calls any given backend API function. There can be exceptions to this general rule, particularly when backend specs are involved. Then the error handling explanation in chapter 4 becomes a simple case of saying something to the effect of: "Some things happen synchronously and some happen asynchronously. This is why there are synchronous and asynchronous exceptions, and a SYCL application needs to deal with both. Except where explicitly specified, it is implementation-defined whether a given exception is thrown synchronously or asynchronously. See [text in chapter 3]." Is there consensus that |
I'm trying very hard to not open that can of worms in this PR 😁
I completely agree. IMO the terms blocking and non-blocking are the correct ones for describing SYCL functions. I don't think the terms synchronous and asynchronous provide a meaningful distinction when applied to an API function. SYCL is fundamentally an asynchronous API. Every SYCL API function has some effect synchronously. The majority of SYCL API functions will also have effects that are realized later, asynchronously. E.g., the options that a buffer is constructed with could later influence how a kernel using that buffer is submitted to a backend, so even though a buffer constructor doesn't block, its effects are realized both synchronously and asynchronously. Implementations need latitude to decide what happens synchronously and what happens asynchronously (for performance reasons, to accommodate different backends, etc), so SYCL needs to be careful not to over-specify this. |
Can you say in more detail which statements in section 4.13 you think are general in nature and don't belong in that section? This might help understand the motivation for this PR. |
I specifically have the following paragraph in mind (the second in the section). It's phrased WRT exception handling, but it's the most complete description of asynchronous (or "decoupled") behavior I've found in the spec. It's surprising that the SYCL Application Execution Model section doesn't contain this information.
|
I suggest that we instead tweak the introductory paragraph to section 3.7.1. SYCL application execution model. The last two sentences currently say:
Let's change this to:
|
I think this is a lot better than what we have. Being really nit-picky, I think the very last sentence here could say something like:
I read "once" as meaning "as soon as", and there's no guarantee that the command begins executing right away. |
Using "after" instead of "once" seems better. The part about "at some point in the future" seems unnecessarily verbose. How about just:
|
You're right, my proposal was unnecessarily verbose. This looks good to me. |
The idea that (most practical) SYCL implementations will do some things asynchronously on the host (either by using a worker thread spawned by the SYCL runtime, or by relying on a worker thread owned by a backend, or both) is central to SYCL. The idea is implicit throughout the whole spec, but there isn't a single place where it's authoritatively spelled out.
The first place where the word asynchronous appears is in
§3.9.9 Error handling
, and IMO this is logically too late in the spec. Error handling shouldn't be used to introduce the idea of asynchronicity; the reader should go into the error handling section already understanding that SYCL has asynchronicity.I suspect that the idea of asynchronicity is so obvious to everyone who's worked on the SYCL spec that it's been taken for granted, but I don't think we should assume that it will be obvious to all readers.
I've taken a stab at adding a (hopefully uncontroversial) paragraph to
§3.7.1. SYCL application execution model
just to spell out that yes, some things might happen asynchronously. I think there's scope for folding the paragraph into the previous subsection, or expanding it, or moving it somewhere else entirely. There are a number of places in the spec (like§3.9.9 Error handling
) that could be made clearer and more succinct if they were to link to this new paragraph. There could also be scope for tying in the wording on asynchronicity with some of the forward progress wording.I'm curious to hear people's thoughts.