-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Non-blocking clEnqueueWriteBuffer does not work in the runtime #204
Comments
I suspect this may be because the In general, the specification claims that the application cannot use the target pointer of a non-blocking write before waiting on the event. I originally took it to only be relevant for host-pointers, but the spec doesn't specify this. It jus says "the application" and "the memory pointed" Link to relevant docs: |
Consolidates what you said: https://community.intel.com/t5/OpenCL-for-CPU/clEnqueueWriteBuffer-does-not-finish-before-Kernel/td-p/1077032 |
If I understand your speculations here correctly, you think there might be an issue with the synchronization of the memory operation and the kernel execution. If you are using different command queues for memory operations and kernels, then you must use events to synchronize them. If you use only a single command queue for both (and you configure it to be in order), then every command will wait until all prior commands have finished before execution. |
Mh. I am getting contradictory statements from the documentation.
On the other hand, the documentations for https://www.khronos.org/registry/OpenCL/sdk/1.0/docs/man/xhtml/clEnqueueWriteBuffer.html states prosaically
Could it be that for kernel launches an in-order queue guarantees sequential execution, but the same is not true for a non-blocking I think I may experiment on the side and see EDIT: A bit later, the docs on the command queue state
Which is a bit vague (strictly speaking, it doesn't't tell me what happens when OUT_OF_ORDER is disabled, or what happens if the events are BEFORE the kernel, as it said "after" the enqueuing of a kernel) but seems to imply @michel-steuwer a bit more. If that's the case, I am really not sure. |
@fedepiz @michel-steuwer Here is a log of the runtime and associated OpenCL calls for this example:
|
In the "one copy" buffer runtime,
deviceBufferSync
usesclEnqueueWriteBuffer
.I would have thought that we could use a non-blocking call there, but that produces bugs (e.g. in #203, output is fixed by changing to a blocking call 3eb189a).
The idea was that before accessing the ptr on the host, there should be a blocking call to
clEnqueueReadBuffer
viahostBufferSync
(which should be ordered after theclEnqueueWriteBuffer
and most likely a kernel call on the command queue).Any idea why the non-blocking call does not work and how it could be fixed? @fedepiz @michel-steuwer
The code where things go wrong:
The text was updated successfully, but these errors were encountered: