Skip to content

Commit f103b4a

Browse files
author
Ruyman
authored
[SYCL] Interop Task R2 (#105)
* `get_buffer` renamed to `get_mem` * Clarified wording on `get_queue` and `get_mem` * `interop_handle` is passed by value to the lambda instead of reference
1 parent ae38f0d commit f103b4a

File tree

1 file changed

+65
-14
lines changed

1 file changed

+65
-14
lines changed

interop_task/interop_task.md

Lines changed: 65 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -2,11 +2,12 @@
22
|-------------|--------|
33
| Name | Interop Task |
44
| Date of Creation | 16 January 2019 |
5+
| Revision | 0.2 |
56
| Target | Vendor extension |
6-
| Current Status | _Availalable since CE 1.0.5_ |
7+
| Current Status | 0.1 _Availalable since CE 1.0.5_, 0.2 TBD |
78
| Reply-to | Victor Lomüller <victor@codeplay.com> |
89
| Original author | Victor Lomüller <victor@codeplay.com>, Gordon Brown <gordon@codeplay.com>, Peter Zuzek <peter@codeplay.com> |
9-
| Contributors | Victor Lomüller <victor@codeplay.com>, Gordon Brown <gordon@codeplay.com>, Peter Zuzek <peter@codeplay.com> |
10+
| Contributors | Victor Lomüller <victor@codeplay.com>, Gordon Brown <gordon@codeplay.com>, Peter Zuzek <peter@codeplay.com>, Ruyman Reyes <ruyman@codeplay.com> |
1011

1112
# interop_task: Improving SYCL-OpenCL Interoperability
1213

@@ -16,6 +17,18 @@ SYCL does not allow a user to access cl_mem object out of an cl::sycl::accessor,
1617

1718
This proposal introduces a way for a user to retrieve the low-level objects associated with SYCL buffers and enqueue a host task that can execute an arbitrary portion of host code within the SYCL runtime, therefore taking advantage of SYCL dependency analysis and scheduling.
1819

20+
## Revisions
21+
22+
### 0.2
23+
24+
* `get_buffer` renamed to `get_mem`
25+
* Clarified wording on `get_queue` and `get_mem`
26+
* `interop_handle` is passed by value to the lambda instead of reference
27+
28+
### 0.1
29+
30+
Initial proposal
31+
1932
## Accessing low-level API functionality on SYCL queues
2033

2134
We introduce a new type of handler, the **codeplay::handler**, which includes a new
@@ -28,7 +41,7 @@ thread to continue submitting command groups).
2841
Other command groups enqueued in the same or different queues
2942
can be executed following the sequential consistency by guaranteeing the
3043
satisfaction of the requisites of this command group.
31-
It is the user's responsibility to ensure the lambda submitted via interop_task does not create race conditions with other command groups or with the host.
44+
It is the user's responsibility to ensure the lambda submitted via `interop_task` does not create race conditions with other command groups or with the host.
3245

3346
The possibility of enqueuing host tasks on SYCL queues also enables the
3447
runtime to perform further optimizations when available.
@@ -61,18 +74,33 @@ class handler : public cl::sycl::handler {
6174
6275
The `interop_task` allows users to submit tasks containing C++ statements with low-level API calls (e.g. OpenCL Host API entries).
6376
The command group that encapsulates the task will execute following the usual SYCL dataflow execution rules.
64-
The functor passed to the `interop_task` takes as input a const reference to a `cl::sycl::codeplay::interop_handle`. The handle can be used to retrieve underlying OpenCL objects relative to the execution of the task.
6577
66-
It is not allowed to allocate new SYCL object inside an `interop_task`.
78+
The SYCL event returned by the command group will be completed when the `interop_task`
79+
functor is completed. Note the SYCL event is completed regardless of the completion
80+
status of any OpenCL operation enqueued or performed inside the `interop_task`
81+
scope. In particular, dispatching of asynchronous OpenCL operations inside
82+
of the `interop_task` requires manual synchronization.
83+
84+
The functor passed to the `interop_task` takes as input a `cl::sycl::codeplay::interop_handle`. The handle can be used to retrieve underlying OpenCL objects relative to the execution of the task.
85+
86+
It is not allowed to allocate new SYCL objects inside a `interop_task` scope.
6787
It is the user's responsibility to ensure that all operations performed inside the `interop_task` are finished before returning from it.
88+
Since SYCL queues are out of order, and any underlying OpenCL queue can be as well,
89+
there is no guarantee that OpenCL commands enqueued inside the `interop_task`
90+
functor will execute on a particular order w.r.t other SYCL commands or
91+
`interop_task` once dispatched to the OpenCL queue, unless this is is
92+
explicitly handled by using OpenCL events or barriers.
6893
69-
Although the statements inside the lambda submitted to the `interop_task` are executed on the host, the requirements and actions for the command group are satisied for the device.
70-
This is the opposite of the `host_handler` vendor extension, where requisites are satisfied for the host since the statements on the lambda submited to the single task are meant to have side effects on the host only.
71-
The interop task lambda can have side effects on the host, but it is the programmer responsability to ensure requirements dont need to be satisfied for the host.
94+
Although the statements inside the lambda submitted to the `interop_task` are executed on the host, the requirements and actions for the command group are satisfied for the device.
95+
This is the opposite of the `host_handler` [vendor extension](https://github.com/codeplaysoftware/standards-proposals/blob/master/asynchronous-data-flow/sycl-2.2/03_interacting_with_data_on_the_host.md), where requisites are satisfied for the host since the statements on the lambda submitted to the single task are meant to have side effects on the host only.
96+
The `interop-task` lambda can have side effects on the host, but it is the programmer responsibility to ensure requirements don't need to be satisfied for the host.
97+
98+
Executing a `interop_task` in a host device is invalid, and the asynchronous
99+
exception `cl::sycl::feature_not_supported` is thrown.
72100
73101
## Accessing low-level API objects
74102
75-
We introduce the `interop_handle` class which provide access to underlying OpenCL objects during the execution of the `interop_task`.
103+
We introduce the `interop_handle` class which provides access to underlying OpenCL objects during the execution of the `interop_task`.
76104
`interop_handle` objects are immutable objects whose purpose is to enable users access to low-level API functionality.
77105
78106
The interface of the `interop_handle` is defined as follow:
@@ -88,27 +116,50 @@ class interop_handle {
88116
89117
public:
90118
/* Return the context */
91-
cl_context get_context() const;
119+
cl_context get_context() const noexcept;
92120
93121
/* Return the device id */
94-
cl_device_id get_device() const;
122+
cl_device_id get_device() const noexcept;
95123
96124
/* Return the command queue associated with this task */
97-
cl_command_queue get_queue() const;
125+
cl_command_queue get_queue() const noexcept;
98126
99127
/*
100128
Returns the underlying cl_mem object associated with a given accessor
101129
*/
102130
template <typename dataT, int dimensions, access::mode accessmode,
103131
access::target accessTarget,
104132
access::placeholder isPlaceholder>
105-
cl_mem get_buffer(const accessor<dataT, dimensions, accessmode, access::target accessTarget, access::placeholder isPlaceholder>&) const;
133+
cl_mem get_mem(const accessor<dataT, dimensions, accessmode, access::target accessTarget, access::placeholder isPlaceholder>&) const;
106134
};
107135
} // namespace codeplay
108136
} // namespace sycl
109137
} // namespace cl
110138
```
111139

140+
### Obtaining the underlying OpenCL queue
141+
142+
The `get_queue` method returns an underlying OpenCL queue for the
143+
SYCL queue used to submit the command group, or the fallback queue
144+
if this command-group is re-trying execution on an OpenCL queue.
145+
The OpenCL command queue returned is implementation-defined in cases
146+
where the SYCL queue maps to multiple underlying OpenCL objects.
147+
148+
It is responsibility of the SYCL runtime to ensure the OpenCL queue
149+
returned is in a state that can be used to dispatch work,
150+
and that other potential OpenCL command queues associated with the same
151+
SYCL command queues are not executing commands while the `interop_task`
152+
is being executed.
153+
154+
### Obtaining memory objects for interoperability
155+
156+
The `get_mem` method receives a SYCL accessor that has been defined as a
157+
requirement for the command group, and returns the underlying OpenCL
158+
memory object that is used by the SYCL runtime.
159+
If the accessor passed as parameter is not part of the command group
160+
requirements (e.g. it is an unregistered placeholder accessor),
161+
the exception `cl::sycl::invalid_object` is thrown asynchronously.
162+
112163
## Example using regular accessor
113164

114165
```cpp
@@ -167,7 +218,7 @@ int main( void )
167218
168219
/* Execute the plan. */
169220
cl_command_queue queue = handle.get_queue();
170-
cl_mem X_mem = handle.get_buffer(X_accessor);
221+
cl_mem X_mem = handle.get_mem(X_accessor);
171222
err = clfftEnqueueTransform(planHandle, CLFFT_FORWARD,
172223
1, &queue, 0, NULL, NULL,
173224
&X_mem, NULL, NULL);

0 commit comments

Comments
 (0)