You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: spec-constant/index.md
+98-21Lines changed: 98 additions & 21 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -19,7 +19,7 @@ This is especially true for highly tuned software that requires information abou
19
19
Since OpenCL C kernels are being fully compiled at runtime, those constants are usually expressed as macro and the value is passed to online compiler when the kernel is being compiled.
20
20
However, SYCL being statically compiled, it is not possible to use this approach. Template based techniques might not be possible or come at the price of code size explosion.
21
21
22
-
SPIR-V, the standard intermediate representation for shader and compute kernels, introduced "specialization constants" as a way to replace this macro usage in statically compiled kernels.
22
+
SPIR-V, the standard intermediate representation for shader and compute kernels, introduced *specialization constants* as a way to replace this macro usage in statically compiled kernels.
23
23
Specialization constants in SPIR-V are treated as constants whose value is not known at the time of the SPIR-V module generation.
24
24
Providing these constants before building the module for the actual target provides the compiler with the opportunity to further optimize the program.
25
25
@@ -40,13 +40,19 @@ class runtime_const;
40
40
// Fetch a value at runtime.
41
41
floatget_value();
42
42
43
+
// Declare a specialization constant id.
44
+
// The variable `runtime_const` will be used as the id.
auto acc = cgh.get_access<cl::sycl::access::mode::write>(buffer);
64
+
// Retrieve a placeholder object representing the spec constant.
65
+
auto my_constant = cgh.get_spec_constant<float, &runtime_const>();
66
+
58
67
cgh.single_task<specialized_kernel>(
59
68
program.get_kernel<specialized_kernel>(),
60
-
[=]() { acc[0] = my_constant.get(); });
69
+
[=]() {
70
+
acc[0] = my_constant.get(); // This should become a constant.
71
+
});
61
72
});
62
73
}
63
74
}
64
75
```
65
-
In this example, the call to `set_spec_constant` binds the value returned by the call to `get_value` to the SYCL `program`.
76
+
77
+
78
+
In this example, the construction of `runtime_const` creates an specialization constant id, the initializer is taken as default value for the spec constant. The call to `set_spec_constant` binds the value returned by the call to `get_value` to the SYCL `program`.
66
79
At static compilation time, the value is unknown to the SYCL device compiler, thus cannot be used by the optimizations.
67
80
At runtime, `get_value` is evaluated and bond to the SYCL `program`, giving the opportunity for the underlying OpenCL runtime to use it during the kernel build.
68
-
The function `set_spec_constant` returns a `spec_constant` object allowing the user to use the value inside the kernel.
69
-
After all runtime values are bounded to the program, the program is built.
70
81
82
+
Upon submission of the kernel `specialized_kernel`, the call to `get_spec_constant` return a spec_constant object. This object is a placeholder that represent the specialization constant inside the kernel.
71
83
The specialization constant `my_constant` is later used inside `specialized_kernel` and the expression `my_constant.get()` returns the value returned by the call to `get_value()`.
72
84
If the target natively supports specialization constant, this value will be known by the underlying OpenCL consumer when it builds the kernel.
73
85
@@ -156,6 +168,9 @@ It can even have an adverse effect as the value will use a register and the comp
156
168
157
169
Using specialization constants, the routine can be rewritten as:
auto blockSize = cgh.set_spec_constant<std::size_t, block_size>(blockSizeCst);
190
200
191
201
cgh.parallel_for<class mxm_kernel<T>>(
192
202
program.get_kernel<class mxm_kernel<T>>(),
193
203
nd_range<2>{range<2>(matSize, matSize),
194
-
range<2>(blockSize, blockSize)},
204
+
range<2>(blockSizeCst, blockSizeCst)},
195
205
[=](nd_item<2> it) {
196
206
// Current block
197
207
int blockX = it.get_group(0);
@@ -238,7 +248,7 @@ void mat_multiply(cl::sycl::queue& q, T* MA, T* MB, T* MC, int matSize) {
238
248
}
239
249
}
240
250
```
241
-
In this example, `blockSize` is now a specialization constant holding the value same value as before, meaning that the value is now injected inside the module, allow the OpenCL consumer to use the value in the optimizations.
251
+
In this example, `blockSize` is now a specialization constant holding the value same value as before, meaning that the value is now injected inside the module, allow the OpenCL consumer to use the value in the optimizations (loop unrolling for instance).
242
252
Note that the specialization constant ID is independent from the template parameter `T` from which the kernel depends on. This means that all kernel instances will share the same value.
243
253
244
254
@@ -255,13 +265,33 @@ namespace cl {
255
265
namespace sycl {
256
266
namespace experimental {
257
267
258
-
template <typename T, typename ID = T>
268
+
template <typename T>
269
+
class spec_id {
270
+
private:
271
+
// Implementation defined constructor.
272
+
spec_id(const spec_id&) = delete;
273
+
spec_id(spec_id&&) = delete;
274
+
public:
275
+
using type = T;
276
+
277
+
// Argument `Args` are forwarded to the underlying T Ctor.
278
+
// This allow the user to setup a default value for the spec_id instance.
279
+
// The initialization of T must be evaluated at compile time to be valid.
280
+
template<class... Args >
281
+
explicit constexpr spec_id(Args&&...);
282
+
};
283
+
284
+
template <class T, spec_id<T>& s>
259
285
class spec_constant {
260
286
private:
261
287
// Implementation defined constructor.
262
288
spec_constant(/* Implementation defined */);
289
+
spec_constant(spec_constant&&) = delete;
290
+
263
291
public:
264
-
spec_constant();
292
+
using type = T;
293
+
294
+
spec_constant(const spec_constant&) = default;
265
295
266
296
T get() const; // explicit access.
267
297
operator T() const; // implicit conversion.
@@ -300,8 +330,15 @@ namespace sycl {
300
330
class program {
301
331
// ...
302
332
public:
303
-
template <typename ID, typename T>
304
-
spec_constant<T, ID> set_spec_constant(T cst);
333
+
334
+
/**
335
+
* Returns true if the current program can support specialization constants natively.
336
+
*
337
+
*/
338
+
bool native_spec_constant() const noexcept;
339
+
340
+
template <typename T, experimental::spec_id<T>&>
341
+
void set_spec_constant(T cst);
305
342
// ...
306
343
};
307
344
@@ -325,6 +362,46 @@ For a same kernel, it is valid to set different specialization constants to diff
325
362
After the kernel is built, it is no longer possible to set new specialization constants.
326
363
A `cl::sycl::experimental::spec_const_error` exception will be thrown if the user attempt change it after the kernel has been built.
327
364
365
+
## Getting Specialization Constants via the command group handler
366
+
367
+
The handler interface is extended to include a mechanism to get a specialization constant.
368
+
369
+
```cpp
370
+
namespace cl {
371
+
namespace sycl {
372
+
373
+
class handler {
374
+
// ...
375
+
public:
376
+
// Set a value for the specialization constant represented by `s`
377
+
// and return the associated spec_constant.
378
+
// Note, this call may require the underlying program to be rebuilt.
The templated member function `get_spec_constant` takes a runtime value of type `T` that will be used to set the specialization constant named `ID`.
394
+
Multiple specialization constants can be generated by calling `get_spec_constant` multiple times.
395
+
396
+
It is invalid to query multiple times a specialization constant with a common `ID` for the same kernel.
397
+
398
+
Upon invocation of a `single_task`/`parallel_for`/`parallel_for_work_group` construct, the runtime will build the appropriate kernel if it has never been built for the set of specialization constant passed to the kernel.
399
+
The SYCL device compiler and runtime are responsible to make sure that it is valid to build the module in which the invoked kernel is defined using only the provided specialization constants.
400
+
401
+
It is illegal to use this interface in conjunction with the `cl::sycl::program` interface.
402
+
403
+
It must be noted that setting a specialization constant has an underlying cost and that changing a constant value will force the OpenCL runtime to build a new kernel.
0 commit comments