Skip to content

Commit 6af24fd

Browse files
author
Victor Lomuller
committed
Update the spec constant interface to match the direction taken for SYCL module proposal.
This patch rework the proposal to match the SYCL group direction. This API has been updated, the description is still work-in-progress.
1 parent 0a6be2a commit 6af24fd

File tree

1 file changed

+98
-21
lines changed

1 file changed

+98
-21
lines changed

spec-constant/index.md

Lines changed: 98 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ This is especially true for highly tuned software that requires information abou
1919
Since OpenCL C kernels are being fully compiled at runtime, those constants are usually expressed as macro and the value is passed to online compiler when the kernel is being compiled.
2020
However, SYCL being statically compiled, it is not possible to use this approach. Template based techniques might not be possible or come at the price of code size explosion.
2121

22-
SPIR-V, the standard intermediate representation for shader and compute kernels, introduced "specialization constants" as a way to replace this macro usage in statically compiled kernels.
22+
SPIR-V, the standard intermediate representation for shader and compute kernels, introduced *specialization constants* as a way to replace this macro usage in statically compiled kernels.
2323
Specialization constants in SPIR-V are treated as constants whose value is not known at the time of the SPIR-V module generation.
2424
Providing these constants before building the module for the actual target provides the compiler with the opportunity to further optimize the program.
2525

@@ -40,13 +40,19 @@ class runtime_const;
4040
// Fetch a value at runtime.
4141
float get_value();
4242

43+
// Declare a specialization constant id.
44+
// The variable `runtime_const` will be used as the id.
45+
cl::sycl::experimental::spec_id<float> runtime_const(42.f);
46+
4347
int main() {
4448
cl::sycl::queue queue;
4549
cl::sycl::program program(queue.get_context());
4650

47-
// Create a specialization constant.
48-
cl::sycl::experimental::spec_constant<float, runtime_const> my_constant =
49-
program.set_spec_constant<runtime_const>(get_value());
51+
// Set the value of the specialization constant.
52+
program.set_spec_constant<float, &runtime_const>(get_value());
53+
// Build the program, the value set by set_spec_constant
54+
// will be used as a constant by the underlying JIT
55+
// if it has native support for specialization constant.
5056
program.build_with_kernel_type<specialized_kernel>();
5157

5258
std::vector<float> vec(1);
@@ -55,19 +61,25 @@ int main() {
5561

5662
queue.submit([&](cl::sycl::handler& cgh) {
5763
auto acc = cgh.get_access<cl::sycl::access::mode::write>(buffer);
64+
// Retrieve a placeholder object representing the spec constant.
65+
auto my_constant = cgh.get_spec_constant<float, &runtime_const>();
66+
5867
cgh.single_task<specialized_kernel>(
5968
program.get_kernel<specialized_kernel>(),
60-
[=]() { acc[0] = my_constant.get(); });
69+
[=]() {
70+
acc[0] = my_constant.get(); // This should become a constant.
71+
});
6172
});
6273
}
6374
}
6475
```
65-
In this example, the call to `set_spec_constant` binds the value returned by the call to `get_value` to the SYCL `program`.
76+
77+
78+
In this example, the construction of `runtime_const` creates an specialization constant id, the initializer is taken as default value for the spec constant. The call to `set_spec_constant` binds the value returned by the call to `get_value` to the SYCL `program`.
6679
At static compilation time, the value is unknown to the SYCL device compiler, thus cannot be used by the optimizations.
6780
At runtime, `get_value` is evaluated and bond to the SYCL `program`, giving the opportunity for the underlying OpenCL runtime to use it during the kernel build.
68-
The function `set_spec_constant` returns a `spec_constant` object allowing the user to use the value inside the kernel.
69-
After all runtime values are bounded to the program, the program is built.
7081
82+
Upon submission of the kernel `specialized_kernel`, the call to `get_spec_constant` return a spec_constant object. This object is a placeholder that represent the specialization constant inside the kernel.
7183
The specialization constant `my_constant` is later used inside `specialized_kernel` and the expression `my_constant.get()` returns the value returned by the call to `get_value()`.
7284
If the target natively supports specialization constant, this value will be known by the underlying OpenCL consumer when it builds the kernel.
7385
@@ -156,6 +168,9 @@ It can even have an adverse effect as the value will use a register and the comp
156168

157169
Using specialization constants, the routine can be rewritten as:
158170
```cpp
171+
cl::sycl::experimental::spec_id<std::size_t> block_size(1);
172+
173+
159174
template <typename T>
160175
void mat_multiply(cl::sycl::queue& q, T* MA, T* MB, T* MC, int matSize) {
161176
auto device = q.get_device();
@@ -165,12 +180,6 @@ void mat_multiply(cl::sycl::queue& q, T* MA, T* MB, T* MC, int matSize) {
165180
auto blockSizeCst = prevPowerOfTwo(std::sqrt(maxBlockSize));
166181
blockSizeCst = std::min(matSize, blockSize);
167182

168-
cl::sycl::program program(queue.get_context());
169-
170-
// Create a specialization constant to encapsulate blockSize.
171-
auto blockSize = program.set_spec_constant<class block_size>(blockSizeCst);
172-
program.build_with_kernel_type<class mxm_kernel<T>>();
173-
174183
{
175184
range<1> dimensions(matSize * matSize);
176185
buffer<T> bA(MA, dimensions);
@@ -181,17 +190,18 @@ void mat_multiply(cl::sycl::queue& q, T* MA, T* MB, T* MC, int matSize) {
181190
auto pA = bA.template get_access<access::mode::read>(cgh);
182191
auto pB = bB.template get_access<access::mode::read>(cgh);
183192
auto pC = bC.template get_access<access::mode::write>(cgh);
184-
auto localRange = range<1>(blockSize * blockSize);
193+
auto localRange = range<1>(blockSizeCst * blockSizeCst);
185194

186195
accessor<T, 1, access::mode::read_write, access::target::local> pBA(
187196
localRange, cgh);
188197
accessor<T, 1, access::mode::read_write, access::target::local> pBB(
189198
localRange, cgh);
199+
auto blockSize = cgh.set_spec_constant<std::size_t, block_size>(blockSizeCst);
190200

191201
cgh.parallel_for<class mxm_kernel<T>>(
192202
program.get_kernel<class mxm_kernel<T>>(),
193203
nd_range<2>{range<2>(matSize, matSize),
194-
range<2>(blockSize, blockSize)},
204+
range<2>(blockSizeCst, blockSizeCst)},
195205
[=](nd_item<2> it) {
196206
// Current block
197207
int blockX = it.get_group(0);
@@ -238,7 +248,7 @@ void mat_multiply(cl::sycl::queue& q, T* MA, T* MB, T* MC, int matSize) {
238248
}
239249
}
240250
```
241-
In this example, `blockSize` is now a specialization constant holding the value same value as before, meaning that the value is now injected inside the module, allow the OpenCL consumer to use the value in the optimizations.
251+
In this example, `blockSize` is now a specialization constant holding the value same value as before, meaning that the value is now injected inside the module, allow the OpenCL consumer to use the value in the optimizations (loop unrolling for instance).
242252
Note that the specialization constant ID is independent from the template parameter `T` from which the kernel depends on. This means that all kernel instances will share the same value.
243253
244254
@@ -255,13 +265,33 @@ namespace cl {
255265
namespace sycl {
256266
namespace experimental {
257267
258-
template <typename T, typename ID = T>
268+
template <typename T>
269+
class spec_id {
270+
private:
271+
// Implementation defined constructor.
272+
spec_id(const spec_id&) = delete;
273+
spec_id(spec_id&&) = delete;
274+
public:
275+
using type = T;
276+
277+
// Argument `Args` are forwarded to the underlying T Ctor.
278+
// This allow the user to setup a default value for the spec_id instance.
279+
// The initialization of T must be evaluated at compile time to be valid.
280+
template<class... Args >
281+
explicit constexpr spec_id(Args&&...);
282+
};
283+
284+
template <class T, spec_id<T>& s>
259285
class spec_constant {
260286
private:
261287
// Implementation defined constructor.
262288
spec_constant(/* Implementation defined */);
289+
spec_constant(spec_constant&&) = delete;
290+
263291
public:
264-
spec_constant();
292+
using type = T;
293+
294+
spec_constant(const spec_constant&) = default;
265295
266296
T get() const; // explicit access.
267297
operator T() const; // implicit conversion.
@@ -300,8 +330,15 @@ namespace sycl {
300330
class program {
301331
// ...
302332
public:
303-
template <typename ID, typename T>
304-
spec_constant<T, ID> set_spec_constant(T cst);
333+
334+
/**
335+
* Returns true if the current program can support specialization constants natively.
336+
*
337+
*/
338+
bool native_spec_constant() const noexcept;
339+
340+
template <typename T, experimental::spec_id<T>&>
341+
void set_spec_constant(T cst);
305342
// ...
306343
};
307344

@@ -325,6 +362,46 @@ For a same kernel, it is valid to set different specialization constants to diff
325362
After the kernel is built, it is no longer possible to set new specialization constants.
326363
A `cl::sycl::experimental::spec_const_error` exception will be thrown if the user attempt change it after the kernel has been built.
327364
365+
## Getting Specialization Constants via the command group handler
366+
367+
The handler interface is extended to include a mechanism to get a specialization constant.
368+
369+
```cpp
370+
namespace cl {
371+
namespace sycl {
372+
373+
class handler {
374+
// ...
375+
public:
376+
// Set a value for the specialization constant represented by `s`
377+
// and return the associated spec_constant.
378+
// Note, this call may require the underlying program to be rebuilt.
379+
template <typename T, experimental::spec_id<T>& s>
380+
spec_constant<T, s> set_spec_constant(T cst);
381+
382+
// Retrive a spec_constant object representing `s`
383+
template <typename T, experimental::spec_id<T>& s>
384+
spec_constant<T, s> get_spec_constant();
385+
386+
// ...
387+
};
388+
389+
} // namespace sycl
390+
} // namespace cl
391+
```
392+
393+
The templated member function `get_spec_constant` takes a runtime value of type `T` that will be used to set the specialization constant named `ID`.
394+
Multiple specialization constants can be generated by calling `get_spec_constant` multiple times.
395+
396+
It is invalid to query multiple times a specialization constant with a common `ID` for the same kernel.
397+
398+
Upon invocation of a `single_task`/`parallel_for`/`parallel_for_work_group` construct, the runtime will build the appropriate kernel if it has never been built for the set of specialization constant passed to the kernel.
399+
The SYCL device compiler and runtime are responsible to make sure that it is valid to build the module in which the invoked kernel is defined using only the provided specialization constants.
400+
401+
It is illegal to use this interface in conjunction with the `cl::sycl::program` interface.
402+
403+
It must be noted that setting a specialization constant has an underlying cost and that changing a constant value will force the OpenCL runtime to build a new kernel.
404+
328405
## Build issue caused by Specialization Constants
329406

330407
The following error class is added:

0 commit comments

Comments
 (0)