|
| 1 | +# Temporary buffer |
| 2 | + |
| 3 | +| Proposal ID | CP017 | |
| 4 | +|-------------|--------| |
| 5 | +| Name | Host access | |
| 6 | +| Date of Creation | 17 September 2017 | |
| 7 | +| Target | SYCL 1.2.1 vendor extension | |
| 8 | +| Current Status | _Work In Progress_ | |
| 9 | +| Reply-to | Ruyman Reyes <ruyman@codeplay.com> | |
| 10 | +| Original author | Ruyman Reyes <ruyman@codeplay.com> | |
| 11 | + |
| 12 | +## Overview |
| 13 | + |
| 14 | +Developers in some situations want to create memory that can only be accessed from the device. This can enable the underlying device driver to allocate memory on places where it is more efficient for the device to access than the host. |
| 15 | +This is possible in OpenCL 1.2 by using the CL_MEM_HOST_NO_ACCESS, CL_MEM_HOST_READ_ONLY or CL_MEM_HOST_WRITE_ONLY flags during buffer creation. |
| 16 | + |
| 17 | +A SYCL implementation for OpenCL needs to analyze the access pattern at compile time, or replace the underlying cl_mem object, in order to infer the host access pattern and create buffers with the appropriate flags. |
| 18 | + |
| 19 | +This proposal enables developers to specify the intended use pattern of a buffer, allowing |
| 20 | +SYCL implementations of taking advantage of knowing if a given buffer is usable by the host or not. |
| 21 | + |
| 22 | +## Revisions |
| 23 | + |
| 24 | +This is the first revision of this proposal. |
| 25 | + |
| 26 | +## Interface changes |
| 27 | + |
| 28 | +This proposal adds a new property for buffers, *buffer::property::host_access*. |
| 29 | +A SYCL buffer with the *host_access* property can only be used within one context, |
| 30 | +and its values are only accessible from queues that are associated with |
| 31 | +said context. |
| 32 | + |
| 33 | +```cpp |
| 34 | +namespace codeplay { |
| 35 | + |
| 36 | +enum class host_access_mode { |
| 37 | + none, |
| 38 | + read, |
| 39 | + read_write, |
| 40 | + write |
| 41 | +}; |
| 42 | + |
| 43 | +struct host_access { |
| 44 | + host_access(host_access_mode hostAccessMode); |
| 45 | + |
| 46 | + host_access_mode get_host_access_mode() const; |
| 47 | +}; |
| 48 | + |
| 49 | +} // namespace codeplay |
| 50 | +``` |
| 51 | +
|
| 52 | +
|
| 53 | +## Example |
| 54 | +
|
| 55 | +The following example is a simplification of a reduction kernel |
| 56 | +that uses temporary storage to avoid modifying the input buffer, and |
| 57 | +then proceeds to manipulate the data on device only. |
| 58 | +
|
| 59 | +
|
| 60 | +```cpp |
| 61 | +context deviceContext; |
| 62 | +queue q{deviceContext}; |
| 63 | +range<3> myRange{3, 3, 3}; |
| 64 | +/* A normal buffer that holds the data for |
| 65 | + * the execution. |
| 66 | + */ |
| 67 | +buffer<float> myBuffer{someHostPointer, myRange}; |
| 68 | +
|
| 69 | +/** |
| 70 | + * We can use the normal buffer on any other queue, |
| 71 | + * SYCL will migrate transparently across contexts if required. |
| 72 | + */ |
| 73 | +std::for_each(sycl_named_policy<example>(otherQueue), |
| 74 | + std::begin(myBuffer), std::end(myBuffer), |
| 75 | + [=](float &elem) { |
| 76 | + elem = 1.0; |
| 77 | + }); |
| 78 | +
|
| 79 | +/* This temporary memory only needs to exist |
| 80 | + * on the context where the operation will be |
| 81 | + * performed. |
| 82 | + */ |
| 83 | +buffer<float, 1> tmp{myRange, |
| 84 | + {property::buffer::context_bound(deviceContext), codeplay::property::buffer::host_access(host_access_mode::none)} }; |
| 85 | +
|
| 86 | +bool firstIter = true; |
| 87 | +
|
| 88 | +auto length = tmp.get_size(); |
| 89 | +
|
| 90 | +do { |
| 91 | + auto f = [&](handler &h) mutable { |
| 92 | + auto aIn = myBuffer.template get_access<access::mode::read>(h); |
| 93 | + auto aInOut = tmp.template get_access<access::mode::read_write>(h); |
| 94 | + accessor<float, 1, access::mode::read_write, access::target::local> |
| 95 | + scratch(range<1>(local), h); |
| 96 | + |
| 97 | + if (firstIter) { |
| 98 | + h.parallel_for(r, ReductionKernel(aIn, aInOut)); |
| 99 | + firstIter = false; |
| 100 | + } else { |
| 101 | + h.parallel_for(r, ReductionKernel(aInOut, aInOut)); |
| 102 | + } |
| 103 | + }; |
| 104 | + q.submit(f); |
| 105 | + length = length / local; |
| 106 | +} while (length > 1); |
| 107 | +
|
| 108 | +/** |
| 109 | + * Value can be used directly on a different command group |
| 110 | + */ |
| 111 | +auto f = [&](handler &h) mutable { |
| 112 | + auto aIn = tmp.template get_access<access::mode::read>(h); |
| 113 | + auto aOut = myBuffer.template get_access<access::mode::read_write>(h); |
| 114 | + h.parallel_for(range<1>(local), [=](id<i> i) { |
| 115 | + aOut[i] = aIn[0] * aOut[i]; |
| 116 | + }) |
| 117 | +} |
| 118 | +
|
| 119 | +// Data is now in buffer out, can be copied to the host as usual |
| 120 | +
|
| 121 | +``` |
| 122 | + |
| 123 | +## Implementation notes |
| 124 | + |
| 125 | +* An implementation may decide to not take advantage of this hint and |
| 126 | +simply construct a normal buffer object underneath. |
| 127 | +However, if the buffer is accessed on the host, this must raise an error. |
| 128 | + |
| 129 | +* Using a buffer with host access restrictions with the host-device results in undefined behavior. |
| 130 | + |
| 131 | +## References |
| 132 | + |
| 133 | +* SYCL 1.2.1 specification |
| 134 | +* OpenCL 1.2 specification |
0 commit comments