Skip to content

Commit 31701f2

Browse files
GordonGordon
authored andcommitted
Issue #33: Add wording for high-level interface.
* Introduce `this_thread::bind` & `this_thread::unbind`. * Introduce high-level interface `bulk_execution_affinity` properties.
1 parent f1b2484 commit 31701f2

File tree

1 file changed

+76
-11
lines changed

1 file changed

+76
-11
lines changed

affinity/cpp-20/d0796r2.md

Lines changed: 76 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,11 @@
1414

1515
# Changelog
1616

17+
### P0796r2 (RAP)
18+
19+
* Introduce `this_thread::bind` & `this_thread::unbind` for binding a thread of execution to an execution resource.
20+
* Introduce high-level interface for execution binding via executor properties.
21+
1722
### P0796r1 (JAX)
1823

1924
* Introduce proposed wording.
@@ -170,6 +175,43 @@ This feature could be easily scaled to heterogeneous and distributed systems, as
170175
171176
In this paper we propose an interface for querying and representing the execution resources within a system, queurying the relative affinity metric between those execution resources, and then using those execution resources to allocate memory and execute work with affinity to the underlying hardware. The interface described in this paper builds on the existing initerface for executors and execution contexts defined in the executors proposal [[22]][p0443r4].
172177
178+
### Interface grandularity
179+
180+
In this paper we propose both a low-level interface and a high-level interface:
181+
* The low-level interface cosnsists of mechanisms for discovering detailed information about a system's topology and affinity properties which can be utilised to hand optimise parallel applications and libraries for the best performance. The low-level interface has high granularity and is aimed at users who have a high knowledge of the system architecture.
182+
* The high-level interface consists of policies which describe desired behaviour when using parallel algorithms or libraries. The high-level interface has low granularity and is aimed at users who may have little or no knowledge of the system architecture.
183+
184+
## High-level interface
185+
186+
The high-level interface is a polcy-based design which utilises the executor property mechanism to provide additional affinity based requirements on executors.
187+
188+
### Bulk execution affinity
189+
190+
In this paper we propose an executor property group called `bulk_execution_affinity` which contains the sub properties `none`, `balanced`, `scatter` or `compact`. Each of these properties, if applied to an *executor* enforce a particular guarantee of execution agent binding to the *execution resources* associated with the *executor* in a partuclar pattern:
191+
* **none** makes no guarantee that *execution agents* created by the *executor* will be bound to specific *execution resources*.
192+
* **balanced** guarantees that *execution agents* created by the executor will be bound to the *execution resources* associated with the *executor* close together in sequence but with an even distribution across the *execution resources*.
193+
* **scatter** guarantees that *execution agents* created by the executor will be bound to the *execution resources* associated with the *executor* distributed with each *execution agent* far from each other *execution agent* in sequence.
194+
* **compact** guarantees that *execution agents* created by the executor will be bound to the *execution resources* associated with the *executor* close together in sequence.
195+
196+
Below *(Listing 2)* is an example of executing a parallel task over 8 threads using `bulk_execute`, with the affinity binding `bulk_execution_affinity.scatter`.
197+
198+
```cpp
199+
{
200+
auto exec = executionContext.executor();
201+
202+
auto affExec = execution::require(exec, execution::bulk,
203+
execution::bulk_execution_affinity.scatter);
204+
205+
affExec.bulk_execute([](std::size_t i, shared s) {
206+
func(i);
207+
}, 8, sharedFactory);
208+
}
209+
210+
```
211+
*Listing 2: Example of using the bulk_execution_affinity property*
212+
213+
## Low-level interface
214+
173215
### Execution resources
174216

175217
An `execution_resource` is a light weight structure which acts as an identifier to particular piece of hardware within a system. It can be queried for whether it can allocate memory via `can_place_memory` and whether it can execute work via `can_place_agents`, and for it's name via `name`. An `execution_resource` can also represent other `execution_resource`s, these are refered to as being *members of* that `execution_resource` and can be queried via `resources`. Additionally the `execution_resource` which another is a *member of* can be queried vis `member_of`. An `execution_resource` can also be queried for the concurrency it can provide; the total number of *threads of execution* supported by that *execution_resource* and all resources it represents.
@@ -182,7 +224,7 @@ An `execution_resource` is a light weight structure which acts as an identifier
182224

183225
The system topology is made up of a number of system level `execution_resource`s, which can be queried through `this_system::resource` which returns a `std::vector`. The `execution_resources` available within the system can be initialised dynamically by a runtime library, however must be done so before `main` is called, given that after that point the system topology cannot change.
184226

185-
Below *(Listing 2)* is an example of iterating over the system level resources and priniting out it's capabilities.
227+
Below *(Listing 3)* is an example of iterating over the system level resources and priniting out it's capabilities.
186228

187229
```cpp
188230
for (auto res : execution::this_system::resources()) {
@@ -192,13 +234,13 @@ for (auto res : execution::this_system::resources()) {
192234
std::cout << res.concurrency() << `\n`;
193235
}
194236
```
195-
*Listing 2: Example of querying all the system level execution resources*
237+
*Listing 3: Example of querying all the system level execution resources*
196238

197239
### Querying relative affinity
198240

199241
The `affinity_query` class template provides an abstraction for a relative affinity value between two `execution_resource`s, derived from a particular `affinity_operation` and `affinity_metric`. The `affinity_query` is templated by `affinity_operation` and `affinity_metric` and is constructed from two `execution_resource`s. An `affinity_query` does not mean much on it's own, instead a relative magnitude of affinity can be queried by using comparison operators. If nessesary the value of an `affinity_query` can also be queried through `native_affinity`, though the return value of this is implementation defined.
200242

201-
Below *(listing 3)* is an example of how you can query the relative affinity between two `execution_resource`s.
243+
Below *(listing 4)* is an example of how you can query the relative affinity between two `execution_resource`s.
202244

203245
```cpp
204246
auto systemLevelResources = execution::this_system::resources();
@@ -212,15 +254,15 @@ auto relativeLatency02 = execution::affinity_query<execution::affinity_operation
212254

213255
auto relativeLatency = relativeLatency01 > relativeLatency02;
214256
```
215-
*Listing 3: Example of querying affinity between two `execution_resource`s.*
257+
*Listing 4: Example of querying affinity between two `execution_resource`s.*
216258

217259
> [*Note:* This interface for querying relative affinity is a very low-level interface designed to be abstracted by libraries and later affinity policies. *--end note*]
218260
219261
### Execution context
220262

221263
The `execution_context` class provides an abstraction for managing a number of light weight execution agents executing work on an `execution_resource` and any `execution_resource`s encapsulated by it. An `execution_context` can then provide an executor for executing work and an allocator or polymorphic memory resource for allocating memory. The `execution_context` is constructed with an `execution_resource`, the `execution_context` then executes work or allocates memory for that `execution_resource` and an `execution_resource` that it represents.
222264

223-
Below *(Listing 4)* is an example of how this extended interface could be used to construct an *execution context* from an *execution resource* which is retrieved from the *system’s resource topology*. Once an *execution context* is constructed it can then still be queried for its *execution resource* and then that *execution resource* can be further partitioned.
265+
Below *(Listing 5)* is an example of how this extended interface could be used to construct an *execution context* from an *execution resource* which is retrieved from the *system’s resource topology*. Once an *execution context* is constructed it can then still be queried for its *execution resource* and then that *execution resource* can be further partitioned.
224266

225267
```cpp
226268
auto &resources = execution::this_system::resources();
@@ -235,11 +277,9 @@ for (auto res : systelLevelResource.resources()) {
235277
std::cout << res.name() << `\n`;
236278
}
237279
```
238-
*Listing 4: Example of constructing an execution context from an execution resource*
280+
*Listing 5: Example of constructing an execution context from an execution resource*
239281
240-
### Binding execution and allocation to resources
241-
242-
When creating an `execution_context` from a given `execution_resource`, the executors and allocators associated with it are bound to that `execution_resource`. For example: when creating an `execution_resource` from a CPU socket resource, all executors associated with the given socket will spawn execution agents with affinity to the socket partition of the system *(Listing 5)*.
282+
When creating an `execution_context` from a given `execution_resource`, the executors and allocators associated with it are bound to that `execution_resource`. For example: when creating an `execution_resource` from a CPU socket resource, all executors associated with the given socket will spawn execution agents with affinity to the socket partition of the system *(Listing 6)*.
243283
244284
```cpp
245285
auto cList = std::execution::this_system::resources();
@@ -252,14 +292,20 @@ auto socketAllocator = eC.allocator(); // Retrieve an allocator to the closest m
252292
std::vector<int, decltype(socketAllocator)> v1(100, socketAllocator);
253293
std::generate(par.on(executor), std::begin(v1), std::end(v1), std::rand);
254294
```
255-
*Listing 5: Example of allocating with affinity to an execution resource*
295+
*Listing 6: Example of allocating with affinity to an execution resource*
256296

257297
The construction of an `execution_context` on a component implies affinity (where possible) to the given resource. This guarantees that all executors created from that `execution_context` can access the resources and the internal data structures requires to guarantee the placement of the processor.
258298

259299
Only developers that care about resource placement need to care about obtaining executors and allocations from the correct `execution_context` object. Existing code for vectors and STL (including the Parallel STL interface) remains unaffected.
260300

261301
If a particular policy or algorithm requires to access placement information, the resources associated with the passed executor can be retrieved via the link to the `execution_context`.
262302

303+
### Binding to execution
304+
305+
A *thread of execution* can be bound to a particular `execution_resource` for a particular *execution agent* by calling `this_thread::bind`. After which point the *execution resource* returned by `this_thread::get_resource` must be equal to the `execution_resource` provided to `this_thread::bind`. Subsequently a *thread of execution* can be unbound by calling `this_thread::unbind`.
306+
307+
> [*Note:* Binding *threads of execution* can provide performance benefits when used in a way which compliments the application, however incorrect usage can lead to denial of service and therefore can cause loss of performance. *--end note*]
308+
263309
## Header `<execution>` synopsis
264310

265311
namespace std {
@@ -349,11 +395,16 @@ If a particular policy or algorithm requires to access placement information, th
349395
std::vector<execution_resource> resources() noexcept;
350396
}
351397

398+
namespace this_thread {
399+
bool bind(executon_resource) noexcept;
400+
bool unbind(executon_resource) noexcept;
401+
}
402+
352403
} // execution
353404
} // experimental
354405
} // std
355406

356-
*Listing 6: Header synopsis*
407+
*Listing 7: Header synopsis*
357408

358409
## Class `execution_resource`
359410

@@ -515,6 +566,20 @@ The free function `this_system::resources` is provided for retrieving the `execu
515566

516567
> [*Note:* Returning a `std::vector` allows users to potentially manipulate the container of `execution_resource`s after it is returned, we may want to replace this with an alternative type which is more restrictive at a later date such as a range. *--end note*]
517568
569+
The free functions `this_thread::bind` and `this_thread::unbind` are provided for binding / unbinding the current *thread of execution* to / from a particular `execution_reosurce`.
570+
571+
bool bind(executon_resource) noexcept;
572+
573+
*Returns:* `true` if the requested binding was successful, otherwise `false`.
574+
575+
*Effects:* If successful, binds the current *thread of execution* to the specified `execution_resource`.
576+
577+
bool unbind(executon_resource) noexcept;
578+
579+
*Returns:* `true` if the requested unbinding was successful, otherwise `false`.
580+
581+
*Effects:* If successful, unbinds the current *thread of execution* from the specified `execution_resource`.
582+
518583
# Future Work
519584

520585
## Migrating data from memory allocated in one partition to another

0 commit comments

Comments
 (0)