Skip to content

Commit 7386f3d

Browse files
author
Gordon Brown
committed
Issue #33 Apply changes based on feedback.
* Move the wording of the bulk_execution_affinity properties to the synopsis and proposed wording. * Add additional notes discussing the behaviour of the bulk_execution_affinity properties. * Remove the straw poll on having a high-level interface. * Add a new straw poll discussing who should have control over the bulk_execution_affinity properties.
1 parent ea2ccd7 commit 7386f3d

File tree

1 file changed

+35
-14
lines changed

1 file changed

+35
-14
lines changed

affinity/cpp-20/d0796r2.md

Lines changed: 35 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -132,7 +132,7 @@ The interface of `thread_execution_resource_t` proposed in the execution context
132132
133133
The interface for querying the *resource topology* of a *system* must be flexible enough to allow querying all *execution resources* available under an *execution context*, querying the *execution resources* available to the entire system, and constructing an *execution context* for a particular *execution resource*. This is important, as many standards such as OpenCL [[6]][opencl-2-2] and HSA [[7]][hsa] require the ability to query the *resource topology* available in a *system* before constructing an *execution context* for executing work.
134134
135-
> For example, an implementation may provide an execution context for a particular execution resource such as a static thread pool or a GPU context for a particular GPU device, or an implementation may provide a more generic execution context which can be constructed from a number of CPU and GPU devices queryable through the system resource topology.
135+
> For example, an implementation may provide an execution context for a particular execution resource such as a static thread pool or a GPU context for a particular GPU device, or an implementation may provide a more generic execution context which can be constructed from a number of CPU and GPU devices query-able through the system resource topology.
136136
137137
### Topology discovery & fault tolerance
138138
@@ -188,11 +188,7 @@ The high-level interface is a policy-based design which utilizes the executor pr
188188
189189
### Bulk execution affinity
190190
191-
In this paper we propose an executor property group called `bulk_execution_affinity` which contains the sub properties `none`, `balanced`, `scatter` or `compact`. Each of these properties, if applied to an *executor* enforce a particular guarantee of execution agent binding to the *execution resources* associated with the *executor* in a particular pattern:
192-
* **none** makes no guarantee that *execution agents* created by the *executor* will be bound to specific *execution resources*.
193-
* **balanced** guarantees that *execution agents* created by the executor will be bound to the *execution resources* associated with the *executor* close together in sequence but with an even distribution across the *execution resources*.
194-
* **scatter** guarantees that *execution agents* created by the executor will be bound to the *execution resources* associated with the *executor* distributed with each *execution agent* far from each other *execution agent* in sequence.
195-
* **compact** guarantees that *execution agents* created by the executor will be bound to the *execution resources* associated with the *executor* close together in sequence.
191+
In this paper we propose an executor property group called `bulk_execution_affinity` which contains the nested properties `none`, `balanced`, `scatter` or `compact`. Each of these properties, if applied to an *executor* enforce a particular guarantee of execution agent binding to the *execution resource*s associated with the *executor* in a particular pattern.
196192
197193
Below *(Listing 2)* is an example of executing a parallel task over 8 threads using `bulk_execute`, with the affinity binding `bulk_execution_affinity.scatter`.
198194
@@ -211,8 +207,6 @@ Below *(Listing 2)* is an example of executing a parallel task over 8 threads us
211207
```
212208
*Listing 2: Example of using the bulk_execution_affinity property*
213209

214-
> [*Note:* The terms used for the `bulk_execution_affinity` property group are derived from the OpenMP properties [[33]][openmp-affinity] including the Intel specific balanced affinity binding [[[34]][intel-balanced-affinity] *--end note*]
215-
216210
## Low-level interface
217211

218212
### Execution resources
@@ -319,6 +313,12 @@ A *thread of execution* can be requested to bind to a particular `execution_reso
319313
namespace experimental {
320314
namespace execution {
321315

316+
/* Bulk execution affinity properties */
317+
318+
struct bulk_execution_affinity_t;
319+
320+
constexpr bulk_execution_affinity_t bulk_execution_affinity;
321+
322322
/* Execution resource */
323323

324324
class execution_resource {
@@ -418,6 +418,25 @@ A *thread of execution* can be requested to bind to a particular `execution_reso
418418

419419
*Listing 7: Header synopsis*
420420

421+
## Bulk execution affinity properties
422+
423+
The bulk_execution_affinity_t property describes what guarantees executors provide about the binding of *execution agent*s to the underlying *execution resource*s.
424+
425+
bulk_execution_affinity_t provides nested property types and objects as described below.
426+
427+
| Nested Property Type | Nested Property Name | Requirements |
428+
|----------------------|----------------------|--------------|
429+
| bulk_execution_affinity_t::none_t | bulk_execution_affinity_t::none | A call to an executor's bulk execution function may or may not bind the the *execution agent*s to the underlying *execution resource*s. The affinity binding pattern may or may not be consistent across invocations of the executor's bulk execution function. |
430+
| bulk_execution_affinity_t::scatter_t | bulk_execution_scatter_t::scatter | A call to an executor's bulk execution function must bind the *execution agent*s to the underlying *execution resource*s such that they are distributed across the *execution resource*s where each *execution agent* far from it's preceding and following *execution agent*s. The affinity binding pattern must to be consistent across invocations of the executor's bulk execution function. |
431+
| bulk_execution_affinity_t::compact_t | bulk_execution_compact_t::compact | A call to an executor's bulk execution function must bind the *execution agent*s to the underlying *execution resource*s such that they are in sequence across the *execution resource*s where each *execution agent* close to it's preceding and following *execution agent*s. The affinity binding pattern must be consistent across invocations of the executor's bulk execution function. |
432+
| bulk_execution_affinity_t::balanced_t | bulk_execution_balanced_t::balanced | A call to an executor's bulk execution function must bind the *execution agent*s to the underlying *execution resource*s such that they are in sequence and evenly spread across the *execution resource*s where each *execution agent* close to it's preceding and following *execution agent*s and all *execution resource*s are utilized. The affinity binding pattern must to be consistent across invocations of the executor's bulk execution function. |
433+
434+
> [*Note:* The requirements of the `bulk_execution_affinity_t` nested properties do not enforce a specific binding, simply that the binding follows the requirements set out above and that the pattern is consistent across invocations of the bulk execution functions. *--end note*]
435+
436+
> [*Note:* If two *executor*s `e1` and `e2` invoke a bulk execution function in order, where `execution::query(e1, execution::context) == query(e2, execution::context)` is `true` and `execution::query(e1, execution::bulk_execution_affinity) == query(e2, execution::bulk_execution_affinity)` is `false`, this will likely result in `e1` binding *execution agent*s if necessary to achieve the requested affinity pattern and then `e2` rebinding rebinding tp achieve the new affinity pattern. *--end note*]
437+
438+
> [*Note:* The terms used for the `bulk_execution_affinity_t` nested properties are derived from the OpenMP properties [[33]][openmp-affinity] including the Intel specific balanced affinity binding [[[34]][intel-balanced-affinity] *--end note*]
439+
421440
## Class `execution_resource`
422441

423442
The `execution_resource` class provides an abstraction over a system's hardware capable to allocate memory, execute light weight execution agents or both. An `execution_resource` can represent further `execution_resource`s, these `execution_resource`s are said to be *members of* this `execution_resource`.
@@ -610,21 +629,23 @@ The free function `this_thread::get_resource` is provided for retrieving the `ex
610629

611630
# Future Work
612631

613-
## Migrating data from memory allocated in one partition to another
632+
## Who should have control over bulk execution affinity?
614633

615-
In some cases for performance it is important to bind a memory allocation to a memory region for the duration of an a tasks execution, however in other cases it’s important to be able to migrate the data from one memory region to another. This is outside the scope of this paper, however we would like to investigate this in a future paper.
634+
This paper currently proposes the `bulk_execution_affinity_t` properties and it's nested properties for allowing an *executor* to make guarantees as to how *execution agent*s are bound to the underlying *execution resource*s. However providing control at this level may lead to *execution agent*s being bound to *execution resource*s within a critical path. A possible solution to this to allow the *execution context* to be configured with `bulk_execution_affinity_t` nested properties, either instead of the *executor* property or in addition. This would allow the binding of *threads of execution* to be performed at the time of the *execution context* creation.
616635

617636
| Straw Poll |
618637
|------------|
619-
| Should the interface provide a way of migrating data between partitions? |
638+
| Should the *execution context* be able to manage the binding of all *threads of execution* which it manages using the `bulk_execution_affinity_t` nested properties? |
639+
| Should the *executor* be able to manage the binding of all *execution agent*s which it manages using the `bulk_execution_affinity_t` nested properties? |
640+
| Should both the *execution context* and the *executor* be able to manage the binding of *threads of execution* and subsequently *execution agent*s using the `bulk_execution_affinity_t` nested properties? |
620641

621-
## Defining memory placement algorithms or policies
642+
## Migrating data from memory allocated in one partition to another
622643

623-
With the ability to place memory with affinity comes the ability to define algorithms or memory policies which describe at a higher level how memory is distributed across large systems. Some examples of these are pinned, first touch and scatter. This is outside the scope of this paper, however we would like to investigate this in a future paper.
644+
In some cases for performance it is important to bind a memory allocation to a memory region for the duration of an a tasks execution, however in other cases it’s important to be able to migrate the data from one memory region to another. This is outside the scope of this paper, however we would like to investigate this in a future paper.
624645

625646
| Straw Poll |
626647
|------------|
627-
| Should the interface provide standard algorithms or policies for distributing memory? |
648+
| Should the interface provide a way of migrating data between partitions? |
628649

629650
## Level of abstraction
630651

0 commit comments

Comments
 (0)