Skip to content

Commit 312a75c

Browse files
author
Gordon Brown
committed
Issue #33: Minor corrections & additions based on feedback.
* Add minor corrections for typos. * Add additional woridng and notes describing the behaviour of `bind` and `unbind`.
1 parent 31701f2 commit 312a75c

File tree

1 file changed

+36
-24
lines changed

1 file changed

+36
-24
lines changed

affinity/cpp-20/d0796r2.md

Lines changed: 36 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@
1616

1717
### P0796r2 (RAP)
1818

19-
* Introduce `this_thread::bind` & `this_thread::unbind` for binding a thread of execution to an execution resource.
19+
* Introduce `this_thread::bind` & `this_thread::unbind` for binding and unbinding a thread of execution to an execution resource.
2020
* Introduce high-level interface for execution binding via executor properties.
2121

2222
### P0796r1 (JAX)
@@ -173,21 +173,21 @@ This feature could be easily scaled to heterogeneous and distributed systems, as
173173
174174
## Overview
175175
176-
In this paper we propose an interface for querying and representing the execution resources within a system, queurying the relative affinity metric between those execution resources, and then using those execution resources to allocate memory and execute work with affinity to the underlying hardware. The interface described in this paper builds on the existing initerface for executors and execution contexts defined in the executors proposal [[22]][p0443r4].
176+
In this paper we propose an interface for querying and representing the execution resources within a system, queurying the relative affinity metric between those execution resources, and then using those execution resources to allocate memory and execute work with affinity to the underlying hardware. The interface described in this paper builds on the existing interface for executors and execution contexts defined in the executors proposal [[22]][p0443r4].
177177
178-
### Interface grandularity
178+
### Interface granularity
179179
180180
In this paper we propose both a low-level interface and a high-level interface:
181-
* The low-level interface cosnsists of mechanisms for discovering detailed information about a system's topology and affinity properties which can be utilised to hand optimise parallel applications and libraries for the best performance. The low-level interface has high granularity and is aimed at users who have a high knowledge of the system architecture.
182-
* The high-level interface consists of policies which describe desired behaviour when using parallel algorithms or libraries. The high-level interface has low granularity and is aimed at users who may have little or no knowledge of the system architecture.
181+
* The low-level interface consists of mechanisms for discovering detailed information about a system's topology and affinity properties which can be utilized to hand optimise parallel applications and libraries for the best performance. The low-level interface has high granularity and is aimed at users who have a high knowledge of the system architecture.
182+
* The high-level interface consists of policies which describe desired behavior when using parallel algorithms or libraries. The high-level interface has low granularity and is aimed at users who may have little or no knowledge of the system architecture.
183183
184184
## High-level interface
185185
186-
The high-level interface is a polcy-based design which utilises the executor property mechanism to provide additional affinity based requirements on executors.
186+
The high-level interface is a policy-based design which utilizes the executor property mechanism to provide additional affinity based requirements on executors.
187187
188188
### Bulk execution affinity
189189
190-
In this paper we propose an executor property group called `bulk_execution_affinity` which contains the sub properties `none`, `balanced`, `scatter` or `compact`. Each of these properties, if applied to an *executor* enforce a particular guarantee of execution agent binding to the *execution resources* associated with the *executor* in a partuclar pattern:
190+
In this paper we propose an executor property group called `bulk_execution_affinity` which contains the sub properties `none`, `balanced`, `scatter` or `compact`. Each of these properties, if applied to an *executor* enforce a particular guarantee of execution agent binding to the *execution resources* associated with the *executor* in a particular pattern:
191191
* **none** makes no guarantee that *execution agents* created by the *executor* will be bound to specific *execution resources*.
192192
* **balanced** guarantees that *execution agents* created by the executor will be bound to the *execution resources* associated with the *executor* close together in sequence but with an even distribution across the *execution resources*.
193193
* **scatter** guarantees that *execution agents* created by the executor will be bound to the *execution resources* associated with the *executor* distributed with each *execution agent* far from each other *execution agent* in sequence.
@@ -210,21 +210,23 @@ Below *(Listing 2)* is an example of executing a parallel task over 8 threads us
210210
```
211211
*Listing 2: Example of using the bulk_execution_affinity property*
212212

213+
> [*Note:* The terms used for the `bulk_execution_affinity` property group are derived from the OpenMP properties [[33]][openmp-affinity] including the Intel specific balanced affinity binding [[[34]][intel-balanced-affinity] *--end note*]
214+
213215
## Low-level interface
214216

215217
### Execution resources
216218

217-
An `execution_resource` is a light weight structure which acts as an identifier to particular piece of hardware within a system. It can be queried for whether it can allocate memory via `can_place_memory` and whether it can execute work via `can_place_agents`, and for it's name via `name`. An `execution_resource` can also represent other `execution_resource`s, these are refered to as being *members of* that `execution_resource` and can be queried via `resources`. Additionally the `execution_resource` which another is a *member of* can be queried vis `member_of`. An `execution_resource` can also be queried for the concurrency it can provide; the total number of *threads of execution* supported by that *execution_resource* and all resources it represents.
219+
An `execution_resource` is a light weight structure which acts as an identifier to particular piece of hardware within a system. It can be queried for whether it can allocate memory via `can_place_memory` and whether it can execute work via `can_place_agents`, and for it's name via `name`. An `execution_resource` can also represent other `execution_resource`s, these are referred to as being *members of* that `execution_resource` and can be queried via `resources`. Additionally the `execution_resource` which another is a *member of* can be queried vis `member_of`. An `execution_resource` can also be queried for the concurrency it can provide; the total number of *threads of execution* supported by that *execution_resource* and all resources it represents.
218220

219221
> [*Note:* Note that an execution resource is not limited to resources which execute work, but also a general resource where no execution can take place but memory can be allocated such as off-chip memory. *--end note*]
220222
221223
> [*Note:* The intention is that the actual implementation details of a resource topology are described in an execution context when required. This allows the execution resource objects to be lightweight objects that serve as identifiers that are only referenced. *--end note*]
222224
223225
### System topology
224226

225-
The system topology is made up of a number of system level `execution_resource`s, which can be queried through `this_system::resource` which returns a `std::vector`. The `execution_resources` available within the system can be initialised dynamically by a runtime library, however must be done so before `main` is called, given that after that point the system topology cannot change.
227+
The system topology is made up of a number of system level `execution_resource`s, which can be queried through `this_system::resource` which returns a `std::vector`. The `execution_resources` available within the system can be initialized dynamically by a runtime library, however must be done so before `main` is called, given that after that point the system topology cannot change.
226228

227-
Below *(Listing 3)* is an example of iterating over the system level resources and priniting out it's capabilities.
229+
Below *(Listing 3)* is an example of iterating over the system level resources and printing out it's capabilities.
228230

229231
```cpp
230232
for (auto res : execution::this_system::resources()) {
@@ -238,7 +240,7 @@ for (auto res : execution::this_system::resources()) {
238240

239241
### Querying relative affinity
240242

241-
The `affinity_query` class template provides an abstraction for a relative affinity value between two `execution_resource`s, derived from a particular `affinity_operation` and `affinity_metric`. The `affinity_query` is templated by `affinity_operation` and `affinity_metric` and is constructed from two `execution_resource`s. An `affinity_query` does not mean much on it's own, instead a relative magnitude of affinity can be queried by using comparison operators. If nessesary the value of an `affinity_query` can also be queried through `native_affinity`, though the return value of this is implementation defined.
243+
The `affinity_query` class template provides an abstraction for a relative affinity value between two `execution_resource`s, derived from a particular `affinity_operation` and `affinity_metric`. The `affinity_query` is templated by `affinity_operation` and `affinity_metric` and is constructed from two `execution_resource`s. An `affinity_query` does not mean much on it's own, instead a relative magnitude of affinity can be queried by using comparison operators. If necessary the value of an `affinity_query` can also be queried through `native_affinity`, though the return value of this is implementation defined.
242244

243245
Below *(listing 4)* is an example of how you can query the relative affinity between two `execution_resource`s.
244246

@@ -302,7 +304,7 @@ If a particular policy or algorithm requires to access placement information, th
302304

303305
### Binding to execution
304306

305-
A *thread of execution* can be bound to a particular `execution_resource` for a particular *execution agent* by calling `this_thread::bind`. After which point the *execution resource* returned by `this_thread::get_resource` must be equal to the `execution_resource` provided to `this_thread::bind`. Subsequently a *thread of execution* can be unbound by calling `this_thread::unbind`.
307+
A *thread of execution* can be requested to bind to a particular `execution_resource` for a particular *execution agent* by calling `this_thread::bind` if that `execution_resource` is able to place agents. If the current *thread of execution* is successfully bound to the specified `execution_resource` it will return `true` otherwise it will return `false`. If the *thread of execution* is successfully bound to the specified `execution_resource` then `execution_resource` returned by `this_thread::get_resource` must be equal to the `execution_resource` provided to `this_thread::bind`. Subsequently a *thread of execution* can be unbound by calling `this_thread::unbind`.
306308

307309
> [*Note:* Binding *threads of execution* can provide performance benefits when used in a way which compliments the application, however incorrect usage can lead to denial of service and therefore can cause loss of performance. *--end note*]
308310
@@ -396,8 +398,8 @@ A *thread of execution* can be bound to a particular `execution_resource` for a
396398
}
397399

398400
namespace this_thread {
399-
bool bind(executon_resource) noexcept;
400-
bool unbind(executon_resource) noexcept;
401+
bool bind(execution_resource eR) noexcept;
402+
bool unbind(execution_resource eR) noexcept;
401403
}
402404

403405
} // execution
@@ -410,7 +412,7 @@ A *thread of execution* can be bound to a particular `execution_resource` for a
410412

411413
The `execution_resource` class provides an abstraction over a system's hardware capable to allocate memory, execute light weight execution agents or both. An `execution_resource` can represent further `execution_resource`s, these `execution_resource`s are said to be *members of* this `execution_resource`.
412414

413-
> [*Note:* The `execution_resource` is required to be implemented such that the underlying software abstraction is initialised when the `execution_resource` is constructed, maintained through reference counting and cleaned up on destruction of the final reference. *--end note*]
415+
> [*Note:* The `execution_resource` is required to be implemented such that the underlying software abstraction is initialized when the `execution_resource` is constructed, maintained through reference counting and cleaned up on destruction of the final reference. *--end note*]
414416
415417
### `execution_resource` constructors
416418

@@ -457,13 +459,13 @@ The `execution_resource` class provides an abstraction over a system's hardware
457459

458460
## Class `execution_context`
459461

460-
The `execution_context` class provides an abstraction for managing a number of light weight execution agents executing work on an `execution_resource` and any `execution_resource`s encapsulated by it. The `execution_resource` which an `execution_context` encapsulates is refered to as the *contained resource*.
462+
The `execution_context` class provides an abstraction for managing a number of light weight execution agents executing work on an `execution_resource` and any `execution_resource`s encapsulated by it. The `execution_resource` which an `execution_context` encapsulates is referred to as the *contained resource*.
461463

462464
### `execution_context` types
463465

464466
using executor_type = see-below;
465467

466-
*Requires:* `executor_type` is an implementation defined class which satifies the general executor requires, as specified by P0443r5.
468+
*Requires:* `executor_type` is an implementation defined class which satisfies the general executor requires, as specified by P0443r5.
467469

468470
using pmr_memory_resource_type = see-below;
469471

@@ -547,7 +549,7 @@ The `affinity_query` class template provides an abstraction for a relative affin
547549
friend expected<size_t, error_type> operator>=(const affinity_query&, const affinity_query&);
548550

549551
*Returns:* An `expected<size_t, error_type>` where,
550-
* if the affinity query was succesful, the value of type `size_t` represents the magnitude of the relative affinity;
552+
* if the affinity query was successful, the value of type `size_t` represents the magnitude of the relative affinity;
551553
* if the affinity query was not successful, the error is an error of type `error_type` which represents the reason for affinity query failed.
552554

553555
> [*Note:* An affinity query is permitted to fail if affinity between the two execution resources cannot be calculated for any reason, such as the resources are of different vendors or communication between the resources is not possible. *--end note*]
@@ -556,7 +558,7 @@ The `affinity_query` class template provides an abstraction for a relative affin
556558
557559
## Free functions
558560

559-
The free function `this_system::resources` is provided for retrieving the `execution_resource`s which encapsulate the hardware platforms available within the system, these are refered to as the *system level resources*.
561+
The free function `this_system::resources` is provided for retrieving the `execution_resource`s which encapsulate the hardware platforms available within the system, these are referred to as the *system level resources*.
560562

561563
std::vector<execution_resource> resources() noexcept;
562564

@@ -566,18 +568,22 @@ The free function `this_system::resources` is provided for retrieving the `execu
566568

567569
> [*Note:* Returning a `std::vector` allows users to potentially manipulate the container of `execution_resource`s after it is returned, we may want to replace this with an alternative type which is more restrictive at a later date such as a range. *--end note*]
568570
569-
The free functions `this_thread::bind` and `this_thread::unbind` are provided for binding / unbinding the current *thread of execution* to / from a particular `execution_reosurce`.
571+
The free functions `this_thread::bind` and `this_thread::unbind` are provided for binding / unbinding the current *thread of execution* to / from a particular `execution_resource`.
570572

571-
bool bind(executon_resource) noexcept;
573+
bool bind(execution_resource eR) noexcept;
572574

573575
*Returns:* `true` if the requested binding was successful, otherwise `false`.
574576

577+
*Requires:* `eR.can_place_agents() == true`.
578+
575579
*Effects:* If successful, binds the current *thread of execution* to the specified `execution_resource`.
576580

577-
bool unbind(executon_resource) noexcept;
581+
bool unbind(execution_resource eR) noexcept;
578582

579583
*Returns:* `true` if the requested unbinding was successful, otherwise `false`.
580584

585+
*Requires:* `eR.can_place_agents() == true`.
586+
581587
*Effects:* If successful, unbinds the current *thread of execution* from the specified `execution_resource`.
582588

583589
# Future Work
@@ -600,7 +606,7 @@ With the ability to place memory with affinity comes the ability to define algor
600606

601607
## Level of abstraction
602608

603-
The current proposal provides an interface for querying whether an `execution_resource` can allocate and/or execute work, it can provide the concurrency it supports and it can provide a name. We also provide the `affinity_query` structure for querying the relative affinity metrics between two `execution_resource`s. However this may not be enough information for users to take full advance of the system, they may also want to know what kind of memory is available or the properties by which work is executed. It was decided that attempting to enumerate the various hardware components would not be ideal as that would make it harder for implementers to support new hardware. It has been discussed that a better approach would be to parameterise the additional properties of hardware such that hardware queries could be much more generic.
609+
The current proposal provides an interface for querying whether an `execution_resource` can allocate and/or execute work, it can provide the concurrency it supports and it can provide a name. We also provide the `affinity_query` structure for querying the relative affinity metrics between two `execution_resource`s. However this may not be enough information for users to take full advance of the system, they may also want to know what kind of memory is available or the properties by which work is executed. It was decided that attempting to enumerate the various hardware components would not be ideal as that would make it harder for implementors to support new hardware. It has been discussed that a better approach would be to parameterize the additional properties of hardware such that hardware queries could be much more generic.
604610

605611
We may wish to mirror the design of the executors proposal and have a generic query interface using properties for querying information about an `execution_resource`. It’s expected that an implementation may provide additional nonstandard queries that are specific to that implementation.
606612

@@ -610,7 +616,7 @@ We may wish to mirror the design of the executors proposal and have a generic qu
610616

611617
## Dynamic topology discovery
612618

613-
The current proposal requires that all `execution_resource`s are initialised before `main` is called, therefore not allowing an `execution_resource` to become available or go offline at runtime. We may wish to support this in the future, however this is outside of the scope of this paper.
619+
The current proposal requires that all `execution_resource`s are initialized before `main` is called, therefore not allowing an `execution_resource` to become available or go offline at runtime. We may wish to support this in the future, however this is outside of the scope of this paper.
614620

615621
| Straw Poll |
616622
|------------|
@@ -712,3 +718,9 @@ The current proposal requires that all `execution_resource`s are initialised bef
712718

713719
[madness-journal]: http://dx.doi.org/10.1137/15M1026171
714720
[[32]][madness-journal] MADNESS: A Multiresolution, Adaptive Numerical Environment for Scientific Simulation
721+
722+
[openmp-affinity]: http://pages.tacc.utexas.edu/~eijkhout/pcse/html/omp-affinity.html
723+
[[33]][openmp-affinity] OpenMP topic: Affinity
724+
725+
[intel-balanced-affinity]: https://software.intel.com/en-us/node/522518
726+
[[34]][intel-balanced-affinity] Balanced Affinity Type

0 commit comments

Comments
 (0)