Skip to content

Commit 1fc49ea

Browse files
authored
Merge pull request #73 from AerialMantis/issue-50
Issue #50: Make execution_resource iterable.
2 parents 392bcd8 + 92841d8 commit 1fc49ea

File tree

1 file changed

+104
-43
lines changed

1 file changed

+104
-43
lines changed

affinity/cpp-20/d0796r3.md

Lines changed: 104 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,9 @@
1919
* Remove reference counting requirement from `execution_resource`.
2020
* Change lifetime model of `execution_resource`: it now either consistently identifies some underlying resource, or is invalid; context creation rejects an invalid resource.ster
2121
* Remove `this_thread::bind` & `this_thread::unbind` interfaces.
22+
* Make `execution_resource`s iterable by replacing `execution_resource::resources` with `execution_resource::begin` and `execution_resource::end`.
23+
* Add `size` and `operator[]` for `execution_resource`.
24+
* Rename `this_system::get_resources` to `this_system::discover_topology`.
2225

2326
### P0796r2 (RAP 2018)
2427

@@ -162,7 +165,13 @@ From a historic perspective, programming models for traditional high-performance
162165
163166
Some of these programming models also address *fault tolerance*. In particular, PVM has native support for this, providing a mechanism [[27]][pvm-callback] which can notify a program when a resource is added or removed from a system. MPI lacks a native *fault tolerance* mechanism, but there have been efforts to implement fault tolerance on top of MPI [[28]][mpi-post-failure-recovery] or by extensions[[29]][mpi-fault-tolerance].
164167
165-
Due to the complexity involved in standardizing *dynamic resource discovery* and *fault tolerance*, these are currently out of the scope of this paper. However, we leave open the possibility of accommodating both in the future, by not overconstraining *resources*' lifetimes (see next section).
168+
Due to the complexity involved in standardizing *dynamic resource discovery* and *fault tolerance*, these are currently out of the scope of this paper. However, we leave open the possibility of accommodating both in the future, by not over constraining *resources*' lifetimes (see next section).
169+
170+
### Reporting errors in topology discovery
171+
172+
As querying the topology of a system can invoke a number of different system and third-party library, we have to consider what will happen when a call to one of these fails. Firstly we want to be able to report this failure so that it can be reported or handled in user code. Secondly as there will often be more than one source of topology discovery we have to avoid short-circuiting the discovery on an error and preventing potentially valid topology information being reported to users. For example if a system were to report both Hwloc and OpenCL execution resources and one of these failed we want the other to still be able to return it's resources.
173+
174+
A potential solution to this could be support partial errors in topology discovery, where querying the system for it's topology could be permitted to fail but still return a valid topology structure representing the topology that was discovered successfully. The way in which these errors are reported (i.e. exceptions or error values) would have to be decided, exceptions could be problematic as it could unwind the stack before capturing important topology information so perhaps an error value based approach would be preferable.
166175
167176
### Resource lifetime
168177
@@ -267,26 +276,39 @@ Below *(Listing 2)* is an example of executing a parallel task over 8 threads us
267276

268277
## Execution resource topology
269278

270-
### Execution resources
279+
### System topology
271280

272-
An `execution_resource` is a lightweight structure which acts as an identifier to particular piece of hardware within a system. It can be queried for whether it can allocate memory via `can_place_memory`, whether it can execute work via `can_place_agents`, and for its name via `name`. An `execution_resource` can also represent other `execution_resource`s. We call these *members of* that `execution_resource`, and can be queried via `resources`. Additionally the `execution_resource` which another is a *member of* can be queried via `member_of`. An `execution_resource` can also be queried for the concurrency it can provide, the total number of *threads of execution* supported by that *execution_resource*, and all resources it represents.
281+
The **system topology** is comprised of a directed acyclic graph (DAG) of **execution resources**, representing all unique hardware and software components available within the system capable of executing work. The root node of the DAG is the **system execution resource** and represents the entire system. Each **execution resource** within the DAG may have any number of child **execution resources** representing a finer granularity of the parent **execution resource**. Every **execution resource** within the **system topology** is exposed via an `execution_resource` object.
273282

274-
> [*Note:* Note that an execution resource is not limited to resources which execute work, but also a general resource where no execution can take place but memory can be allocated such as off-chip memory. *--end note*]
283+
The **system topology** can be discovered by calling `this_system::discover_topology`. This will discover all **execution resources** available within the system and construct the **system topology** DAG, describing a read-only snapshot at the point of the call, and then return an `execution_resource` object exposing the **system execution resource**.
275284

276-
> [*Note:* The intention is that the actual implementation details of a resource topology are described in an execution context when required. This allows the execution resource objects to be lightweight objects that serve as identifiers that are only referenced. *--end note*]
285+
A call to `this_system::discover_topology` may invoke C++ library, system or third party library API calls required to discover certain **execution resources**. However, `this_system::discover_topology` must be thread safe and must initialize and finalize any OS or third-party state before returning.
277286

278-
### System topology
287+
### Execution resources
288+
289+
An `execution_resource` is a lightweight structure which identifies a particular **execution resource** within a snapshot of the **system topology**. It can be queried for whether the associated **execution resource** can allocate memory via `can_place_memory`, whether the associated **execution resource** can execute work via `can_place_agents`, and for a name via `name`.
279290

280-
The system topology is made up of a number of system-level `execution_resource`s, which can be queried through `this_system::get_resources` which returns a `std::vector`. A run-time library may initialize the `execution_resource`s available within the system dynamically. However, `this_system::get_resources` must be thread safe and must initialize and finalize any third-party or OS state before returning.
291+
An `execution_resource` object can be queried for a pointer to it's parent `execution_resource` via `member_of`, and can also be iterated over for it's child `execution_resource`s via `begin` and `end`.
281292

282-
Below *(Listing 3)* is an example of iterating over the system-level resources and printing out their capabilities.
293+
An `execution_resource` object can also be queried for the amount concurrency it can provide, the total number of **threads of execution** supported by the associated **execution resource**.
294+
295+
> [*Note:* An **execution resource** is not limited to resources which execute work, but also a general resource where no execution can take place but memory can be allocated, such as off-chip memory. *--end note*]
296+
297+
Below *(Listing 3)* is an example of iterating over every **execution resource** within the **system topology** and printing out their capabilities.
283298

284299
```cpp
285-
for (auto res : execution::this_system::get_resources()) {
286-
std::cout << res.name() `\n`;
287-
std::cout << res.can_place_memory() << `\n`;
288-
std::cout << res.can_place_agents() << `\n`;
289-
std::cout << res.concurrency() << `\n`;
300+
void print_topology(const execution::execution_resource &resource, int indent = 0) {
301+
for (int i = 0; i < indent; i++) { std::cout << " "; }
302+
std::cout << resource.name() << ": " << resource.can_place_memory() << ", "
303+
<< resource.can_place_agents() << ", " << resource.concurrency() << "\n";
304+
for (const execution::execution_resource child : resource) {
305+
print_topology(child, indent + 1);
306+
}
307+
}
308+
309+
int main(int argc, char * argv[]) {
310+
auto systemResource = this_system::discover_topology();
311+
print_topology(systemResource);
290312
}
291313
```
292314
*Listing 3: Example of querying all the system level execution resources*
@@ -298,14 +320,13 @@ The `affinity_query` class template provides an abstraction for a relative affin
298320
Below *(listing 4)* is an example of how to query the relative affinity between two `execution_resource`s.
299321
300322
```cpp
301-
auto systemLevelResources = execution::this_system::get_resources();
302-
auto memberResources = systemLevelResources.resources();
323+
auto systemResource = this_system::discover_topology();
303324
304325
auto relativeLatency01 = execution::affinity_query<execution::affinity_operation::read,
305-
execution::affinity_metric::latency>(memberResources[0], memberResources[1]);
326+
execution::affinity_metric::latency>(systemResource[0], systemResource[1]);
306327
307328
auto relativeLatency02 = execution::affinity_query<execution::affinity_operation::read,
308-
execution::affinity_metric::latency>(memberResources[0], memberResources[2]);
329+
execution::affinity_metric::latency>(systemResource[0], systemResource[2]);
309330
310331
auto relativeLatency = relativeLatency01 > relativeLatency02;
311332
```
@@ -320,27 +341,27 @@ The `execution_context` class provides an abstraction for managing a number of l
320341
Below *(Listing 5)* is an example of how this extended interface could be used to construct an *execution context* from an *execution resource* which is retrieved from the *system’s resource topology*. Once an *execution context* is constructed it can then still be queried for its *execution resource*, and that *execution resource* can be further partitioned.
321342

322343
```cpp
323-
auto &resources = execution::this_system::get_resources();
344+
auto systemResource = std::this_system::discover_topology();
324345

325-
execution::execution_context execContext(resources[0]);
346+
execution::execution_context execContext(systemResource[0]);
326347

327-
auto &systemLevelResource = execContext.resource();
348+
auto &execResource = execContext.resource();
328349

329-
// resource[0] should be equal to execResource
350+
// systemResource[0] should be equal to execResource
330351

331-
for (auto res : systemLevelResource.resources()) {
332-
std::cout << res.name() << `\n`;
352+
for (const execution::execution_resource &res : execResource) {
353+
std::cout << res.name() << "\n";
333354
}
334355
```
335356
*Listing 5: Example of constructing an execution context from an execution resource*
336357
337358
When creating an `execution_context` from a given `execution_resource`, the executors and allocators associated with it are bound to that `execution_resource`. For example, when creating an `execution_resource` from a CPU socket resource, all executors associated with the given socket will spawn execution agents with affinity to the socket partition of the system *(Listing 6)*.
338359
339360
```cpp
340-
auto cList = std::execution::this_system::get_resources();
361+
auto systemResource = std::this_system::discover_topology();
341362
// FindASocketResource is a user-defined function that finds a
342363
// resource that is a CPU socket in the given resource list
343-
auto& socket = findASocketResource(cList);
364+
auto& socket = findASocketResource(systemResource);
344365
execution_contextC{socket} // Associated with the socket
345366
auto executor = eC.executor(); // By transitivity, associated with the socket too
346367
auto socketAllocator = eC.allocator(); // Retrieve an allocator to the closest memory node
@@ -378,18 +399,32 @@ The `execution_resource` which underlies the current thread of execution can be
378399
class execution_resource {
379400
public:
380401

402+
using value_type = execution_resource;
403+
using pointer = execution_resource *;
404+
using const_pointer = const execution_resource *;
405+
using iterator = see-below;
406+
using const_iterator = see-below;
407+
using reference = execution_resource &;
408+
using const_reference = const execution_resource &;
409+
using size_type = std::size_t;
410+
381411
execution_resource() = delete;
382412
execution_resource(const execution_resource &);
383413
execution_resource(execution_resource &&);
384414
execution_resource &operator=(const execution_resource &);
385415
execution_resource &operator=(execution_resource &&);
386416
~execution_resource();
387417

388-
size_t concurrency() const noexcept;
418+
size_type size() const noexcept;
419+
420+
const_iterator begin() const noexcept;
421+
const_iterator end() const noexcept;
389422

390-
std::vector<resource> resources() const noexcept;
423+
const_reference operator[](std::size_t child) const noexcept;
391424

392-
const execution_resource member_of() const noexcept;
425+
const_pointer member_of() const noexcept;
426+
427+
size_t concurrency() const noexcept;
393428

394429
std::string name() const noexcept;
395430

@@ -455,7 +490,7 @@ The `execution_resource` which underlies the current thread of execution can be
455490
/* This system */
456491

457492
namespace this_system {
458-
std::vector<execution_resource> resources() noexcept;
493+
const execution_resource discover_topology();
459494
}
460495

461496
/* This thread */
@@ -494,9 +529,21 @@ The `execution_resource` class provides an abstraction over a system's hardware,
494529

495530
> [*Note:* Creating an `execution_resource` may require initializing the underlying software abstraction when the `execution_resource` is constructed, in order to discover other `execution_resource`s accessible through it. However, an `execution_resource` is nonowning. *--end note*]
496531
532+
### `execution_resource` member types
533+
534+
iterator
535+
536+
*Requires:* `iterator` to model `RandomAccessIterator` with the value type `execution_resource::value_type`.
537+
538+
const_iterator
539+
540+
*Requires:* `const_iterator` to model `RandomAccessIterator` with the value type `execution_resource::value_type`.
541+
542+
iterator_traits<>iterator_category
543+
497544
### `execution_resource` constructors
498545

499-
execution_resource();
546+
execution_resource() = delete;
500547

501548
> [*Note:* An implementation of `execution_resource` is permitted to provide non-public constructors to allow other objects to construct them. *--end note*]
502549
@@ -517,31 +564,43 @@ The `execution_resource` class provides an abstraction over a system's hardware,
517564

518565
*Returns:* The total concurrency available to this resource. More specifically, the number of *threads of execution* collectively available to this `execution_resource` and any resources which are *members of*, recursively.
519566

520-
std::vector<resource> resources() const noexcept;
567+
size_type size() const noexcept;
521568

522-
*Returns:* All `execution_resource`s which are *members of* this resource.
569+
*Returns:* The number of child `execution_resource`s.
523570

524-
const execution_resource &member_of() const noexcept;
571+
const_iterator begin() const noexcept;
525572

526-
*Returns:* The `execution_resource` which this resource is a *member of*.
573+
*Returns:* A const iterator to the beginning of the child `execution_resource`s.
574+
575+
const_iterator end() const noexcept;
576+
577+
*Returns:* A const iterator to the end of the child `execution_resource`s.
578+
579+
const_reference operator[](std::size_t child) const noexcept;
580+
581+
*Returns:* A const reference to the specified child `execution_resource`s.
582+
583+
const_pointer member_of() const noexcept;
584+
585+
*Returns:* The parent `execution_resource`.
527586

528587
std::string name() const noexcept;
529588

530589
*Returns:* An implementation defined string.
531590

532591
bool can_place_memory() const noexcept;
533592

534-
*Returns:* If this resource is capable of allocating memory with affinity, 'true'.
593+
*Returns:* If the associated **execution resource* is capable of allocating memory with affinity, 'true'.
535594

536595
bool can_place_agent() const noexcept;
537596

538-
*Returns:* If this resource is capable of execute with affinity, 'true'.
597+
*Returns:* If the associated **execution resource* is capable of execute with affinity, 'true'.
539598

540599
## Class `execution_context`
541600

542601
The `execution_context` class provides an abstraction for managing a number of lightweight execution agents executing work on an `execution_resource` and any `execution_resource`s encapsulated by it. The `execution_resource` which an `execution_context` encapsulates is referred to as the *contained resource*.
543602

544-
### `execution_context` types
603+
### `execution_context` member types
545604

546605
using executor_type = see-below;
547606

@@ -638,17 +697,19 @@ The `affinity_query` class template provides an abstraction for a relative affin
638697
639698
## Free functions
640699

641-
### `this_system::get_resources`
700+
### `this_system::discover_topology`
701+
702+
The free function `this_system::discover_topology` is provided for discovering the **system topology**.
642703

643-
The free function `this_system::get_resources` is provided for retrieving the `execution_resource`s which encapsulate the hardware platforms available within the system. We refer to these resources as the *system level resources*.
704+
const execution_resource discover_topology();
644705

645-
std::vector<execution_resource> resources() noexcept;
706+
*Returns:* An `execution_resource` object exposing the **system execution resource**.
646707

647-
*Returns:* An `std::vector` containing all *system level resources*.
708+
*Requires:* If `this_system::discover_topology().size() > 0`, `this_system::discover_topology()[0]` be the `execution_resource` use by `std::thread`. Calls to `this_system::discover_topology()` may not introduce a data race with any other call to `this_system::discover_topology()`.
648709

649-
*Requires:* If `this_system::get_resources().size() > 0`, `this_system::get_resources()[0]` be the `execution_resource` use by `std::thread`. The value returned by `this_system::get_resources()` be the same at any point after the invocation of `main`.
710+
*Effects:* Discovers all **execution resources** available within the system and constructs the **system topology** DAG, describing a read-only snapshot at the point of the call.
650711

651-
> [*Note:* Returning a `std::vector` allows users to potentially manipulate the container of `execution_resource`s after it is returned. We may want to replace this at a later date with an alternative type which is more restrictive, such as a range or span. *--end note*]
712+
*Throws:* Any exception thrown as a result of **system topology** discovery.
652713

653714
### `this_thread::get_resource`
654715

0 commit comments

Comments
 (0)