|
7 | 7 | = CPU partitioning and performance tuning |
8 | 8 |
|
9 | 9 | New in this release:: |
10 | | -* No reference design updates in this release |
| 10 | +* The `PerformanceProfile` and `TunedPerformancePatch` objects have been updated to fully support the aarch64 architecture. |
| 11 | +** If you have previously applied additional patches to the `TunedPerformancePatch` object, you must convert those patches to a new performance profile that includes the `ran-du-performance` profile instead. See the "Engineering considerations" section. |
| 12 | + |
11 | 13 |
|
12 | 14 | Description:: |
13 | | -The RAN DU use model includes cluster performance tuning using `PerformanceProfile` CRs for low-latency performance. |
| 15 | +The RAN DU use model includes cluster performance tuning using `PerformanceProfile` CRs for low-latency performance, and a `TunedPerformancePatch` CR that adds additional RAN-specific tuning. |
| 16 | +A reference `PerformanceProfile` is provided for both x86_64 and aarch64 CPU architectures. |
| 17 | +The single `TunedPerformancePatch` object provided automatically detects the CPU architecture and performs the required additional tuning. |
14 | 18 | The RAN DU use case requires the cluster to be tuned for low-latency performance. |
15 | | -The Node Tuning Operator reconciles the `PerformanceProfile` CRs. |
| 19 | +The Node Tuning Operator reconciles the `PerformanceProfile` and `TunedPerformancePatch` CRs. |
| 20 | + |
16 | 21 | For more information about node tuning with the `PerformanceProfile` CR, see "Tuning nodes for low latency with the performance profile". |
17 | 22 |
|
18 | 23 | Limits and requirements:: |
19 | | -The Node Tuning Operator uses the `PerformanceProfile` CR to configure the cluster. |
20 | 24 | You must configure the following settings in the telco RAN DU profile `PerformanceProfile` CR: |
21 | 25 | + |
22 | 26 | -- |
23 | | -* Set a reserved `cpuset` of 4 or more, equating to 4 hyper-threads (2 cores) for either of the following CPUs: |
| 27 | +* Set a reserved `cpuset` of 4 or more, equating to 4 hyper-threads (2 cores) on x86_64, or 4 cores on aarch64 for any of the following CPUs: |
24 | 28 | ** Intel 3rd Generation Xeon (IceLake) 2.20 GHz, or newer, CPUs with host firmware tuned for maximum performance |
25 | 29 | ** AMD EPYC Zen 4 CPUs (Genoa, Bergamo) |
| 30 | +** ARM CPUs (Neoverse) |
26 | 31 | + |
27 | 32 | [NOTE] |
28 | 33 | ==== |
29 | | -AMD EPYC Zen 4 CPUs (Genoa, Bergamo) are fully supported. |
30 | | -Power consumption evaluations are ongoing. |
31 | 34 | It is recommended to evaluate features, such as per-pod power management, to determine any potential impact on performance. |
32 | 35 | ==== |
33 | 36 |
|
34 | | -* Set the reserved `cpuset` to include both hyper-thread siblings for each included core. |
35 | | -Unreserved cores are available as allocatable CPU for scheduling workloads. |
36 | | -* Ensure that hyper-thread siblings are not split across reserved and isolated cores. |
37 | | -* Ensure that reserved and isolated CPUs include all the threads for all cores in the CPU. |
38 | | -* Include Core 0 for each NUMA node in the reserved CPU set. |
39 | | -* Set the huge page size to 1G. |
| 37 | +* x86_64: |
| 38 | +** Set the reserved `cpuset` to include both hyper-thread siblings for each included core. |
| 39 | + Unreserved cores are available as allocatable CPU for scheduling workloads. |
| 40 | +** Ensure that hyper-thread siblings are not split across reserved and isolated cores. |
| 41 | +** Ensure that reserved and isolated CPUs include all the threads for all cores in the CPU. |
| 42 | +** Include Core 0 for each NUMA node in the reserved CPU set. |
| 43 | +** Set the hugepage size to 1G. |
| 44 | +* aarch64: |
| 45 | +** Use the first 4 cores for the reserved CPU set (or more). |
| 46 | +** Set the hugepage size to 512M. |
40 | 47 | * Only pin {product-title} pods that are by default configured as part of the management workload partition to reserved cores. |
41 | 48 | * When recommended by the hardware vendor, set the maximum CPU frequency for reserved and isolated CPUs using the `hardwareTuning` section. |
42 | 49 | -- |
43 | 50 |
|
44 | 51 | Engineering considerations:: |
45 | | -* Meeting the full performance metrics requires use of the RT kernel. |
46 | | -If required, you can use the non-RT kernel with corresponding impact to performance. |
| 52 | + |
| 53 | +* RealTime (RT) kernel |
| 54 | +** Under x86_64, to reach the full performance metrics, you must use the RT kernel, which is the default in the `x86_64/PerformanceProfile.yaml` configuration. |
| 55 | +*** If required, you can select the non-RT kernel with corresponding impact to performance. |
| 56 | +** Under aarch64, only the 64k-pagesize non-RT kernel is recommended for RAN DU use cases, which is the default in the `aarch64/PerformanceProfile.yaml` configuration. |
47 | 57 | * The number of hugepages you configure depends on application workload requirements. |
48 | 58 | Variation in this parameter is expected and allowed. |
49 | 59 | * Variation is expected in the configuration of reserved and isolated CPU sets based on selected hardware and additional components in use on the system. |
50 | 60 | The variation must still meet the specified limits. |
51 | 61 | * Hardware without IRQ affinity support affects isolated CPUs. |
52 | 62 | To ensure that pods with guaranteed whole CPU QoS have full use of allocated CPUs, all hardware in the server must support IRQ affinity. |
53 | | -* If you enable workload partitioning by setting `cpuPartitioningMode` to `AllNodes` during deployment, you must use the `PerformanceProfile` CR to allocate enough CPUs to support the operating system, interrupts, and {product-title} pods. |
54 | | -* The reference performance profile includes additional kernel arguments settings for `vfio_pci`. |
| 63 | +* To enable workload partitioning, set `cpuPartitioningMode` to `AllNodes` during deployment, and then use the `PerformanceProfile` CR to allocate enough CPUs to support the operating system, interrupts, and {product-title} pods. |
| 64 | +* Under x86_64, the `PerformanceProfile` CR includes additional kernel arguments settings for `vfio_pci`. |
55 | 65 | These arguments are included for support of devices such as the FEC accelerator. You can omit them if they are not required for your workload. |
| 66 | +* Under aarch64, the `PerformanceProfile` must be adjusted depending on the needs of the platform: |
| 67 | +** For Grace Hopper systems, the following kernel commandline arguments are required: |
| 68 | +*** `acpi_power_meter.force_cap_on=y` |
| 69 | +*** `module_blacklist=nouveau` |
| 70 | +*** `pci=realloc=off` |
| 71 | +*** `pci=pcie_bus_safe` |
| 72 | +** For other ARM platforms, you may need to enable `iommu.passthrough=1` or `pci=realloc` |
| 73 | +* Extending and augmenting `TunedPerformancePatch.yaml`: |
| 74 | +** `TunedPerformancePatch.yaml` introduces a default top-level tuned profile named `ran-du-performance` and an architecture-aware RAN tuning profile named `ran-du-performance-architecture-common`, and additional archichitecture-specific child policies that are automatically selected by the common policy. |
| 75 | +** By default, the `ran-du-performance` profile is set to `priority` level `18`, and it includes both the PerformanceProfile-created profile `openshift-node-performance-openshift-node-performance-profile` and `ran-du-performance-architecture-common` |
| 76 | +** If you have customized the name of the `PerformanceProfile` object, you must create a new tuned object that includes the name change of the tuned profile created by the `PerformanceProfile` CR, as well as the `ran-du-performance-architecture-common` RAN tuning profile. This must have a `priority` less than 18. |
| 77 | +For example, if the PerformanceProfile object is named `change-this-name`: |
| 78 | ++ |
| 79 | +[source,yaml] |
| 80 | +---- |
| 81 | +apiVersion: tuned.openshift.io/v1 |
| 82 | +kind: Tuned |
| 83 | +metadata: |
| 84 | + name: custom-performance-profile-override |
| 85 | + namespace: openshift-cluster-node-tuning-operator |
| 86 | +spec: |
| 87 | + profile: |
| 88 | + - name: custom-performance-profile-x |
| 89 | + data: | |
| 90 | + [main] |
| 91 | + summary=Override of the default ran-du performance tuning to adjust for our renamed PerformanceProfile |
| 92 | + include=openshift-node-performance-change-this-name,ran-du-performance-architecture-common |
| 93 | + recommend: |
| 94 | + - machineConfigLabels: |
| 95 | + machineconfiguration.openshift.io/role: "master" |
| 96 | + priority: 15 |
| 97 | + profile: custom-performance-profile-x |
| 98 | +---- |
| 99 | ++ |
| 100 | +** To further override, the optional `TunedPowerCustom.yaml` config file exemplifies how to extend the provided `TunedPerformancePatch.yaml` without needing to overlay or edit it directly. |
| 101 | +Creating an additional tuned profile which includes the top-level tuned profile named `ran-du-performance` and has a lower `priority` number in the `recommend` section allows adding additional settings easily. |
| 102 | +** For additional information on the Node Tuning Operator, see "Using the Node Tuning Operator". |
0 commit comments