Skip to content

Commit 4f4c09e

Browse files
Merge pull request #100428 from slovern/TELCODOCS-2436
TELCOCODS-2436 ARM updates - RAN RDS
2 parents d958b69 + f272e8e commit 4f4c09e

File tree

5 files changed

+148
-25
lines changed

5 files changed

+148
-25
lines changed

modules/telco-ran-engineering-considerations-for-the-ran-du-use-model.adoc

Lines changed: 40 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,11 @@ The recommended topology for RAN DU workloads is {sno}.
2323
DU workloads may be run on other cluster topologies such as 3-node compact cluster, high availability (3 control plane + n worker nodes), or SNO+1 as needed.
2424
Multiple SNO clusters, or a highly-available 3-node compact cluster, are recommended over the SNO+1 topology.
2525

26+
Under the standard cluster topology case (3+n), a mixed architecture cluster is allowed only if:
27+
28+
* All control plane nodes are x86_64.
29+
* All worker nodes are aarch64.
30+
2631
Remote worker node (RWN) cluster topologies are not recommended or included under this reference design specification.
2732
For workloads with high service level agreement requirements such as RAN DU the following drawbacks exclude RWN from consideration:
2833

@@ -35,11 +40,45 @@ For workloads with high service level agreement requirements such as RAN DU the
3540
3641
--
3742

43+
Supported cluster topologies for RAN DU::
44+
+
45+
.Supported cluster topologies for RAN DU
46+
[cols="1,2,3,4,5,6", options="header"]
47+
|===
48+
|Architecture
49+
|SNO
50+
|SNO+1
51+
|3-node
52+
|Standard
53+
|RWN
54+
55+
|x86_64
56+
|Yes
57+
|Yes
58+
|Yes
59+
|Yes
60+
|No
61+
62+
|aarch64
63+
|Yes
64+
|No
65+
|No
66+
|No
67+
|No
68+
69+
|mixed
70+
|N/A
71+
|No
72+
|No
73+
|Yes
74+
|No
75+
76+
|===
77+
3878
Workloads::
3979
. DU workloads are described in xref:../scalability_and_performance/telco-ran-du-rds.adoc#telco-ran-du-application-workloads_telco-ran-du[Telco RAN DU application workloads].
4080
. DU worker nodes are Intel 3rd Generation Xeon (IceLake) 2.20 GHz or newer with host firmware tuned for maximum performance.
4181

42-
4382
Resources::
4483
The maximum number of running pods in the system, inclusive of application workload and {product-title} pods, is 120.
4584

modules/telco-ran-node-tuning-operator.adoc

Lines changed: 64 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -7,49 +7,96 @@
77
= CPU partitioning and performance tuning
88

99
New in this release::
10-
* No reference design updates in this release
10+
* The `PerformanceProfile` and `TunedPerformancePatch` objects have been updated to fully support the aarch64 architecture.
11+
** If you have previously applied additional patches to the `TunedPerformancePatch` object, you must convert those patches to a new performance profile that includes the `ran-du-performance` profile instead. See the "Engineering considerations" section.
12+
1113

1214
Description::
13-
The RAN DU use model includes cluster performance tuning using `PerformanceProfile` CRs for low-latency performance.
15+
The RAN DU use model includes cluster performance tuning using `PerformanceProfile` CRs for low-latency performance, and a `TunedPerformancePatch` CR that adds additional RAN-specific tuning.
16+
A reference `PerformanceProfile` is provided for both x86_64 and aarch64 CPU architectures.
17+
The single `TunedPerformancePatch` object provided automatically detects the CPU architecture and performs the required additional tuning.
1418
The RAN DU use case requires the cluster to be tuned for low-latency performance.
15-
The Node Tuning Operator reconciles the `PerformanceProfile` CRs.
19+
The Node Tuning Operator reconciles the `PerformanceProfile` and `TunedPerformancePatch` CRs.
20+
1621
For more information about node tuning with the `PerformanceProfile` CR, see "Tuning nodes for low latency with the performance profile".
1722

1823
Limits and requirements::
19-
The Node Tuning Operator uses the `PerformanceProfile` CR to configure the cluster.
2024
You must configure the following settings in the telco RAN DU profile `PerformanceProfile` CR:
2125
+
2226
--
23-
* Set a reserved `cpuset` of 4 or more, equating to 4 hyper-threads (2 cores) for either of the following CPUs:
27+
* Set a reserved `cpuset` of 4 or more, equating to 4 hyper-threads (2 cores) on x86_64, or 4 cores on aarch64 for any of the following CPUs:
2428
** Intel 3rd Generation Xeon (IceLake) 2.20 GHz, or newer, CPUs with host firmware tuned for maximum performance
2529
** AMD EPYC Zen 4 CPUs (Genoa, Bergamo)
30+
** ARM CPUs (Neoverse)
2631
+
2732
[NOTE]
2833
====
29-
AMD EPYC Zen 4 CPUs (Genoa, Bergamo) are fully supported.
30-
Power consumption evaluations are ongoing.
3134
It is recommended to evaluate features, such as per-pod power management, to determine any potential impact on performance.
3235
====
3336

34-
* Set the reserved `cpuset` to include both hyper-thread siblings for each included core.
35-
Unreserved cores are available as allocatable CPU for scheduling workloads.
36-
* Ensure that hyper-thread siblings are not split across reserved and isolated cores.
37-
* Ensure that reserved and isolated CPUs include all the threads for all cores in the CPU.
38-
* Include Core 0 for each NUMA node in the reserved CPU set.
39-
* Set the huge page size to 1G.
37+
* x86_64:
38+
** Set the reserved `cpuset` to include both hyper-thread siblings for each included core.
39+
Unreserved cores are available as allocatable CPU for scheduling workloads.
40+
** Ensure that hyper-thread siblings are not split across reserved and isolated cores.
41+
** Ensure that reserved and isolated CPUs include all the threads for all cores in the CPU.
42+
** Include Core 0 for each NUMA node in the reserved CPU set.
43+
** Set the hugepage size to 1G.
44+
* aarch64:
45+
** Use the first 4 cores for the reserved CPU set (or more).
46+
** Set the hugepage size to 512M.
4047
* Only pin {product-title} pods that are by default configured as part of the management workload partition to reserved cores.
4148
* When recommended by the hardware vendor, set the maximum CPU frequency for reserved and isolated CPUs using the `hardwareTuning` section.
4249
--
4350
4451
Engineering considerations::
45-
* Meeting the full performance metrics requires use of the RT kernel.
46-
If required, you can use the non-RT kernel with corresponding impact to performance.
52+
53+
* RealTime (RT) kernel
54+
** Under x86_64, to reach the full performance metrics, you must use the RT kernel, which is the default in the `x86_64/PerformanceProfile.yaml` configuration.
55+
*** If required, you can select the non-RT kernel with corresponding impact to performance.
56+
** Under aarch64, only the 64k-pagesize non-RT kernel is recommended for RAN DU use cases, which is the default in the `aarch64/PerformanceProfile.yaml` configuration.
4757
* The number of hugepages you configure depends on application workload requirements.
4858
Variation in this parameter is expected and allowed.
4959
* Variation is expected in the configuration of reserved and isolated CPU sets based on selected hardware and additional components in use on the system.
5060
The variation must still meet the specified limits.
5161
* Hardware without IRQ affinity support affects isolated CPUs.
5262
To ensure that pods with guaranteed whole CPU QoS have full use of allocated CPUs, all hardware in the server must support IRQ affinity.
53-
* If you enable workload partitioning by setting `cpuPartitioningMode` to `AllNodes` during deployment, you must use the `PerformanceProfile` CR to allocate enough CPUs to support the operating system, interrupts, and {product-title} pods.
54-
* The reference performance profile includes additional kernel arguments settings for `vfio_pci`.
63+
* To enable workload partitioning, set `cpuPartitioningMode` to `AllNodes` during deployment, and then use the `PerformanceProfile` CR to allocate enough CPUs to support the operating system, interrupts, and {product-title} pods.
64+
* Under x86_64, the `PerformanceProfile` CR includes additional kernel arguments settings for `vfio_pci`.
5565
These arguments are included for support of devices such as the FEC accelerator. You can omit them if they are not required for your workload.
66+
* Under aarch64, the `PerformanceProfile` must be adjusted depending on the needs of the platform:
67+
** For Grace Hopper systems, the following kernel commandline arguments are required:
68+
*** `acpi_power_meter.force_cap_on=y`
69+
*** `module_blacklist=nouveau`
70+
*** `pci=realloc=off`
71+
*** `pci=pcie_bus_safe`
72+
** For other ARM platforms, you may need to enable `iommu.passthrough=1` or `pci=realloc`
73+
* Extending and augmenting `TunedPerformancePatch.yaml`:
74+
** `TunedPerformancePatch.yaml` introduces a default top-level tuned profile named `ran-du-performance` and an architecture-aware RAN tuning profile named `ran-du-performance-architecture-common`, and additional archichitecture-specific child policies that are automatically selected by the common policy.
75+
** By default, the `ran-du-performance` profile is set to `priority` level `18`, and it includes both the PerformanceProfile-created profile `openshift-node-performance-openshift-node-performance-profile` and `ran-du-performance-architecture-common`
76+
** If you have customized the name of the `PerformanceProfile` object, you must create a new tuned object that includes the name change of the tuned profile created by the `PerformanceProfile` CR, as well as the `ran-du-performance-architecture-common` RAN tuning profile. This must have a `priority` less than 18.
77+
For example, if the PerformanceProfile object is named `change-this-name`:
78+
+
79+
[source,yaml]
80+
----
81+
apiVersion: tuned.openshift.io/v1
82+
kind: Tuned
83+
metadata:
84+
name: custom-performance-profile-override
85+
namespace: openshift-cluster-node-tuning-operator
86+
spec:
87+
profile:
88+
- name: custom-performance-profile-x
89+
data: |
90+
[main]
91+
summary=Override of the default ran-du performance tuning to adjust for our renamed PerformanceProfile
92+
include=openshift-node-performance-change-this-name,ran-du-performance-architecture-common
93+
recommend:
94+
- machineConfigLabels:
95+
machineconfiguration.openshift.io/role: "master"
96+
priority: 15
97+
profile: custom-performance-profile-x
98+
----
99+
+
100+
** To further override, the optional `TunedPowerCustom.yaml` config file exemplifies how to extend the provided `TunedPerformancePatch.yaml` without needing to overlay or edit it directly.
101+
Creating an additional tuned profile which includes the top-level tuned profile named `ran-du-performance` and has a lower `priority` number in the `recommend` section allows adding additional settings easily.
102+
** For additional information on the Node Tuning Operator, see "Using the Node Tuning Operator".

modules/telco-ran-ptp-operator.adoc

Lines changed: 22 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -7,17 +7,32 @@
77
= PTP Operator
88

99
New in this release::
10-
* Dual-port NIC for PTP ordinary clock is enabled.
11-
* The PTP events REST API v1 and events consumer application sidecar support are removed.
12-
* A maximum of 3 Westport channel NIC configurations is now supported for T-GM.
10+
* No reference design updates in this release
1311

1412
Description::
15-
Configure PTP in cluster nodes with `PTPConfig` CRs for the RAN DU use case with features like Grandmaster clock (T-GM) support using GPS, ordinary clock (OC), boundary clocks (T-BC), dual boundary clocks, high availability (HA), and optional fast event notification over HTTP.
16-
PTP ensures precise timing and reliability in the RAN environment.
13+
Configure Precision Time Protocol (PTP) in cluster nodes.
14+
PTP ensures precise timing and reliability in the RAN environment, compared to other clock synchronization protocols, like NTP.
15+
Support includes::
16+
* Grandmaster clock (T-GM): use GPS to sync the local clock and provide time synchronization to other devices
17+
* Boundary clock (T-BC): receive time from another PTP source and redistribute it to other devices
18+
* Ordinary clock (T-TSC): synchronize the local clock from another PTP time provider
19+
20+
Configuration variations allow for multiple NIC configurations for greater time distribution and high availability (HA), and optional fast event notification over HTTP.
1721

1822
Limits and requirements::
19-
* Limited to 2 boundary clocks for nodes with dual NICs and HA
20-
* Limited to 3 Westport channel NIC configurations for T-GM
23+
24+
* Supports the PTP G.8275.1 profile for the following telco use-cases:
25+
** T-GM use-case:
26+
*** Limited to a maximum of 3 Westport channel NICs
27+
*** Requires GNSS input to one NIC card, with SMA connections to synchronize additional NICs
28+
*** HA support N/A
29+
** T-BC use-case:
30+
*** Limited to a maximum of 2 NICs
31+
*** System clock HA support is optional in 2-NIC configuration.
32+
** T-TSC use-case:
33+
*** Limited to single NIC only
34+
*** System clock HA support is optional in active/standby 2-port configuration.
35+
* Log reduction must be enabled with `true` or `enhanced`.
2136
2237
Engineering considerations::
2338
* RAN DU RDS configurations are provided for ordinary clocks, boundary clocks, grandmaster clocks, and highly available dual NIC boundary clocks.

modules/telco-ref-design-overview.adoc

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,3 +31,23 @@ The reference configurations in this document are deployed using a centrally man
3131

3232
.Telco RAN DU deployment architecture
3333
image::474_OpenShift_OpenShift_RAN_RDS_arch_updates_1023.png[A diagram showing two distinctive network far edge deployment processes, one showing how the hub cluster uses {ztp} to install managed clusters, and the other showing how the hub cluster uses TALM to apply policies to managed clusters]
34+
35+
== Supported CPU architectures for RAN DU
36+
37+
.Supported CPU architectures for RAN DU
38+
[cols="1,2,3", options="header"]
39+
|===
40+
41+
|Architecture
42+
|Real-time Kernel
43+
|Non-Realtime Kernel
44+
45+
|x86_64
46+
|Yes
47+
|Yes
48+
49+
|aarch64
50+
|No
51+
|Yes
52+
|===
53+

scalability_and_performance/telco-ran-du-rds.adoc

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,8 @@ include::modules/telco-ran-node-tuning-operator.adoc[leveloffset=+2]
4848
4949
* xref:../scalability_and_performance/cnf-tuning-low-latency-nodes-with-perf-profile.adoc#cnf-tuning-low-latency-nodes-with-perf-profile[Tuning nodes for low latency with the performance profile]
5050
51+
* xref:../scalability_and_performance/using-node-tuning-operator.adoc#using-node-tuning-operator[Using the Node Tuning Operator]
52+
5153
include::modules/telco-ran-ptp-operator.adoc[leveloffset=+2]
5254

5355
include::modules/telco-ran-sr-iov-operator.adoc[leveloffset=+2]

0 commit comments

Comments
 (0)