Skip to content

Commit 07fd717

Browse files
authored
Merge pull request #98666 from kquinn1204/TELCODOCS-2269
Telcodocs 2269: NUMAResourcesOperator: Support for schedulable control-plane nodes
2 parents 4961864 + 48f7396 commit 07fd717

File tree

3 files changed

+245
-0
lines changed

3 files changed

+245
-0
lines changed
Lines changed: 218 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,218 @@
1+
// Module included in the following assemblies:
2+
//
3+
// *scalability_and_performance/cnf-numa-aware-scheduling.adoc
4+
5+
:_mod-docs-content-type: PROCEDURE
6+
7+
[id="cnf-configuring-nrop-on-schedulable-cp-nodes_{context}"]
8+
= Configuring NUMA Resources Operator on schedulable control plane nodes
9+
10+
[role="_abstract"]
11+
This procedure describes how to configure the NUMA Resources Operator (NROP) to manage control plane nodes that a user configures to be schedulable. This is particularly useful in compact clusters where control plane nodes also serve as worker nodes, or in multi-node OpenShift (MNO) clusters where control plane nodes are configured as schedulable to run workloads.
12+
13+
.Prerequisites
14+
15+
* Install the {oc-first}.
16+
* Log in as a user with `cluster-admin` privileges.
17+
* Install the NUMA Resources Operator.
18+
19+
.Procedure
20+
21+
. To enable Topology Aware Scheduling (TAS) on control plane nodes, configure the nodes to be schedulable first. This allows the NUMA Resources Operator to deploy and manage pods on them. Without this action, the operator cannot deploy the pods required to gather NUMA topology information from these nodes. Follow these steps to make the control plane nodes schedulable:
22+
23+
.. Edit the `schedulers.config.openshift.io` resource by running the following command:
24+
+
25+
[source,terminal]
26+
----
27+
$ oc edit schedulers.config.openshift.io cluster
28+
----
29+
30+
.. In the editor, set the `mastersSchedulable` field to `true`, then save and exit the editor.
31+
+
32+
[source,yaml]
33+
----
34+
apiVersion: config.openshift.io/v1
35+
kind: Scheduler
36+
metadata:
37+
creationTimestamp: "2019-09-10T03:04:05Z"
38+
generation: 1
39+
name: cluster
40+
resourceVersion: "433"
41+
selfLink: /apis/config.openshift.io/v1/schedulers/cluster
42+
uid: a636d30a-d377-11e9-88d4-0a60097bee62
43+
spec:
44+
mastersSchedulable: true
45+
status: {}
46+
#...
47+
----
48+
49+
. To configure the NUMA Resources Operator, you must create a single NUMAResourcesOperator custom resource (CR) on the cluster. The `nodeGroups` configuration within this CR specifies the node pools the Operator must manage.
50+
+
51+
[NOTE]
52+
====
53+
Before configuring `nodeGroups`, ensure the specified node pool meets all prerequisites detailed in Section 12.5, "Configuring a single NUMA node policy." The NUMA Resources Operator requires all nodes within a group to be identical. Non-compliant nodes prevent the NUMA Resources Operator from performing the expected topology-aware scheduling for the entire pool.
54+
55+
You can specify multiple non-overlapping node sets for the NUMA Resources Operator to manage. Each of these sets should correspond to a different machine config pool (MCP). The NUMA Resources Operator then manages the schedulable control plane nodes within these specified node groups.
56+
====
57+
58+
.. For a compact cluster, the compact cluster's master nodes are also the schedulable nodes, so specify only the master pool. Create the following `nodeGroups` configuration in the `NUMAResourcesOperator` CR:
59+
+
60+
[source,yaml]
61+
----
62+
apiVersion: nodetopology.openshift.io/v1
63+
kind: NUMAResourcesOperator
64+
metadata:
65+
name: numaresourcesoperator
66+
spec:
67+
nodeGroups:
68+
- poolName: master
69+
----
70+
+
71+
[NOTE]
72+
====
73+
Configuring a compact cluster with a worker pool in addition to the `master` pool should be avoided. While this setup does not break the cluster or affect operator functionality, it can lead to redundant or duplicate pods and create unnecessary noise in the system. The worker pool is essentially a pointless, empty MCP in this context and serves no purpose.
74+
====
75+
76+
.. For an MNO cluster where both control plane and worker nodes are schedulable, you have the option to configure the NUMA Resources Operator to manage multiple `nodeGroups`. You can specify which nodes to include by adding their corresponding MCPs to the `nodeGroups` list in the `NUMAResourcesOperator` CR. The configuration depends entirely on your specific requirements. For example, to manage both the `master` and `worker-cnf` pools, create the following `nodeGroups` configuration in the NUMAResourcesOperator CR:
77+
+
78+
[source,yaml]
79+
----
80+
apiVersion: nodetopology.openshift.io/v1
81+
kind: NUMAResourcesOperator
82+
metadata:
83+
name: numaresourcesoperator
84+
spec:
85+
nodeGroups:
86+
- poolName: master
87+
- poolName: worker-cnf
88+
----
89+
+
90+
[NOTE]
91+
====
92+
You can customize this list to include any combination of nodeGroups for management with Topology-Aware Scheduling. To prevent duplicate, pending pods, you must ensure that each `poolName` in the configuration corresponds to a MachineConfigPool (MCP) with a unique node selector label. The label must be applied only to the nodes within that specific pool and must not overlap with labels on any other nodes in the cluster. The `worker-cnf` MCP designates a set of nodes that run telecommunications workloads.
93+
====
94+
95+
.. After you update the `nodeGroups` field in the `NUMAResourcesOperator` CR to reflect your cluster's configuration, apply the changes by running the following command:
96+
+
97+
[source,terminal]
98+
----
99+
$ oc apply -f <filename>.yaml
100+
----
101+
+
102+
[NOTE]
103+
====
104+
Replace `<filename>.yaml` with the name of your configuration file.
105+
====
106+
107+
.Verification
108+
109+
After applying the configuration, verify that the NUMA Resources Operator is correctly managing the schedulable control plane nodes by performing the following checks:
110+
111+
. Confirm that the control plane nodes have the worker role and are schedulable by running the following command:
112+
+
113+
[source,terminal]
114+
----
115+
$ oc get nodes
116+
----
117+
+
118+
.Example output:
119+
120+
[source,terminal]
121+
----
122+
NAME STATUS ROLES AGE VERSION
123+
worker-0 Ready worker,worker-cnf 100m v1.33.3
124+
worker-1 Ready worker 93m v1.33.3
125+
master-0 Ready control-plane,master,worker 108m v1.33.3
126+
master-1 Ready control-plane,master,worker 107m v1.33.3
127+
master-2 Ready control-plane,master,worker 107m v1.33.3
128+
worker-2 Ready worker 100m v1.33.3
129+
----
130+
131+
. Verify that the NUMA Resources Operator’s pods are running on the intended nodes by running the following command. You should see a numaresourcesoperator pod for each node group you specified in the CR:
132+
+
133+
[source,terminal]
134+
----
135+
$ oc get pods -n openshift-numaresources -o wide
136+
----
137+
+
138+
.Example output:
139+
[source,terminal]
140+
----
141+
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
142+
numaresources-controller-manager-bdbdd574-xx6bw 1/1 Running 0 49m 10.130.0.17 master-0 <none> <none>
143+
numaresourcesoperator-master-lprrh 2/2 Running 0 20m 10.130.0.20 master-0 <none> 2/2
144+
numaresourcesoperator-master-qk6k4 2/2 Running 0 20m 10.129.0.50 master-2 <none> 2/2
145+
numaresourcesoperator-master-zm79n 2/2 Running 0 20m 10.128.0.44 master-1 <none> 2/2
146+
numaresourcesoperator-worker-cnf-gqlmd 2/2 Running 0 4m27s 10.128.2.21 worker-0 <none> 2/2
147+
----
148+
149+
. Confirm that the NUMA Resources Operator has collected and reported the NUMA topology data for all nodes in the specified groups by running the following command:
150+
+
151+
[source,terminal]
152+
----
153+
$ oc get noderesourcetopologies.topology.node.k8s.io
154+
----
155+
+
156+
.Example output:
157+
[source,terminal]
158+
----
159+
NAME AGE
160+
worker-0 6m11s
161+
master-0 22m
162+
master-1 21m
163+
master-2 21m
164+
----
165+
+
166+
The presence of a `NodeResourceTopology` resource for a node confirms that the NUMA Resources Operator was able to schedule a pod on it to collect the data, enabling topology-aware scheduling.
167+
168+
169+
. Inspect a single Node Resource Topology by running the following command:
170+
+
171+
[source,terminal]
172+
----
173+
$ oc get noderesourcetopologies <master_node_name> -o yaml
174+
----
175+
+
176+
.Example output:
177+
[source,yaml]
178+
----
179+
apiVersion: topology.node.k8s.io/v1alpha2
180+
attributes:
181+
- name: nodeTopologyPodsFingerprint
182+
value: pfp0v001ef46db3751d8e999
183+
- name: nodeTopologyPodsFingerprintMethod
184+
value: with-exclusive-resources
185+
- name: topologyManagerScope
186+
value: container
187+
- name: topologyManagerPolicy
188+
value: single-numa-node
189+
kind: NodeResourceTopology
190+
metadata:
191+
annotations:
192+
k8stopoawareschedwg/rte-update: periodic
193+
topology.node.k8s.io/fingerprint: pfp0v001ef46db3751d8e999
194+
creationTimestamp: "2025-09-23T10:18:34Z"
195+
generation: 1
196+
name: master-0
197+
resourceVersion: "58173"
198+
uid: 35c0d27e-7d9f-43d3-bab9-2ebc0d385861
199+
zones:
200+
- costs:
201+
- name: node-0
202+
value: 10
203+
name: node-0
204+
resources:
205+
- allocatable: "3"
206+
available: "2"
207+
capacity: "4"
208+
name: cpu
209+
- allocatable: "1476189952"
210+
available: "1378189952"
211+
capacity: "1576189952"
212+
name: memory
213+
type: Node
214+
----
215+
+
216+
The presence of this resource for a node with a master role proves that the NUMA Resources Operator was able to deploy its discovery pods onto that node. These pods are what gather the NUMA topology data, and they can only be scheduled on nodes that are considered schedulable.
217+
+
218+
The output confirms that the procedure to make the master nodes schedulable was successful, as the NUMA Resources Operator has now collected and reported the NUMA-related information for that specific control plane node.
Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
// Module included in the following assemblies:
2+
//
3+
// *scalability_and_performance/cnf-numa-aware-scheduling.adoc
4+
:_mod-docs-content-type: CONCEPT
5+
[id="cnf-numa-resource-operator-support-scheduling-cp_{context}"]
6+
= NUMA Resources Operator support for schedulable control-plane nodes
7+
8+
[role="_abstract"]
9+
You can enable schedulable control plane nodes to run user-defined pods, effectively turning the nodes into hybrid Control Plane and Worker nodes. This configuration is especially beneficial in resource-constrained environments, such as compact clusters. When enabled, the NUMA Resources Operator can apply its topology-aware scheduling to the nodes for guaranteed workloads, ensuring Pods are placed according to the best NUMA affinity.
10+
11+
Traditionally, control plane nodes in {product-title} are dedicated to running critical cluster services. Enabling schedulable control plane nodes allows user-defined Pods to be scheduled on the nodes.
12+
13+
You can make control plane nodes schedulable by setting the `mastersSchedulable` field to true in the `schedulers.config.openshift.io` resource.
14+
15+
The NUMA Resources Operator provides topology-aware scheduling for workloads that need a specific NUMA affinity. When control plane nodes are made schedulable, the operator's management capabilities can be applied to them, just as they are to worker nodes. This ensures that NUMA-aware pods are placed on a node with the best NUMA topology, whether it's a control plane or worker node.
16+
17+
When configuring the NUMA Resources Operator, its management scope is determined by the `nodeGroups` field in its custom resource (CR). This principle applies to both compact and multi-node clusters.
18+
19+
Compact clusters:: In a compact cluster, all nodes are configured as schedulable control plane nodes. The NUMA Resources Operator can be configured to manage all nodes in the cluster. Follow the deployment instructions for more details on the process.
20+
21+
Multi-Node OpenShift (MNO) clusters:: In a Multi-Node {product-title} cluster, control plane nodes are made schedulable in addition to existing worker nodes. To manage these nodes, you can configure the NUMA Resources Operator by defining separate `nodeGroups` in the `NUMAResourcesOperator` CR for the control plane and worker nodes. This ensures that the NUMA Resources Operator correctly schedules pods on both sets of nodes based on resource availability and NUMA topology.
22+
23+

scalability_and_performance/cnf-numa-aware-scheduling.adoc

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -73,6 +73,10 @@ include::modules/cnf-deploying-the-numa-aware-scheduler.adoc[leveloffset=+2]
7373

7474
include::modules/cnf-scheduling-numa-aware-workloads.adoc[leveloffset=+2]
7575

76+
include::modules/cnf-nrop-support-schedulable-resources.adoc[leveloffset=+1]
77+
78+
include::modules/cnf-configuring-nrop-on-schedlable-control-planes.adoc[leveloffset=+2]
79+
7680
include::modules/cnf-configuring-node-groups-for-the-numaresourcesoperator.adoc[leveloffset=+1]
7781

7882
include::modules/cnf-troubleshooting-numa-aware-workloads.adoc[leveloffset=+1]

0 commit comments

Comments
 (0)