Skip to content

Commit f6e32b5

Browse files
committed
OCPBUGS-64584 Modify DPU Operator Docs to reflect 4.20 code
formatting fix formatting fix 2 fixing vale errors 5 fixing vale errors 6 fixing vale errors 7
1 parent 9ca3860 commit f6e32b5

16 files changed

+242
-128
lines changed

_topic_maps/_topic_map.yml

Lines changed: 2 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1489,16 +1489,8 @@ Topics:
14891489
Dir: dpu-operator
14901490
Distros: openshift-enterprise,openshift-origin
14911491
Topics:
1492-
- Name: About the DPU and the DPU Operator
1493-
File: about-dpu
1494-
- Name: Installing the DPU Operator
1495-
File: installing-dpu-operator
1496-
- Name: Configuring the DPU Operator
1497-
File: configuring-dpu-operator
1498-
- Name: Running a workload on the DPU
1499-
File: running-workload-on-dpu
1500-
- Name: Uninstalling the DPU Operator
1501-
File: uninstalling-dpu-operator
1492+
- Name: DPU Operator
1493+
File: dpu-operator
15021494
- Name: Network security
15031495
Dir: network_security
15041496
Distros: openshift-enterprise,openshift-origin

modules/nw-about-dpu.adoc

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6,13 +6,16 @@
66
[id="nw-about-dpu_{context}"]
77
= Orchestrating DPUs with the DPU Operator
88

9-
A Data Processing Unit (DPU) is a type of programmable processor that is considered one of the three fundamental pillars of computing, alongside CPUs and GPUs. While CPUs handle general computing tasks and GPUs accelerate specific workloads, the primary role of the DPU is to offload and accelerate data-centric workloads, such as networking, storage, and security functions.
9+
[role="_abstract"]
10+
You can use the Data Processing Unit (DPU) Operator to manage DPUs that offload networking, storage, and security workloads from host CPUs to improve cluster performance and efficiency.
1011

11-
DPUs are typically used in data centers and cloud environments to improve performance, reduce latency, and enhance security by offloading these tasks from the CPU. DPUs can also be used to create a more efficient and flexible infrastructure by enabling the deployment of specialized workloads closer to the data source.
12+
A DPU is a type of programmable processor that represents one of the three fundamental pillars of computing, alongside CPUs and GPUs. While CPUs handle general computing tasks and GPUs accelerate specific workloads, the primary role of the DPU is to offload and accelerate data-centric workloads, such as networking, storage, and security functions.
13+
14+
DPUs are typically used in data centers and cloud environments to improve performance, reduce latency, and enhance security by offloading these tasks from the CPU. You can also use DPUs to create a more efficient and flexible infrastructure by enabling the deployment of specialized workloads closer to the data source.
1215

1316
The DPU Operator is responsible for managing the DPU devices and network attachments. The DPU Operator deploys the DPU daemon onto {product-title} compute nodes that interface through an API controlling the DPU daemon running on the DPU. The DPU Operator is responsible for the life-cycle management of the `ovn-kube` components and the necessary host network initialization on the DPU.
1417

15-
The currently supported DPU devices are described in the following table.
18+
The following table describes the currently supported DPU devices.
1619

1720
.Supported devices
1821
[cols="1,1,1,2", options="header"]

modules/nw-dpu-configuring-operator.adoc

Lines changed: 28 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -4,37 +4,47 @@
44

55
:_mod-docs-content-type: PROCEDURE
66
[id="nw-dpu-configuring-operator_{context}"]
7-
= Configuring the DPU Operator
7+
= Configuring the DPU Operator
8+
9+
[role="_abstract"]
10+
You can configure the DPU Operator after installation to enable management of DPU devices and network attachments in both dual cluster and single cluster deployment modes.
11+
12+
You can configure the DPU Operator to manage the DPU devices and network attachments in your cluster.
813

914
To configure the DPU Operator follow these steps:
1015

1116
.Procedure
1217

13-
. Create a `DpuOperatorConfig` custom resource (CR) on both the host cluster and on each of the DPU clusters. The DPU Operator in each cluster is activated after this CR is created.
18+
. Create the `DpuOperatorConfig` Custom Resource (CR) based on your deployment mode:
19+
20+
* Dual Cluster Deployment: You must create the `DpuOperatorConfig` CR on both the host {product-title} cluster and on each of the {ms} DPU clusters.
21+
* Single Cluster Deployment: This deployment uses a standard {product-title} cluster. You only need to create the `DpuOperatorConfig` CR once on this cluster.
22+
+
23+
The content of the CR is the same for all clusters.
1424
15-
. Create a file named `dpu-operator-host-config.yaml` by using the following YAML:
25+
. Create a file named `dpu-operator-config.yaml` by using the following YAML:
1626
+
1727
[source,yaml]
1828
----
1929
apiVersion: config.openshift.io/v1
2030
kind: DpuOperatorConfig
2131
metadata:
22-
name: dpu-operator-config <1>
32+
name: dpu-operator-config
2333
spec:
24-
mode: host <2>
34+
logLevel: 0
2535
----
2636
+
27-
<1> The name of the custom resource must be `dpu-operator-config`.
28-
<2> Set the value to `host` on the host cluster. On each DPU cluster, which runs a single MicroShift cluster per DPU, set the value to `dpu`.
37+
* `metadata.name`: Specifies the name of the Custom Resource, which must be `dpu-operator-config`.
38+
* `spec.logLevel`: Sets the desired logging verbosity in the operator container logs. The value `0` is the default setting.
2939
3040
. Create the resource by running the following command:
3141
+
3242
[source,terminal]
3343
----
34-
$ oc apply -f dpu-operator-host-config.yaml
44+
$ oc apply -f dpu-operator-config.yaml
3545
----
3646

37-
. You must label all nodes that either have an attached DPU or are functioning as a DPU. On the host cluster, this means labeling all compute nodes assuming each node has an attached DPU with `dpu=true`. On the DPU, where each MicroShift cluster consists of a single node, label that single node in each cluster with `dpu=true`. You can apply this label by running the following command:
47+
. Label all nodes that either have an attached DPU or are functioning as a DPU. You can apply this label by running the following command:
3848
+
3949
[source,terminal]
4050
----
@@ -44,3 +54,12 @@ $ oc label node <node_name> dpu=true
4454
where:
4555
+
4656
`node_name`:: Refers to the name of your node, such as `worker-1`.
57+
+
58+
[NOTE]
59+
====
60+
There are two ways to deploy clusters that are compatible with DPUs:
61+
62+
* Dual cluster deployment: This consists of {product-title} running on the hosts and {ms} running on the DPU. In this mode, the {ms} instance also needs to deploy the DPU Operator, and you must set the label `dpu=true` on the node.
63+
* Single cluster deployment: This consists of only {product-title} running on hosts, where the DPUs are integrated into the main cluster. DPUs just require the label `dpu=true` for both the host nodes with DPUs installed and the DPU nodes themselves. The DPU Operator automatically detects the role of the node whether it is running as a DPU or a host with an attached DPU.
64+
====
65+

modules/nw-dpu-creating-a-sfc.adoc

Lines changed: 45 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -4,11 +4,14 @@
44

55
:_mod-docs-content-type: PROCEDURE
66
[id="nw-dpu-creating-a-sfc_{context}"]
7-
= Creating a service function chain on the DPU
7+
= Running a workload on the DPU
88

9-
Network service chaining, also known as service function chaining (SFC) is a capability that uses software-defined networking (SDN) capabilities to create a chain of connected network services, such as L4-7 services like firewalls, network address translation (NAT), and intrusion protection.
9+
[role="_abstract"]
10+
You can deploy network workloads directly on the DPU to improve performance, enhance security isolation, and reduce host CPU usage.
1011

11-
Follow this procedure on the DPU to create the network function `my-network-function` in the service function chain.
12+
The DPU offloads network workloads, such as security functions or virtualized appliances, to improve performance, enhance security isolation, and free host CPU resources.
13+
14+
Follow this procedure to deploy a simple pod directly onto the DPU.
1215

1316
.Prerequisites
1417

@@ -18,27 +21,55 @@ Follow this procedure on the DPU to create the network function `my-network-func
1821
1922
.Procedure
2023

21-
. Save the following YAML file example as `sfc.yaml`:
24+
. Save the following YAML file example as `dpu-pod.yaml`. This is an example of a simple pod that will be scheduled directly onto a DPU node by the Kubernetes default scheduler.
2225
+
2326
[source,yaml]
2427
----
25-
apiVersion: config.openshift.io/v1
26-
kind: ServiceFunctionChain
28+
apiVersion: v1
29+
kind: Pod
2730
metadata:
28-
name: sfc
31+
name: "my-network-function"
2932
namespace: openshift-dpu-operator
33+
annotations:
34+
k8s.v1.cni.cncf.io/networks: dpunfcni-conf, dpunfcni-conf
3035
spec:
31-
networkFunctions:
32-
- name: my-network-function <1>
33-
image: quay.io/example-org/my-network-function:latest <2>
36+
nodeSelector:
37+
dpu.config.openshift.io/dpuside: "dpu"
38+
containers:
39+
- name: "my-network-function"
40+
image: "quay.io/example-org/my-network-function:latest"
41+
resources:
42+
requests:
43+
openshift.io/dpu: "2"
44+
limits:
45+
openshift.io/dpu: "2"
46+
securityContext:
47+
privileged: true
48+
capabilities:
49+
drop:
50+
- ALL
51+
add:
52+
- NET_RAW
53+
- NET_ADMIN
3454
----
3555
+
36-
<1> The name of the network function. This name is used to identify the network function in the service function chain.
37-
<2> The URL to the container image that contains the network function. The image must be accessible from the DPU.
56+
* `metadata.name.annotations.k8s.v1.cni.cncf.io/networks`: The value `dpunfcni-conf` specifies the name of the `NetworkAttachmentDefinition` resource. The DPU Operator creates this resource during installation to configure the DPU networking.
57+
* `spec.nodeSelector`: The `nodeSelector` is the primary mechanism for scheduling this workload. The DPU Operator creates and maintains the label: `dpu.config.openshift.io/dpuside: "dpu"`. This label ensures the pod is scheduled directly onto the DPU's processing unit.
58+
* `spec.containers.name`: The name of the container.
59+
* `spec.containers.image`: The container image to pull and run.
3860
39-
. Create the chain by running the following command on the DPU nodes:
61+
. Create the pod by running the following command:
4062
+
4163
[source,terminal]
4264
----
43-
$ oc apply -f sfc.yaml
65+
$ oc apply -f dpu-pod.yaml
66+
----
67+
68+
. Verify the pod status by running the following command:
69+
+
70+
[source,bash]
71+
----
72+
$ oc get pods -n openshift-dpu-operator
4473
----
74+
+
75+
Ensure the pod's status is `Running`.

modules/nw-dpu-installing-operator-cli.adoc

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,9 @@
77
[id="nw-dpu-installing-operator-cli_{context}"]
88
= Installing the DPU Operator by using the CLI
99

10+
[role="_abstract"]
11+
You can install the DPU Operator by using the CLI. You can use the DPU Operator to simplify the installation process when setting up DPU device management on host clusters.
12+
1013
As a cluster administrator, you can install the DPU Operator by using the CLI.
1114

1215
[NOTE]

modules/nw-dpu-installing-operator-ui.adoc

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,9 @@
77
[id="nw-dpu-installing-operator-ui_{context}"]
88
= Installing the DPU Operator using the web console
99

10+
[role="_abstract"]
11+
You can install the DPU Operator by using the web console. You can use the DPU Operator to simplify the installation process when setting up DPU device management on host clusters.
12+
1013
As a cluster administrator, you can install the DPU Operator by using the web console.
1114

1215
.Prerequisites
@@ -28,7 +31,7 @@ As a cluster administrator, you can install the DPU Operator by using the web co
2831

2932
. Navigate to the *Ecosystem* -> *Installed Operators* page.
3033

31-
. Ensure that *DPU Operator* is listed in the *openshift-dpu-operator* project with a *Status* of *InstallSucceeded*.
34+
. Ensure that the *openshift-dpu-operator* project lists *DPU Operator* with a *Status* of *InstallSucceeded*.
3235
+
3336
[NOTE]
3437
====
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * networking/networking_operators/installing-dpu-operator.adoc
4+
5+
:_mod-docs-content-type: Concept
6+
[id="overview-installing-dpu-operator_{context}"]
7+
= Installing the DPU Operator
8+
9+
[role="_abstract"]
10+
You can install the Data Processing Unit (DPU) Operator on both host and DPU clusters to manage device lifecycle and network attachments using the CLI or web console.
11+
12+
Cluster administrators can install the DPU Operator on the host cluster and all DPU clusters using the {product-title} CLI or the web console. The DPU Operator manages the lifecycle, DPU devices, and network attachments for all supported DPUs."
13+
14+
[NOTE]
15+
====
16+
You need to install the DPU Operator on the host cluster and each of the DPU clusters.
17+
====

modules/nw-dpu-operator-uninstall.adoc

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,10 @@
66
[id="nw-dpu-operator-uninstall_{context}"]
77
= Uninstalling the DPU Operator
88

9-
As a cluster administrator, you can uninstall the DPU Operator.
9+
[role="_abstract"]
10+
You can uninstall the DPU Operator from your cluster when you no longer need DPU device management, ensuring all workloads are deleted first.
11+
12+
To uninstall the DPU Operator, you must first delete any running DPU workloads. Follow this procedure to uninstall the DPU Operator.
1013

1114
.Prerequisites
1215

@@ -69,7 +72,7 @@ $ oc delete namespace openshift-dpu-operator
6972

7073
.Verification
7174

72-
. Verify that the DPU Operator is uninstalled by running the following command. An example of succesful command output is `No resources found in openshift-dpu-operator namespace`.
75+
. Verify that the DPU Operator is uninstalled by running the following command. An example of successful command output is `No resources found in openshift-dpu-operator namespace`.
7376
+
7477
[source,terminal]
7578
----

modules/nw-dpu-running-workloads.adoc

Lines changed: 16 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -4,9 +4,14 @@
44

55
:_mod-docs-content-type: PROCEDURE
66
[id="nw-running-workloads-dpu_{context}"]
7-
= Running a workload on the DPU
7+
= Running a workload on the host with DPU
88

9-
Follow these steps to deploy a workload on the DPU.
9+
[role="_abstract"]
10+
You can deploy workloads on the host with DPU to offload specialized infrastructure tasks and improve performance while freeing up host CPU resources.
11+
12+
Running workloads on a DPU enables offloading specialized infrastructure tasks such as networking, security, and storage to a dedicated processing unit. This improves performance, enforces a stronger security boundary between infrastructure and application workloads, and frees up host CPU resources.
13+
14+
Follow these steps to deploy a workload on the host with DPU. This is the standard deployment model where the application runs on the host's x86 CPU but utilizes the DPU for network acceleration and offload.
1015

1116
.Prerequisites
1217

@@ -16,7 +21,7 @@ Follow these steps to deploy a workload on the DPU.
1621
1722
.Procedure
1823

19-
. Create a sample workload on the host side by using the following YAML, save the file as `workload-host.yaml`:
24+
. Create a sample workload designed to run on the host-side worker node by using the following YAML. Save the file as `workload-host.yaml`:
2025
+
2126
[source,yaml]
2227
----
@@ -29,7 +34,7 @@ metadata:
2934
k8s.v1.cni.cncf.io/networks: default-sriov-net
3035
spec:
3136
nodeSelector:
32-
kubernetes.io/hostname: worker-237 <1>
37+
kubernetes.io/hostname: worker-237
3338
containers:
3439
- name: appcntr1
3540
image: registry.access.redhat.com/ubi9/ubi:latest
@@ -48,7 +53,13 @@ spec:
4853
openshift.io/dpu: '1'
4954
----
5055
+
51-
<1> The name of the node where the workload is deployed.
56+
`spec.nodeSelector`: The node selector schedules the pod on the node with the DPU resource. You can use any standard Kubernetes selector for this, such as `kubernetes.io/hostname`, to target a specific node as shown in the example YAML.
57+
+
58+
[NOTE]
59+
====
60+
For flexible scheduling, the DPU Operator creates the label dpu.config.openshift.io/dpuside: "dpu-host". This label enables the default scheduler to place the workload on any host with a DPU. The workload automatically joins that DPU secondary network.
61+
When the label on the node is `dpu.config.openshift.io/dpuside: "dpu"`, this signifies that the node is the DPU itself. The DPU Operator creates and manages the `dpu.config.openshift.io/dpuside` label .
62+
====
5263

5364
. Create the workload by running the following command:
5465
+
Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * networking/networking_operators/nw-dpu-running-workloads.adoc
4+
5+
:_mod-docs-content-type: PROCEDURE
6+
[id="nw-dpu-monitoring-status_{context}"]
7+
= Monitoring the status of DPU
8+
9+
[role="_abstract"]
10+
You can monitor the DPU infrastructure status to check the current state and health of your DPU devices across the cluster.
11+
12+
You can monitor the DPU status to see the current state of the DPU infrastructure.
13+
14+
The `oc get dpu` command shows the current state of the DPU infrastructure. Follow this procedure to monitor the status of various cards.
15+
16+
.Prerequisites
17+
18+
* The OpenShift CLI (`oc`) is installed.
19+
* An account with `cluster-admin` privileges is available.
20+
* The DPU Operator is installed.
21+
22+
.Procedure
23+
24+
. Run the following command to check the overall health of your nodes:
25+
+
26+
[source,terminal]
27+
----
28+
$ oc get nodes
29+
----
30+
+
31+
The example output provides a list of all nodes in the cluster along with their status. Ensure that all nodes are in the `Ready` state before proceeding.
32+
+
33+
[source,terminal]
34+
----
35+
NAME STATUS ROLES AGE VERSION
36+
ocpcluster-master-1 Ready master 10d v1.32.9
37+
ocpcluster-master-2 Ready master 10d v1.32.9
38+
ocpcluster-master-3 Ready master 10d v1.32.9
39+
ocpcluster-dpu-ipu-219 Ready worker 42h v1.32.9
40+
ocpcluster-dpu-marvell-41 Ready worker 3d23h v1.32.9
41+
ocpcluster-dpu-ptl-243 Ready worker 3d23h v1.32.9
42+
worker-host-ipu-219 Ready worker 3d19h v1.32.9
43+
worker-host-marvell-41 Ready worker 4d v1.32.9
44+
worker-host-ptl-243 Ready worker 3d23h v1.32.9
45+
----
46+
+
47+
This output shows three master nodes, and three worker nodes identified by the worker-host prefix, for example, `worker-host-ipu-219`. Each worker node contains a DPU identified by the ocpcluster-dpu prefix, for example, `ocpcluster-dpu-ipu-219`.
48+
49+
. Run the following command to report on the status of the DPUs:
50+
+
51+
[source,terminal]
52+
----
53+
$ oc get dpu
54+
----
55+
+
56+
The example output provides a list of detected DPUs.
57+
+
58+
[source,terminal]
59+
----
60+
NAME DPU PRODUCT DPU SIDE MODE NAME STATUS
61+
030001163eec00ff-host Intel Netsec Accelerator false worker-host-ptl-243 True
62+
d4-e5-c9-00-ec-3v-dpu Intel Netsec Accelerator true worker-dpu-ptl-243 True
63+
intel-ipu-0000-06-00.0-host Intel IPU E2100 false worker-host-ipu-219 False
64+
intel-ipu-dpu Intel IPU E2100 true worker-dpu-ipu-219 False
65+
marvell-dpu-0000-87-00.0-host Marvell DPU false worker-host-marvell-41 True
66+
marvell-dpu-ipu Marvell DPU true worker-dpu-marvell-41 True
67+
----
68+
* `DPU PRODUCT`:Displays the vendor or type of DPU, for example, Intel or Marvell.
69+
* `DPU SIDE`:Indicates whether the DPU is operating on the host side (`false`) or the DPU side (`true`). Each physical DPU is represented twice.
70+
* `MODE NAME`:The name of the node where the DPU is located. This is the host worker node for `false` entries and the DPU node for `true` entries.
71+
* `STATUS`:Indicates whether the DPU is functioning correctly (`True`) or has issues (`False`).
72+
+
73+
[NOTE]
74+
====
75+
Run `oc get dpu -o yaml` to get more details about the status.
76+
====

0 commit comments

Comments
 (0)