Skip to content

Commit 5ca4308

Browse files
amolnar-ghlcavalle
authored andcommitted
TELCODOCS-2171#Generalize Day2Ops for Telco
Conflicts solved
1 parent d66c44c commit 5ca4308

File tree

36 files changed

+138
-152
lines changed

36 files changed

+138
-152
lines changed

_topic_maps/_topic_map.yml

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -3562,23 +3562,23 @@ Topics:
35623562
File: telco-update-completing-the-y-stream-update
35633563
- Name: Completing the z-stream update
35643564
File: telco-update-completing-the-z-stream-update
3565-
- Name: Troubleshooting and maintaining telco core CNF clusters
3565+
- Name: Troubleshooting and maintaining OpenShift Container Platform clusters
35663566
Dir: troubleshooting
35673567
Topics:
3568-
- Name: Troubleshooting and maintaining telco core CNF clusters
3569-
File: telco-troubleshooting-intro
3568+
- Name: Troubleshooting and maintaining OpenShift Container Platform clusters
3569+
File: troubleshooting-intro
35703570
- Name: General troubleshooting
3571-
File: telco-troubleshooting-general-troubleshooting
3571+
File: troubleshooting-general-troubleshooting
35723572
- Name: Cluster maintenance
3573-
File: telco-troubleshooting-cluster-maintenance
3573+
File: troubleshooting-cluster-maintenance
35743574
- Name: Security
3575-
File: telco-troubleshooting-security
3575+
File: troubleshooting-security
35763576
- Name: Certificate maintenance
3577-
File: telco-troubleshooting-cert-maintenance
3577+
File: troubleshooting-cert-maintenance
35783578
- Name: Machine Config Operator
3579-
File: telco-troubleshooting-mco
3579+
File: troubleshooting-mco
35803580
- Name: Bare-metal node maintenance
3581-
File: telco-troubleshooting-bmn-maintenance
3581+
File: troubleshooting-bmn-maintenance
35823582
- Name: Observability
35833583
Dir: observability
35843584
Topics:

edge_computing/day_2_core_cnf_clusters/telco-day-2-welcome.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ You can use the following Day 2 operations to manage telco core CNF clusters.
1111
Updating a telco core CNF cluster:: Updating your cluster is a critical task that ensures that bugs and potential security vulnerabilities are patched.
1212
For more information, see xref:../day_2_core_cnf_clusters/updating/telco-update-welcome.adoc#telco-update-welcome[Updating a telco core CNF cluster].
1313

14-
Troubleshooting and maintaining telco core CNF clusters:: To maintain and troubleshoot a bare-metal environment where high-bandwidth network throughput is required, see xref:../day_2_core_cnf_clusters/troubleshooting/telco-troubleshooting-intro.adoc#telco-troubleshooting-intro[Troubleshooting and maintaining telco core CNF clusters].
14+
Troubleshooting and maintaining telco core CNF clusters:: To maintain and troubleshoot a bare-metal environment where high-bandwidth network throughput is required, see xref:../day_2_core_cnf_clusters/troubleshooting/troubleshooting-intro.adoc#troubleshooting-intro[Troubleshooting and maintaining {product-title} clusters].
1515

1616
Observability in telco core CNF clusters:: {product-title} generates a large amount of data, such as performance metrics and logs from the platform and the workloads running on it.
1717
As an administrator, you can use tools to collect and analyze the available data.

edge_computing/day_2_core_cnf_clusters/troubleshooting/telco-troubleshooting-cluster-maintenance.adoc

Lines changed: 0 additions & 21 deletions
This file was deleted.

edge_computing/day_2_core_cnf_clusters/troubleshooting/telco-troubleshooting-mco.adoc

Lines changed: 0 additions & 20 deletions
This file was deleted.
Original file line numberDiff line numberDiff line change
@@ -1,29 +1,29 @@
11
:_mod-docs-content-type: ASSEMBLY
2-
[id="telco-troubleshooting-bmn-maintenance"]
2+
[id="troubleshooting-bmn-maintenance"]
33
= Bare-metal node maintenance
44
include::_attributes/common-attributes.adoc[]
5-
:context: telco-troubleshooting-bmn-maintenance
5+
:context: troubleshooting-bmn-maintenance
66

77
toc::[]
88

99
You can connect to a node for general troubleshooting.
1010
However, in some cases, you need to perform troubleshooting or maintenance tasks on certain hardware components.
11-
This section discusses topics that you need to perform that hardware maintenance.
11+
This section discusses topics that you need to perform for hardware maintenance.
1212

13-
include::modules/telco-troubleshooting-bmn-connect-to-node.adoc[leveloffset=+1]
14-
include::modules/telco-troubleshooting-bmn-move-apps-to-pods.adoc[leveloffset=+1]
13+
include::modules/troubleshooting-bmn-connect-to-node.adoc[leveloffset=+1]
14+
include::modules/troubleshooting-bmn-move-apps-to-pods.adoc[leveloffset=+1]
1515

1616
[role="_additional-resources"]
1717
.Additional resources
1818

1919
* xref:../../../nodes/nodes/nodes-nodes-working.adoc#nodes-nodes-working_nodes-nodes-working[Working with nodes]
2020
21-
include::modules/telco-troubleshooting-bmn-replace-dimm.adoc[leveloffset=+1]
21+
include::modules/troubleshooting-bmn-replace-dimm.adoc[leveloffset=+1]
22+
include::modules/troubleshooting-bmn-replace-disk.adoc[leveloffset=+1]
2223

2324
[role="_additional-resources"]
2425
.Additional resources
2526

2627
* xref:../../../storage/index.adoc#storage-overview_storage-overview[{product-title} storage overview]
2728
28-
include::modules/telco-troubleshooting-bmn-replace-disk.adoc[leveloffset=+1]
29-
include::modules/telco-troubleshooting-bmn-replace-nw-card.adoc[leveloffset=+1]
29+
include::modules/troubleshooting-bmn-replace-nw-card.adoc[leveloffset=+1]
Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
:_mod-docs-content-type: ASSEMBLY
2-
[id="telco-troubleshooting-cert-maintenance"]
2+
[id="troubleshooting-cert-maintenance"]
33
= Certificate maintenance
44
include::_attributes/common-attributes.adoc[]
5-
:context: telco-troubleshooting-cert-maintenance
5+
:context: troubleshooting-cert-maintenance
66

77
toc::[]
88

@@ -14,22 +14,22 @@ Learn about certificates in {product-title} and how to maintain them by using th
1414
* link:https://access.redhat.com/solutions/5018231[Which OpenShift certificates do rotate automatically and which do not in Openshift 4.x?]
1515
* link:https://access.redhat.com/solutions/7000968[Checking etcd certificate expiry in OpenShift 4]
1616
17-
include::modules/telco-troubleshooting-certs-manual.adoc[leveloffset=+1]
18-
include::modules/telco-troubleshooting-certs-manual-proxy.adoc[leveloffset=+2]
17+
include::modules/troubleshooting-certs-manual.adoc[leveloffset=+1]
18+
include::modules/troubleshooting-certs-manual-proxy.adoc[leveloffset=+2]
1919

2020
[role="_additional-resources"]
2121
.Additional resources
2222

2323
* xref:../../../security/certificate_types_descriptions/proxy-certificates.adoc#cert-types-proxy-certificates[Proxy certificates]
2424
25-
include::modules/telco-troubleshooting-certs-manual-user-provisioned.adoc[leveloffset=+2]
25+
include::modules/troubleshooting-certs-manual-user-provisioned.adoc[leveloffset=+2]
2626

2727
[role="_additional-resources"]
2828
.Additional resources
2929

3030
* xref:../../../security/certificate_types_descriptions/user-provided-certificates-for-api-server.adoc#cert-types-user-provided-certificates-for-the-api-server[User-provisioned certificates for the API server]
3131
32-
include::modules/telco-troubleshooting-certs-auto.adoc[leveloffset=+1]
32+
include::modules/troubleshooting-certs-auto.adoc[leveloffset=+1]
3333

3434
[role="_additional-resources"]
3535
.Additional resources
@@ -44,21 +44,21 @@ include::modules/telco-troubleshooting-certs-auto.adoc[leveloffset=+1]
4444
* xref:../../../security/certificate_types_descriptions/control-plane-certificates.adoc#cert-types-control-plane-certificates_cert-types-control-plane-certificates[Control plane certificates]
4545
* xref:../../../security/certificate_types_descriptions/ingress-certificates.adoc#cert-types-ingress-certificates_cert-types-ingress-certificates[Ingress certificates]
4646
47-
include::modules/telco-troubleshooting-certs-auto-etcd.adoc[leveloffset=+2]
47+
include::modules/troubleshooting-certs-auto-etcd.adoc[leveloffset=+2]
4848

4949
[role="_additional-resources"]
5050
.Additional resources
5151

5252
* xref:../../../security/certificate_types_descriptions/etcd-certificates.adoc#cert-types-etcd-certificates_cert-types-etcd-certificates[etcd certificates]
5353
54-
include::modules/telco-troubleshooting-certs-auto-node.adoc[leveloffset=+2]
54+
include::modules/troubleshooting-certs-auto-node.adoc[leveloffset=+2]
5555

5656
[role="_additional-resources"]
5757
.Additional resources
5858

5959
* xref:../../../security/certificate_types_descriptions/node-certificates.adoc#cert-types-node-certificates_cert-types-node-certificates[Node certificates]
6060
61-
include::modules/telco-troubleshooting-certs-auto-service-ca.adoc[leveloffset=+2]
61+
include::modules/troubleshooting-certs-auto-service-ca.adoc[leveloffset=+2]
6262

6363
[role="_additional-resources"]
6464
.Additional resources
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
:_mod-docs-content-type: ASSEMBLY
2+
[id="troubleshooting-cluster-maintenance"]
3+
= Cluster maintenance
4+
include::_attributes/common-attributes.adoc[]
5+
:context: troubleshooting-cluster-maintenance
6+
7+
toc::[]
8+
9+
When deploying {product-title} on bare-metal infrastructure, you must pay more attention to certain configurations which can have a significant impact on cluster stability.
10+
You can troubleshoot more effectively by completing these tasks:
11+
12+
* Monitor for failed or failing hardware components
13+
* Periodically check the status of the cluster Operators
14+
15+
[NOTE]
16+
====
17+
For hardware monitoring, contact your hardware vendor to find the appropriate logging tool for your specific hardware.
18+
====
19+
20+
include::modules/troubleshooting-clusters-check-cluster-operators.adoc[leveloffset=+1]
21+
include::modules/troubleshooting-clusters-check-for-failed-pods.adoc[leveloffset=+1]
Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,28 +1,28 @@
11
:_mod-docs-content-type: ASSEMBLY
2-
[id="telco-troubleshooting-general-troubleshooting"]
2+
[id="troubleshooting-general-troubleshooting"]
33
= General troubleshooting
44
include::_attributes/common-attributes.adoc[]
5-
:context: telco-troubleshooting-general-troubleshooting
5+
:context: troubleshooting-general-troubleshooting
66

77
toc::[]
88

99
When you encounter a problem, the first step is to find the specific area where the issue is happening.
10-
To narrow down the potential problematic areas, complete one or more tasks:
10+
To narrow down the potential problematic areas, complete one or more of the following tasks:
1111

1212
* Query your cluster
1313
* Check your pod logs
1414
* Debug a pod
1515
* Review events
1616
17-
include::modules/telco-troubleshooting-general-query-cluster.adoc[leveloffset=+1]
17+
include::modules/troubleshooting-general-query-cluster.adoc[leveloffset=+1]
1818

1919
[role="_additional-resources"]
2020
.Additional resources
2121

2222
* xref:../../../cli_reference/openshift_cli/developer-cli-commands.adoc#oc-get[oc get]
2323
* xref:../../../support/troubleshooting/investigating-pod-issues.adoc#reviewing-pod-status_investigating-pod-issues[Reviewing pod status]
2424
25-
include::modules/telco-troubleshooting-general-check-logs.adoc[leveloffset=+1]
25+
include::modules/troubleshooting-general-check-logs.adoc[leveloffset=+1]
2626

2727
[role="_additional-resources"]
2828
.Additional resources
@@ -32,37 +32,37 @@ include::modules/telco-troubleshooting-general-check-logs.adoc[leveloffset=+1]
3232
* xref:../../../support/troubleshooting/investigating-pod-issues.adoc#inspecting-pod-and-container-logs_investigating-pod-issues[Inspecting pod and container logs]
3333
3434
35-
include::modules/telco-troubleshooting-general-describe-pod.adoc[leveloffset=+1]
35+
include::modules/troubleshooting-general-describe-pod.adoc[leveloffset=+1]
3636

3737
[role="_additional-resources"]
3838
.Additional resources
3939

4040
* xref:../../../cli_reference/openshift_cli/developer-cli-commands.adoc#oc-describe[oc describe]
4141
42-
include::modules/telco-troubleshooting-general-review-events.adoc[leveloffset=+1]
42+
include::modules/troubleshooting-general-review-events.adoc[leveloffset=+1]
4343

4444
[role="_additional-resources"]
4545
.Additional resources
4646

4747
* xref:../../../security/container_security/security-monitoring.adoc#security-monitoring-events_security-monitoring[Watching cluster events]
4848
49-
include::modules/telco-troubleshooting-general-connect-to-pod.adoc[leveloffset=+1]
49+
include::modules/troubleshooting-general-connect-to-pod.adoc[leveloffset=+1]
5050

5151
[role="_additional-resources"]
5252
.Additional resources
5353

5454
* xref:../../../cli_reference/openshift_cli/developer-cli-commands.adoc#oc-rsh[oc rsh]
5555
* xref:../../../support/troubleshooting/investigating-pod-issues.adoc#accessing-running-pods_investigating-pod-issues[Accessing running pods]
5656
57-
include::modules/telco-troubleshooting-general-debug-pod.adoc[leveloffset=+1]
57+
include::modules/troubleshooting-general-debug-pod.adoc[leveloffset=+1]
5858

5959
[role="_additional-resources"]
6060
.Additional resources
6161

6262
* xref:../../../cli_reference/openshift_cli/developer-cli-commands.adoc#oc-debug[oc debug]
6363
* xref:../../../support/troubleshooting/investigating-pod-issues.adoc#starting-debug-pods-with-root-access_investigating-pod-issues[Starting debug pods with root access]
6464
65-
include::modules/telco-troubleshooting-general-run-command-on-pod.adoc[leveloffset=+1]
65+
include::modules/troubleshooting-general-run-command-on-pod.adoc[leveloffset=+1]
6666

6767
[role="_additional-resources"]
6868
.Additional resources
Lines changed: 6 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,24 +1,23 @@
11
:_mod-docs-content-type: ASSEMBLY
2-
[id="telco-troubleshooting-intro"]
3-
= Troubleshooting and maintaining telco core CNF clusters
2+
[id="troubleshooting-intro"]
3+
= Troubleshooting and maintaining {product-title} clusters
44
include::_attributes/common-attributes.adoc[]
5-
:context: telco-troubleshooting-intro
5+
:context: troubleshooting-intro
66

77
toc::[]
88

99
Troubleshooting and maintenance are weekly tasks that can be a challenge if you do not have the tools to reach your goal, whether you want to update a component or investigate an issue.
1010
Part of the challenge is knowing where and how to search for tools and answers.
1111

12-
To maintain and troubleshoot a bare-metal environment where high-bandwidth network throughput is required, see the following procedures.
12+
To maintain and troubleshoot a bare-metal environment with high performance requirements, see the following procedures.
1313

1414
[IMPORTANT]
1515
====
16-
This troubleshooting information is not a reference for configuring {product-title} or developing Cloud-native Network Function (CNF) applications.
16+
This troubleshooting information is not a reference for configuring {product-title} or developing cloud-native applications.
1717
18-
For information about developing CNF applications for telco, see link:https://redhat-best-practices-for-k8s.github.io/guide/[Red Hat Best Practices for Kubernetes].
18+
For information about developing cloud-native applications on {product-title}, see link:https://redhat-best-practices-for-k8s.github.io/guide/[Red Hat Best Practices for Kubernetes].
1919
====
2020

21-
include::modules/telco-troubleshooting-cnfs.adoc[leveloffset=+1]
2221
include::modules/support-getting-support.adoc[leveloffset=+1]
2322
include::modules/support-knowledgebase-about.adoc[leveloffset=+2]
2423
include::modules/support-knowledgebase-search.adoc[leveloffset=+2]
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
:_mod-docs-content-type: ASSEMBLY
2+
[id="troubleshooting-mco"]
3+
= Machine Config Operator
4+
include::_attributes/common-attributes.adoc[]
5+
:context: troubleshooting-mco
6+
7+
toc::[]
8+
9+
The Machine Config Operator provides useful information to cluster administrators and controls what is running directly on the bare-metal host.
10+
11+
The Machine Config Operator differentiates between groups of nodes in the cluster, allowing control plane nodes and worker nodes to run with different configurations.
12+
These groups of nodes run worker or application pods, which are called `MachineConfigPool` (`mcp`) groups.
13+
The same machine config is applied to all nodes or only to one MCP in the cluster.
14+
15+
For more information about the Machine Config Operator, see xref:../../../operators/operator-reference.adoc#machine-config-operator_cluster-operators-ref[Machine Config Operator].
16+
17+
include::modules/troubleshooting-mco-purpose.adoc[leveloffset=+1]
18+
include::modules/troubleshooting-mco-apply-several-mcs.adoc[leveloffset=+1]

0 commit comments

Comments
 (0)