Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
61 commits
Select commit Hold shift + click to select a range
744e3e0
Updates
dockerymick Nov 12, 2025
ed6f37f
more small updates
dockerymick Nov 12, 2025
5f8d1e0
adding more changes
dockerymick Nov 12, 2025
50bc00c
Apply suggestions from code review
dockerymick Nov 13, 2025
d4c0d5a
Apply suggestion from @dockerymick
dockerymick Nov 13, 2025
ccc2581
update API descriptions
dockerymick Nov 14, 2025
74237e9
Configuring MCOA APIs
dockerymick Nov 17, 2025
5340907
More updates
dockerymick Nov 18, 2025
c0876e4
Apply suggestion from @dockerymick
dockerymick Nov 18, 2025
7ea9620
Update observability introduction documentation
dockerymick Nov 18, 2025
0848959
Update obs_mcoa_federate.adoc
dockerymick Nov 18, 2025
92b16d1
Add relabeling documentation to main.adoc
dockerymick Nov 18, 2025
6e3ac21
Document relabeling default metrics for MCOA
dockerymick Nov 18, 2025
084be98
Update links and comments in obs_mcoa_intro.adoc
dockerymick Nov 18, 2025
20de748
Adding link to intro
dockerymick Nov 18, 2025
e925b24
Add documentation for exporting metrics to external endpoints
dockerymick Nov 18, 2025
7bd1e77
Add export_metrics_ext_endpts.adoc to main.adoc
dockerymick Nov 18, 2025
6eaa135
Enhance observability documentation with cluster ID details
dockerymick Nov 18, 2025
6475970
Fix formatting of API descriptions in documentation
dockerymick Nov 18, 2025
a739b8f
Rename obs_mcoa-relabel.adoc to obs_mcoa_relabel.adoc
dockerymick Nov 18, 2025
a9d0fcb
Fix duplicate entry for RightSizingRecommendation
dockerymick Nov 18, 2025
49e5730
Revise MultiClusterObservability documentation and commands
dockerymick Nov 19, 2025
462f5d9
Updates and hidden comments after dev review
dockerymick Nov 19, 2025
913e41c
Updates after dev review
dockerymick Nov 19, 2025
41232e2
remove hidden comment
dockerymick Nov 19, 2025
4489634
Remove anchors from H2 sections
dockerymick Nov 19, 2025
3f9e16a
Live review with Thibault
dockerymick Nov 19, 2025
2f18cb7
More from the live review
dockerymick Nov 19, 2025
d9b9257
More updates during live review
dockerymick Nov 19, 2025
e926995
More updates during live review
dockerymick Nov 19, 2025
e810918
Apply suggestion from @thibaultmg
dockerymick Nov 24, 2025
1e75131
Update commands after dev review
dockerymick Nov 24, 2025
7aab063
Apply suggestion from @thibaultmg
dockerymick Nov 24, 2025
8c8bb38
Apply suggestion from @thibaultmg
dockerymick Nov 24, 2025
25e8d9c
Apply suggestion from @thibaultmg
dockerymick Nov 24, 2025
85d624d
Apply suggestion from @thibaultmg
dockerymick Nov 24, 2025
c1ed37b
Update ScrapeConfig for platform metrics alerts
dockerymick Nov 24, 2025
77c94bb
Revise export metrics documentation for clarity
dockerymick Nov 24, 2025
2ea769f
Updates after dev review
dockerymick Nov 24, 2025
2e41be6
Update obs_mcoa_config_apis.adoc
dockerymick Nov 24, 2025
11e90d6
changes after dev review
dockerymick Nov 24, 2025
53d70fe
update after dev review
dockerymick Nov 24, 2025
a02ffbe
Apply suggestion from @thibaultmg
dockerymick Nov 24, 2025
e3e0cd2
Apply suggestion from @dockerymick
dockerymick Nov 24, 2025
d4afbd2
updates after peer review
dockerymick Nov 24, 2025
320a3e6
Update obs_enable_mcoa.adoc
dockerymick Nov 24, 2025
3bf5941
Update obs_mcoa_add_custom_metrics.adoc
dockerymick Nov 24, 2025
7145825
Update observability/obs_mcoa_add_custom_metrics.adoc
dockerymick Nov 25, 2025
82fc602
Update observability/obs_mcoa_config_apis.adoc
dockerymick Nov 25, 2025
1396a6f
Apply suggestion from @dockerymick
dockerymick Nov 25, 2025
0d4069d
Merge branch '2.15_stage' into mj-ACM-23917
dockerymick Nov 25, 2025
cd8dd5d
Update obs_mcoa_config_apis.adoc
dockerymick Nov 25, 2025
375a1e9
Update obs_mcoa_federate.adoc
dockerymick Nov 25, 2025
5a45610
Apply suggestion from @dockerymick
dockerymick Nov 25, 2025
76fb870
Update obs_mcoa_intro.adoc
dockerymick Nov 25, 2025
8f0369e
Update obs_mcoa_relabel.adoc
dockerymick Nov 25, 2025
80efa3d
Update obs_mcoa_config_apis.adoc
dockerymick Nov 25, 2025
527a13a
Update observability/obs_mcoa_intro.adoc
dockerymick Nov 25, 2025
d02263d
Improving intro description
dockerymick Nov 26, 2025
8f18b7c
IMproving what's new entry
dockerymick Nov 26, 2025
92a1590
Update observability/obs_mcoa_config_apis.adoc
dockerymick Dec 1, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 48 additions & 0 deletions observability/export_metrics_ext_endpts.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
[#export-metrics-external]
= Exporting metrics to external endpoints for the multicluster observability add-on

To configure an external metrics endpoint, add your custom `remoteWrite` specification to your `PrometheusAgents` resources. When you configure your `remoteWrite` specification, the managed cluster sends metrics are directly to the external endpoint.

Update related multicluster observability add-on configurations by relabeling default metrics. Exporting metrics from your managed cluster helps you improve resiliency during network partitions between your managed and hub clusters for up to two hours.

*Required access:* Cluster administrator

.Prerequisites

- You have installed and enabled the multicluster observability add-on. For more information, see xref:../observability/obs_enable_mcoa.adoc#enable-mcoa[Enabling the multicluster observability add-on].

.Procedure

Complete the following steps to export metrics to external endpoints for the `PrometheusAgent` resource:

. Create the TLS `Secret` config maps within the `open-cluster-management-observability` namespace.
. Add the secret names to the `secrets` specification of the `PrometheusAgent` resource.
. Add the `remoteWrite` specification to the `PrometheusAgent` resource. Your `PrometheusAgent` resource might resemble the following YAML file, where the `up` metric is exported to a custom endpoint:

+
[source,yaml]
----
apiVersion: monitoring.rhobs/v1alpha1
kind: PrometheusAgent
metadata:
name: mcoa-default-platform-metrics-collector-global
namespace: open-cluster-management-observability
spec:
secrets:
- custom-endpoint-ca
- custom-endpoint-cert
remoteWrite:
- name: custom-endpoint
tlsConfig:
caFile: /etc/prometheus/secrets/custom-endpoint-ca/ca.crt
certFile: /etc/prometheus/secrets/custom-endpoint-cert/tls.crt
keyFile: /etc/prometheus/secrets/custom-endpoint-cert/tls.key
url: 'https://my-custom-remote-write-endpoint.io/api/v1/receive'
writeRelabelConfigs:
- action: keep
regex: ^up$
sourceLabels:
- __name__
- name: acm-observability
...
----
17 changes: 12 additions & 5 deletions observability/main.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,6 @@ include::modules/common-attributes.adoc[]
include::observe_environments_intro.adoc[leveloffset=+1]
include::observability_arch.adoc[leveloffset=+2]
include::obs_config.adoc[leveloffset=+2]
include::observability_enable.adoc[leveloffset=+2]
include::use_observability.adoc[leveloffset=+2]
include::design_grafana.adoc[leveloffset=+3]
include::grafana_labels.adoc[leveloffset=+3]
include::observability_alerts.adoc[leveloffset=+2]
include::adv_config_obs.adoc[leveloffset=+2]
include::obs_metrics.adoc[leveloffset=+3]
include::obs_scale.adoc[leveloffset=+3]
Expand All @@ -20,9 +15,21 @@ include::obs_update_mco.adoc[leveloffset=+3]
include::obs_pv_pvc.adoc[leveloffset=+3]
include::obs_custom_alert.adoc[leveloffset=+3]
include::obs_rbac.adoc[leveloffset=+3]
include::observability_enable.adoc[leveloffset=+2]
include::use_observability.adoc[leveloffset=+2]
include::design_grafana.adoc[leveloffset=+3]
include::grafana_labels.adoc[leveloffset=+3]
include::observability_alerts.adoc[leveloffset=+2]
include::obs_right_size_intro.adoc[leveloffset=+2]
include::obs_right_size_ns.adoc[leveloffset=+3]
include::obs_right_size_config_ns.adoc[leveloffset=+3]
include::obs_right_size_virt.adoc[leveloffset=+3]
include::obs_right_size_config_virt.adoc[leveloffset=+3]
include::obs_mcoa_intro.adoc[leveloffset=+2]
include::obs_enable_mcoa.adoc[leveloffset=+3]
include::obs_mcoa_config_apis.adoc[leveloffset=+3]
include::obs_mcoa_add_custom_metrics.adoc[leveloffset=+3]
include::obs_mcoa_federate.adoc[leveloffset=+3]
include::obs_mcoa_relabel.adoc[leveloffset=+3]
include::export_metrics_ext_endpts.adoc[leveloffset=+3]
include::insights_intro.adoc[leveloffset=+1]
62 changes: 62 additions & 0 deletions observability/obs_enable_mcoa.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
[#enable-mcoa]
= Enabling the multicluster observability add-on

Enable the multicluster observability add-on on your hub cluster to configure your metrics for your platform and user workloads. You are required to enable platform workloads and it is optional to enable user workloads.

When you enable `platform` and `userWorkloads` specifications, the `MultiClusterObservability` operator stops deploying the metrics collectors to your managed clusters and deploys the `multicluster-observability-addon-manager` in the `open-cluster-management-observability` namespace. The `multicluster-observability-addon-manager` deploys the new metrics collectors based on the `PrometheusAgent` resource defined on your hub cluster.

*Required access:* Cluster administrator

.Prerequisites

- You must enable the Observability service on your hub cluster. For more details, see xref:../observability/observability_enable.adoc[Enabling the Observability service].
- You must install the Red Hat OpenShift Cluster Observability Operator. For more information, see link:https://docs.redhat.com/en/documentation/red_hat_openshift_cluster_observability_operator/1-latest/html-single/about_red_hat_openshift_cluster_observability_operator/index[Cluster Observability Operator overview].

.Procedure

Complete the following steps to enable the multicluster observability add-on on your hub cluster:

. To enable platform monitoring and user workload monitoring, add the `platform` and `userWorkloads` specification to your `MultiClusterObservability` resource. Run the following command:

+
[source,bash]
----
oc patch mco observability -n open-cluster-management-observability --type=merge -p '{"spec":{"capabilities":{"platform":{"metrics":{"default":{"enabled": true}}},"userWorkloads":{"metrics":{"default":{"enabled": true}}}}}}'
----
+
Your `MultiClusterObservability` resource might resemble the following file example:

+
[source,yaml]
----
apiVersion: observability.open-cluster-management.io/v1beta2
kind: MultiClusterObservability
metadata:
name: observability
spec:
capabilities:
platform:
metrics:
default:
enabled: true
userWorkloads:
metrics:
default:
enabled: true
----

. To verify that the default configuration resources for the multicluster observability add-on are created, open your `multicluster-observability-addon` `ClusterManagementAddon` resource. Run the following command:

+
[source,bash]
----
oc get prometheusagents -n open-cluster-management-observability
----

. Verify that the default configurations are added to your placements. Run the following command:

+
[source,bash]
----
oc get cma multicluster-observability-addon -o yaml | yq '.spec.installStrategy.placements'
----
47 changes: 47 additions & 0 deletions observability/obs_mcoa_add_custom_metrics.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
[#add-custom-metrics-mcoa]
= Adding custom metrics for the multicluster observability add-on

Add your own custom metrics to be collected from your managed clusters by configuring the `ScrapeConfig` resource. The `ScrapeConfig` must be added to the `Placement` configurations of the `ClusterManagementAddOn` resource for deploying on the corresponding managed clusters. The Prometheus operator on each managed cluster adds the new `ScrapeConfig` resource to the `PrometheusAgent` resource.

*Requierd access:* Cluster administrator

.Prerequisites

- You have installed and enabled the multicluster observability add-on. For more information, see xref:../observability/obs_enable_mcoa.adoc#enable-mcoa[Enabling the multicluster observability add-on].

Complete the following steps to add custom metrics for the multicluster observability add-on:

. Create a new `ScrapeConfig` resource in the `open-cluster-management-observability` namespace that includes values for the required parameters, `jobName`, `metricsPath`, and `params`.

. Add the appropriate label for the `app.kubernetes.io/component` parameter to specify whether the metrics are for platform monitoring or user workload monitoring. Use one of the following label values, `platform-metrics-collector` or `user-workload-metrics-collector`.

+
*Note:* When you use the `platform-metrics-collector` label, the multicluster observability add-on automatically sets the `scrapeClass` and `targets` parameters to enable federation from the platform Prometheus of your {ocp-short} managed cluster. You can override the `scrapeClass` and `targets` parameters by adding the value that you need.

. *Optional:* Manually set the `scrapeClass` and `staticConfigs` specifications for your `user-workload-metrics-collector` `ScrapeConfig` resource.

. Add the `ScrapeConfig` resource reference to the placements of the `ClusterManagementAddOn` resource where you want the resource to be deployed.

+
*Note:* Ensure that you reference the `ScrapeConfig` resource only after you create it. Otherwise, the add-on status updates to a `Deploying` status because the resource does not exist.

+
Your `ScrapeConfig` resource might resemble the following YAML file:

+
[source,yaml]
----
apiVersion: monitoring.rhobs/v1alpha1
kind: ScrapeConfig
metadata:
name: add-custom-metrics
namespace: open-cluster-management-observability
labels:
- app.kubernetes.io/component: platform-metrics-collector
spec:
jobName: some-job-name
metricsPath: /federate
params:
match[]:
- '{__name__="up"}'
----
59 changes: 59 additions & 0 deletions observability/obs_mcoa_config_apis.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
[#config-apis-mcoa]
= Configuring APIs for the multicluster observability add-on

Configure the default metrics-specific APIs of the multicluster observability add-on. When you reference a new placement in your `ClusterManagementAddon` resource, the `multicluster-observability-addon-manager` automatically creates specific default `PrometheusAgent` resources. The `multicluster-observability-addon-manager` adds the `PrometheusAgent` resources reference in the related `Placement` configurations.

While there is one `PrometheusAgent` resource created by the placement, the default `ScrapeConfigs` and `PrometheusRules` are common to all placements. The following resources are the default configuration resources for the multicluster observability add-on: `PrometheusAgent`, `ScrapeConfigs`, and `PrometheusRules`.

*Required access:* Cluster administrator

.Prerequisites

- You have installed and enabled the multicluster observability add-on. For more information, see xref:../observability/obs_enable_mcoa.adoc#enable-mcoa[Enabling the multicluster observability add-on].

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.Procedure

.Procedure

Complete the following to configure the APIs for the multicluster observability add-on:

. *Optional* Override the default scrape interval for your `PrometheusAgent` resource by changing the `scrapeInterval` parameter. The default value is `300s`. You can also override the `scrapeInterval` of the `scrapeConfig` resource.

. Configure the `ScrapeConfig` resource to define a set of metrics for independent federation from Prometheus of your managed clusters. Complete the following steps:

.. Add the name of the job that you want to reference for the `jobName` parameter.
.. To ensure that Prometheus federates metrics, add the `/federate` URL path for the `metricsPath` parameter.
.. Add the metric name and labels that you want to collect. See the following YAML file example where the `ScrapeConfig` resource collects the `up` metric:

+
[source,yaml]
----
apiVersion: monitoring.rhobs/v1alpha1
kind: ScrapeConfig
metadata:
name: some-metrics-to-collect
namespace: open-cluster-management-observability
labels:
- app.kubernetes.io/component: <platform-metrics-collector> or <user-workload-metrics-collector>
spec:
jobName: some-job-name
metricsPath: /federate
params:
match[]:
- '{__name__="up"}'
----

. Configure the `PrometheusRule` resource to limit the cardinality of your collected metrics on your hub cluster. Complete the following steps:

.. Define alerting and recording rules for platform and user workload monitoring on your managed clusters.

.. To target user workloads in your `PrometheusRule` resource, add the following annotation to define the namespace where you want to deploy the resource: `observability.open-cluster-management.io/target-namespace`.

+
*Notes:*
- When the `observability.open-cluster-management.io/target-namespace` is not set in your `PrometheusRule` resource, the `PrometheusRule` resource are deployed to the default installation namespace. If `openshift.io/cluster-monitoring` is set to `true`, the `PrometheusRule` resouces are not monitored by {ocp-short} user workload monitoring stack.
- Be sure to use the `monitoring.coreos.com` group for the `PrometheusRule` resource.

. Configure the `AddonDeploymentConfig` resource to customize the deployment of the multicluster observability add-on on managed clusters. *Note:* The values in the `AddonDeploymentConfig` resource are override direct modifications of other resources. Complete the following steps:

.. Define the namespace where you install the `PrometheusAgent` resource for the `agentInstallNamespace` parameter. The default namespace is `open-cluster-management-agent-addon`.
.. Add the `nodePlacement` specification with the associated `nodeSelector` and `tolerations` parameters.
.. Edit the `proxyConfig` specification by updating the `httpProxy` and `noProxy` configurations.
34 changes: 34 additions & 0 deletions observability/obs_mcoa_federate.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
[#federate-uw-coo]
= Federating user workloads from Cluster Observability Operator

Federate your user workloads from the Red Hat OpenShift Cluster Observability Operator on your managed clusters. By default, user workload metrics are federated from the Prometheus user-workload on your {ocp-short} cluster.

*Required access:* Cluster administrator

.Prerequisites

- User workload monitoring is enabled on your managed cluster.
- User workload monitoring is enabled in the `MultiClusterObservability` custom resource.
- You have created `ScrapeConfig` resources with the `app.kubernetes.io/component: <user-workload-metrics-collector>` label.
- The `ScrapeConfig` resources are referenced in the configurations list of the `ClusterManagementAddOn` for the target placements.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.Procedure

.Procedure

Complete the following steps to federate user workloads from the Cluster Observability Operator on your managed clusters:

. Update your `ScrapeConfig` resource by adding the endpoint of the Cluster Observability Operator `MonitoringStack` resource. Your resource might resemble the following YAML file:

+
[source,yaml]
----
apiVersion: monitoring.coreos.com/v1alpha1
kind: ScrapeConfig
spec:
scrapeClass: “”
scheme: HTTP
staticConfigs:
- targets:
- my-monitoring-stack.my-monitoring-ns.svc:9090
----

. If you use a proxy with the Prometheus server, modify the `ScrapeConfig` resource to include your TLS configuration. Create a corresponding `scrapeClass` specification on the user workload `PrometheusAgent` resource. Then reference the `PrometheusAgent` resource within the `ScrapeConfig` resource.
Loading