stolostron · dockerymick · Nov 12, 2025 · Nov 12, 2025 · Nov 12, 2025 · Nov 13, 2025
diff --git a/observability/export_metrics_ext_endpts.adoc b/observability/export_metrics_ext_endpts.adoc
@@ -0,0 +1,48 @@
+[#export-metrics-external]
+= Exporting metrics to external endpoints for the multicluster observability add-on
+
+To configure an external metrics endpoint, add your custom `remoteWrite` specification to your `PrometheusAgents` resources. When you configure your `remoteWrite` specification, the managed cluster sends metrics are directly to the external endpoint.
+
+Update related multicluster observability add-on configurations by relabeling default metrics. Exporting metrics from your managed cluster helps you improve resiliency during network partitions between your managed and hub clusters for up to two hours.
+
+*Required access:* Cluster administrator
+
+.Prerequisites
+
+- You have installed and enabled the multicluster observability add-on. For more information, see xref:../observability/obs_enable_mcoa.adoc#enable-mcoa[Enabling the multicluster observability add-on].
+
+.Procedure
+
+Complete the following steps to export metrics to external endpoints for the `PrometheusAgent` resource:
+
+. Create the TLS `Secret` config maps within the `open-cluster-management-observability` namespace.
+. Add the secret names to the `secrets` specification of the `PrometheusAgent` resource.
+. Add the `remoteWrite` specification to the `PrometheusAgent` resource. Your `PrometheusAgent` resource might resemble the following YAML file, where the `up` metric is exported to a custom endpoint:
+
++
+[source,yaml]
+----
+apiVersion: monitoring.rhobs/v1alpha1
+kind: PrometheusAgent
+metadata:
+  name: mcoa-default-platform-metrics-collector-global
+  namespace: open-cluster-management-observability
+spec:
+  secrets:
+    - custom-endpoint-ca
+    - custom-endpoint-cert
+  remoteWrite:
+    - name: custom-endpoint
+      tlsConfig:
+        caFile: /etc/prometheus/secrets/custom-endpoint-ca/ca.crt
+        certFile: /etc/prometheus/secrets/custom-endpoint-cert/tls.crt
+        keyFile: /etc/prometheus/secrets/custom-endpoint-cert/tls.key
+      url: 'https://my-custom-remote-write-endpoint.io/api/v1/receive'
+      writeRelabelConfigs:
+        - action: keep
+          regex: ^up$
+          sourceLabels:
+            - __name__
+    - name: acm-observability
+      ...
+----
diff --git a/observability/main.adoc b/observability/main.adoc
@@ -5,11 +5,6 @@ include::modules/common-attributes.adoc[]
 include::observe_environments_intro.adoc[leveloffset=+1]
 include::observability_arch.adoc[leveloffset=+2]
 include::obs_config.adoc[leveloffset=+2]
-include::observability_enable.adoc[leveloffset=+2]
-include::use_observability.adoc[leveloffset=+2]
-include::design_grafana.adoc[leveloffset=+3]
-include::grafana_labels.adoc[leveloffset=+3]
-include::observability_alerts.adoc[leveloffset=+2]
 include::adv_config_obs.adoc[leveloffset=+2]
 include::obs_metrics.adoc[leveloffset=+3]
 include::obs_scale.adoc[leveloffset=+3]
@@ -20,9 +15,21 @@ include::obs_update_mco.adoc[leveloffset=+3]
 include::obs_pv_pvc.adoc[leveloffset=+3]
 include::obs_custom_alert.adoc[leveloffset=+3]
 include::obs_rbac.adoc[leveloffset=+3]
+include::observability_enable.adoc[leveloffset=+2]
+include::use_observability.adoc[leveloffset=+2]
+include::design_grafana.adoc[leveloffset=+3]
+include::grafana_labels.adoc[leveloffset=+3]
+include::observability_alerts.adoc[leveloffset=+2]
 include::obs_right_size_intro.adoc[leveloffset=+2]
 include::obs_right_size_ns.adoc[leveloffset=+3]
 include::obs_right_size_config_ns.adoc[leveloffset=+3]
 include::obs_right_size_virt.adoc[leveloffset=+3]
 include::obs_right_size_config_virt.adoc[leveloffset=+3]
+include::obs_mcoa_intro.adoc[leveloffset=+2]
+include::obs_enable_mcoa.adoc[leveloffset=+3]
+include::obs_mcoa_config_apis.adoc[leveloffset=+3]
+include::obs_mcoa_add_custom_metrics.adoc[leveloffset=+3]
+include::obs_mcoa_federate.adoc[leveloffset=+3]
+include::obs_mcoa_relabel.adoc[leveloffset=+3]
+include::export_metrics_ext_endpts.adoc[leveloffset=+3]
 include::insights_intro.adoc[leveloffset=+1]
diff --git a/observability/obs_enable_mcoa.adoc b/observability/obs_enable_mcoa.adoc
@@ -0,0 +1,62 @@
+[#enable-mcoa]
+= Enabling the multicluster observability add-on
+
+Enable the multicluster observability add-on on your hub cluster to configure your metrics for your platform and user workloads. You are required to enable platform workloads and it is optional to enable user workloads.
+
+When you enable `platform` and `userWorkloads` specifications, the `MultiClusterObservability` operator stops deploying the metrics collectors to your managed clusters and deploys the `multicluster-observability-addon-manager` in the `open-cluster-management-observability` namespace. The `multicluster-observability-addon-manager` deploys the new metrics collectors based on the `PrometheusAgent` resource defined on your hub cluster.
+
+*Required access:* Cluster administrator
+
+.Prerequisites
+
+- You must enable the Observability service on your hub cluster. For more details, see xref:../observability/observability_enable.adoc[Enabling the Observability service].
+- You must install the Red Hat OpenShift Cluster Observability Operator. For more information, see link:https://docs.redhat.com/en/documentation/red_hat_openshift_cluster_observability_operator/1-latest/html-single/about_red_hat_openshift_cluster_observability_operator/index[Cluster Observability Operator overview].
+
+.Procedure
+
+Complete the following steps to enable the multicluster observability add-on on your hub cluster:
+
+. To enable platform monitoring and user workload monitoring, add the `platform` and `userWorkloads` specification to your `MultiClusterObservability` resource. Run the following command:
+
++
+[source,bash]
+----
+oc patch mco observability -n open-cluster-management-observability --type=merge -p '{"spec":{"capabilities":{"platform":{"metrics":{"default":{"enabled": true}}},"userWorkloads":{"metrics":{"default":{"enabled": true}}}}}}'
+----
++
+Your `MultiClusterObservability` resource might resemble the following file example:
+
++
+[source,yaml]
+----
+apiVersion: observability.open-cluster-management.io/v1beta2
+kind: MultiClusterObservability
+metadata:
+  name: observability
+spec:
+  capabilities:
+    platform:
+      metrics:
+        default:
+          enabled: true
+    userWorkloads:
+      metrics:
+        default:
+          enabled: true
+----
+
+. To verify that the default configuration resources for the multicluster observability add-on are created, open your `multicluster-observability-addon` `ClusterManagementAddon` resource. Run the following command:
+
++
+[source,bash]
+----
+oc get prometheusagents -n open-cluster-management-observability
+----
+
+. Verify that the default configurations are added to your placements. Run the following command:
+
++
+[source,bash]
+----
+oc get cma multicluster-observability-addon -o yaml | yq '.spec.installStrategy.placements'
+----
diff --git a/observability/obs_mcoa_add_custom_metrics.adoc b/observability/obs_mcoa_add_custom_metrics.adoc
@@ -0,0 +1,47 @@
+[#add-custom-metrics-mcoa]
+= Adding custom metrics for the multicluster observability add-on
+
+Add your own custom metrics to be collected from your managed clusters by configuring the `ScrapeConfig` resource. The `ScrapeConfig` must be added to the `Placement` configurations of the `ClusterManagementAddOn` resource for deploying on the corresponding managed clusters. The Prometheus operator on each managed cluster adds the new `ScrapeConfig` resource to the `PrometheusAgent` resource.
+
+*Requierd access:* Cluster administrator
+
+.Prerequisites
+
+- You have installed and enabled the multicluster observability add-on. For more information, see xref:../observability/obs_enable_mcoa.adoc#enable-mcoa[Enabling the multicluster observability add-on].
+
+Complete the following steps to add custom metrics for the multicluster observability add-on:
+
+. Create a new `ScrapeConfig` resource in the `open-cluster-management-observability` namespace that includes values for the required parameters, `jobName`, `metricsPath`, and `params`.
+
+. Add the appropriate label for the `app.kubernetes.io/component` parameter to specify whether the metrics are for platform monitoring or user workload monitoring. Use one of the following label values, `platform-metrics-collector` or `user-workload-metrics-collector`.
+
++
+*Note:* When you use the `platform-metrics-collector` label, the multicluster observability add-on automatically sets the `scrapeClass` and `targets` parameters to enable federation from the platform Prometheus of your {ocp-short} managed cluster. You can override the `scrapeClass` and `targets` parameters by adding the value that you need.
+
+. *Optional:* Manually set the `scrapeClass` and `staticConfigs` specifications for your `user-workload-metrics-collector` `ScrapeConfig` resource.
+
+. Add the `ScrapeConfig` resource reference to the placements of the `ClusterManagementAddOn` resource where you want the resource to be deployed.
+
++
+*Note:* Ensure that you reference the `ScrapeConfig` resource only after you create it. Otherwise, the add-on status updates to a `Deploying` status because the resource does not exist.
+
++
+Your `ScrapeConfig` resource might resemble the following YAML file:
+
++
+[source,yaml]
+----
+apiVersion: monitoring.rhobs/v1alpha1
+kind: ScrapeConfig
+metadata:
+  name: add-custom-metrics
+  namespace: open-cluster-management-observability
+  labels:
+    - app.kubernetes.io/component: platform-metrics-collector
+spec:
+  jobName: some-job-name
+  metricsPath: /federate
+  params:
+    match[]:
+    - '{__name__="up"}'
+----
diff --git a/observability/obs_mcoa_config_apis.adoc b/observability/obs_mcoa_config_apis.adoc
@@ -0,0 +1,59 @@
+[#config-apis-mcoa]
+= Configuring APIs for the multicluster observability add-on
+
+Configure the default metrics-specific APIs of the multicluster observability add-on. When you reference a new placement in your `ClusterManagementAddon` resource, the `multicluster-observability-addon-manager` automatically creates specific default `PrometheusAgent` resources. The `multicluster-observability-addon-manager` adds the `PrometheusAgent` resources reference in the related `Placement` configurations.
+
+While there is one `PrometheusAgent` resource created by the placement, the default `ScrapeConfigs` and `PrometheusRules` are common to all placements. The following resources are the default configuration resources for the multicluster observability add-on: `PrometheusAgent`, `ScrapeConfigs`, and `PrometheusRules`.
+
+*Required access:* Cluster administrator
+
+.Prerequisites
+
+- You have installed and enabled the multicluster observability add-on. For more information, see xref:../observability/obs_enable_mcoa.adoc#enable-mcoa[Enabling the multicluster observability add-on].
+
-
+
+.Procedure
+
-
+
+.Procedure
+
+.Procedure
+
+Complete the following to configure the APIs for the multicluster observability add-on:
+
+. *Optional* Override the default scrape interval for your `PrometheusAgent` resource by changing the `scrapeInterval` parameter. The default value is `300s`. You can also override the `scrapeInterval` of the `scrapeConfig` resource.
+
+. Configure the `ScrapeConfig` resource to define a set of metrics for independent federation from Prometheus of your managed clusters. Complete the following steps:
+
+.. Add the name of the job that you want to reference for the `jobName` parameter.
+.. To ensure that Prometheus federates metrics, add the `/federate` URL path for the `metricsPath` parameter.
+.. Add the metric name and labels that you want to collect. See the following YAML file example where the `ScrapeConfig` resource collects the `up` metric:
+
++
+[source,yaml]
+----
+apiVersion: monitoring.rhobs/v1alpha1
+kind: ScrapeConfig
+metadata:
+  name: some-metrics-to-collect
+  namespace: open-cluster-management-observability
+  labels:
+    - app.kubernetes.io/component: <platform-metrics-collector> or <user-workload-metrics-collector>
+spec:
+  jobName: some-job-name
+  metricsPath: /federate
+  params:
+    match[]:
+    - '{__name__="up"}'
+----
+
+. Configure the `PrometheusRule` resource to limit the cardinality of your collected metrics on your hub cluster. Complete the following steps:
+
+.. Define alerting and recording rules for platform and user workload monitoring on your managed clusters. 
+
+.. To target user workloads in your `PrometheusRule` resource, add the following annotation to define the namespace where you want to deploy the resource: `observability.open-cluster-management.io/target-namespace`. 
+
++
+*Notes:* 
+- When the `observability.open-cluster-management.io/target-namespace` is not set in your `PrometheusRule` resource, the `PrometheusRule` resource are deployed to the default installation namespace. If `openshift.io/cluster-monitoring` is set to `true`, the `PrometheusRule` resouces are not monitored by {ocp-short} user workload monitoring stack.
+- Be sure to use the `monitoring.coreos.com` group for the `PrometheusRule` resource.
+
+. Configure the `AddonDeploymentConfig` resource to customize the deployment of the multicluster observability add-on on managed clusters. *Note:* The values in the `AddonDeploymentConfig` resource are override direct modifications of other resources. Complete the following steps:
+
+.. Define the namespace where you install the `PrometheusAgent` resource for the `agentInstallNamespace` parameter. The default namespace is `open-cluster-management-agent-addon`.
+.. Add the `nodePlacement` specification with the associated `nodeSelector` and `tolerations` parameters.
+.. Edit the `proxyConfig` specification by updating the `httpProxy` and `noProxy` configurations.
diff --git a/observability/obs_mcoa_federate.adoc b/observability/obs_mcoa_federate.adoc
@@ -0,0 +1,34 @@
+[#federate-uw-coo]
+= Federating user workloads from Cluster Observability Operator
+
+Federate your user workloads from the Red Hat OpenShift Cluster Observability Operator on your managed clusters. By default, user workload metrics are federated from the Prometheus user-workload on your {ocp-short} cluster.
+
+*Required access:* Cluster administrator
+
+.Prerequisites
+
+- User workload monitoring is enabled on your managed cluster.
+- User workload monitoring is enabled in the    `MultiClusterObservability` custom resource.
+- You have created `ScrapeConfig` resources with the `app.kubernetes.io/component: <user-workload-metrics-collector>` label.
+- The `ScrapeConfig` resources are referenced in the configurations list of the `ClusterManagementAddOn` for the target placements.
+
-
+
+.Procedure
+
-
+
+.Procedure
+
+.Procedure
+
+Complete the following steps to federate user workloads from the Cluster Observability Operator on your managed clusters:
+
+. Update your `ScrapeConfig` resource by adding the endpoint of the Cluster Observability Operator `MonitoringStack` resource. Your resource might resemble the following YAML file:
+
++
+[source,yaml]
+----
+apiVersion: monitoring.coreos.com/v1alpha1
+kind: ScrapeConfig
+spec:
+  scrapeClass: “” 
+  scheme: HTTP
+  staticConfigs:
+  - targets:
+    - my-monitoring-stack.my-monitoring-ns.svc:9090
+----
+
+. If you use a proxy with the Prometheus server, modify the `ScrapeConfig` resource to include your TLS configuration. Create a corresponding `scrapeClass` specification on the user workload `PrometheusAgent` resource. Then reference the `PrometheusAgent` resource within the `ScrapeConfig` resource.