-
Notifications
You must be signed in to change notification settings - Fork 117
https://issues.redhat.com/browse/ACM-23917 MCOA #8364
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
dockerymick
wants to merge
61
commits into
2.15_stage
Choose a base branch
from
mj-ACM-23917
base: 2.15_stage
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+376
−5
Open
Changes from all commits
Commits
Show all changes
61 commits
Select commit
Hold shift + click to select a range
744e3e0
Updates
dockerymick ed6f37f
more small updates
dockerymick 5f8d1e0
adding more changes
dockerymick 50bc00c
Apply suggestions from code review
dockerymick d4c0d5a
Apply suggestion from @dockerymick
dockerymick ccc2581
update API descriptions
dockerymick 74237e9
Configuring MCOA APIs
dockerymick 5340907
More updates
dockerymick c0876e4
Apply suggestion from @dockerymick
dockerymick 7ea9620
Update observability introduction documentation
dockerymick 0848959
Update obs_mcoa_federate.adoc
dockerymick 92b16d1
Add relabeling documentation to main.adoc
dockerymick 6e3ac21
Document relabeling default metrics for MCOA
dockerymick 084be98
Update links and comments in obs_mcoa_intro.adoc
dockerymick 20de748
Adding link to intro
dockerymick e925b24
Add documentation for exporting metrics to external endpoints
dockerymick 7bd1e77
Add export_metrics_ext_endpts.adoc to main.adoc
dockerymick 6eaa135
Enhance observability documentation with cluster ID details
dockerymick 6475970
Fix formatting of API descriptions in documentation
dockerymick a739b8f
Rename obs_mcoa-relabel.adoc to obs_mcoa_relabel.adoc
dockerymick a9d0fcb
Fix duplicate entry for RightSizingRecommendation
dockerymick 49e5730
Revise MultiClusterObservability documentation and commands
dockerymick 462f5d9
Updates and hidden comments after dev review
dockerymick 913e41c
Updates after dev review
dockerymick 41232e2
remove hidden comment
dockerymick 4489634
Remove anchors from H2 sections
dockerymick 3f9e16a
Live review with Thibault
dockerymick 2f18cb7
More from the live review
dockerymick d9b9257
More updates during live review
dockerymick e926995
More updates during live review
dockerymick e810918
Apply suggestion from @thibaultmg
dockerymick 1e75131
Update commands after dev review
dockerymick 7aab063
Apply suggestion from @thibaultmg
dockerymick 8c8bb38
Apply suggestion from @thibaultmg
dockerymick 25e8d9c
Apply suggestion from @thibaultmg
dockerymick 85d624d
Apply suggestion from @thibaultmg
dockerymick c1ed37b
Update ScrapeConfig for platform metrics alerts
dockerymick 77c94bb
Revise export metrics documentation for clarity
dockerymick 2ea769f
Updates after dev review
dockerymick 2e41be6
Update obs_mcoa_config_apis.adoc
dockerymick 11e90d6
changes after dev review
dockerymick 53d70fe
update after dev review
dockerymick a02ffbe
Apply suggestion from @thibaultmg
dockerymick e3e0cd2
Apply suggestion from @dockerymick
dockerymick d4afbd2
updates after peer review
dockerymick 320a3e6
Update obs_enable_mcoa.adoc
dockerymick 3bf5941
Update obs_mcoa_add_custom_metrics.adoc
dockerymick 7145825
Update observability/obs_mcoa_add_custom_metrics.adoc
dockerymick 82fc602
Update observability/obs_mcoa_config_apis.adoc
dockerymick 1396a6f
Apply suggestion from @dockerymick
dockerymick 0d4069d
Merge branch '2.15_stage' into mj-ACM-23917
dockerymick cd8dd5d
Update obs_mcoa_config_apis.adoc
dockerymick 375a1e9
Update obs_mcoa_federate.adoc
dockerymick 5a45610
Apply suggestion from @dockerymick
dockerymick 76fb870
Update obs_mcoa_intro.adoc
dockerymick 8f0369e
Update obs_mcoa_relabel.adoc
dockerymick 80efa3d
Update obs_mcoa_config_apis.adoc
dockerymick 527a13a
Update observability/obs_mcoa_intro.adoc
dockerymick d02263d
Improving intro description
dockerymick 8f18b7c
IMproving what's new entry
dockerymick 92a1590
Update observability/obs_mcoa_config_apis.adoc
dockerymick File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,48 @@ | ||
| [#export-metrics-external] | ||
| = Exporting metrics to external endpoints for the multicluster observability add-on | ||
|
|
||
| To configure an external metrics endpoint, add your custom `remoteWrite` specification to your `PrometheusAgents` resources. When you configure your `remoteWrite` specification, the managed cluster sends metrics are directly to the external endpoint. | ||
|
|
||
| Update related multicluster observability add-on configurations by relabeling default metrics. Exporting metrics from your managed cluster helps you improve resiliency during network partitions between your managed and hub clusters for up to two hours. | ||
|
|
||
| *Required access:* Cluster administrator | ||
|
|
||
| .Prerequisites | ||
|
|
||
| - You have installed and enabled the multicluster observability add-on. For more information, see xref:../observability/obs_enable_mcoa.adoc#enable-mcoa[Enabling the multicluster observability add-on]. | ||
|
|
||
| .Procedure | ||
|
|
||
| Complete the following steps to export metrics to external endpoints for the `PrometheusAgent` resource: | ||
|
|
||
| . Create the TLS `Secret` config maps within the `open-cluster-management-observability` namespace. | ||
| . Add the secret names to the `secrets` specification of the `PrometheusAgent` resource. | ||
| . Add the `remoteWrite` specification to the `PrometheusAgent` resource. Your `PrometheusAgent` resource might resemble the following YAML file, where the `up` metric is exported to a custom endpoint: | ||
|
|
||
| + | ||
| [source,yaml] | ||
| ---- | ||
| apiVersion: monitoring.rhobs/v1alpha1 | ||
| kind: PrometheusAgent | ||
| metadata: | ||
| name: mcoa-default-platform-metrics-collector-global | ||
| namespace: open-cluster-management-observability | ||
| spec: | ||
| secrets: | ||
| - custom-endpoint-ca | ||
| - custom-endpoint-cert | ||
| remoteWrite: | ||
| - name: custom-endpoint | ||
| tlsConfig: | ||
| caFile: /etc/prometheus/secrets/custom-endpoint-ca/ca.crt | ||
| certFile: /etc/prometheus/secrets/custom-endpoint-cert/tls.crt | ||
| keyFile: /etc/prometheus/secrets/custom-endpoint-cert/tls.key | ||
| url: 'https://my-custom-remote-write-endpoint.io/api/v1/receive' | ||
| writeRelabelConfigs: | ||
| - action: keep | ||
| regex: ^up$ | ||
| sourceLabels: | ||
| - __name__ | ||
| - name: acm-observability | ||
| ... | ||
| ---- | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,62 @@ | ||
| [#enable-mcoa] | ||
| = Enabling the multicluster observability add-on | ||
|
|
||
| Enable the multicluster observability add-on on your hub cluster to configure your metrics for your platform and user workloads. You are required to enable platform workloads and it is optional to enable user workloads. | ||
|
|
||
| When you enable `platform` and `userWorkloads` specifications, the `MultiClusterObservability` operator stops deploying the metrics collectors to your managed clusters and deploys the `multicluster-observability-addon-manager` in the `open-cluster-management-observability` namespace. The `multicluster-observability-addon-manager` deploys the new metrics collectors based on the `PrometheusAgent` resource defined on your hub cluster. | ||
|
|
||
| *Required access:* Cluster administrator | ||
|
|
||
| .Prerequisites | ||
|
|
||
| - You must enable the Observability service on your hub cluster. For more details, see xref:../observability/observability_enable.adoc[Enabling the Observability service]. | ||
| - You must install the Red Hat OpenShift Cluster Observability Operator. For more information, see link:https://docs.redhat.com/en/documentation/red_hat_openshift_cluster_observability_operator/1-latest/html-single/about_red_hat_openshift_cluster_observability_operator/index[Cluster Observability Operator overview]. | ||
|
|
||
dockerymick marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| .Procedure | ||
|
|
||
| Complete the following steps to enable the multicluster observability add-on on your hub cluster: | ||
|
|
||
| . To enable platform monitoring and user workload monitoring, add the `platform` and `userWorkloads` specification to your `MultiClusterObservability` resource. Run the following command: | ||
|
|
||
| + | ||
| [source,bash] | ||
| ---- | ||
| oc patch mco observability -n open-cluster-management-observability --type=merge -p '{"spec":{"capabilities":{"platform":{"metrics":{"default":{"enabled": true}}},"userWorkloads":{"metrics":{"default":{"enabled": true}}}}}}' | ||
| ---- | ||
| + | ||
| Your `MultiClusterObservability` resource might resemble the following file example: | ||
|
|
||
| + | ||
| [source,yaml] | ||
| ---- | ||
| apiVersion: observability.open-cluster-management.io/v1beta2 | ||
| kind: MultiClusterObservability | ||
| metadata: | ||
| name: observability | ||
| spec: | ||
| capabilities: | ||
| platform: | ||
| metrics: | ||
| default: | ||
| enabled: true | ||
| userWorkloads: | ||
| metrics: | ||
| default: | ||
| enabled: true | ||
| ---- | ||
|
|
||
| . To verify that the default configuration resources for the multicluster observability add-on are created, open your `multicluster-observability-addon` `ClusterManagementAddon` resource. Run the following command: | ||
|
|
||
| + | ||
| [source,bash] | ||
dockerymick marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| ---- | ||
| oc get prometheusagents -n open-cluster-management-observability | ||
| ---- | ||
|
|
||
| . Verify that the default configurations are added to your placements. Run the following command: | ||
|
|
||
| + | ||
| [source,bash] | ||
| ---- | ||
| oc get cma multicluster-observability-addon -o yaml | yq '.spec.installStrategy.placements' | ||
| ---- | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,47 @@ | ||
| [#add-custom-metrics-mcoa] | ||
| = Adding custom metrics for the multicluster observability add-on | ||
|
|
||
| Add your own custom metrics to be collected from your managed clusters by configuring the `ScrapeConfig` resource. The `ScrapeConfig` must be added to the `Placement` configurations of the `ClusterManagementAddOn` resource for deploying on the corresponding managed clusters. The Prometheus operator on each managed cluster adds the new `ScrapeConfig` resource to the `PrometheusAgent` resource. | ||
|
|
||
| *Requierd access:* Cluster administrator | ||
|
|
||
| .Prerequisites | ||
|
|
||
| - You have installed and enabled the multicluster observability add-on. For more information, see xref:../observability/obs_enable_mcoa.adoc#enable-mcoa[Enabling the multicluster observability add-on]. | ||
|
|
||
| Complete the following steps to add custom metrics for the multicluster observability add-on: | ||
|
|
||
| . Create a new `ScrapeConfig` resource in the `open-cluster-management-observability` namespace that includes values for the required parameters, `jobName`, `metricsPath`, and `params`. | ||
|
|
||
| . Add the appropriate label for the `app.kubernetes.io/component` parameter to specify whether the metrics are for platform monitoring or user workload monitoring. Use one of the following label values, `platform-metrics-collector` or `user-workload-metrics-collector`. | ||
dockerymick marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| + | ||
| *Note:* When you use the `platform-metrics-collector` label, the multicluster observability add-on automatically sets the `scrapeClass` and `targets` parameters to enable federation from the platform Prometheus of your {ocp-short} managed cluster. You can override the `scrapeClass` and `targets` parameters by adding the value that you need. | ||
|
|
||
| . *Optional:* Manually set the `scrapeClass` and `staticConfigs` specifications for your `user-workload-metrics-collector` `ScrapeConfig` resource. | ||
|
|
||
| . Add the `ScrapeConfig` resource reference to the placements of the `ClusterManagementAddOn` resource where you want the resource to be deployed. | ||
|
|
||
| + | ||
| *Note:* Ensure that you reference the `ScrapeConfig` resource only after you create it. Otherwise, the add-on status updates to a `Deploying` status because the resource does not exist. | ||
|
|
||
| + | ||
| Your `ScrapeConfig` resource might resemble the following YAML file: | ||
|
|
||
| + | ||
| [source,yaml] | ||
| ---- | ||
| apiVersion: monitoring.rhobs/v1alpha1 | ||
| kind: ScrapeConfig | ||
| metadata: | ||
| name: add-custom-metrics | ||
| namespace: open-cluster-management-observability | ||
| labels: | ||
| - app.kubernetes.io/component: platform-metrics-collector | ||
| spec: | ||
| jobName: some-job-name | ||
| metricsPath: /federate | ||
| params: | ||
| match[]: | ||
| - '{__name__="up"}' | ||
| ---- | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,59 @@ | ||||||||||
| [#config-apis-mcoa] | ||||||||||
| = Configuring APIs for the multicluster observability add-on | ||||||||||
|
|
||||||||||
| Configure the default metrics-specific APIs of the multicluster observability add-on. When you reference a new placement in your `ClusterManagementAddon` resource, the `multicluster-observability-addon-manager` automatically creates specific default `PrometheusAgent` resources. The `multicluster-observability-addon-manager` adds the `PrometheusAgent` resources reference in the related `Placement` configurations. | ||||||||||
|
|
||||||||||
| While there is one `PrometheusAgent` resource created by the placement, the default `ScrapeConfigs` and `PrometheusRules` are common to all placements. The following resources are the default configuration resources for the multicluster observability add-on: `PrometheusAgent`, `ScrapeConfigs`, and `PrometheusRules`. | ||||||||||
|
|
||||||||||
| *Required access:* Cluster administrator | ||||||||||
|
|
||||||||||
| .Prerequisites | ||||||||||
|
|
||||||||||
| - You have installed and enabled the multicluster observability add-on. For more information, see xref:../observability/obs_enable_mcoa.adoc#enable-mcoa[Enabling the multicluster observability add-on]. | ||||||||||
|
|
||||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||
| .Procedure | ||||||||||
|
|
||||||||||
| Complete the following to configure the APIs for the multicluster observability add-on: | ||||||||||
|
|
||||||||||
| . *Optional* Override the default scrape interval for your `PrometheusAgent` resource by changing the `scrapeInterval` parameter. The default value is `300s`. You can also override the `scrapeInterval` of the `scrapeConfig` resource. | ||||||||||
|
|
||||||||||
| . Configure the `ScrapeConfig` resource to define a set of metrics for independent federation from Prometheus of your managed clusters. Complete the following steps: | ||||||||||
|
|
||||||||||
| .. Add the name of the job that you want to reference for the `jobName` parameter. | ||||||||||
| .. To ensure that Prometheus federates metrics, add the `/federate` URL path for the `metricsPath` parameter. | ||||||||||
| .. Add the metric name and labels that you want to collect. See the following YAML file example where the `ScrapeConfig` resource collects the `up` metric: | ||||||||||
|
|
||||||||||
| + | ||||||||||
| [source,yaml] | ||||||||||
| ---- | ||||||||||
| apiVersion: monitoring.rhobs/v1alpha1 | ||||||||||
| kind: ScrapeConfig | ||||||||||
| metadata: | ||||||||||
| name: some-metrics-to-collect | ||||||||||
| namespace: open-cluster-management-observability | ||||||||||
| labels: | ||||||||||
| - app.kubernetes.io/component: <platform-metrics-collector> or <user-workload-metrics-collector> | ||||||||||
| spec: | ||||||||||
| jobName: some-job-name | ||||||||||
| metricsPath: /federate | ||||||||||
| params: | ||||||||||
| match[]: | ||||||||||
| - '{__name__="up"}' | ||||||||||
| ---- | ||||||||||
|
|
||||||||||
| . Configure the `PrometheusRule` resource to limit the cardinality of your collected metrics on your hub cluster. Complete the following steps: | ||||||||||
|
|
||||||||||
| .. Define alerting and recording rules for platform and user workload monitoring on your managed clusters. | ||||||||||
|
|
||||||||||
| .. To target user workloads in your `PrometheusRule` resource, add the following annotation to define the namespace where you want to deploy the resource: `observability.open-cluster-management.io/target-namespace`. | ||||||||||
|
|
||||||||||
| + | ||||||||||
| *Notes:* | ||||||||||
| - When the `observability.open-cluster-management.io/target-namespace` is not set in your `PrometheusRule` resource, the `PrometheusRule` resource are deployed to the default installation namespace. If `openshift.io/cluster-monitoring` is set to `true`, the `PrometheusRule` resouces are not monitored by {ocp-short} user workload monitoring stack. | ||||||||||
| - Be sure to use the `monitoring.coreos.com` group for the `PrometheusRule` resource. | ||||||||||
|
|
||||||||||
| . Configure the `AddonDeploymentConfig` resource to customize the deployment of the multicluster observability add-on on managed clusters. *Note:* The values in the `AddonDeploymentConfig` resource are override direct modifications of other resources. Complete the following steps: | ||||||||||
|
|
||||||||||
| .. Define the namespace where you install the `PrometheusAgent` resource for the `agentInstallNamespace` parameter. The default namespace is `open-cluster-management-agent-addon`. | ||||||||||
| .. Add the `nodePlacement` specification with the associated `nodeSelector` and `tolerations` parameters. | ||||||||||
| .. Edit the `proxyConfig` specification by updating the `httpProxy` and `noProxy` configurations. | ||||||||||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,34 @@ | ||||||||||
| [#federate-uw-coo] | ||||||||||
| = Federating user workloads from Cluster Observability Operator | ||||||||||
|
|
||||||||||
| Federate your user workloads from the Red Hat OpenShift Cluster Observability Operator on your managed clusters. By default, user workload metrics are federated from the Prometheus user-workload on your {ocp-short} cluster. | ||||||||||
|
|
||||||||||
| *Required access:* Cluster administrator | ||||||||||
|
|
||||||||||
| .Prerequisites | ||||||||||
|
|
||||||||||
| - User workload monitoring is enabled on your managed cluster. | ||||||||||
| - User workload monitoring is enabled in the `MultiClusterObservability` custom resource. | ||||||||||
| - You have created `ScrapeConfig` resources with the `app.kubernetes.io/component: <user-workload-metrics-collector>` label. | ||||||||||
| - The `ScrapeConfig` resources are referenced in the configurations list of the `ClusterManagementAddOn` for the target placements. | ||||||||||
|
|
||||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||
| .Procedure | ||||||||||
|
|
||||||||||
| Complete the following steps to federate user workloads from the Cluster Observability Operator on your managed clusters: | ||||||||||
|
|
||||||||||
| . Update your `ScrapeConfig` resource by adding the endpoint of the Cluster Observability Operator `MonitoringStack` resource. Your resource might resemble the following YAML file: | ||||||||||
|
|
||||||||||
| + | ||||||||||
| [source,yaml] | ||||||||||
| ---- | ||||||||||
| apiVersion: monitoring.coreos.com/v1alpha1 | ||||||||||
| kind: ScrapeConfig | ||||||||||
| spec: | ||||||||||
| scrapeClass: “” | ||||||||||
| scheme: HTTP | ||||||||||
| staticConfigs: | ||||||||||
| - targets: | ||||||||||
| - my-monitoring-stack.my-monitoring-ns.svc:9090 | ||||||||||
| ---- | ||||||||||
|
|
||||||||||
| . If you use a proxy with the Prometheus server, modify the `ScrapeConfig` resource to include your TLS configuration. Create a corresponding `scrapeClass` specification on the user workload `PrometheusAgent` resource. Then reference the `PrometheusAgent` resource within the `ScrapeConfig` resource. | ||||||||||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.