You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/guides/all/heal-unhealthy-k8s-pods.md
+12-14Lines changed: 12 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1107,20 +1107,6 @@ Next, we will create an AI agent that analyzes pod health issues and creates com
1107
1107
"properties": {
1108
1108
"description": "AI agent specialized in diagnosing and automatically fixing unhealthy Kubernetes workloads. Monitors pod health, identifies root causes, and applies appropriate fixes.",
1109
1109
"status": "active",
1110
-
"allowed_blueprints": [
1111
-
"k8s_workload",
1112
-
"k8s_pod",
1113
-
"k8s_replicaSet",
1114
-
"k8s_namespace",
1115
-
"k8s_cluster"
1116
-
],
1117
-
"allowed_actions": [
1118
-
"get_k8s_pod_logs",
1119
-
"restart_k8s_workload",
1120
-
"scale_k8s_workload",
1121
-
"update_k8s_workload_config",
1122
-
"create_k8s_fix_pr"
1123
-
],
1124
1110
"prompt": "You are a Kubernetes healing AI agent with access to comprehensive SDLC data and pod logs.\n\n**Your healing process:**\n1. **FIRST - Get Logs**: Always start by retrieving pod logs using the get_k8s_pod_logs action to understand what's actually happening\n2. **Analyze with Context**: Combine log data with SDLC information (recent deployments, code changes, configuration updates) to build a complete picture\n3. **Intelligent Diagnosis**: Based on logs and context, determine the root cause (crashes, resource constraints, configuration issues, etc.)\n4. **Targeted Fix**: Execute only the specific action that will resolve the issue:\n - Restart for crashes and transient issues\n - Scale for resource constraints (CPU/memory)\n - Update config for resource limit issues\n - Create PR for complex fixes requiring code changes\n5. **Explain Your Actions**: Always explain what you found in the logs and why you chose the specific fix in your response",
1125
1111
"execution_mode": "Automatic",
1126
1112
"conversation_starters": [
@@ -1130,13 +1116,25 @@ Next, we will create an AI agent that analyzes pod health issues and creates com
1130
1116
"Scale up resources for this memory-constrained workload",
1131
1117
"What's causing this ImagePullBackOff error?",
1132
1118
"Why are my pods stuck in Pending state?"
1119
+
],
1120
+
"tools": [
1121
+
"^(list|get|search|track|describe)_.*",
1122
+
"run_get_k8s_pod_logs",
1123
+
"run_restart_k8s_workload",
1124
+
"run_scale_k8s_workload",
1125
+
"run_update_k8s_workload_config",
1126
+
"run_create_k8s_fix_pr"
1133
1127
]
1134
1128
},
1135
1129
"relations": {}
1136
1130
}
1137
1131
```
1138
1132
</details>
1139
1133
1134
+
:::tip MCP Enhanced Capabilities
1135
+
The AI agent uses MCP (Model Context Protocol) enhanced capabilities to automatically discover important and relevant blueprint entities via its tools. The `^(list|get|search|track|describe)_.*` pattern allows the agent to access and analyze related entities across your entire software catalog, including Kubernetes resources, recent deployments, code changes, runbooks, and history logs. This gives the agent rich SDLC context to make intelligent healing decisions. Additionally, we explicitly add the remediation action tools (`run_get_k8s_pod_logs`, `run_restart_k8s_workload`, etc.), which the agent calls sequentially to diagnose and fix unhealthy workloads.
0 commit comments