Skip to content

Commit b92215c

Browse files
authored
Knative files and doc (scale gateway to 0) (#1390)
* Adding Knative docs and files Signed-off-by: Arthur De Magalhaes <arthurdm@ca.ibm.com> * Formatting Signed-off-by: Arthur De Magalhaes <arthurdm@ca.ibm.com> * Formatting Signed-off-by: Arthur De Magalhaes <arthurdm@ca.ibm.com> --------- Signed-off-by: Arthur De Magalhaes <arthurdm@ca.ibm.com>
1 parent 2821907 commit b92215c

File tree

4 files changed

+502
-0
lines changed

4 files changed

+502
-0
lines changed
Lines changed: 284 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,284 @@
1+
# Knative Scale-to-Zero Setup for mcpgateway
2+
3+
## Overview
4+
This document describes the Knative Serving configuration that enables scale-to-zero functionality for the mcpgateway application on Kubernetes clusters (including OpenShift).
5+
6+
## Prerequisites
7+
8+
- Kubernetes cluster (1.28+) or OpenShift (4.12+)
9+
- Knative Serving installed ([installation guide](https://knative.dev/docs/install/))
10+
- kubectl or oc CLI configured
11+
12+
## Components
13+
14+
### 1. PostgreSQL Configuration
15+
**File:** [`postgres-config.yaml`](postgres-config.yaml)
16+
**Namespace:** `mcp-gateway`
17+
18+
ConfigMap containing PostgreSQL connection settings. **Important:** Update these values before deploying:
19+
- `POSTGRES_HOST`: PostgreSQL service hostname
20+
- `POSTGRES_PORT`: PostgreSQL port (default: 5432)
21+
- `POSTGRES_DB`: Database name
22+
- `POSTGRES_USER`: Database username
23+
- `POSTGRES_PASSWORD`: Database password (use Kubernetes Secrets in production)
24+
25+
### 2. KnativeServing Custom Resource
26+
**File:** [`knative-serving.yaml`](knative-serving.yaml)
27+
**Namespace:** `knative-serving`
28+
29+
This resource configures the Knative Serving platform with:
30+
- **Scale-to-zero enabled**: Pods automatically scale down to 0 when idle
31+
- **30-second grace period**: Pods remain running for 30 seconds after the last request
32+
- **High availability**: 1 replica for control plane components
33+
- **Ingress configuration**: Commented out by default - configure based on your setup
34+
- **Autoscaling parameters**:
35+
- Target concurrency: 100 requests per pod
36+
- Stable window: 60 seconds
37+
- Panic window: 6 seconds
38+
39+
**Note:** The ingress configuration is commented out. Uncomment and configure based on your ingress controller (Kourier, Istio, or Contour). OpenShift users don't need to configure this as it's handled automatically by the Serverless Operator.
40+
41+
### 3. Knative Service for mcpgateway
42+
**File:** [`mcpgateway-knative-service.yaml`](mcpgateway-knative-service.yaml)
43+
**Namespace:** `mcp-gateway`
44+
45+
This replaces the traditional Deployment with a Knative Service that includes:
46+
- **Min scale: 0** - Allows scaling to zero pods
47+
- **Max scale: 1** - Maximum of 1 pod under load (adjust as needed)
48+
- **Container concurrency: 100** - Up to 100 concurrent requests per pod
49+
- **Scale-to-zero retention: 30s** - Keeps pods alive for 30 seconds after traffic stops
50+
- **Health checks**: Readiness and liveness probes for proper traffic routing
51+
- **Database config**: References `postgres-config` ConfigMap for connection settings
52+
53+
## Deployment Steps
54+
55+
### 1. Install Knative Serving and Ingress Controller
56+
57+
**For vanilla Kubernetes:**
58+
```bash
59+
# Install Knative Serving
60+
kubectl apply -f https://github.com/knative/serving/releases/download/knative-v1.12.0/serving-crds.yaml
61+
kubectl apply -f https://github.com/knative/serving/releases/download/knative-v1.12.0/serving-core.yaml
62+
63+
# Install Kourier (recommended lightweight ingress)
64+
kubectl apply -f https://github.com/knative/net-kourier/releases/download/knative-v1.12.0/kourier.yaml
65+
66+
# Configure Knative to use Kourier
67+
kubectl patch configmap/config-network \
68+
--namespace knative-serving \
69+
--type merge \
70+
--patch '{"data":{"ingress-class":"kourier.ingress.networking.knative.dev"}}'
71+
```
72+
73+
**For OpenShift:**
74+
```bash
75+
# Install OpenShift Serverless Operator from OperatorHub
76+
# Then create KnativeServing instance (ingress is auto-configured)
77+
```
78+
79+
### 2. Create namespace
80+
```bash
81+
kubectl create namespace mcp-gateway
82+
```
83+
84+
### 3. Deploy PostgreSQL configuration
85+
```bash
86+
# Edit postgres-config.yaml with your database credentials first!
87+
kubectl apply -f postgres-config.yaml
88+
```
89+
90+
**Security Note:** For production, use Kubernetes Secrets instead of ConfigMap:
91+
```bash
92+
kubectl create secret generic postgres-credentials \
93+
--from-literal=POSTGRES_PASSWORD=your-secure-password \
94+
-n mcp-gateway
95+
```
96+
97+
Then update the Knative Service to reference the Secret instead of ConfigMap.
98+
99+
### 4. Deploy Knative Serving configuration (optional)
100+
```bash
101+
# This step is optional - only needed if you want to customize
102+
# autoscaling parameters beyond defaults
103+
kubectl apply -f knative-serving.yaml
104+
```
105+
106+
**Note:** For vanilla Kubernetes, you may need to uncomment and configure the `ingress-class` setting in [`knative-serving.yaml`](knative-serving.yaml:48) to match your installed ingress controller.
107+
108+
### 5. Deploy the mcpgateway service
109+
```bash
110+
kubectl apply -f mcpgateway-knative-service.yaml
111+
```
112+
113+
### 6. Verify deployment
114+
```bash
115+
# Check service status
116+
kubectl get ksvc mcpgateway -n mcp-gateway
117+
118+
# Check revisions
119+
kubectl get revision -n mcp-gateway
120+
121+
# Expected output when idle (scale-to-zero active):
122+
# NAME CONFIG NAME GENERATION READY ACTUAL REPLICAS DESIRED REPLICAS
123+
# mcpgateway-00001 mcpgateway 1 True 0 0
124+
```
125+
126+
## Checking Status
127+
128+
```bash
129+
# For OpenShift:
130+
$ oc get ksvc mcpgateway -n mcp-gateway
131+
132+
# For vanilla Kubernetes:
133+
$ kubectl get ksvc mcpgateway -n mcp-gateway
134+
135+
# Check revisions:
136+
$ kubectl get revision -n mcp-gateway
137+
NAME CONFIG NAME GENERATION READY ACTUAL REPLICAS DESIRED REPLICAS
138+
mcpgateway-00001 mcpgateway 1 True 0 0
139+
```
140+
141+
**Scale-to-zero is active**: The service shows 0 actual and 0 desired replicas when idle.
142+
143+
## How It Works
144+
145+
1. **Idle State**: When no traffic is received, Knative scales the pods to 0 after the grace period
146+
2. **Cold Start**: When a request arrives, Knative automatically spins up a pod
147+
3. **Active State**: Pods handle requests and scale based on concurrency
148+
4. **Scale Down**: After 30 seconds of no traffic, pods scale back to 0
149+
150+
## Accessing the Service
151+
152+
The service is accessible via the Knative-managed route. The exact URL depends on your cluster's domain configuration:
153+
- **OpenShift**: `https://mcpgateway-mcp-gateway.apps.<cluster-domain>`
154+
- **Vanilla Kubernetes**: Depends on your ingress configuration and domain setup
155+
156+
When you make a request:
157+
1. If scaled to zero, there will be a brief cold-start delay (typically 5-15 seconds)
158+
2. The pod will start and handle the request
159+
3. Subsequent requests will be fast while the pod is running
160+
4. After 30 seconds of inactivity, the pod will terminate
161+
162+
## Monitoring Scale-to-Zero
163+
164+
### Check current pod count:
165+
```bash
166+
kubectl get pods -n mcp-gateway -l serving.knative.dev/service=mcpgateway
167+
```
168+
169+
### Watch pods scale up/down:
170+
```bash
171+
kubectl get pods -n mcp-gateway -l serving.knative.dev/service=mcpgateway -w
172+
```
173+
174+
### Check revision status:
175+
```bash
176+
kubectl get revision -n mcp-gateway
177+
```
178+
179+
### View Knative Service details:
180+
```bash
181+
kubectl describe ksvc mcpgateway -n mcp-gateway
182+
```
183+
184+
**Note:** OpenShift users can use `oc` instead of `kubectl` for all commands.
185+
186+
## Configuration Parameters
187+
188+
Key autoscaling annotations in the Knative Service:
189+
190+
| Annotation | Value | Description |
191+
|------------|-------|-------------|
192+
| `autoscaling.knative.dev/min-scale` | `0` | Minimum pods (enables scale-to-zero) |
193+
| `autoscaling.knative.dev/max-scale` | `10` | Maximum pods under load |
194+
| `autoscaling.knative.dev/target` | `100` | Target concurrent requests per pod |
195+
| `autoscaling.knative.dev/scale-to-zero-pod-retention-period` | `30s` | Time to keep pods after last request |
196+
| `autoscaling.knative.dev/metric` | `concurrency` | Metric used for scaling decisions |
197+
198+
## Troubleshooting
199+
200+
### Service not scaling to zero
201+
```bash
202+
# Check if there's active traffic
203+
kubectl get podautoscaler -n mcp-gateway
204+
205+
# Check Knative autoscaler logs
206+
kubectl logs -n knative-serving -l app=autoscaler
207+
```
208+
209+
### Cold start taking too long
210+
```bash
211+
# Check pod startup time
212+
kubectl get pods -n mcp-gateway -l serving.knative.dev/service=mcpgateway -w
213+
214+
# Review readiness probe configuration
215+
kubectl describe ksvc mcpgateway -n mcp-gateway
216+
```
217+
218+
### Service not ready
219+
```bash
220+
# Check Knative Service status
221+
kubectl get ksvc mcpgateway -n mcp-gateway -o yaml
222+
223+
# Check revision status
224+
kubectl describe revision -n mcp-gateway
225+
```
226+
227+
## Reverting to Standard Deployment
228+
229+
If you need to revert to a standard Kubernetes Deployment:
230+
231+
1. Delete the Knative Service:
232+
```bash
233+
kubectl delete ksvc mcpgateway -n mcp-gateway
234+
```
235+
236+
2. Recreate the original Deployment and Service from your backup or version control
237+
238+
## Platform-Specific Notes
239+
240+
### Vanilla Kubernetes
241+
- **Must install Knative Serving and an ingress controller** (Kourier recommended): [Installation Guide](https://knative.dev/docs/install/)
242+
- Configure DNS or use Magic DNS (xip.io/nip.io/sslip.io) for local development
243+
- Uncomment and set `ingress-class` in [`knative-serving.yaml`](knative-serving.yaml:48) to match your ingress controller
244+
- Supported ingress controllers:
245+
- **Kourier** (recommended): Lightweight, Knative-specific
246+
- **Istio**: Full service mesh with advanced features
247+
- **Contour**: Envoy-based, good balance of features and performance
248+
249+
### OpenShift
250+
- **Install OpenShift Serverless Operator** from OperatorHub (includes Knative + Kourier)
251+
- Ingress is automatically configured - no need to modify [`knative-serving.yaml`](knative-serving.yaml:48)
252+
- OpenShift Routes are automatically created and managed
253+
- Can use `oc` instead of `kubectl` for all commands
254+
- No separate ingress controller installation needed
255+
256+
## Security Best Practices
257+
258+
1. **Use Secrets for sensitive data:**
259+
```bash
260+
kubectl create secret generic postgres-credentials \
261+
--from-literal=POSTGRES_PASSWORD=secure-password \
262+
-n mcp-gateway
263+
```
264+
265+
2. **Update the Knative Service to use Secrets:**
266+
```yaml
267+
- name: POSTGRES_PASSWORD
268+
valueFrom:
269+
secretKeyRef:
270+
name: postgres-credentials
271+
key: POSTGRES_PASSWORD
272+
```
273+
274+
3. **Use network policies to restrict database access**
275+
4. **Enable TLS for the Knative Service route**
276+
5. **Regularly rotate credentials**
277+
278+
## Additional Resources
279+
280+
- [Knative Serving Documentation](https://knative.dev/docs/serving/)
281+
- [Knative Autoscaling](https://knative.dev/docs/serving/autoscaling/)
282+
- [Knative Installation Guide](https://knative.dev/docs/install/)
283+
- [OpenShift Serverless Documentation](https://docs.openshift.com/serverless/)
284+
- [Kubernetes Secrets](https://kubernetes.io/docs/concepts/configuration/secret/)
Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,78 @@
1+
apiVersion: operator.knative.dev/v1beta1
2+
kind: KnativeServing
3+
metadata:
4+
name: knative-serving
5+
namespace: knative-serving
6+
spec:
7+
# High availability configuration
8+
high-availability:
9+
replicas: 1
10+
11+
# Ingress configuration
12+
# ingress:
13+
# kourier:
14+
# enabled: true
15+
16+
# Configuration for scale-to-zero
17+
config:
18+
autoscaler:
19+
# Enable scale to zero
20+
enable-scale-to-zero: "true"
21+
# Time window for stable mode (default: 60s)
22+
stable-window: "60s"
23+
# Time window for panic mode (default: 6s)
24+
panic-window: "6s"
25+
# Target concurrency per pod (default: 100)
26+
container-concurrency-target-default: "100"
27+
# Percentage of target to maintain (default: 70)
28+
container-concurrency-target-percentage: "70"
29+
# Scale down delay after last request (default: 0s for immediate scale down)
30+
scale-to-zero-grace-period: "30s"
31+
# Pod retention time after scale to zero decision (default: 0s)
32+
scale-to-zero-pod-retention-period: "0s"
33+
34+
deployment:
35+
# Progress deadline for deployments
36+
progress-deadline: "600s"
37+
# QPS settings for Kubernetes API
38+
qps-burst: "200"
39+
qps: "100"
40+
41+
network:
42+
# Ingress class - configure based on your ingress controller:
43+
# - Kourier: "kourier.ingress.networking.knative.dev"
44+
# - Istio: "istio.ingress.networking.knative.dev"
45+
# - Contour: "contour.ingress.networking.knative.dev"
46+
# OpenShift: Automatically configured by Serverless Operator
47+
# Comment out or remove if using default ingress
48+
# ingress-class: "kourier.ingress.networking.knative.dev"
49+
50+
# Domain template for routes
51+
domain-template: "{{.Name}}-{{.Namespace}}.{{.Domain}}"
52+
# Enable HTTP2
53+
enable-http2: "true"
54+
# Autocreate cluster domain claims
55+
autocreate-cluster-domain-claims: "true"
56+
57+
observability:
58+
# Enable request logging
59+
logging.enable-request-log: "true"
60+
# Request log template
61+
logging.request-log-template: >-
62+
{"httpRequest": {"requestMethod": "{{.Request.Method}}",
63+
"requestUrl": "{{js .Request.RequestURI}}",
64+
"requestSize": "{{.Request.ContentLength}}",
65+
"status": {{.Response.Code}},
66+
"responseSize": "{{.Response.Size}}",
67+
"userAgent": "{{js .Request.UserAgent}}",
68+
"remoteIp": "{{js .Request.RemoteAddr}}",
69+
"serverIp": "{{.Revision.PodIP}}",
70+
"referer": "{{js .Request.Referer}}",
71+
"latency": "{{.Response.Latency}}s",
72+
"protocol": "{{.Request.Proto}}"},
73+
"traceId": "{{index .Request.Header "X-B3-Traceid"}}"}
74+
# Metrics backend
75+
metrics.backend-destination: "prometheus"
76+
# Enable profiling
77+
profiling.enable: "false"
78+

0 commit comments

Comments
 (0)