Skip to content

Commit 9fb0c40

Browse files
weltekialexellis
authored andcommitted
Add queue based scaling to autoscaling docs
Signed-off-by: Han Verstraete (OpenFaaS Ltd) <han@openfaas.com>
1 parent bb722d4 commit 9fb0c40

File tree

2 files changed

+30
-7
lines changed

2 files changed

+30
-7
lines changed

docs/architecture/autoscaling.md

Lines changed: 28 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -121,7 +121,7 @@ mean per pod = 90 / 1 = 90
121121
* Capacity `capacity`
122122

123123
Based upon inflight requests (or connections), ideal for: long-running functions or functions which can only handle a limited number of requests at once.
124-
124+
125125
A hard limit can be enforced through the `max_inflight` environment variable on the function, so the caller will need to retry the request some of the time. The OpenFaaS Pro queue-worker does this automatically, see also: [Retries](/openfaas-pro/retries).
126126

127127
* RPS `rps`
@@ -134,6 +134,10 @@ mean per pod = 90 / 1 = 90
134134

135135
Based upon CPU usage of the function, this strategy is idea for CPU-bound workloads, or where Capacity and RPS are not giving the optimal scaling profile. The value configured here is in milli-CPU, so 1000 accounts for *1 CPU core*.
136136

137+
* Queue-depth `queue`
138+
139+
Based upon the number of async invocations that are queued for a function. This allows you to scale functions rapidly and proactively to the desired number of replicas to process the queue as quickly as possible. Ideal for functions that are only invoked asynchronously.
140+
137141
* Scaling to zero
138142

139143
[Scaling to zero](/openfaas-pro/scale-to-zero) is an opt-in feature on a per function basis. It can be used in combination with any scaling mode, including *Static scaling*
@@ -175,7 +179,7 @@ hey -t 10 -z 3m -c 5 -q 5 \
175179
http://127.0.0.1:8080/function/sleep
176180
```
177181

178-
To apply a hard limit, add `--env max_inflight=5` to the `faas-cli store deploy` command.
182+
To apply a hard limit, add `--env max_inflight=5` to the `faas-cli store deploy` command.
179183

180184
What if you need to limit a function to processing only one request at a time?
181185

@@ -260,6 +264,26 @@ hey -m POST -d data -z 3m -c 5 -q 10 \
260264

261265
Note that `com.openfaas.scale.zero=false` is a default, so this is not strictly required.
262266

267+
**4) Queue-depth based scaling**
268+
269+
When the number of incoming async invocation increases, the queue depth grows. By scaling functions based on this metric, you can proactively add more replicas to process messages faster.
270+
271+
```bash
272+
faas-cli store deploy sleep \
273+
--label com.openfaas.scale.max=10 \
274+
--label com.openfaas.scale.target=10 \
275+
--label com.openfaas.scale.type=queue \
276+
--label com.openfaas.scale.target-proportion=1 \
277+
--env max_inflight=1
278+
279+
hey -m POST -n 30 -c 30 \
280+
http://127.0.0.1:8080/async-function/sleep
281+
```
282+
283+
This sleep function takes 2 seconds to complete, and has a *hard limit* on the number of invocations of 1 concurrent request.
284+
285+
With the above scaling configuration, if 30 messages are submitted to the queue via async invocations, the sleep function will scale to 3 replicas immediately.
286+
263287
## Smoothing out scaling down with a stable window
264288

265289
The `com.openfaas.scale.down.window` label can be set with a Go duration up to a maximum of `5m` or `300s`. When set, the autoscaler will record recommendations on each cycle, and only scale down a function to the highest recorded recommendation of replicas.
@@ -306,7 +330,7 @@ Scaling functions to zero replicas can improve efficiency and reduce costs in yo
306330

307331
1. **Cost Savings**: By scaling down to zero when idle, you can reduce the number of nodes required in your cluster, leading to lower infrastructure costs with fewer, or smaller nodes required.
308332
2. **Resource Efficiency**: Scaling down to zero helps to free up resources in your cluster, this also helps with on-premises clusters where the amount of nodes may be fixed.
309-
3. **Security**: By scaling functions down, the attack surface is also reduced to only active functions.
333+
3. **Security**: By scaling functions down, the attack surface is also reduced to only active functions.
310334

311335
### Scaling down to zero replicas
312336

@@ -364,10 +388,9 @@ The minimum (initial) and maximum replica count can be set at deployment time by
364388
* `com.openfaas.scale.factor` by default this is set to `20%` and has to be a value between 0-100 (including borders)
365389

366390
> Note:
367-
> Setting `com.openfaas.scale.min` and `com.openfaas.scale.max` to the same value, allows to disable the auto-scaling functionality of openfaas.
391+
> Setting `com.openfaas.scale.min` and `com.openfaas.scale.max` to the same value, allows to disable the auto-scaling functionality of openfaas.
368392
> Setting `com.openfaas.scale.factor=0` also allows to disable the auto-scaling functionality of openfaas.
369393

370394
For each alert fired the auto-scaler will add a number of replicas, which is a defined percentage of the max replicas. This percentage can be set using `com.openfaas.scale.factor`. For example setting `com.openfaas.scale.factor=100` will instantly scale to max replicas. This label enables to define the overall scaling behavior of the function.
371395

372396
> Note: Active alerts can be viewed in the "Alerts" tab of Prometheus which is deployed with OpenFaaS.
373-

docs/openfaas-pro/comparison.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -54,10 +54,10 @@ Did you know? OpenFaaS Pro's autoscaling engine can scale many different types o
5454
| Maximum replicas per function | 5 | 1 | No limit applied | as per Standard |
5555
| Scale from Zero | Not available | Supported | Supported, with additional checks for Istio | as per Standard |
5656
| Zero downtime updates | Not available | Not available | Supported with readiness probes and rolling updates | as per Standard |
57-
| Autoscaling strategy | RPS | Not applicable | [CPU utilization, Capacity (inflight requests), RPS and Custom](/architecture/autoscaling) | as per Standard |
57+
| Autoscaling strategy | RPS | Not applicable | [CPU utilization, Capacity (inflight requests), RPS, async queue-depth and Custom (e.g. Memory)](/architecture/autoscaling) | as per Standard |
5858
| Autoscaling granularity | One global rule | Not applicable | Configurable per function | as per Standard |
5959

60-
Data-driven, intensive, or long running functions are best suited to capacity-based autoscaling, which is only available in OpenFaaS Pro.
60+
Data-driven, intensive, or long running functions are best suited to capacity-based or queue-based autoscaling, which is only available in OpenFaaS Pro.
6161

6262
Scaling to zero is also a commercial feature, which can be opted into on a per function basis, with a custom idle threshold.
6363

0 commit comments

Comments
 (0)