Skip to content
This repository was archived by the owner on Apr 28, 2025. It is now read-only.

Commit 47c4c25

Browse files
authored
Merge pull request #350 from grafana/playbook-for-CortexProvisioningTooManyActiveSeries
Add playbook for CortexProvisioningTooManyActiveSeries
2 parents fe7bd13 + efd72f1 commit 47c4c25

File tree

2 files changed

+16
-7
lines changed

2 files changed

+16
-7
lines changed

cortex-mixin/alerts/alerts.libsonnet

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -436,19 +436,19 @@
436436
},
437437
{
438438
alert: 'CortexProvisioningTooManyActiveSeries',
439-
// 1.5 million active series per ingester max.
439+
// We target each ingester to 1.5M in-memory series. This alert fires if the average
440+
// number of series / ingester in a Cortex cluster is > 1.6M for 2h (we compact
441+
// the TSDB head every 2h).
440442
expr: |||
441443
avg by (%s) (cortex_ingester_memory_series) > 1.6e6
442-
and
443-
sum by (%s) (rate(cortex_ingester_received_chunks[1h])) == 0
444-
||| % [$._config.alert_aggregation_labels, $._config.alert_aggregation_labels],
445-
'for': '1h',
444+
||| % [$._config.alert_aggregation_labels],
445+
'for': '2h',
446446
labels: {
447447
severity: 'warning',
448448
},
449449
annotations: {
450450
message: |||
451-
Too many active series for ingesters, add more ingesters.
451+
The number of in-memory series per ingester in {{ $labels.namespace }} is too high.
452452
|||,
453453
},
454454
},

cortex-mixin/docs/playbooks.md

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -498,7 +498,16 @@ _This alert applies to Cortex chunks storage only._
498498
499499
### CortexProvisioningTooManyActiveSeries
500500
501-
_TODO: this playbook has not been written yet._
501+
This alert fires if the average number of in-memory series per ingester is above our target (1.5M).
502+
503+
How to **fix**:
504+
- Scale up ingesters
505+
- To find out the Cortex clusters where ingesters should be scaled up and how many minimum replicas are expected:
506+
```
507+
ceil(sum by(cluster, namespace) (cortex_ingester_memory_series) / 1.5e6) >
508+
count by(cluster, namespace) (cortex_ingester_memory_series)
509+
```
510+
- After the scale up, the in-memory series are expected to be reduced at the next TSDB head compaction (occurring every 2h)
502511
503512
### CortexProvisioningTooManyWrites
504513

0 commit comments

Comments
 (0)