This repository was archived by the owner on Apr 28, 2025. It is now read-only.
File tree Expand file tree Collapse file tree 2 files changed +16
-7
lines changed Expand file tree Collapse file tree 2 files changed +16
-7
lines changed Original file line number Diff line number Diff line change 436436 },
437437 {
438438 alert: 'CortexProvisioningTooManyActiveSeries' ,
439- // 1.5 million active series per ingester max.
439+ // We target each ingester to 1.5M in-memory series. This alert fires if the average
440+ // number of series / ingester in a Cortex cluster is > 1.6M for 2h (we compact
441+ // the TSDB head every 2h).
440442 expr: |||
441443 avg by (%s) (cortex_ingester_memory_series) > 1.6e6
442- and
443- sum by (%s) (rate(cortex_ingester_received_chunks[1h])) == 0
444- ||| % [$._config.alert_aggregation_labels, $._config.alert_aggregation_labels],
445- 'for' : '1h' ,
444+ ||| % [$._config.alert_aggregation_labels],
445+ 'for' : '2h' ,
446446 labels: {
447447 severity: 'warning' ,
448448 },
449449 annotations: {
450450 message: |||
451- Too many active series for ingesters, add more ingesters .
451+ The number of in-memory series per ingester in {{ $labels.namespace }} is too high .
452452 ||| ,
453453 },
454454 },
Original file line number Diff line number Diff line change @@ -498,7 +498,16 @@ _This alert applies to Cortex chunks storage only._
498498
499499### CortexProvisioningTooManyActiveSeries
500500
501- _TODO: this playbook has not been written yet._
501+ This alert fires if the average number of in-memory series per ingester is above our target (1.5M).
502+
503+ How to **fix**:
504+ - Scale up ingesters
505+ - To find out the Cortex clusters where ingesters should be scaled up and how many minimum replicas are expected:
506+ ```
507+ ceil(sum by(cluster, namespace) (cortex_ingester_memory_series) / 1.5e6) >
508+ count by(cluster, namespace) (cortex_ingester_memory_series)
509+ ```
510+ - After the scale up, the in-memory series are expected to be reduced at the next TSDB head compaction (occurring every 2h)
502511
503512### CortexProvisioningTooManyWrites
504513
You can’t perform that action at this time.
0 commit comments