Skip to content
This repository was archived by the owner on Apr 28, 2025. It is now read-only.

Commit 464fcc8

Browse files
authored
Merge pull request #348 from grafana/improve-CortexIngesterReachingSeriesLimit-playbook
Improve CortexIngesterReachingSeriesLimit playbook
2 parents b266d7b + f40af5c commit 464fcc8

File tree

1 file changed

+4
-2
lines changed

1 file changed

+4
-2
lines changed

cortex-mixin/docs/playbooks.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -50,10 +50,12 @@ How the limit is **configured**:
5050
- The configured limit can be queried via `cortex_ingester_instance_limits{limit="max_series"}`
5151

5252
How to **fix**:
53+
1. **Temporarily increase the limit**<br />
54+
If the actual number of series is very close or already hit the limit, or if you foresee the ingester will hit the limit before dropping the stale series as effect of the scale up, you should also temporarily increase the limit.
55+
1. **Check if shuffle-sharding shard size is correct**<br />
56+
When shuffle-sharding is enabled, we target to 100K series / tenant / ingester. You can run `avg by (user) (cortex_ingester_memory_series_created_total{namespace="<namespace>"} - cortex_ingester_memory_series_removed_total{namespace="<namespace>"}) > 100000` to find out tenants with > 100K series / ingester. You may want to increase the shard size for these tenants.
5357
1. **Scale up ingesters**<br />
5458
Scaling up ingesters will lower the number of series per ingester. However, the effect of this change will take up to 4h, because after the scale up we need to wait until all stale series are dropped from memory as the effect of TSDB head compaction, which could take up to 4h (with the default config, TSDB keeps in-memory series up to 3h old and it gets compacted every 2h).
55-
2. **Temporarily increase the limit**<br />
56-
If the actual number of series is very close or already hit the limit, or if you foresee the ingester will hit the limit before dropping the stale series as effect of the scale up, you should also temporarily increase the limit.
5759

5860
### CortexIngesterReachingTenantsLimit
5961

0 commit comments

Comments
 (0)