✨ Multi cluster controller sharding #74

zachsmith1 · 2025-09-18T17:28:15Z

What

Deterministic sharding via HRW (rendezvous) over live peers.
Peer leases mcr-peer-* (membership/weights) + shard leases mcr-shard- (per-cluster fencing, single writer).
Clean handoff: watches detach/re-attach on ownership change.
Rebalance on scale up/down.

Why

Prevent double reconciles & cold-start stampede.
Balanced, deterministic ownership with fast failover.

How to try

Follow the README (examples/sharded-namespace) for build/deploy/observe steps.

Changes

Manager: HRW + per-cluster Lease fencing, peer/fence prefix split.
Controller: per-cluster engagement context, re-engage fix.
Source: removable handler, re-register on new ctx.

Co-authored-by: Nelo-T. Wallus <10514301+ntnn@users.noreply.github.com>

Co-authored-by: Marvin Beckers <mail@embik.me>

feat: controller sharding

k8s-ci-robot · 2025-09-18T17:28:22Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: zachsmith1
Once this PR has been reviewed and has the lgtm label, please assign jeremyot for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

linux-foundation-easycla · 2025-09-18T17:28:24Z

The committers listed above are authorized under a signed CLA.

✅ login: zachsmith1 / name: Zach Smith (32b2280, 9a82d21, ed39573)

k8s-ci-robot · 2025-09-18T17:28:24Z

Welcome @zachsmith1!

It looks like this is your first PR to kubernetes-sigs/multicluster-runtime 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/multicluster-runtime has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

FourFifthsCode · 2025-09-19T14:06:52Z

this looks amazing 🚀

with this implementation, is it possible to shard individual clusters too or is it per cluster?

zachsmith1 · 2025-09-19T15:51:30Z

this looks amazing 🚀

with this implementation, is it possible to shard individual clusters too or is it per cluster?

This is just for clusters but I think there is another community project we could leverage for the intra cluster sharding

…rs early

zachsmith1 · 2025-12-01T04:05:26Z

@sttts I fixed a startup ordering bug: providers were starting a remote cluster’s cache before registering controller watches and before publishing the cluster to the provider map. That meant the informer’s initial list/sync happened with no handlers attached (so initial “add” events were missed), and early reconciles could fail because mgr.GetCluster(req.ClusterName) wasn’t resolvable yet. We now publish the cluster first (so GetCluster works), then Engage (which registers watches), and only then start the cache and wait for sync. This mirrors controller-runtime’s contract (start event sources before cache sync), ensuring initial add events reach handlers and “existing objects” are reconciled immediately—making the test "runs the reconciler for existing objects" pass.

mjudeikis

Just few comment from last weekd debugging session. Didnt looked to other code yet,

mjudeikis · 2025-12-01T09:12:24Z

pkg/clusters/clusters.go


+	// Engage before starting cache so handlers are registered for initial adds.
+	if aware != nil {
+		if err := aware.Engage(ctx, clusterName, cl); err != nil {


Will this not start the provider via runnable and hence trigger the reconciler? So basically triggering the reconciler with not started informers? When adding 2nd or 3rd clusters to clusters, you will trigger the reconciler before the cache?

We had these issues last week with @ntnn . How are you testing this? Meaning, which providers are you running, so it works?

In general this flow is bit different for 1st cluster where reconcile loop might not be running yet and second, when reconcilers is already there. So once you engage the cluster it already pops up in the queue. And if you have non-shared informers and caches (not started before) - this might fail?

I agree. Swapping this around feels wrong. I had this originally the other way around in clusters.Clusters with a similar intent of "first register handlers and reconcilers, then start" - but given that this applies to all Awares this can lead to opaque errors. In our case the opaque error was that the mcmanager is waiting on the cache sync when engaging the cluster.

My guess is a similar thing is happening here now and swapping this around in the providers is just masking the actual error.
With the sync engine taking charge of the runnables, maybe this is a problem of the providers being part of the runnables?
Also it looks a bit off that the engine is added as a runnable to the manager but simultaneously the engine takes charge of the runnables?

mjudeikis · 2025-12-01T09:14:05Z

providers/cluster-api/provider.go

 	p.log.Info("Added new cluster")

-	// engage manager.
+	// engage before starting cache so handlers are registered for initial adds


Same comment as above. I feel this will not work for follow-up clusters? might work for first one, but not second, third?

ntnn · 2025-12-01T10:52:59Z

pkg/manager/manager.go

+	if err := mgr.Add(eng.Runnable()); err != nil {
+		return nil, err
+	}


Should probably be mgr.GetLocalManager().Add so the engine isn't added as a runnable to itself?
Haven't looked closer but that's what it looks like at a cursory look.

zachsmith1 and others added 17 commits August 22, 2025 23:43

feat: controller sharding

f3ea62f

feat: controller sharding

1db63bc

chore: remove misleading comment

ed2a440

Co-authored-by: Nelo-T. Wallus <10514301+ntnn@users.noreply.github.com>

fix: same cluster name re-engaged

0abdcad

fix: cleanup and some fixes for engage

1493d97

chore: cleanup

8c521f9

fix: formatting, tests, cleanup

434aeb5

fix comment

c1bd6b3

Co-authored-by: Marvin Beckers <mail@embik.me>

fix: replace ownership with synchronization naming

0a6a229

fix: remove configure in favor of opts

335a35b

fix: remove configure in favor of opts

83c6d32

fix: ns/nm to namespace/name

9181aba

fix: remove sleep in test

8015825

fix: reorder loss handling (cleanup before notify)

9588d26

feat: add pluggable peer registry

8b9beaf

chore: unit tests and env tests for sharding

2b5537a

Merge pull request #1 from zachsmith1/feat/controllersharding

5bd7f9b

feat: controller sharding

k8s-ci-robot requested review from skitt and sttts September 18, 2025 17:28

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 18, 2025

k8s-ci-robot added cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Sep 18, 2025

Merge branch 'main' into main

31c2794

k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Sep 18, 2025

zachsmith1 mentioned this pull request Sep 18, 2025

Support for sharded controller instances #58

Open

embik self-requested a review September 19, 2025 07:31

sttts mentioned this pull request Sep 29, 2025

🌱 Prefactoring: controller-sharding #75

Merged

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 22, 2025

chore: rebase

9a82d21

k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 30, 2025

zachsmith1 added 2 commits November 30, 2025 16:08

fix: use manager option and let managers start providers

32b2280

fix(providers): engage before starting cluster caches; publish cluste…

ed39573

…rs early

mjudeikis reviewed Dec 1, 2025

View reviewed changes

ntnn reviewed Dec 1, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

✨ Multi cluster controller sharding #74

✨ Multi cluster controller sharding #74

zachsmith1 commented Sep 18, 2025

Uh oh!

k8s-ci-robot commented Sep 18, 2025

Uh oh!

linux-foundation-easycla bot commented Sep 18, 2025 •

edited

Loading

Uh oh!

k8s-ci-robot commented Sep 18, 2025

Uh oh!

FourFifthsCode commented Sep 19, 2025

Uh oh!

zachsmith1 commented Sep 19, 2025 •

edited

Loading

Uh oh!

zachsmith1 commented Dec 1, 2025

Uh oh!

mjudeikis left a comment

Uh oh!

mjudeikis Dec 1, 2025

Uh oh!

mjudeikis Dec 1, 2025

Uh oh!

ntnn Dec 1, 2025

Uh oh!

mjudeikis Dec 1, 2025

Uh oh!

ntnn Dec 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

✨ Multi cluster controller sharding #74

Are you sure you want to change the base?

✨ Multi cluster controller sharding #74

Conversation

zachsmith1 commented Sep 18, 2025

Uh oh!

k8s-ci-robot commented Sep 18, 2025

Uh oh!

linux-foundation-easycla bot commented Sep 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

k8s-ci-robot commented Sep 18, 2025

Uh oh!

FourFifthsCode commented Sep 19, 2025

Uh oh!

zachsmith1 commented Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zachsmith1 commented Dec 1, 2025

Uh oh!

mjudeikis left a comment

Choose a reason for hiding this comment

Uh oh!

mjudeikis Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

mjudeikis Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

ntnn Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

mjudeikis Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

ntnn Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

linux-foundation-easycla bot commented Sep 18, 2025 •

edited

Loading

zachsmith1 commented Sep 19, 2025 •

edited

Loading