Skip to content

Commit 2f2a564

Browse files
committed
Article: Split Brain
1 parent c9df969 commit 2f2a564

File tree

1 file changed

+125
-0
lines changed

1 file changed

+125
-0
lines changed

content/blog/split-brain.md

Lines changed: 125 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,125 @@
1+
---
2+
author: Nimendra
3+
title: "System Design Notes: Split Brain in Docker Swarm"
4+
date: 2025-07-14
5+
description: "Understand the causes, consequences, and prevention of split-brain scenarios in Docker Swarm using Raft consensus and quorum-based leader election."
6+
tags: ["docker", "distributed-systems", "raft", "quorum", "K8s"]
7+
categories: ["Backend", "Docker"]
8+
lastmod: 2025-07-14
9+
showtoc: true
10+
TocOpen: false
11+
ShowReadingTime: true
12+
ShowPostNavLinks: true
13+
ShowBreadCrumbs: true
14+
ShowCodeCopyButtons: true
15+
editPost:
16+
URL: "https://github.com/nmdra/nmdra.github.io/tree/main/content"
17+
Text: "Suggest edit"
18+
appendFilePath: true
19+
---
20+
21+
**Split Brain** in distributed systems, such as Docker Swarm, occurs when a network partition causes nodes to lose communication with one another.
22+
23+
This results in two or more subsets of nodes thinking they are the **leader** or **primary controller** of the cluster. This inconsistency can lead to:
24+
25+
- Data corruption
26+
- Conflicting operations
27+
- Duplicate tasks being executed
28+
29+
## How it Happens
30+
31+
1. **Network Partition**: A temporary network failure splits the nodes into two or more isolated groups.
32+
2. **Leader Election Conflict**: Each isolated group might independently attempt to elect a leader.
33+
3. **Independent Decisions**: Each group operates as a separate cluster, leading to inconsistent states.
34+
35+
> ### In a Docker Swarm cluster:
36+
>
37+
> - Nodes are classified into **managers** and **workers**.
38+
> - **Managers** coordinate service orchestration and maintain the cluster state.
39+
> - If a partition occurs:
40+
> - Each group of managers may elect its own leader.
41+
> - This results in multiple active leaders (**split brain**) and **service conflicts**.
42+
43+
## Consequences of Split Brain
44+
45+
1. **Data Inconsistency**: Multiple leaders might make conflicting updates.
46+
2. **Duplicate Workloads**: Services may be scheduled redundantly.
47+
3. **Unrecoverable State**: Independent decisions by both partitions can be hard to reconcile.
48+
4. **Reduced System Reliability**: The system becomes unpredictable or unusable.
49+
50+
## Prevention Techniques in Docker Swarm
51+
52+
Docker Swarm uses the following techniques to avoid split-brain scenarios:
53+
54+
1. **Raft Consensus Algorithm**
55+
Ensures only one leader exists by requiring majority agreement.
56+
57+
2. **Quorum Enforcement**
58+
A cluster will only elect a leader and make decisions if the majority (quorum) of manager nodes are reachable.
59+
60+
3. **Network Redundancy**
61+
By avoiding partitions via redundant network paths, clusters reduce the risk of isolation.
62+
63+
## Quorum Explained
64+
65+
### 1. Majority Rule
66+
67+
A quorum is achieved when **more than half** of manager nodes agree.
68+
69+
| Managers | Quorum (Majority) | Fault Tolerance |
70+
|----------|-------------------|------------------|
71+
| 1 | 1 | 0 |
72+
| 2 | 2 | 0 |
73+
| 3 | 2 | 1 |
74+
| 4 | 3 | 1 |
75+
| 5 | 3 | 2 |
76+
| 6 | 4 | 2 |
77+
| 7 | 4 | 3 |
78+
79+
- **Example**: In a 3-manager node setup, 2 must be online to form a quorum.
80+
81+
When a network partition occurs:
82+
83+
- The **partition with quorum** (majority) becomes the **active** cluster.
84+
- The minority partition becomes **inactive** or **read-only**.
85+
86+
### 2. Leader Election and Heartbeats
87+
88+
- Manager nodes use **heartbeat messages** to monitor each other's health.
89+
- When heartbeats fail, managers assume the leader is down and initiate **leader election** via Raft.
90+
91+
> ### What are heartbeats?
92+
>
93+
> A **heartbeat** in Docker Swarm is a periodic signal sent between manager nodes to detect node failure. This ensures only active, reachable nodes participate in orchestration.
94+
95+
#### Rules for Leader Election
96+
97+
1. A manager can **only become a leader if it has quorum**.
98+
2. If quorum is not met, **no leader is elected**, and the cluster **pauses operations**.
99+
100+
101+
## Example Scenarios
102+
103+
### Scenario 1: 3 Manager Nodes
104+
105+
- Partition A: 2 nodes → **quorum met**
106+
- Partition B: 1 node → **no quorum**
107+
108+
➡ Partition A remains active; Partition B becomes read-only.
109+
110+
### Scenario 2: 4 Manager Nodes
111+
112+
- Partition A: 2 nodes
113+
- Partition B: 2 nodes
114+
115+
➡ Neither side has quorum (majority = 3), causing the system to **pause orchestration** and potentially enter **split-brain** until resolved.
116+
117+
# Split-Brain Scenarios on K8s
118+
119+
In Kubernetes, the control plane relies on a distributed key-value store called etcd, which stores the entire cluster state—pods, configurations, secrets, and more. To ensure consistency and fault tolerance, etcd uses the Raft consensus algorithm, which helps prevent split-brain scenarios.
120+
121+
👉 [Learn how etcd works](https://learnk8s.io/etcd-kubernetes)
122+
123+
124+
125+

0 commit comments

Comments
 (0)