Skip to content

Commit e8d8128

Browse files
Karthik-K-Nkishen-v
andcommitted
Update HotUnplug Scenario
Co-authored-by: kishen-v <kishen.viswanathan@ibm.com>
1 parent 40b8dec commit e8d8128

File tree

1 file changed

+30
-0
lines changed
  • keps/sig-node/3953-node-resource-hot-plug

1 file changed

+30
-0
lines changed

keps/sig-node/3953-node-resource-hot-plug/README.md

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@ tags, and then generate with `hack/update-toc.sh`.
1919
- [Non-Goals](#non-goals)
2020
- [Proposal](#proposal)
2121
- [Handling HotUnplug Events](#handling-hotunplug-events)
22+
- [Flow Control](#flow-control)
2223
- [User Stories](#user-stories)
2324
- [Story 1](#story-1)
2425
- [Story 2](#story-2)
@@ -156,6 +157,35 @@ With this proposal its also necessary to recalculate and update OOMScoreAdj and
156157
Though this KEP focuses only on resource hotplug, It will enable the kubelet to capture the current available capacity of the node (Regardless of whether it was a hotplug or hotunplug of resources.)
157158
For now, we will introduce an error mode in the kubelet to inform users about the shrink in the available resources in case of hotunplug.
158159

160+
As the hotunplug events are not completely handled in this KEP, in such cases, it is imperative to move the node to the NotReady state when the current capacity of the node
161+
is lesser than the initial capacity of the node. This is only to point at the fact that the resources have shrunk on the node and may need attention/intervention.
162+
163+
Once the node has transitioned to the NotReady state, it will be reverted to the ReadyState once when the node's capacity is reconfigured to match or exceed the last valid configuration.
164+
In this case, valid configuration refers to a state which can either be previous hot-plug capacity or the initial capacity in case there was no history of hotplug.
165+
166+
#### Flow Control
167+
168+
```
169+
T=0: Node initial Resources:
170+
- Memory: 10G
171+
- Node state: Ready
172+
173+
T=1: Resize Instance to Hotplug Memory
174+
- Current Memory: 10G
175+
- Update Memory: 15G
176+
- Node state: Ready
177+
178+
T=2: Resize Instance to HotUnplug Memory
179+
- Current Memory: 15G
180+
- UpdatedMemory: 5G
181+
- Node state: NotReady
182+
183+
T=3: Resize Instance to Hotplug Memory
184+
- Current Memory: 5G
185+
- Updated Memory Size: 15G
186+
- Node state: Ready
187+
```
188+
159189
Few of the concerns surrounding hotunplug are listed below
160190
* Pod re-admission:
161191
* Given that there is probability that the current Pod resource usage may exceed the available capacity of node, its necessary to check if the pod can continue Running

0 commit comments

Comments
 (0)