@@ -177,10 +177,9 @@ incorrectly or objects being garbage collected mistakenly.
177177
178178* Ensure discovery reports the same set of resources everywhere (not just group
179179 versions, as it does today)
180- * Proxy client traffic to an apiserver that can service the request
181- * If traffic can't be proxied (e.g. if no network connection between
182- apiservers), return a 503 and not a 404 so clients can make the right
183- decision.
180+ * Ensure that every resource in discovery can be accessed successfully
181+ * In the failure case (e.g. network not routable between apiservers), ensure
182+ that unreachable resources are served 503 and not 404.
184183
185184### Non-Goals
186185
@@ -212,6 +211,17 @@ API server change:
212211 we will use the storage version objects to reconstruct a merged discovery
213212 document and serve that in all apiservers.
214213
214+ Why so much work?
215+ * Note that merely serving 503s at the right times does not solve the problem,
216+ for two reasons: controllers might get an incomplete discovery and therefore
217+ not ask about all the correct resources; and when they get 503 responses,
218+ although the controller can avoid doing something destructive, it also can't
219+ make progress and is stuck for the duration of the upgrade.
220+ * Likewise proxying but not merging the discovery document, or merging the
221+ discovery document but serving 503s instead of proxying, doesn't fix the
222+ problem completely. We need both safety against destructive actions and the
223+ ability for controllers to proceed and not block.
224+
215225### User Stories (Optional)
216226
217227#### Garbage Collector
@@ -222,8 +232,9 @@ described above, could result in GC seeing a 404 and assuming an object has been
222232deleted; this could result in it deleting a subsequent object that it should
223233not.
224234
225- This proposal will cause the GC to see either the correct object or get a 503
226- (which it handles safely).
235+ This proposal will cause the GC to see the complete list of resources in
236+ discovery, and when it requests specific objects, see either the correct object
237+ or get a 503 (which it handles safely).
227238
228239#### Namespace Lifecycle Controller
229240
0 commit comments