@@ -880,11 +880,16 @@ not added to the pod for a long enough time (for example 2 minutes).
880880
881881# ### Marking pods as Failed
882882
883- When matching a failed pod against Job pod failure policy it is important that
883+ When matching a failed pod against Job pod failure policy, it is important that
884884the pod is actually in the terminal phase (`Failed`), to ensure their state is
885885not modified while Job controller matches them against the pod failure policy.
886886
887- However, there are scenarios in which a pod gets stuck in a non-terminal phase,
887+ Additionally, it is necessary to avoid the creation of a replacement Pod if the
888+ previously created Pod becomes terminating (has a `deletionTimestamp` but is
889+ not `Failed` nor `Succeeded` yet), or we might create replacement Pods that
890+ wouldn't be created if the pod failure policy was applied against the terminated Pod.
891+
892+ There are scenarios in which a pod gets stuck in a non-terminal phase,
888893but is doomed to be failed, as it is terminating (has `deletionTimestamp` set, also
889894known as the `DELETING` state, see :
890895[The API Object Lifecycle](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/object-lifecycle.md)).
@@ -1433,7 +1438,11 @@ type PodFailurePolicyRule struct {
14331438 OnPodConditions []PodFailurePolicyOnPodConditionsPattern
14341439}
14351440
1436- // PodFailurePolicy describes how failed pods influence the backoffLimit.
1441+ // podFailurePolicy describes how failed pods are accounted. In particular,
1442+ // how they influence the backoffLimit.
1443+ // When using podFailurePolicy, terminating Pods (have a ` deletionTimestamp`)
1444+ // are not immediately replaced and don't count as failed until they reach
1445+ // a terminal phase (`Failed` or `Succeeded`).
14371446type PodFailurePolicy struct {
14381447 // A list of pod failure policy rules. The rules are evaluated in order.
14391448 // Once a rule matches a Pod failure, the remaining of the rules are ignored.
@@ -1507,9 +1516,18 @@ spec:
15071516### Evaluation
15081517
15091518We use the ` syncJob` function of the Job controller to evaluate the specified
1510- ` podFailurePolicy` rules against the failed pods. It is only the first rule with
1511- matching requirements which is applied as the rules are evaluated in order. If
1512- the pod failure does not match any of the specified rules, then default
1519+ ` podFailurePolicy` rules against the failed pods.
1520+
1521+ Since terminating Pods (have `deletionTimestamp` and are not `Failed` or
1522+ ` Succeeded` ) don't have an exit code yet and might actually succeed, the
1523+ controller will not evaluate them against the `podFailurePolicy`.
1524+ The job controller will also not create a replacement Pod until they reach the
1525+ ` Failed` phase. This behavior is the same as
1526+ [`podReplacementPolicy : Failed`](../3939-allow-replacement-when-fully-terminated/).
1527+
1528+ When evaluating Failed Pods against the `podFailurePolicy`, it is only the first
1529+ rule with matching requirements which is applied as the rules are evaluated in order.
1530+ If the pod failure does not match any of the specified rules, then default
15131531handling of failed pods applies.
15141532
15151533If we limit this feature to use `onExitCodes` only when `restartPolicy=Never`
@@ -1708,14 +1726,17 @@ Below are some examples to consider, in addition to the aforementioned [maturity
17081726 [SSA](https://kubernetes.io/docs/reference/using-api/server-side-apply/) client.
17091727- The feature flag enabled by default
17101728
1711- Second iteration :
1729+ Second iteration (1.27) :
17121730 - Extend Kubelet to mark as failed pending terminating pods (see : [Marking pods as Failed](#marking-pods-as-failed)).
17131731 - Extend the feature documentation to explain transitioning of pending and
17141732 terminating pods into `Failed` phase.
17151733
17161734Third iteration (1.28) :
17171735- Add `DisruptionTarget` condition for pods which are preempted by Kubelet to make room for critical pods.
17181736 Also, backport this fix to 1.26 and 1.27 release branches, and update the user-facing documentation to reflect this change.
1737+ - Avoid creation of replacement Pods for terminating Pods until they reach
1738+ the terminal phase. Update user-facing documentation.
1739+ Might be considered for backport to 1.27.
17191740
17201741# ### GA
17211742
0 commit comments