Skip to content

inconsistent oom kills in k8s #12282

@real-danm

Description

@real-danm

Description

I am trying to get a better understanding of how gvisor handles memory management; and more specifically how it handles oom kills primarily for python workloads.

Across both gke and custom clusters I get very inconsistent oom-kills. See below for version numbers.
I have req and limits set for primary and sidecar containers; i have experimented with setting req/limits for init container, but this doesn't seem to make a difference. By default I do not have these values set for init containers.

My deployment yaml is templated so the only differences are the containers.
With this setup I have some pods that are correctly oom-killed, exit 137, or evicted.
However, I often see several pods where memory reported by the k8s metrics api is 1.5x or greater than the limit set in the deployment spec; this is validated by inspecting the process on the host.
e.x.

kubectl get pod xxx -o jsonpath='{.spec.runtimeClassName}'
gvisor
kubectl get pod xxx -o jsonpath='{range .spec.containers[*]}{.name}{": limits="}{.resources.limits}{", requests="}{.resources.requests}{"\n"}{end}'
primary: limits={"cpu":"4","ephemeral-storage":"2Gi","memory":"8589934592"}, requests={"cpu":"2","ephemeral-storage":"2Gi","memory":"4294967296"}
sidecar: limits={"cpu":"100m","memory":"100Mi"}, requests={"cpu":"10m","memory":"50Mi"}

kubectl top pod xxx
NAME                                                       CPU(cores)   MEMORY(bytes)
xxx   93m          12560Mi

I'm not sure if this is a bug with gvisor or with my setup, would appreciate any help in debugging this.

Steps to reproduce

I set up a test pod that just continuously allocates memory. With the single container it is oom-killed as expected. With an init container; i have seen a single instance where it was able to exceeded the limit; but most of the time it is killed correctly. Not able to draw any conclusions here.

runsc version

I have two different cluster types using gvisor.
gke (1.33.5-gke.1080000) clusters:

/home/containerd/usr/local/sbin/runsc --version
runsc version google-785595836
spec: 1.2.1

custom cluster setup using systemd cgroups v2:

runsc --version
runsc version release-20250820.0
spec: 1.2.0

docker version (if using docker)

uname

No response

kubectl (if using Kubernetes)

custom node setup

kubelet --version
Kubernetes v1.33.1

kubectl version
...
Server Version: v1.33.1

kubectl get nodes
NAME           STATUS   ROLES   AGE   VERSION
xxx   Ready    node    29d   v1.33.1
...

repo state (if built from source)

No response

runsc debug logs (if available)

Metadata

Metadata

Assignees

No one assigned

    Labels

    type: bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions