You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jun 6, 2024. It is now read-only.
K8s's pod eviction may not trigger when job container uses too much disk space.
According to k8s doc, there are two fs partitions that kubelet supports:
The nodefs filesystem that kubelet uses for volumes, daemon logs, etc.
The imagefs filesystem that container runtimes uses for storing images and container writable layers.
The imagefs limit is optional, so it may not configured by default. The result is if nodefs and imagefs are different, k8s may not evict pod that consumes large disk space.
Plus, if a pod is evicted by k8s, PAI treat it as system failure and retry without increase retry count. The job will always retry.
The text was updated successfully, but these errors were encountered:
K8s's pod eviction may not trigger when job container uses too much disk space.
According to k8s doc, there are two fs partitions that kubelet supports:
The imagefs limit is optional, so it may not configured by default. The result is if nodefs and imagefs are different, k8s may not evict pod that consumes large disk space.
Plus, if a pod is evicted by k8s, PAI treat it as system failure and retry without increase retry count. The job will always retry.
The text was updated successfully, but these errors were encountered: