You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jun 6, 2024. It is now read-only.
Kubernetes deploys the coreDNS pods in nodes within the cluster. It is not guaranteed that Kubernetes will/won't deploy the coreDNS pod in a certain node. Some PAI worker nodes may have it, while others may not.
The coreDNS pod requests 1 CPU and 500MiB memory. It will affect resoure calculation for hived scheduler.
e.g. One cluster has one master node and two worker nodes. Every worker node has 10 allocatable CPUs. In the beginning, the coreDNS pod is deployed in the master node. So the admin configures 10 CPUs in every worker node in hived scheduler. But, for some reasons, one coreDNS pod is deployed in one worker node. Thus the worker node's allocatable CPU number becomes 9. It may cause infinite job retries because hived may always schedule job to this worker node.
The text was updated successfully, but these errors were encountered:
Remove coredns may cause pod which not using hostNetwork failed to access internet. May need to move it to master node.
I'm not sure about the AKS env. If we can control the coredns in AKS
Kubernetes deploys the coreDNS pods in nodes within the cluster. It is not guaranteed that Kubernetes will/won't deploy the coreDNS pod in a certain node. Some PAI worker nodes may have it, while others may not.
The coreDNS pod requests 1 CPU and 500MiB memory. It will affect resoure calculation for hived scheduler.
e.g. One cluster has one master node and two worker nodes. Every worker node has 10 allocatable CPUs. In the beginning, the coreDNS pod is deployed in the master node. So the admin configures 10 CPUs in every worker node in hived scheduler. But, for some reasons, one coreDNS pod is deployed in one worker node. Thus the worker node's allocatable CPU number becomes 9. It may cause infinite job retries because hived may always schedule job to this worker node.
The text was updated successfully, but these errors were encountered: