Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aws-cni on vanilla K8s #508

Closed
tvalasek opened this issue Jun 18, 2019 · 7 comments
Closed

aws-cni on vanilla K8s #508

tvalasek opened this issue Jun 18, 2019 · 7 comments

Comments

@tvalasek
Copy link

tvalasek commented Jun 18, 2019

We build K8s clusters in AWS using CF/kubeadm from upstream vanilla K8s. For networking we use aws-cni plugin. Our out-of-the-box setup is 3x masters (etcd running inside) and 3x worker nodes.

AWS-cni plugin runs as deamonset, thus on all 6 members of a cluster (masters + workers).

Now, the behaviour I'm seeing is that aws-cni plugin does not differenciate masters from workers.

The result of that is (looking at the cni-metrics-helper stats) aws-cni on demand creates new ENIs and secondary IP address pool (warm pool) also on master nodes which by default have pod scheduling disabled (for obvious reasons). That leaves us with huge amount of warm pool unused IPs on master nodes that can't ever be allocated.

I believe aws-cni was primarily build for EKS (where control plane / master and etcd) are hidden from the EKS admin, but I wonder if aws-cni has feature to distinguish masters from workers (and thus different warm pool scheduling) for those of us who decided not to use EKS.

E.g. labeling master nodes and annotate aws-cni daemon set to act differently (like do not create new ENIs) on labeled nodes

Thanks

@mogren
Copy link
Contributor

mogren commented Jun 18, 2019

Hi @tvalasek,

You are right that we have not yet optimized the plugin much for use outside of EKS, so there is more work to be done here. I think it sounds like a good idea to make the CNI more configurable in order work better on the masters. Do you have any more concrete suggestions for what configurations you would need?

@tvalasek
Copy link
Author

How I see it is something to what already has been done: #68

Ain't sure if is it part of aforementioned PR but for our use case I would like to have something like max number of ENIs for given node that can be created config option. (that way we could control how many warm pool IPs can be created for that specific node)

Secondly these config options would work globally on all members of a cluster (like it's now) or only work on nodes with either specific labels on K8s level (e.g. node-role.kubernetes.io) or specific tags on AWS EC2 level. I reckon the later sounds more like generic AWS approach, kinda like it.

Does it make sense?

@mogren
Copy link
Contributor

mogren commented Jun 19, 2019

We do have the MAX_ENI setting already, but that would get applied to both master and worker nodes.

In this case, I guess it would be better to have a way to tag master nodes, or that the CNI is aware of common taints like node-role.kubernetes.io/master and behave differently in that case.

@danbeaulieu
Copy link

@mogren is creating separate daemonsets an option? One for control plane nodes with the right tolerations and selectors and one for non control plane nodes?

@jaypipes jaypipes added priority/P2 Low priority, nice to have. needs investigation and removed priority/P2 Low priority, nice to have. labels Oct 30, 2019
@jaypipes
Copy link
Contributor

@tvalasek Hi Tomas, we're actually wondering what the specific feature request is for this. We're hoping you can elaborate. Are you asking for the CNI plugin to behave in a different way if it knows it's running on a master node (via inspection of, say, node-role annotation)? Or are you asking for a way to prevent the CNI plugin (via a daemonset taint/toleration) from running on master nodes?

@tvalasek
Copy link
Author

@jaypipes Hi Jay.

Are you asking for the CNI plugin to behave in a different way if it knows it's running on a master node (via inspection of, say, node-role annotation)?

yes, thats the correct one

P.S.: If we would prevent it from running on master nodes we would not be able to schedule any pods on masters because aws-cni is responsible for assigning IP addresses to pods

@jaypipes
Copy link
Contributor

@jaypipes Hi Jay.

Are you asking for the CNI plugin to behave in a different way if it knows it's running on a master node (via inspection of, say, node-role annotation)?

yes, thats the correct one

P.S.: If we would prevent it from running on master nodes we would not be able to schedule any pods on masters because aws-cni is responsible for assigning IP addresses to pods

Apologies for the long delay in getting back to you @tvalasek! This unfortunately dropped out of my email radar :(

The solution you are looking for is to modify the YAML manifest for the aws-k8s-cni Daemonset you are using to deploy the CNI plugin to include a nodeAffinity specification that prevents the Daemonset from being scheduled to specific nodes:

  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: kubernetes.io/role
            operator: NotIn
            values:
            - master

Depending on how you are installing Kubernetes, the "key" above may be different (it's the label that is applied to a node). The example above is the label that kops applies, for example.

haouc added a commit to haouc/amazon-vpc-cni-k8s that referenced this issue Apr 23, 2021
jayanthvn pushed a commit that referenced this issue Apr 23, 2021
* Cherry-pick the PR #458 from eks charts

* bumping chart version to sync with #508 from eks-charts
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants