Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design doc for multiple worker node groups support #757

Merged
merged 9 commits into from
Dec 23, 2021

Conversation

bnrjee
Copy link
Contributor

@bnrjee bnrjee commented Dec 3, 2021

Issue #, if available:

Description of changes:

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@bnrjee bnrjee requested a review from vivek-koppuru December 3, 2021 19:35
@eks-distro-bot eks-distro-bot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Dec 3, 2021
@bnrjee bnrjee requested a review from abhay-krishna December 3, 2021 20:11
@bnrjee bnrjee force-pushed the wng-doc branch 3 times, most recently from 8ba7400 to 4b5c5b7 Compare December 4, 2021 04:18
@eks-distro-bot eks-distro-bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Dec 4, 2021
Copy link
Member

@g-gaston g-gaston left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks very good

designs/multiple-worker-node-groups.md Show resolved Hide resolved
designs/multiple-worker-node-groups.md Outdated Show resolved Hide resolved
designs/multiple-worker-node-groups.md Show resolved Hide resolved
designs/multiple-worker-node-groups.md Outdated Show resolved Hide resolved
designs/multiple-worker-node-groups.md Show resolved Hide resolved
designs/multiple-worker-node-groups.md Outdated Show resolved Hide resolved
designs/multiple-worker-node-groups.md Outdated Show resolved Hide resolved

For each group, we will append these three fields corresponding to that group in the capi spec.

Right now, the cli assumes that there will be only one group and it treats worker node group configuration array as a collection of only one element. As a result, the controller just refers to the first element of this array in different places of the code. So we need to do the same operations in loops, which includes capi spec creation, cluster spec validation etc. Once a capi spec is created with this approach, the workload cluster will be created with multiple worker nodes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't need to be part of this doc at all, but it might be a good idea to document all the places where this assumption is being made and how deeply they go (at least before starting the execution). Maybe something we can do in parallel to this review.

I fear this refactor might be a bit more complex/lengthy that what it seems

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can discuss this separately.

@bnrjee bnrjee requested a review from g-gaston December 7, 2021 18:52
designs/multiple-worker-node-groups.md Outdated Show resolved Hide resolved
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
kind: VSphereMachineTemplate
metadata:
name: eksa-test-worker-node-template-1638469395669
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be mapping the VSphereMachineTemplate to the worker node group, which means we would have a eksa-test-1-worker-node-template-* and eksa-test-2-worker-node-template-*? I think it gets a little complicated when we think about maintaining a mapping from worker node groups and machine config objects to the capi template, especially when introducing/removing a machine config. Unless we just say that every worker node group configuration warrants a new capi template spec, regardless of whether it references the same machine config or not.

Based on what it is, would like to see a note about that as a sentence or two in the design doc here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious about this as well
In particular: especially when introducing/removing a machine config, super interesting problem I didn't think about

designs/multiple-worker-node-groups.md Outdated Show resolved Hide resolved

For each group, we will append these three fields corresponding to that group in the CAPI spec.

Right now, the cli assumes that there will be only one group and it treats worker node group configuration array as a collection of only one element. As a result, the controller just refers to the first element of this array in different places of the code. So we need to do the same operations in loops, which includes CAPI spec creation, cluster spec validation etc. Once a CAPI spec is created with this approach, the workload cluster will be created with multiple worker nodes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we introduce multiple machine configs, are we going to use Go array templating to loop or maintain a default capi spec containing worker node configuration, and append to the resultant capi spec depending on how many worker node groups that we have? I would prefer to do the latter so that we control generating new capi worker node specs based on the number of machine config objects we have configured.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 option: use capi api (go) structs
This one has my vote

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will use go structs.

designs/multiple-worker-node-groups.md Show resolved Hide resolved

Also, it needs to be made sure that at the least one of the worker node groups does not have `NoExecute` or `NoSchedule` taint. This validation will be done at the preflight validation stage.

The examples in this design are for vsphere provider. But the same strategy applies for other providers as well.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means that for docker, only changing the taints for each worker node group would warrant a new capi template spec, or just each worker node group corresponds to a separate capi template spec even if the values are exactly the same?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, for docker also, we will be adding KubeadmConfigTemplate, MachineDeployment, VSphereMachineTemplate in the CAPI spec file for each worker node group.


Right now, the cli assumes that there will be only one group and it treats worker node group configuration array as a collection of only one element. As a result, the controller just refers to the first element of this array in different places of the code. So we need to do the same operations in loops, which includes CAPI spec creation, cluster spec validation etc. Once a CAPI spec is created with this approach, the workload cluster will be created with multiple worker nodes.
Right now, the cli assumes that there will be only one group and it treats worker node group configuration array as a collection of only one element. As a result, the controller just refers to the first element of this array in different places of the code. So we need to do the same operations in loops, which includes CAPI spec creation, cluster spec validation etc. Once a CAPI spec is created with this approach, the workload cluster will be created with multiple worker nodes. We will use an array of CAPI objects to store the worker node group configurations and then generate CAPI spec file using that array.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you mention that one element of the array of worker node group configurations corresponds to a set of CAPI objects consisting of KubeadmConfig, MachineDeployment, and whatever else is there? Just gives us an understanding as to what to expect, even if there are repeated VSphereMachineConfigs for each of worker node groups.

Copy link
Contributor Author

@bnrjee bnrjee Dec 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure, if there is a data structure encompassing all three. But these three types are well defined in capi and capv code bases. What I plan to do is to create a structure of these 3 elements and then create an array of that structure. I will update the design doc.

@bnrjee bnrjee mentioned this pull request Dec 15, 2021
7 tasks
@eks-distro-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bnrjee, g-gaston, vivek-koppuru

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [bnrjee,g-gaston,vivek-koppuru]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@g-gaston
Copy link
Member

/lgtm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved lgtm size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants