-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add apis for machine-class #488
Conversation
@@ -39,5 +42,25 @@ type ProviderConfig struct { | |||
// ProviderConfigSource represents a source for the provider-specific | |||
// resource configuration. | |||
type ProviderConfigSource struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mvladev: https://github.com/kubernetes/kube-deploy/pull/659/files#r177866614
roberthbailey: I don't think it should be necessary to specify the capacity / allocatable on a single machine if you inline the provider config. I think of the class is a separate way to bring the provider specific data, not something inherent to the spec of the machine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/cc @mvladev - additional thoughts on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@roberthbailey does that mean machine class is just a collection of provider specific configuration?
Advantages of machine classes:
- less configuration lines in the machine objects by keeping common bits in a machine class object
Any other?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
less configuration lines in the machine objects by keeping common bits in a machine class object
- more precisely, externalize the
providerConfig
to separate objects. - as discussed in wg-call, it could also be used for defaulting/versioning via apiserver-webhooks, which could be hard to achieve in only inlined.
} | ||
|
||
// MachineClassRef is a reference to the MachineClass object. Controllers should find the right MachineClass using this reference. | ||
type MachineClassRef struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mvladev: What about namespace?
roberthbailey: Storage classes aren't namespaced, and I used that same pattern here. But it's worth discussing whether they should be part of a namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have entertained the namespaces- should be helpful especially keeping in mind the sensitivity of the data in machine-classes and our approach of isolating the clusters based on namespaces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
StorageClass are basically templates. Here, we're using namespace for isolating actual clusters. Do we need to also isolate machine templates? I don't see a need for namespace here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned, machine-classes going forward would be expected to have specific details towards machines and in-turn clusters, and also sensitive information of underlying infra. Separating machine-classes via namespaces will help to avoid unintended consumers of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tend to agree with @hardikdr on keeping machine class as namespaced object.
// parameters is 512, with a cumulative max size of 256K. | ||
// TODO(roberthbailey): Should this be a json-patch? | ||
// +optional | ||
Parameters map[string]string `json:parameters,omitempty` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
krousey: What kinds of things is this meant to override? Capacity? Allocatable? Or stuff in the provider config? All of these seem like they could be deeply nested structures that a map would not be able to fully express.
If ProviderConfig, perhaps we consider json patches?
roberthbailey: My thinking was in the provider config (e.g. parameterize the zone of a class to stamp out identical machine sets in multiple zones).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should leave this out for now. If we find a need for it, we can figure out how to add it later (templating vs. overlays will be contentious). But until there is a concrete need, less is more.
// this field is consistent with the underlying machine that will | ||
// be provisioned when this class is used, to inform higher level | ||
// automation (e.g. the cluster autoscaler). | ||
Allocatable corev1.ResourceList `json:"allocatable"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
krousey: From what I understand, this is the capacity minus the amount that kubelet is going to reserve on the node. I don't think you can know what that reserve is going to be just given a machine class. I know for GKE, we vary the reserve for each release, and possibly by machine size.
Perhaps the best way to represent this would be to officially model the Kubelet reserved resources in the MachineSpec (and therefore MachineSet and MachineDeployment). If we did that, we could drop Allocatable here, and autoscalers of deployments could pick the capacity from machine class and subtract reserved from machine spec.
mvladev: I agree with @krousey. There is no way to calculate in advance the allocated resources - the Machine class doesn't have a knowledge of which kubelet version / container runtime you are going to use and those can affect the kubelet's --kube-reserved and --system-reserved flags.
roberthbailey: The intent is that this would indicate to the cluster autoscaler how much actual capacity would exist once the "node allocatable" overhead was subtracted. The fact that the overhead varies by version makes putting this variable here... difficult, since it would tightly couple a machineclass to a specific k8s version if you wanted to adhere to the warning text.
@krousey - Is your suggestion of putting reserved in the machine spec to put it next to the kubelet version, since it is tightly coupled with it?
// TODO: should this use an api.ObjectReference to a 'MachineTemplate' instead? | ||
// A link to the MachineTemplate that will be used to create provider | ||
// specific configuration for Machines of this class. | ||
// MachineTemplate corev1.ObjectReference `json:machineTemplate` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mladev: Why is this needed?
roberthbailey: I added this comment as an alternate design -- one thing we'd thought about was splitting this data across two objects (class + template) and the template could potentially be used directly by machine in addition to machine class. Not sure if that's valuable or not though, so I put this here to foster discussion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for keeping capacity
and allocatable
as MahcineStatus
fields. cluster-api controller would initialize these fields once linking with the kubelet node is established. These are "observed" values and mary vary from machine to machine due to some discovery failure or hw failure etc. Therefore should be updated in MachineStatus whatever is actually observed at kubelet node.
+1 for keeping taints in MachineClassSpec. Taints are generally used for scheduling domain isolation and therefore generally applies to a bunch of machines.
/ok-to-test |
13eedd0
to
dba6f7a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be worth waiting to merge this until the kubebuilder stuff lands. I know it will mean a significant rebase, but I think the important thing is to agree on the initial api types. Hopefully putting those into the kubebuilder framework will be straightforward.
@@ -39,5 +42,25 @@ type ProviderConfig struct { | |||
// ProviderConfigSource represents a source for the provider-specific | |||
// resource configuration. | |||
type ProviderConfigSource struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/cc @mvladev - additional thoughts on this?
// parameters is 512, with a cumulative max size of 256K. | ||
// TODO(roberthbailey): Should this be a json-patch? | ||
// +optional | ||
Parameters map[string]string `json:parameters,omitempty` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should leave this out for now. If we find a need for it, we can figure out how to add it later (templating vs. overlays will be contentious). But until there is a concrete need, less is more.
// MachineClassRef is a reference to the MachineClass object. Controllers should find the right MachineClass using this reference. | ||
type MachineClassRef struct { | ||
// +optional | ||
metav1.ObjectMeta `json:"metadata,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't there a "reference" type that we should use instead of objectmeta? Something like LocalReference or ObjectReference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, my bad, made it objectreference.
Also removed the Parameters
for now.
// across multiple Machines / MachineSets / MachineDeployments. | ||
// +k8s:openapi-gen=true | ||
// +resource:path=machineclasses | ||
type MachineClass struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we leave out both allocatable and capacity, then machine class == provider config. I think it's important that we address @mvladev's comment above about whether we should have both of those fields on all machines (via an inline class) or only available to the autoscaler on machines that were created via a reference to a machine class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. I believe these decisions would be easier to take once requirements from cluster-autosclaer requirements are clearer. Shall we evolve the APIs for such fields as and when needed?
See: Initial work has started here kubernetes/community#2653
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if allocatable
and capacity
is something which should be put in the status of a machine. How can one define allocatable
?
At the moment this is calculated by summing kube-reserved
, system-reserved
and eviction-thresholds
https://kubernetes.io/docs/tasks/administer-cluster/reserve-compute-resources/#node-allocatable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are going to chat with some folks that work on the autoscaler tomorrow; maybe that will help clarify what is needed here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code on how allocatable is calculated is here https://github.com/kubernetes/kubernetes/blob/426ef9d349bb3a277c3e8826fb772b6bdb008382/pkg/kubelet/cm/node_container_manager.go#L174:33
We could assume a reasonable margin so allocatable is set based on the given capacity when creating a new machineClass. This would be used for the case of scaling out from zero, otherwise the info from running machines/nodes would be used.
Also should taints belongs to the machineClass?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems strange to have allocatable and capacity part of machineclass. Is this defining how many pods we're going to allow to run on this machine? How can we know that without knowing the size of OS, and additional software stack? Shouldn't we let the control plane calculate that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, these fields are expected to be available only after certain calculation, but autoscaler probably would be asking for it from a machine-api stack. We are in talks with autoscaler folks and try to decide right place/design, and hence I have intentionally kept those fields out for the first cut.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@enxebre I am not sure about the specific requirements around taints, but I would expect Taints to be available rather at MachineObjects than at class. Class should be seen more for representation of set of machines under MachineDeployment/Set, where subset of them could or could not have taints, which could be best identified by taints on specific MachineObjects.
sample/machineclass.yaml
Outdated
metadata: | ||
name: small | ||
namespace: foo | ||
capacity: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
leave out capacity / allocatable since those are currently commented out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks for pointing out, I missed it.
sample/machineclass.yaml
Outdated
apiVersion: "gceproviderconfig/v1alpha1" | ||
kind: "GCEProviderConfig" | ||
project: "$GCLOUD_PROJECT" | ||
zone: "${ZONE:-us-central1-f}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since we are going to remove the parameters part, make this just us-central1-f
instead of my bash-style substitution syntax.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
dba6f7a
to
9277250
Compare
|
||
// Provider is the name of the cloud-provider which MachineClass is intended for. | ||
// +optional | ||
Provider string `json:"provider,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that this is needed as the MachineClass
should contain all the information for that specific machine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mvladev - where does the machine class contain that info? Right now it's contained in this string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume Provider is similar to StorageClass' "provisioner"? If so, shouldn't we also have a name field for the MachineClass? Or is that already built in because of corev1.ObjectReference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, ObjectReference should provide us with the name field in it.
// across multiple Machines / MachineSets / MachineDeployments. | ||
// +k8s:openapi-gen=true | ||
// +resource:path=machineclasses | ||
type MachineClass struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if allocatable
and capacity
is something which should be put in the status of a machine. How can one define allocatable
?
At the moment this is calculated by summing kube-reserved
, system-reserved
and eviction-thresholds
https://kubernetes.io/docs/tasks/administer-cluster/reserve-compute-resources/#node-allocatable
// TODO(roberthbailey): Fill these in later | ||
// The machine class from which the provider config should be sourced. | ||
// +optional | ||
MachineClass *MachineClassRef `json:machineClass,omitempty` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just want to point out I believe this is different from the way StorageClass and RuntimeClass handles ref. I guess it's ok since we're not handling the class the same way as them. They use just a string. Their controller matches the class' string name to match up the provider. So if we were following the same pattern, we would have some controller that matched up the string with a provider known to the controller. It's been a few months since I looked at this, but I believe that is the pattern.
providerConfig: | ||
valueFrom: | ||
machineClass: | ||
provider: gcp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this is an example, but can provider be defined in the machine class object above? With the way it was defined in the code above, can provider be provided in either machine class or machine objects?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have kept provider
at machine-object level, keeping in mind the usecase where we might not want to fetch the complete MachineClass when MachineObject is seen/updated. In the future iterations, we can think if it's needed at any other layers as well to be populated explicitly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the usecase where we might not want to fetch the complete MachineClass when MachineObject is seen/updated.
I am struggling to understand this use case. Who is "we"? autoscaler?
If it is intended that provider should use default provider config for this machine object and should not refer machine class, may be following will make more sense:
spec:
providerConfig:
provider: gcp
valueFrom:
machineClass:
name: ""
namepsace: ""
i.e if machine class is mentioned it will be refereed otherwise should be left blank for default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inline providerConfig will still work as it is currently, classes will only be a parallel mechanism for users to inject the providerConfig via reference. Above comment was explicitly for provider
field at MachineObject layer, and not the existence of classes as such :)
Trying to compare machine classes with storage classes. Motivation behind storage classes was the portability across different storage providers. Wondering what is the motivation behind machine class? Can you please link me to a user story that how autoscaler will consume machine classes? |
@vikaschoudhary16 Though we learn from storage-class, we may have our divergent-points, as underlying usecase is different, for instance - namespaced machineclasses. |
I think it is a common pattern in kubernetes projects to have a separate commit for generated code. This helps reviewers to focus on the actual changes. If not so late already, you might want to rearrange commits to do so. |
@hardikdr - this needs a rebase (for the generated file). |
9277250
to
28568d9
Compare
@roberthbailey I updated the commits to align with both comment from @vikaschoudhary16 and kubebuilder pr, can you please have a look at the changes made. |
28568d9
to
a68fc1e
Compare
/lgtm This has been waiting for comments for longe enough. Let's get it in and then we can tweak it going forward if we need to (since the api is still alpha). |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: hardikdr, roberthbailey The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What this PR does / why we need it: Adding APIs for Machine-class. Machineclass is a way of externalizing the ProviderConfig from MachineSpec using reference.
This PR is basically re-initiating the work done here: kubernetes-retired/kube-deploy#659
Which issue(s) this PR fixes : This PR partially resolves- #22 .
Special notes for your reviewer:
Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.
Release note: