This describes a setup where the following Cluster API core components and Metal3-io components are deployed :
- Cluster API manager
- Cluster API Bootstrap Provider Kubeadm (CABPK) manager
- Cluster API Kubeadm Control Plane manager
- Baremetal Operator (including the Ironic setup)
- Cluster API Provider Metal3 (CAPM3)
The BareMetalHost is an object from baremetal-operator. Each CR represents a physical host with BMC credentials, hardware status etc.
BareMetalHost exposes those different fields that are secret references:
- userData : for a cloud-init user-data in a secret with the key
userData
- metaData : for a cloud-init metadata in a secret with the key
metaData
- networkData : for a cloud-init network data in a secret with the key
networkData
For the metaData, some values are set by default to maintain compatibility:
- uuid: This is the BareMetalHost UID
- metal3-namespace: the name of the BareMetalHost
- metal3-name: The name of the BareMetalHost
- local-hostname: The name of the BareMetalHost
- local_hostname: The namespace of the BareMetalHost
However, setting any of those values in the metaData secret will override those default values.
In CAPM3 API version v1alpha4 (which is removed from main branch but exist in
previous release branches) and onwards, it is possible to mark BareMetalHost
object as unhealthy by adding an annotation capi.metal3.io/unhealthy
. This
annotation prevents CAPM3 to select unhealthy BareMetalHost for newly created
metal3machine. Removing the annotation will enable the normal operations.
A Cluster is a Cluster API core object representing a Kubernetes cluster.
Example cluster:
apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
name: cluster
namespace: metal3
spec:
clusterNetwork:
services:
cidrBlocks:
- 10.96.0.0/12
pods:
cidrBlocks:
- 192.168.0.0/18
infrastructureRef:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: Metal3Cluster
name: m3cluster
namespace: metal3
controlPlaneRef:
kind: KubeadmControlPlane
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
name: m3cluster-controlplane
namespace: metal3
The metal3Cluster object contains information related to the deployment of the cluster on Baremetal. It currently has two specification fields :
- controlPlaneEndpoint: contains the target cluster API server address and port
- noCloudProvider: (true/false) Whether the cluster will not be deployed with an external cloud provider. If set to true, CAPM3 will patch the target cluster node objects to add a providerID. This will allow the CAPI process to continue even if the cluster is deployed without cloud provider.
Example metal3cluster :
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: Metal3Cluster
metadata:
name: m3cluster
namespace: metal3
spec:
controlPlaneEndpoint:
host: 192.168.111.249
port: 6443
noCloudProvider: true
This object contains all information related to the control plane configuration. It references an infrastructureTemplate that must be a Metal3MachineTemplate in this case.
For example:
kind: KubeadmControlPlane
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
metadata:
name: m3cluster-controlplane
namespace: metal3
spec:
machineTemplate:
infrastructureRef:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: Metal3MachineTemplate
name: m3cluster-controlplane
namespace: metal3
nodeDrainTimeout: 0s
replicas: 3
rolloutStrategy:
rollingUpdate:
maxSurge: 1
type: RollingUpdate
version: v1.31.2
kubeadmConfigSpec:
joinConfiguration:
controlPlane: {}
nodeRegistration:
name: "{{ ds.meta_data.name }}"
kubeletExtraArgs:
node-labels: "metal3.io/uuid={{ ds.meta_data.uuid }}"
initConfiguration:
nodeRegistration:
name: "{{ ds.meta_data.name }}"
kubeletExtraArgs:
node-labels: "metal3.io/uuid={{ ds.meta_data.uuid }}"
The KubeadmConfig object is for CABPK. It contains the node Kubeadm configuration and additional commands to run on the node for the setup.
In order to deploy Kubernetes successfully, you need to know the cluster API address before deployment. However, if you are deploying an HA cluster or if you are deploying without using static ip addresses, the cluster API server address is unknown. A solution to go around the problem is to deploy Keepalived. Keepalived allows you to set up a virtual IP, defined beforehand, and shared by the nodes. Hence the commands to set up Keepalived have to run before kubeadm.
The content of a KubeadmConfig can contain Jinja2 template elements, since the
cloud-init renders the cloud-config as a Jinja2 template. It is possible to use
metadata from cloud-init, using the following: {{ ds.meta_data.<key>}}
. The
keys and values are passed to cloud-init through a Metal3DataTemplate
object
(see below).
Example KubeadmConfig:
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: KubeadmConfig
metadata:
name: controlplane-0
namespace: metal3
spec:
initConfiguration:
nodeRegistration:
name: "{{ ds.meta_data.name }}"
kubeletExtraArgs:
node-labels: "metal3.io/uuid={{ ds.meta_data.uuid }}"
preKubeadmCommands:
- netplan apply
- systemctl enable --now crio kubelet
- if (curl -sk --max-time 10 https://192.168.111.249:6443/healthz); then
echo "keepalived already running";else systemctl start keepalived; fi
- systemctl link /lib/systemd/system/monitor.keepalived.service
- systemctl enable monitor.keepalived.service
- systemctl start monitor.keepalived.service
postKubeadmCommands:
- mkdir -p /home/metal3/.kube
- chown metal3:metal3 /home/metal3/.kube
- cp /etc/kubernetes/admin.conf /home/metal3/.kube/config
- systemctl enable --now keepalived
- chown metal3:metal3 /home/metal3/.kube/config
files:
- path: /etc/keepalived/keepalived.conf
content: |
! Configuration File for keepalived
global_defs {
notification_email {
sysadmin@example.com
support@example.com
}
notification_email_from lb@example.com
smtp_server localhost
smtp_connect_timeout 30
}
vrrp_instance VI_2 {
state MASTER
interface enp2s0
virtual_router_id 2
priority 101
advert_int 1
virtual_ipaddress {
192.168.111.249
}
}
A Machine is a Cluster API core object representing a Kubernetes node. A machine has a reference to a KubeadmConfig and a reference to a metal3machine.
Example Machine:
apiVersion: cluster.x-k8s.io/v1beta1
kind: Machine
metadata:
name: controlplane-0
namespace: metal3
labels:
cluster.x-k8s.io/control-plane: "true"
cluster.x-k8s.io/cluster-name: "cluster"
spec:
bootstrap:
configRef:
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: KubeadmConfig
name: controlplane-0
namespace: metal3
infrastructureRef:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: Metal3Machine
name: controlplane-0
namespace: metal3
nodeDrainTimeout: 0s
providerID: metal3://68be298f-ed11-439e-9d51-6c5260faede6
version: v1.31.2
The Metal3Machine contains information related to the deployment of the BareMetalHost such as the image and the host selector. For each machine, there must be a Metal3Machine.
The fields are:
-
image -- This includes two sub-fields,
url
andchecksum
, which include the URL to the image and the URL to a checksum for that image. These fields are required. The image will be used for provisioning of theBareMetalHost
chosen by theMachine
actuator. -
userData -- This includes two sub-fields,
name
andnamespace
, which reference aSecret
that contains base64 encoded user-data to be written to a config drive on the provisionedBareMetalHost
. This field is optional and is automatically set by CAPM3 with the userData from the machine object. If you want to overwrite the userData, this should be done in the CAPI machine. -
dataTemplate -- This includes a reference to a Metal3DataTemplate object containing the metadata and network data templates, and includes two fields,
name
andnamespace
. -
metaData is a reference to a secret containing the metadata rendered from the Metal3DataTemplate metadata template object automatically. In case this would not be managed by the Metal3DataTemplate controller, if provided by the user for example, the ownerreference should be set properly to ensure that the secret belongs to the cluster ownerReference tree (see doc).
-
networkData is a reference to a secret containing the network data rendered from the Metal3DataTemplate metadata template object automatically. In case this would not be managed by the Metal3DataTemplate controller, if provided by the user for example, the ownerreference should be set properly to ensure that the secret belongs to the cluster ownerReference tree (see doc). The content of the secret should be a yaml equivalent of a json object that follows the format definition that can be found here.
-
hostSelector -- Specify criteria for matching labels on
BareMetalHost
objects. This can be used to limit the set of availableBareMetalHost
objects chosen for thisMachine
. -
automatedCleaningMode -- An interface to enable or disable Ironic automated cleaning during provisioning or deprovisioning of a host. When set to
disabled
, automated cleaning will be skipped, wheremetadata
value enables it. It is recommended to tune the cleaning via metal3MachineTemplate rather than metal3Machine. Whenspec.template.spec.automatedCleaningMode
field of metal3MachineTemplate is updated, metal3MachineTemplate controller will update all the metal3Machines (generated from the metal3MachineTemplate) and eventually BareMetalHosts with the same value.
The metaData
and networkData
field in the spec
section are for the user to
give directly a secret to use as metaData or networkData. The userData
,
metaData
and networkData
fields in the status
section are for the
controller to store the reference to the secret that is actually being used,
whether it is from one of the spec fields, or somehow generated. This is aimed
at making a clear difference between the desired state from the user (whether it
is with a DataTemplate reference, or direct metaData
or userData
secrets)
and what the controller is actually using.
The dataTemplate
field consists of an object reference to a Metal3DataTemplate
object containing the templates for the metadata and network data generation for
this Metal3Machine. The renderedData
field is a reference to the Metal3Data
object created for this machine. If the dataTemplate field is set but either the
renderedData
, metaData
or networkData
fields in the status are unset, then
the Metal3Machine controller will wait until it can find the Metal3Data object
and the rendered secrets. It will then populate those fields.
When CAPM3 controller will set the different fields in the BareMetalHost, it
will reference the metadata secret and the network data secret in the
BareMetalHost. If any of the metaData
or networkData
status fields are
unset, that field will also remain unset on the BareMetalHost.
When the Metal3Machine gets deleted, the CAPM3 controller will remove its ownerreference from the data template object. This will trigger the deletion of the generated Metal3Data object and the secrets generated for this machine.
The `hostSelector field has two possible optional sub-fields:
-
matchLabels -- Key/value pairs of labels that must match exactly.
-
matchExpressions -- A set of expressions that must evaluate to true for the labels on a
BareMetalHost
.
Valid operators include:
- ! -- Key does not exist. Values ignored.
- = -- Key equals specified value. There must only be one value specified.
- == -- Key equals specified value. There must only be one value specified.
- in -- Value is a member of a set of possible values
- != -- Key does not equal the specified value. There must only be one value specified.
- notin -- Value not a member of the specified set of values.
- exists -- Key exists. Values ignored.
- gt -- Value is greater than the one specified. Value must be an integer.
- lt -- Value is less than the one specified. Value must be an integer.
Example 1: Only consider a BareMetalHost
with label key1
set to value1
.
spec:
providerSpec:
value:
hostSelector:
matchLabels:
key1: value1
Example 2: Only consider BareMetalHost
with both key1
set to value1
AND
key2
set to value2
.
spec:
providerSpec:
value:
hostSelector:
matchLabels:
key1: value1
key2: value2
Example 3: Only consider BareMetalHost
with key3
set to either a
, b
, or
c
.
spec:
providerSpec:
value:
hostSelector:
matchExpressions:
- key: key3
operator: in
values: [‘a’, ‘b’, ‘c’]
Example 3: Only consider BareMetalHost
with key1
set to value1
AND key2
set to value2
AND key3
set to either a
, b
, or c
.
spec:
providerSpec:
value:
hostSelector:
matchLabels:
key1: value1
key2: value2
matchExpressions:
- key: key3
operator: in
values: [‘a’, ‘b’, ‘c’]
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: Metal3Machine
metadata:
name: controlplane-0
namespace: metal3
spec:
automatedCleaningMode: metadata
image:
checksum: http://172.22.0.1/images/UBUNTU_22.04_NODE_IMAGE_K8S_v1.31.2-raw.img.sha256sum
checksumType: sha256
format: raw
url: http://172.22.0.1/images/UBUNTU_22.04_NODE_IMAGE_K8S_v1.31.2-raw.img
hostSelector:
matchLabels:
key1: value1
matchExpressions:
key: key2
operator: in
values: { ‘abc’, ‘123’, ‘value2’ }
dataTemplate:
Name: controlplane-metadata
metaData:
Name: controlplane-0-metadata-0
MachineDeployment is a core Cluster API object that is similar to deployment for pods. It refers to a KubeadmConfigTemplate and to a Metal3MachineTemplate.
Example MachineDeployment:
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachineDeployment
metadata:
name: md-0
namespace: metal3
labels:
cluster.x-k8s.io/cluster-name: cluster
nodepool: nodepool-0
spec:
clusterName: cluster
replicas: 1
selector:
matchLabels:
cluster.x-k8s.io/cluster-name: cluster
nodepool: nodepool-0
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
type: RollingUpdate
template:
metadata:
labels:
cluster.x-k8s.io/cluster-name: cluster
nodepool: nodepool-0
spec:
bootstrap:
configRef:
name: md-0
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: KubeadmConfigTemplate
infrastructureRef:
name: md-0
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: Metal3MachineTemplate
version: v1.31.2
This contains a template to generate KubeadmConfig.
Example KubeadmConfigTemplate:
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: KubeadmConfigTemplate
metadata:
name: controlplane-0
namespace: metal3
spec:
template:
spec:
joinConfiguration:
nodeRegistration:
name: "{{ ds.meta_data.name }}"
kubeletExtraArgs:
node-labels: "metal3.io/uuid={{ ds.meta_data.uuid }}"
preKubeadmCommands:
- netplan apply
- systemctl enable --now crio kubelet
The Metal3MachineTemplate contains following two specification fields:
- nodeReuse: (true/false) Whether the same pool of BareMetalHosts will be re-used during the upgrade/remediation operations. By default set to false, if set to true, CAPM3 Machine controller will pick the same pool of BareMetalHosts that were released while upgrading/remediation - for the next provisioning phase.
- template: is a template containing the data needed to create a Metal3Machine.
This feature can be desirable and enabled in scenarios such as upgrade or node remediation. For example, the same pool of hosts need to be used after cluster upgrade and no data of secondary storage should be lost. To achieve that:
-
spec.nodeReuse
field of metal3MachineTemplate must be set toTrue
. This tells that we want to reuse the same hosts after the upgrade, or to be exact same BareMetalHosts should be provisioned. -
spec.template.spec.automatedCleaningMode
field of metal3MachineTemplate must be set todisabled
. This tells that we want secondary/hosted storage data to persist even after upgrade.
Above field changes need to be made before you start upgrading your cluster.
When spec.nodeReuse
field of metal3MachineTemplate is set to True
, CAPM3
Machine controller:
- Sets
infrastructure.cluster.x-k8s.io/node-reuse
label to the corresponding CAPI object name (acontrolplane.cluster.x-k8s.io
object such asKubeadmControlPlane
or aMachineDeployment
) on the BareMetalHost during deprovisioning; - Selects the BareMetalHost that contains
infrastructure.cluster.x-k8s.io/node-reuse
label and matches exact same CAPI object name set in the previous step during next provisioning.
Example Metal3MachineTemplate :
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: Metal3MachineTemplate
metadata:
name: m3mt-0
namespace: metal3
spec:
nodeReuse: false
template:
spec:
automatedCleaningMode: metadata
image:
checksum: http://172.22.0.1/images/UBUNTU_22.04_NODE_IMAGE_K8S_v1.31.2-raw.img.sha256sum
checksumType: sha256
format: raw
url: http://172.22.0.1/images/UBUNTU_22.04_NODE_IMAGE_K8S_v1.31.2-raw.img
hostSelector:
matchLabels:
key1: value1
matchExpressions:
key: key2
operator: in
values: { ‘abc’, ‘123’, ‘value2’ }
dataTemplate:
Name: m3mt-0-metadata
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: Metal3DataTemplate
metadata:
name: nodepool-1
namespace: default
ownerReferences:
- apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
controller: true
kind: Metal3Cluster
name: cluster-1
spec:
templateReference: old-template
metaData:
strings:
- key: abc
value: def
objectNames:
- key: name_m3m
object: metal3machine
- key: name_machine
object: machine
- key: name_bmh
object: baremetalhost
indexes:
- key: index
offset: 0
step: 1
ipAddressesFromIPPool:
- key: ip
Name: pool-1
prefixesFromIPPool:
- key: ip
Name: pool-1
gatewaysFromIPPool:
- key: gateway
Name: pool-1
dnsServersFromIPPool:
- key: dns
Name: pool-1
fromHostInterfaces:
- key: mac
interface: "eth0"
fromLabels:
- key: label-1
object: machine
label: mylabelkey
fromAnnotations:
- key: annotation-1
object: machine
annotation: myannotationkey
networkData:
links:
ethernets:
- type: "phy"
id: "enp1s0"
mtu: 1500
macAddress:
fromAnnotation:
object: machine
annotation: primary-mac
- type: "phy"
id: "enp2s0"
mtu: 1500
macAddress:
fromHostInterface: "eth1"
bonds:
- id: "bond0"
mtu: 1500
macAddress:
string: "XX:XX:XX:XX:XX:XX"
bondMode: "802.3ad"
bondLinks:
- enp1s0
- enp2s0
vlans:
- id: "vlan1"
mtu: 1500
macAddress:
string: "YY:YY:YY:YY:YY:YY"
vlanID: 1
vlanLink: bond0
networks:
ipv4DHCP:
- id: "provisioning"
link: "bond0"
ipv4:
- id: "Baremetal"
link: "vlan1"
IPAddressFromIPPool: pool-1
routes:
- network: "0.0.0.0"
netmask: 0
gateway:
fromIPPool: pool-1
services:
dns:
- "8.8.4.4"
dnsFromIPPool: pool-1
ipv6DHCP:
- id: "provisioning6"
link: "bond0"
ipv6SLAAC:
- id: "provisioning6slaac"
link: "bond0"
ipv6:
- id: "Baremetal6"
link: "vlan1"
IPAddressFromIPPool: pool6-1
routes:
- network: "0::0"
netmask: 0
gateway:
string: "2001:0db8:85a3::8a2e:0370:1"
services:
dns:
- "2001:4860:4860::8844"
dnsFromIPPool: pool6-1
services:
dns:
- "8.8.8.8"
- "2001:4860:4860::8888"
status:
indexes:
"0": "machine-1"
dataNames:
"machine-1": nodepool-1-0
lastUpdated: "2020-04-02T06:36:09Z"
This object will be reconciled by its own controller. When reconciled, the
controller will add a label pointing to the Metal3Cluster that has nodes linking
to this object. The spec contains a metaData
and a networkData
field that
contain a template of the values that will be rendered for all nodes.
The metaData
field will be rendered into a map of strings in yaml format,
while networkData
will be rendered into a map equivalent of
Nova network_data.json.
On the target node, the network data will be rendered as a json object that
follows the format definition that can be found
here.
The metaData
field contains a list of items that will render data in different
ways. The following types of objects are available and accept lists:
- strings: renders the given string as value in the metadata. It takes a
value
attribute. - objectNames : renders the name of the object that matches the type given.
It takes an
object
attribute, containing the type of the object. - indexes: renders the index of the current object, with the offset from the
offset
field and using the step from thestep
field. The following conditions must be matched :offset
>= 0 andstep
>= 1 if the step is unspecified (default value being 0), the controller will automatically change it for 1. Theprefix
andsuffix
attributes are to provide a prefix and a suffix for the rendered index. - ipAddressesFromIPPool: renders an ip address from an IPPool object. The IPPool objects are defined in the IP Address manager repo
- prefixesFromIPPool: renders a network prefix from an IPPool object. The IPPool objects are defined in the IP Address manager repo
- gatewaysFromIPPool: renders a network gateway from an IPPool object. The IPPool objects are defined in the IP Address manager repo
- dnsServersFromIPPool: renders a dns servers list from an IPPool object. The IPPool objects are defined in the IP Address manager repo
- fromHostInterfaces: renders the MAC address of the BareMetalHost that matches the name given as value.
- fromLabels: renders the content of a label on an object or an empty string
if the label is absent. It takes an
object
attribute to specify the type of the object where to fetch the label, and alabel
attribute that contains the label key. - fromAnnotations: renders the content of a annotation on an object or an
empty string if the annotation is absent. It takes an
object
attribute to specify the type of the object where to fetch the annotation, and anannotation
attribute that contains the annotation key.
For each object, the attribute key is required.
The networkData
field will contain three items :
- links: a list of layer 2 interface
- networks: a list of layer 3 networks
- services : a list of services (DNS)
The object for the links section list can be:
- ethernets: a list of ethernet interfaces
- bonds: a list of bond interfaces
- vlans: a list of vlan interfaces
The links/ethernets objects contain the following:
- type: Type of the ethernet interface
- id: Interface name
- mtu: Interface MTU
- macAddress: an object to render the MAC Address
The links/ethernets/type can be one of :
- bridge
- dvs
- hw_veb
- hyperv
- ovs
- tap
- vhostuser
- vif
- phy
The links/ethernets/macAddress object can be one of:
- string: with the desired Mac given as a string
- fromAnnotation: with the desired Mac retrieved from an annotation. It
takes an
object
attribute to specify the type of the object where to fetch the annotation, and anannotation
attribute that contains the annotation key. - fromHostInterface: with the interface name from BareMetalHost hardware details.
The links/bonds object contains the following:
- id: Interface name
- mtu: Interface MTU
- macAddress: an object to render the MAC Address
- bondMode: The bond mode
- bondLinks : a list of links to use for the bond
The links/bonds/bondMode can be one of :
- 802.3ad
- balance-rr
- active-backup
- balance-xor
- broadcast
- balance-tlb
- balance-alb
The links/vlans object contains the following:
- id: Interface name
- mtu: Interface MTU
- macAddress: an object to render the MAC Address
- vlanID: The vlan ID
- vlanLink : The link on which to create the vlan
The object for the networks section can be:
- ipv4: a list of ipv4 static allocations
- ipv4DHCP: a list of ipv4 DHCP based allocations
- ipv6: a list of ipv6 static allocations
- ipv6DHCP: a list of ipv6 DHCP based allocations
- ipv6SLAAC: a list of ipv6 SLAAC based allocations
The networks/ipv4 object contains the following:
- id: the network name
- link: The name of the link to configure this network for
- ipAddressFromIPPool: renders an ip address from an IPPool object. The IPPool objects are defined in the IP Address manager repo
- routes: the list of route objects
The networks/ipv*/routes is a route object containing:
- network: the subnet to reach
- netmask: the mask of the subnet as integer
- gateway: the gateway to use, it can either be given as a string in string or as an IPPool name in fromIPPool
- services: a list of services object as defined later
The networks/ipv4Dhcp object contains the following:
- id: the network name
- link: The name of the link to configure this network for
- routes: the list of route objects
The networks/ipv6 object contains the following:
- id: the network name
- link: The name of the link to configure this network for
- ipAddressFromIPPool: renders an ip address from an IPPool object. The IPPool objects are defined in the IP Address manager repo
- routes: the list of route objects
The networks/ipv6Dhcp object contains the following:
- id: the network name
- link: The name of the link to configure this network for
- routes: the list of route objects
The networks/ipv6Slaac object contains the following:
- id: the network name
- link: The name of the link to configure this network for
- routes: the list of route objects
The object for the services section can be:
- dns: a list of dns service with the ip address of a dns server
- dnsFromIPPool: the IPPool from which to fetch the dns servers list
The data template parts containing the metadata and networkData must be immutable since the BareMetalHost references the secrets and they are used at provisioning time, the secrets cannot be updated to be able to reprovision the node in the exact same state. This means that updates have to be done by creating a new template and referencing it in the new/updated Metal3MachineTemplate.
The process to allow updating the metaData and networkData is then to create a
new Metal3DataTemplate and reference the new one in the Metal3MachineTemplate.
This requires the Metal3Data to be linked to both Metal3DataTemplates. This is
achieved using the templateReference
field of the Metal3DataTemplate. When a
Metal3Data corresponding to the Metal3DataTemplate is created, the reconciler
will set the templateReference
to the value of the Metal3DataTemplate, if set.
The Metal3Data objects are linked to a Metal3DataTemplate by three ways:
- Directly reference the template in the
template
field of the Metal3Dataspec
- They have the same
templateReference
key as the template - The template's
templateReference
matches thename
of thetemplate
field of thespec
of the Metal3Data.
The third way ensures backward compatibility with previous version of
implementation when there was no templateReference
in Metal3Data objects.
Since the templateReference
field is an addition to the existing API, the
default behaviour of the controller will be to list the Metal3Data objects by
matching their template field if the templateReference
is left empty on the
template object. However, if the templateReference
is set on the
Metal3DataTemplate object, but not on the Metal3Data object, then the controller
will match the templateReference
field with the name
of the template
field
of the Metal3Data object. This is performed by setting the templateReference
value on a new Metal3DataTemplate object to the name of an old
Metal3DataTemplate object. This allows to transition from the old template
object, which is without templateReference
set on the Metal3Data objects
created from the old template object to the new one which uses the
templateReference
.
A new object would be created, a Metal3DataClaim type.
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: Metal3DataClaim
metadata:
name: machine-1
namespace: default
ownerReferences:
- apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
controller: true
kind: Metal3Machine
name: machine-1
spec:
template:
name: nodepool-1
status:
renderedData:
name: nodepool-1-0
errorMessage: ""
The Metal3DataClaim object will reference its target Metal3DataTemplate object. In its status, the renderedData would reference the Metal3Data object when it would be generated. In case of error, the errorMessage would contain a description of the error.
The output of the controller would be a Metal3Data object,one per node linking to the Metal3DataTemplate object and the associated secrets
The Metal3Data object would be:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: Metal3Data
metadata:
name: nodepool-1-0
namespace: default
ownerReferences:
- apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
controller: true
kind: Metal3DataTemplate
name: nodepool-1
spec:
templateReference: old-template
index: 0
claim:
name: machine-1
namespace: metal3
metaData:
name: machine-1-metadata
namespace: metal3
networkData:
name: machine-1-networkdata
namespace: metal3
template:
name: test1-workers-template
namespace: metal3
status:
ready: true
error: false
errorMessage: ""
The Metal3Data will contain the index of this node, and links to the secrets generated and to the Metal3Machine using this Metal3Data object.
If the Metal3DataTemplate object is updated, the generated secrets will not be updated, to allow for reprovisioning of the nodes in the exact same state as they were initially provisioned. Hence, to do an update, it is necessary to do a rolling upgrade of all nodes.
The reconciliation of the Metal3DataTemplate object will also be triggered by
changes on Metal3Machines. In the case that a Metal3Machine gets modified, if
the dataTemplate
references a Metal3DataTemplate, that Metal3DataClaim
object will be reconciled. There will be two cases:
- An already generated Metal3Data object exists for that Metal3DataClaim. If the reference is not in the Metal3DataClaim object, the reconciler will add it. The reconciler will also verify that the required secrets exist. If they do not, they will be created.
- if no Metal3Data exists for that Metal3DataClaim, then the reconciler will create one and fill the respective field with the secret name.
To create a Metal3Data object, the Metal3DataClaim controller will select an
index for that Metal3Machine. The selection happens by selecting the lowest
available index that is not in use. To do that, the controller will list all
existing Metal3Data object linked to this Metal3DataTemplate and to get the
unavailable indexes. The indexes always start from 0 and increment by 1. The
lowest available index is to be used next. The dataNames
field contains the
map of Metal3Machine to Metal3Data and the indexes
contains the map of
allocated indexes and claims.
Once the next lowest available index is found, it will create the Metal3Data
object. The name would be a concatenation of the Metal3DataTemplate name and
index. Upon conflict, it will fetch again the list to consider the new list of
Metal3Data and try to create the new object with the new index, this will happen
until the new object is created successfully. Upon success, it will render the
content values, and create the secrets containing the rendered data. The
controller will generate the content based on the metaData
or networkData
field of the Metal3DataTemplate Specs. The ready field in renderedData will
then be set accordingly. If any error happens during the rendering, an error
message will be added.
The name of the secret will be made of a prefix and the index. The Metal3Machine
object name will be used as the prefix. A -metadata-
or -networkdata-
will
be added between the prefix and the index.
In the case where the Metal3Machine is created without a dataTemplate
value,
if the metaData
or networkData
fields are set (one or both), the
Metal3Machine reconciler will fetch the secret, set the status field and
directly start the provisioning of the BareMetalHost using the secrets if given.
If one of the secrets does not exist, the controller will wait to start the
provisioning of the BareMetalHost until it exists.
In the case where the Metal3Machine is created with a dataTemplate
value, the
Metal3Machine reconciler will create a Metal3DataClaim for that object.
The Metal3DataClaim would then be reconciled, and its controller will create an index for this Metal3DataClaim if it does not exist yet, and create a Metal3Data object with the index. Upon success, it will set the ready field to true, and the renderedData field to reference the Metal3Data object.
The Metal3Data reconciler will then generate the secrets, based on the index,
the Metal3DataTemplate and the machine. Once created, it will set the status
field ready
to True.
Once the Metal3Data object is ready, the Metal3Machine controller will fetch the secrets that have been created (one or both) and use them to start provisioning the BareMetalHost.
If the Metal3Machine object is created with a dataTemplate
field set, but one
of the metaData
or networkData
is also set in the spec, this one will
override the template generation for this specific secret. i.e. if the user sets
the three fields, the controller will use the user input secret for both.
This means that some hybrid scenarios are supported, where the user can give
directly the metaData
secret and let the controller render the networkData
secret through the Metal3DataTemplate object.
You can find CR examples in the Metal3-io dev env project, in the template folder.