Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync a set of labels placed on a BMH to the corresponding K Node object #146

Closed
Arvinderpal opened this issue Oct 27, 2020 · 13 comments
Closed
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.

Comments

@Arvinderpal
Copy link
Contributor

Arvinderpal commented Oct 27, 2020

There are use cases where certain information is inherently tied to the BMH but at the same time, is valuable to the users/operators/schedulers in the K workload cluster. For example:

  1. As a user, I would like to place my workloads across hosts that are in different failure zones. For example, I may want replicas of a specific workload spread out across different server racks or geographical locations.

  2. As a user, I would like to place my security sensitive workloads on machines that meet specific security requirements. For example, certain hosts may be in a fortified zone within a data center or certain hosts may have strong hardware based security mechanisms in place (e.g. hardware attestation via TPM).

In Kubernetes, labels on Node objects are the primary means by which to solve this problem. The ask here is for CAPM3 to synchronize a specific set of labels placed on a BMH object with labels on the corresponding K Node running on that BMH. CAPM3 is already capable of mapping BMH<->Node, so a controller that keeps the labels in sync may be a straightforward addition. The synchronization would be limited to only a specific set of labels matching a certain prefix. For example, the user may specify my-prefix.metal3.io/ as their prefix (e.g. via a command-line flag). Labels placed on the BMH that match this prefix would be synchronized with the labels on the K Node object. For example,

kind: BareMetalHost
name: node-0
metadata:
  labels:
    my-prefix.metal3.io/rack: xyz-123
    my-prefix.metal3.io/zone: security-level-0
...
---
kind: Node
name: worker-node-0 
metadata:
  labels:
    my-prefix.metal3.io/rack: xyz-123
    my-prefix.metal3.io/zone: security-level-0
...

Proposal doc: https://docs.google.com/document/d/1qMCkggaLGQLHNPnEVGYjpabSrO-DTvSsr8uIW06W0ck/edit?usp=sharing

Related:
There is a related issue in the CAPI community linked below. The proposal there is for CAPI to synchronize labels placed on MachineDeployment objects with the K Nodes created from that deployment. While similar, they are addressing different things. However, the proposed CAPI approach also uses prefixes to limit the scope of the synchronization.

CAPI: Support syncing a set of labels from MachineDeployment/MachineSet/Machine to Nodes

@Arvinderpal
Copy link
Contributor Author

@maelk @dhellmann PTAL.

@dhellmann
Copy link
Member

I like the idea of doing this. I'm not sure about using a command line flag to configure it, but we can work through that.

Is there any reason to limit which labels are copied? What would happen if we just take them all?

@dhellmann dhellmann added the kind/feature Categorizes issue or PR as related to a new feature. label Oct 27, 2020
@Arvinderpal
Copy link
Contributor Author

Is there any reason to limit which labels are copied? What would happen if we just take them all?

The primary concern is that we may be stepping on labels that are managed by some other entity (this was brought up in the CAPI issue linked). Here CAPM3 would own the set of labels that match the prefix.

Also, the ask is for synchronization and not just a one time copy from BMH to Node on creation; that is, if a label (say my-prefix.metal3.io/rack) was accidentally removed from the Node, CAPM3 would reapply it; similarly, if the user added/removed a label on a BMH, CAPM3 would ensure the Node was updated appropriately.

@maelk
Copy link
Member

maelk commented Oct 27, 2020

I also think this would be a very valuable feature for CAPM3. Would you mind starting a proposal on this ?

@Arvinderpal
Copy link
Contributor Author

If we have consensus that this is a good feature to have, then I can definitely write up a proposal.

@dhellmann
Copy link
Member

Yes, I think we definitely want this. We've been looking at how to do something similar with the hardware-classification-controller output.

It would be ideal if this was a new controller in capm3, so we could import the code into capbm downstream. If not a controller, then maybe a library function we could invoke separately.

@Arvinderpal
Copy link
Contributor Author

Great. I'll put together a proposal.

@Arvinderpal
Copy link
Contributor Author

@dhellmann
Copy link
Member

@Arvinderpal if that document is ready for review, please move it to a pull request on this repository.

@Arvinderpal
Copy link
Contributor Author

@dhellmann #149

@metal3-io-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues will close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@metal3-io-bot metal3-io-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 7, 2021
@metal3-io-bot
Copy link
Contributor

Stale issues close after 30d of inactivity. Reopen the issue with /reopen. Mark the issue as fresh with /remove-lifecycle stale.

/close

@metal3-io-bot
Copy link
Contributor

@metal3-io-bot: Closing this issue.

In response to this:

Stale issues close after 30d of inactivity. Reopen the issue with /reopen. Mark the issue as fresh with /remove-lifecycle stale.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.
Projects
None yet
Development

No branches or pull requests

4 participants