Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🏃add draft of multi-tenancy proposal #809

Merged
merged 1 commit into from
Aug 3, 2020

Conversation

devigned
Copy link
Contributor

What this PR does / why we need it:

We need to provide the Cluster an identity to use for provisioning besides the default identity of the controller. This will enable the CAPZ controllers to work as a single instance with multiple tenants.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #586

see also: kubernetes-sigs/cluster-api-provider-aws#1674

Big thank you to @randomvariable and @andrewmyhre! I cribbed much of this document from the AWS provider proposal.

TODOs:

  • squashed commits
  • includes documentation
  • adds unit tests

Release note:

NONE

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jul 20, 2020
@k8s-ci-robot k8s-ci-robot requested a review from juan-lee July 20, 2020 21:50
@k8s-ci-robot k8s-ci-robot added area/provider/azure Issues or PRs related to azure provider sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jul 20, 2020
the principal types. Upon receiving an update event, the controller will update lookup the key in
the cache and update the relevant `Authorizer`. This may be implemented as its own interface.
Mutexes will ensure in-flight updates are completed prior to SDK calls are made. This would require
changes to RBAC, and maintaining a watch on secrets of a specific type will require further
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maintaining a watch on secrets of a specific type will require further
investigation as to feasibility

this sounds fine to me, anything you're worried about? if there's any identifying metadata you can use it to write a filter

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was not, but the CAPA authors were. Not sure if I missed something... Wanted to leave it in for someone smarter than me to spot the hole.

* The controller will set the Identity resource as one of the OwnerReferences of the AzureCluster.

### Clusterctl changes

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

related issue in cluster-api kubernetes-sigs/cluster-api#3354

Copy link
Contributor

@CecileRobertMichon CecileRobertMichon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the proposal is sane, I like the proposed API changes. Mostly added some clarifying questions/comments.

docs/proposals/20200720-single-controller-multitenancy.md Outdated Show resolved Hide resolved
<a name="FR3">FR3.</a> CAPZ MUST prevent privilege escalation allowing users to create clusters in Azure accounts they should
not be able to.

<a name="FR4">FR4.</a> CAPZ SHOULD support credential refreshing modified principal data.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is an example of this being able to refresh an expired service principal secret to be able to keep managing an existing cluster?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more about someone changing the content of the secret, which should cause the Authorizer cache to revoke the existing Authorizer and rebuild it from the changed secret.


<a name="FR4">FR4.</a> CAPZ SHOULD support credential refreshing modified principal data.

<a name="FR5">FR5.</a> CAPZ SHOULD provide validation for principal data submitted by users.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what kind of validation? Does that include validating that the credentials are valid/not expired?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to attempt to get a token from AAD. If we are unable to get a token from AAD, perhaps we get a 401, we should reject the secret as invalid.


### Non-Functional

<a name="NFR1">NFR1.</a> Each instance of CAPZ SHOULD be able to support 200 clusters using role assumption.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why 200?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a reasonable aspiration and was part of the CAPA proposal.

docs/proposals/20200720-single-controller-multitenancy.md Outdated Show resolved Hide resolved
docs/proposals/20200720-single-controller-multitenancy.md Outdated Show resolved Hide resolved

<em>Cluster scoped resources</em>

* `AzureServicePrincipal` represents an AAD Service Principal. Should support both
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

related to #778

imperfect. For example, a change to an Azure Identity resource could affect the validity of
corresponding AzureCluster.

#### Principal Type Credential Provider Behaviour
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conversion changes? What happens if you have a v1alpha3 cluster using SystemAssigned identity and you try to convert to v1alpha2?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a great question. I think that fails.

We could add the identity information to the metadata of the v1alpha2 version of the AzureCluster. Thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should at least document that as breaking. Wonder how CAPA is handling conversion... we should probably align on that.

@devigned
Copy link
Contributor Author

I think I've addressed all of the feedback thus far. PTAL when you all have a min.

@devigned devigned requested a review from alexeldeib July 24, 2020 12:58
Name string `json:"name"`
}

type AzureClusterSpec struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just thought about this, what's the impact on the subscriptionID field in AzureClusterSpec? None right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes should have no impact on subscriptionID.

An AAD identity has nothing to do with a subscription. Auth only needs the tenant, client ID, a secret, and resource or scopes.

If one day in the future we decide to build multi-subscription clusters, then we'd have to make a change to subscription ID.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Thanks for the clarification.

@CecileRobertMichon
Copy link
Contributor

/hold

Starting lazy consensus for a week from now (7/31)

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 24, 2020
@CecileRobertMichon
Copy link
Contributor

another quick question: I assume this has no impact on #481, correct? Since this is for identity for infra provisioning

@devigned
Copy link
Contributor Author

another quick question: I assume this has no impact on #481, correct? Since this is for identity for infra provisioning

@CecileRobertMichon, nope. I think this is more for user auth to the cluster (kubectl auth).

@devigned devigned changed the title add first draft of multitenancy proposal 🏃add draft of multi-tenancy proposal Jul 30, 2020
@CecileRobertMichon
Copy link
Contributor

Last chance for comments, will remove the hold Monday morning

Copy link
Contributor

@CecileRobertMichon CecileRobertMichon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 31, 2020
@nader-ziada
Copy link
Contributor

lgtm

@CecileRobertMichon
Copy link
Contributor

/hold cancel
/approve

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 3, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: CecileRobertMichon

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 3, 2020
@devigned
Copy link
Contributor Author

devigned commented Aug 3, 2020

/retest

@k8s-ci-robot k8s-ci-robot merged commit c28cdfc into kubernetes-sigs:master Aug 3, 2020
@k8s-ci-robot k8s-ci-robot added this to the v0.4.7 milestone Aug 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/provider/azure Issues or PRs related to azure provider cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Multi-tenancy within one manager instance
5 participants