Skip to content
This repository has been archived by the owner on Oct 24, 2023. It is now read-only.

fix: de-dup vnet roleAssignment deployments #3857

Merged
merged 10 commits into from
Sep 29, 2020

Conversation

jackfrancis
Copy link
Member

@jackfrancis jackfrancis commented Sep 23, 2020

Reason for Change:

This PR addresses and edge case resulting from the "master VM roleAssignment" change introduced in #2583. Specifically, we want to create a VNET roleAssignment per master VM for each VNET referenced by any node pool. In the event that we are sharing a common VNET among several node pools, we want to ensure that we only create one Deployment resource per VNET.

The CI jobs in this repro PR demonstrate that system-assigned identity is currently broken w/ multiple node pools + custom VNET:

#3861

In addition, this PR only includes the above roll assignment Deployment resources if the master VNET is in fact distinct as compared to any one node pool VNET.

Also, we standardize to the 2018-09-01-preview API version for defining system-assigned ID-derived role assignments, so that we can declare the "principalType": "ServicePrincipal" configuration.

Also includes changes to the cluster configuration used to test PRs so that we're testing system-assigned identity-enabled clusters regularly.

Issue Fixed:

Fixes #3863

Credit Where Due:

Does this change contain code from or inspired by another project?

  • No
  • Yes

If "Yes," did you notify that project's maintainers and provide attribution?

  • No
  • Yes

Requirements:

Notes:

@acs-bot acs-bot added size/L and removed size/M labels Sep 23, 2020
@jackfrancis
Copy link
Member Author

cc @dennis-benzinger-hybris

@codecov
Copy link

codecov bot commented Sep 23, 2020

Codecov Report

Merging #3857 into master will increase coverage by 0.00%.
The diff coverage is 69.44%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #3857   +/-   ##
=======================================
  Coverage   72.83%   72.83%           
=======================================
  Files         149      149           
  Lines       23171    23201   +30     
=======================================
+ Hits        16876    16898   +22     
- Misses       5178     5182    +4     
- Partials     1117     1121    +4     
Impacted Files Coverage Δ
pkg/engine/armtype.go 85.00% <ø> (ø)
pkg/engine/masterarmresources.go 65.90% <59.09%> (+0.11%) ⬆️
pkg/engine/systemroleassignments.go 97.95% <85.71%> (-2.05%) ⬇️
pkg/engine/templates_generated.go 48.65% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f2a577d...e2a2600. Read the comment docs.

@acs-bot acs-bot added size/XXL and removed size/L labels Sep 23, 2020
log.Printf("Command Output: %s\n", out)
}
}
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's an example of the result of this change:

$ az deployment group create --name kubernetes-eastus-37551 --resource-group kubernetes-eastus-37551 --template-file /Users/jackfrancis/work/src/github.com/Azure/aks-engine/_output/kubernetes-eastus-37551/azuredeploy.json --parameters /Users/jackfrancis/work/src/github.com/Azure/aks-engine/_output/kubernetes-eastus-37551/azuredeploy.parameters.json
.......2020/09/23 17:26:56 
Error from deployment for kubernetes-eastus-37551 in resource group kubernetes-eastus-37551:exit status 1
2020/09/23 17:26:56 Command Output: Deployment failed. Correlation ID: 0934f995-914a-47ec-a60c-cbf5774220ba. {
  "error": {
    "code": "PrincipalNotFound",
    "message": "Principal 4aaa67de8ac644c5bf234a1d069a995e does not exist in the directory 72f988bf-86f1-41af-91ab-2d7cd011db47."
  }
}


$ az deployment group create --name kubernetes-eastus-37551 --resource-group kubernetes-eastus-37551 --template-file /Users/jackfrancis/work/src/github.com/Azure/aks-engine/_output/kubernetes-eastus-37551/azuredeploy.json --parameters /Users/jackfrancis/work/src/github.com/Azure/aks-engine/_output/kubernetes-eastus-37551/azuredeploy.parameters.json
.

And then the E2E execution proceeded successfully. Basically, this change enables us to functionally test system-assigned identity reliably, as we need to assume that such a config will regularly produce an error response from the initial ARM template deployment based on observed behaviors.

Additionally, documentation outlining this "redeploy your ARM template" guidance will be added to this PR:

#3837

Copy link
Member

@mboersma mboersma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

pkg/engine/systemroleassignments.go Show resolved Hide resolved
@acs-bot
Copy link

acs-bot commented Sep 29, 2020

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jackfrancis, mboersma

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [jackfrancis,mboersma]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@jackfrancis jackfrancis merged commit fa9bb91 into Azure:master Sep 29, 2020
@jackfrancis jackfrancis deleted the vnet-role-assignments branch September 29, 2020 20:13
penggu pushed a commit to penggu/aks-engine that referenced this pull request Oct 28, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

system-assigned identity broken in multiple node pools w/ common VNET scenario
4 participants