Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

azure: Add support for dns=none #15627

Merged
merged 5 commits into from
Jul 16, 2023
Merged

Conversation

hakman
Copy link
Member

@hakman hakman commented Jul 13, 2023

/cc @justinsb
/assign @justinsb
/kind office-hours

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. area/api area/kops-controller area/nodeup area/provider/azure Issues or PRs related to azure provider labels Jul 13, 2023
@@ -30,84 +30,6 @@ func TestPrecreateDNSNames(t *testing.T) {
cluster *kops.Cluster
expected []recordKey
}{
{
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Azure doesn't do any DNS operations, so these test cases are useless.

@@ -1321,7 +1321,7 @@ func setupDNSTopology(opt *NewClusterOptions, cluster *api.Cluster) error {
switch strings.ToLower(opt.DNSType) {
case "":
switch cluster.Spec.GetCloudProvider() {
case api.CloudProviderHetzner, api.CloudProviderDO:
case api.CloudProviderHetzner, api.CloudProviderDO, api.CloudProviderAzure:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should move most, if not all cloud providers to dns=none soon.
Gossip is a much worse choice and Public DNS is only available on a few cloud providers.

@@ -151,6 +179,7 @@ func (*LoadBalancer) RenderAzure(t *azure.AzureAPITarget, a, e, changes *LoadBal
ID: to.StringPtr(fmt.Sprintf("/%s/virtualNetworks/%s/subnets/%s", idPrefix, *e.Subnet.VirtualNetwork.Name, *e.Subnet.Name)),
}
}
// TODO: Move hardcoded values to the model
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost everything about LB is hardcoded. We should move this to model soon. For now it is good enough.

upup/pkg/fi/cloudup/dns_test.go Show resolved Hide resolved
upup/pkg/fi/cloudup/dns_test.go Show resolved Hide resolved
upup/pkg/fi/cloudup/dns_test.go Outdated Show resolved Hide resolved
@@ -24,7 +24,7 @@ import (

// UseKopsControllerForNodeBootstrap is true if nodeup should use kops-controller for bootstrapping.
func UseKopsControllerForNodeBootstrap(cloudProvider kops.CloudProviderID) bool {
return cloudProvider != kops.CloudProviderAzure
return true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 👍 👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll take on removing the dead code this leaves behind.

}, nil
}

func (a azureVerifier) VerifyToken(ctx context.Context, rawRequest *http.Request, token string, body []byte, useInstanceIDForNodeName bool) (*bootstrap.VerifyResult, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't validate that the VM should actually be allowed to join the cluster, it seems to accept a token from any VM in the resource group. For comparison, the AWS verifier checks that the assumed role in the GetCallerIdentity response matches an IAM role used by any of the cluster's instance groups.

Can we do something similar here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. It checks at least to be in same resource group. Will check to see if I can lock down things more here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There may be a more secure solution by using the attested data, though will be something for the future.
For now I added the VM ID, which should be pretty hard to guess without access to the resource group.
https://learn.microsoft.com/en-us/azure/virtual-machines/instance-metadata-service?tabs=linux#attested-data

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we verify the VM's VMSS is of a known kOps InstanceGroup?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should be checked further along in the process. This is why we set result.InstanceGroupName.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be safer to take the IG name from the VMSS name, instead of the tag. WDYT?

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jul 14, 2023
@hakman
Copy link
Member Author

hakman commented Jul 14, 2023

/retest

@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jul 15, 2023
@justinsb
Copy link
Member

This lgtm, and I think is a good net improvement for security already.

I do think we should check that the Azure VM is part of our cluster, but maybe we do that as a separate PR? WDYT @rifelpet / @hakman ?

/approve
/lgtm
/hold to agree whether we check the cluster as part of this PR or another

@k8s-ci-robot k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm "Looks good to me", indicates that a PR is ready to be merged. labels Jul 15, 2023
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: justinsb

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 15, 2023
@rifelpet
Copy link
Member

I'm fine with the security improvements happening in a follow up

@hakman
Copy link
Member Author

hakman commented Jul 16, 2023

Thanks @justinsb & @rifelpet I will create a separate PR with a small idea and we can continue the discussion there.
In short, my impression was that after we set the IG name in the response, it will be used to get config for that IG name. If it doesn't exist, it will fail.
/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 16, 2023
@k8s-ci-robot k8s-ci-robot merged commit 2a0cc8a into kubernetes:master Jul 16, 2023
@k8s-ci-robot k8s-ci-robot added this to the v1.28 milestone Jul 16, 2023
@hakman hakman deleted the azure_dns_none branch July 18, 2023 02:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/api area/kops-controller area/nodeup area/provider/azure Issues or PRs related to azure provider cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/office-hours lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants