Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bugs related to secret management and improve documentation #1358

Merged
merged 11 commits into from
Feb 5, 2021

Conversation

matthchr
Copy link
Member

@matthchr matthchr commented Jan 13, 2021

Closes #1356
Closes #1360

What this PR does / why we need it:
Add new AZURE_SECRET_NAMING_VERSION mode

The new mode allows us to fix inconsistencies in how secrets
were named without making a breaking change.

  • AppInsights created secrets in the same namespace as the resource but with name: "appinsights--"
  • Storage created secrets in the same namespace as the resource but with name: "storage--"
  • AzureSQL resources created resources with a different naming scheme as well.
  • Other resources created a secret in the same namespace with the secret name being the resource name.

The new V2 mode ensures that all resources create secrets in KeyVault and/or Kubernetes with a consistent naming pattern.

Return proper error if we cannot deserialize secret.

Improved secrets documentation

If applicable:

  • this PR contains documentation
  • this PR contains tests

@@ -444,3 +442,61 @@ func GenerateRandomSshPublicKeyString() string {
sshPublicKeyData := string(ssh.MarshalAuthorizedKey(publicRsaKey))
return sshPublicKeyData
}

//CreateVaultWithAccessPolicies creates a new key vault and provides access policies to the specified user - used in test
func CreateVaultWithAccessPolicies(ctx context.Context, creds config.Credentials, groupName string, vaultName string, location string, clientID string) error {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was copied from resourcemanager/keyvaults/keyvault.go as it's only used in test and it kept confusing me by being in the reconcile file.

controllers/suite_test.go Outdated Show resolved Hide resolved
docs/howto/secrets.md Outdated Show resolved Hide resolved
go.mod Outdated Show resolved Hide resolved
}
} else {
err = sqlUserSecretClient.Delete(
Copy link
Member Author

@matthchr matthchr Jan 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to continue to ignore this error because some customers are expecting only best effort deletes?

@matthchr matthchr force-pushed the feature/sql-user-failure branch from f84d1e3 to f634091 Compare January 28, 2021 19:38
Copy link
Member

@theunrepentantgeek theunrepentantgeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partial review, I got about half way through.

Does this actually need the versioning config for the secrets config?

Could we instead default to the new one with fallback to the old style if it's missing? Back compat for existing users, new features for new users, or amy I missing something important?

api/v1alpha1/storageaccount_types.go Outdated Show resolved Hide resolved
charts/azure-service-operator/templates/secret.yaml Outdated Show resolved Hide resolved
config/default/manager_image_patch.yaml Show resolved Hide resolved
controllers/appinsights_controller_test.go Outdated Show resolved Hide resolved
controllers/async_controller.go Show resolved Hide resolved
controllers/azuresql_combined_test.go Outdated Show resolved Hide resolved
controllers/azuresqlaction_controller_test.go Outdated Show resolved Hide resolved
controllers/suite_test.go Outdated Show resolved Hide resolved
@matthchr matthchr force-pushed the feature/sql-user-failure branch from f02753e to 4f831f5 Compare February 1, 2021 18:35
Copy link
Member

@theunrepentantgeek theunrepentantgeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good - within the limits of my understanding of the code anyway.
A handful of comments.

Something feels repetitive - as though there's a missing abstraction, but I can't put my finger on it. I'll have another review with fresh eyes tomorrow.

pkg/resourcemanager/config/env.go Outdated Show resolved Hide resolved
pkg/resourcemanager/config/env.go Outdated Show resolved Hide resolved
pkg/secrets/keyvault/client.go Outdated Show resolved Hide resolved
pkg/secrets/keyvault/client.go Outdated Show resolved Hide resolved
pkg/secrets/keyvault/client.go Outdated Show resolved Hide resolved
The new mode allows us to fix inconsitencies in how secrets
were named without making a breaking change.

  - AppInsights created secrets in the same namespace
    as the resource but with name:
    "appinsights-<resourceGroup>-<resourceName>"
  - Storage created secrets in the same namespace
    as the resource but with name:
    "storage-<resourceGroup>-<resourceName>"
  - AzureSQL resources created resources with
    a different naming scheme as well.
  - Other resources created a secret in the same
    namespace with the secret name being the
    resource name.

The new V2 mode ensures that all resources create secrets
in KeyVault and/or Kubernetes with a consistent naming pattern.
  - This would prevent secrets from being created in Kubernetes
@matthchr matthchr force-pushed the feature/sql-user-failure branch from 4f831f5 to a65cf8d Compare February 3, 2021 19:00
pkg/resourcemanager/config/env.go Outdated Show resolved Hide resolved
pkg/secrets/keyvault/client.go Outdated Show resolved Hide resolved
babbageclunk
babbageclunk previously approved these changes Feb 4, 2021
Copy link
Member

@babbageclunk babbageclunk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, my only concern was how we ensure that the V1 secret naming path is tested. Thanks for discussion about that!


r.Telemetry.LogInfoByInstance("status", "ensuring vault", req.String())

// TODO: It's really awkward that we do this so often?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow yeah, that method seems really heavyweight! Is this to handle the case where the vault isn't accessible yet and the object is being created for the first time, the individual resource managers don't have to do the work of tearing down the resource they created if the secret can't be stored?

Although I don't think that really holds does it? There's still a possibility of the secret write failing even if we've looked-before-leaping. I checked a number of resources' Ensure methods and they almost all write the secret first (having generated keys/passwords/etc) to avoid that scenario, or in the case of redis the secret content is queryable (not very secret) after creation so it just requeues if it has a problem writing it. So possibly we don't need to do the check here?

This could be one of those "you're only allowed to remove this if you can tell me why it's there" situations though - maybe there's something I'm missing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think this is more about error code handling and setting the right message on the resource if we can't access the KeyVault, although doing a pre-check so that resources don't need to might be part of it as well.

Obviously as you point out, no amount of looking can guarantee your leap succeeds so pre-checking feels sorta pointless, and an alternative where we have an error handling library that can help us determine error causes presumably would mean that rather than doing this check once we can just farm it out to the error checker in all the places we interact with KV at which point there's not much value in doing this it seems to me as well? Especially since we need error handling in those other places already so it's not like we're forcing some there when previously we had none...

I've filed #1369 to track this as I don't think we want to bloat this PR further by me doing it here.

controllers/azuresql_combined_test.go Outdated Show resolved Hide resolved
return secrets.SecretKey{Name: instance.Name, Namespace: instance.Namespace, Kind: instance.TypeMeta.Kind}
}

// TODO: Ignoring prefix here... we ok with that?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assumption is that the prefix was being used to prevent collisions, so we don't need it in the new version? That seems reasonable to me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. It's also taking an opportunity to remove some of the "extra" secrets configuration that only AzureSQLUser/AzureSQLManagedUser have that don't match the other types and doesn't really seem to add a lot of value to what customers can do (since it's not needed to avoid conflicts anymore it seems just extra complexity on our side to support it without much in the way of customer value add)

pkg/resourcemanager/config/env.go Outdated Show resolved Hide resolved
pkg/secrets/keyvault/client.go Outdated Show resolved Hide resolved
pkg/secrets/keyvault/client.go Outdated Show resolved Hide resolved
Comment on lines +28 to +34
// NOTE: Tests in this file are intended to be run twice, once in the v1 secret naming mode
// and once in the v2 secret naming mode. In some instances they parallel tests
// which don't assert on secret name or shape.
// The naming of the tests is important! It's how the makefile knows to run just these tests
// in the old V1 mode.
// Not every resource is included here, only resources which for legacy reasons in the old v1
// secret naming mode had secret naming schemes which were different than the standard.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice.

@matthchr matthchr merged commit 876b1c7 into Azure:master Feb 5, 2021
@matthchr matthchr deleted the feature/sql-user-failure branch February 5, 2021 18:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants