-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SDK Migration - V2 Beta #306
Conversation
if claims.ExpiresAt < time.Now().Add(5*time.Minute).Unix() { | ||
errs = packersdk.MultiErrorAppend(errs, fmt.Errorf("client_jwt will expire within 5 minutes, please use a JWT that is valid for at least 5 minutes")) | ||
} | ||
//if claims.ExpiresAt < time.Now().Add(5*time.Minute).Unix() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OIDC Access Tokens have a short validity period (60 seconds), so throwing this error prevents proper OIDC support, removing it fixes this, I have a slack link with some more context from the provider folks if needed
1ec98f0
to
841570a
Compare
…et it, switch from MeClient to ServicePrincipals client to get objectID
505414e
to
04a4a81
Compare
return c.authType | ||
} | ||
|
||
// TODO I still need to port this to the new SDK |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic is used by every builder, I'll port is towards the end of the process and leave this todo until then
@@ -305,12 +314,13 @@ func (s *StepDeployTemplate) deleteDeploymentResources(ctx context.Context, depl | |||
defer wg.Done() | |||
retryConfig := retry.Config{ | |||
Tries: 10, | |||
RetryDelay: (&retry.Backoff{InitialBackoff: 10 * time.Second, MaxBackoff: 600 * time.Second, Multiplier: 2}).Linear, | |||
RetryDelay: (&retry.Backoff{InitialBackoff: 5 * time.Second, MaxBackoff: 60 * time.Second, Multiplier: 1.5}).Linear, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change seems to speed up builds quite a bit, if the delete failed a few times here we could end up waiting for quite a long time, this cuts off the max backoff at 1 minute, and lowers the multiplier that increases the backoff, anecdotally acceptance tests feel way faster, and the OIDC test build only took 5 minutes.
@@ -107,11 +107,11 @@ func (s *StepDeployTemplate) Cleanup(state multistep.StateBag) { | |||
// delete the disk as the image request will return a 404 | |||
computeName := state.Get(constants.ArmComputeName).(string) | |||
isManagedDisk := state.Get(constants.ArmIsManagedImage).(bool) | |||
imageType, imageName, err := s.disk(context.TODO(), subscriptionId, resourceGroupName, computeName) | |||
imageType, imageName, err := s.disk(ctx, subscriptionId, resourceGroupName, computeName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch
@@ -65,7 +65,7 @@ func (b *Builder) Run(ctx context.Context, ui packersdk.Ui, hook packersdk.Hook) | |||
|
|||
ui.Say("Running builder ...") | |||
|
|||
ctx, cancel := context.WithCancel(ctx) | |||
ctx, cancel := context.WithTimeout(ctx, time.Minute*60) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to make this timeout configurable via some configuration option?
if b.config.ClientConfig.ObjectID == "" { | ||
b.config.ClientConfig.ObjectID = getObjectIdFromToken(ui, spnCloud) | ||
b.config.ClientConfig.ObjectID = *objectID |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there ever a case where objectID is nil and this throws a nil pointer deference error?
if err == nil { | ||
err = f.WaitForCompletionRef(ctx, azureClient.ImagesClient.Client) | ||
} | ||
err := azureClient.ImagesClient.DeleteThenPoll(ctx, imageId) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice
images, err := client.ImagesClient.ListByResourceGroupComplete(context.TODO(), resourceGroupName) | ||
func findManagedImageByName(client *AzureClient, name, subscriptionId, resourceGroupName string) (*hashiImagesSDK.Image, error) { | ||
id := commonids.NewResourceGroupID(subscriptionId, resourceGroupName) | ||
images, err := client.ImagesClient.ListByResourceGroupComplete(context.TODO(), id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really part of the review but I'm curious why the code has context.TODO() as opposed to reusing the ctx created during runtime? Guessing there is no context being passed in would be my first guess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! So the context for the clients is mostly used for canceling and polling deadlines, I have not passed it through to here yet since it was only one get request used outside of a step, happy to add a real context here if you think it'd be better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed ~half for now, will continue this tomorrow, overall there's a ton of changes and I left more questions than proper comments, but this looks good to me at first glance.
I notice that only the arm
builder is updated now, will you refactor the rest later?
@@ -56,7 +53,7 @@ type Artifact struct { | |||
StateData map[string]interface{} | |||
} | |||
|
|||
func NewManagedImageArtifact(osType, resourceGroup, name, location, id, osDiskSnapshotName, dataDiskSnapshotPrefix string, generatedData map[string]interface{}, keepOSDisk bool, template *CaptureTemplate, getSasUrl func(name string) string) (*Artifact, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm very ignorant of this plugin, but what's a SAS URL, and why do we remove that here? I presume this is not necessary anymore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Related but not too much, I see the only other place where there's a reference to a getSasUrl
is in the NewArtifact
function for the dtl
builder (in builder/azure/dtl/artifact.go
line 56), which is never used in the code according to gopls
, so I presume we could remove this as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question, a SAS URL is a link to an Azure Storage account resource with an embedded token, we create SAS tokens in VHD builds and when using managed image builds without deleting the disk I believe, however we've been asked by a lot of our users to remove this here #250, because of the security concerns of using these they are not recommended by Microsoft, we are doing them in every build. To top it off the new SDK does not provide the same helper we currently use to generate these tokens.
So for v2 after chatting with Wilken, and given the above issue, I am proposing removing them, they can still be created by users outside of Packer, and this was never used in the resulting Azure Artifact, but was simply printed in the terminal output of the build.
Yes we should remove it from the DTL builder but I would like to hold of any changes to the DTL builder until we have signed off on the way we've handling this for the ARM builder
azureClient.DisksClient.UserAgent = fmt.Sprintf("%s %s", useragent.String(version.AzurePluginVersion.FormattedVersion()), azureClient.DisksClient.UserAgent) | ||
// Clients that have been ported to hashicorp/go-azure-sdk | ||
azureClient.DisksClient = hashiDisksSDK.NewDisksClientWithBaseURI(cloud.ResourceManagerEndpoint) | ||
azureClient.DisksClient.Client.Authorizer = authWrapper.AutorestAuthorizer(resourceManagerAuthorizer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if that's the same package referred to here, but from our discussions I remember autorest
being a contentious choice, is this the same package referred to here? Won't we still experience crashes because of this one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So autorest is an intermediary library used in track 1 SDKs, hashicorp/go-azure-sdk is still migrating from auto-rest to its own transit layer.
This code has been converted to use a different auth layer than Autorest, but track 1 SDKs still rely on autorest, so we can use the helper method in autoWrapper to convert this for SDKs not converted, then each time a client is moved to track 2 we will just replace authWrapper.AutorestAuthorizer(resourceManagerAuthorizer) with resourceManagerAuthorizer, as that helper function just turns the "new" auth object into the old one
} | ||
|
||
// Returns an Azure Client used for the Azure Resource Manager | ||
// Also returns the Azure object ID for the authentication method used in the build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder, why is the object ID returned as a separate value here? Shouldn't it be embedded in the AzureClient
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It used to be defined at the builder level and then passed into the client so I just kept it that way, also historically users could add the ObjectID in their templates instead of relying on us to grab it from the token claims
return "", err | ||
} | ||
return claims["oid"].(string), nil | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: superfluous empty line
@@ -80,28 +80,33 @@ func (b *Builder) Run(ctx context.Context, ui packersdk.Ui, hook packersdk.Hook) | |||
b.stateBag.Put(constants.ArmManagedImageSubscription, b.config.ClientConfig.SubscriptionID) | |||
} | |||
|
|||
b.stateBag.Put(constants.ArmSubscription, b.config.ClientConfig.SubscriptionID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is part of the config, do we need to keep this in the state? Are the steps that use this unable to access the full config?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: I see some other config options are being stored in the state so if that's a convention for this plugin feel free to disregard my comment.
@@ -56,6 +57,7 @@ func TestBuilderAcc_SharedImageGallery_ARM64SpecializedLinuxSIG(t *testing.T) { | |||
Setup: func() error { | |||
createSharedImageGalleryDefinition(t, CreateSharedImageGalleryDefinitionParameters{ | |||
galleryImageName: "arm-linux-specialized-sig", | |||
subscriptionId: os.Getenv("ARM_SUBSCRIPTION_ID"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason why this is part of the environment rather than a constant for this test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These acceptance tests rely on env variables for most of the Azure related fields that are in interpreted in the templates, this just grabs them to use in these functions
} | ||
|
||
return nil, fmt.Errorf("Cannot find an image named '%s' in the resource group '%s'", name, resourceGroupName) | ||
} | ||
|
||
func findVirtualNetworkResourceGroup(client *AzureClient, name string) (string, error) { | ||
virtualNetworks, err := client.VirtualNetworksClient.ListAllComplete(context.TODO()) | ||
func findVirtualNetworkResourceGroup(client *AzureClient, subscriptionId, name string) (string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To fuel my curiosity, I see that a bunch of calls now include the subscriptionId
, for those of us unitiated to Azure's offering, what's that and why do we need it now in so many places?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Azure Subscriptions are user created groups where resources are organized. Every resource (VM, Managed Image) has a Subscription, an a Resource Group which are sub-divisions of subscriptions.
In the old SDK we passed in the subscription ID into the Azure Client, in the new SDK they ask for the subscription ID on every call but the resulting API call is the same regardless.
|
||
return multistep.ActionHalt | ||
// HACK(chrboum): I do not like this. The capture method should be returning this value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Who's cherboum
and paulmey
? Are they working on the SDK?
Oh nevermind that's an older comment that was moved, still, is this hack still relevant with the refactor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes but I haven't fully vetted that, there are two helper methods we use in azure client that still rely on autorest specific libraries, these are working on the new SDK currently but I need to migrate them still
s.say(s.client.LastError.Error()) | ||
} | ||
|
||
return exists.Response.StatusCode != 404, err | ||
return exists.HttpResponse.StatusCode != 404, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does that happen? If the error might be linked to a 404
, should we do the check in both places?
if err != nil { | ||
if exists.HttpResponse.StatusCode == 404 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a chance that HttpResponse
might be nil
here? If so this will panic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes but this was true in the old SDK and it never happened, definitely worth guarding against but unlikely to happen I believe
@@ -1,6 +1,3 @@ | |||
// Copyright (c) HashiCorp, Inc. | |||
// SPDX-License-Identifier: MPL-2.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless I'm mistaken, this should stay part of the file for compliance
"github.com/hashicorp/packer-plugin-azure/builder/azure/common/constants" | ||
"github.com/hashicorp/packer-plugin-sdk/multistep" | ||
packersdk "github.com/hashicorp/packer-plugin-sdk/packer" | ||
) | ||
|
||
type StepPublishToSharedImageGallery struct { | ||
client *AzureClient | ||
publish func(ctx context.Context, sourceID string, sharedImageGallery SharedImageGalleryDestination, miSGImageVersionEndOfLifeDate string, miSGImageVersionExcludeFromLatest bool, miSigReplicaCount int32, location string, diskEncryptionSetId string, tags map[string]*string) (string, error) | ||
publish func(ctx context.Context, subscriptionID string, sourceID string, sharedImageGallery SharedImageGalleryDestination, miSGImageVersionEndOfLifeDate string, miSGImageVersionExcludeFromLatest bool, miSigReplicaCount int64, location string, diskEncryptionSetId string, tags map[string]string) (string, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, what's the mi
in miSGImageVersion...
referring to?
Machine image maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess so, not 100% sure tbh
@@ -130,7 +130,7 @@ | |||
"listeners": [ | |||
{ | |||
"certificateUrl": "", | |||
"protocol": "https" | |||
"protocol": "Https" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Weird casing here, is this voluntary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what the new SDK sends, they resolve the same but we do a file compare to check in the test that causes failures cause https != https, Azure interprets them the same
package common | ||
|
||
// StringPtr returns a pointer to the passed string. | ||
func StringPtr(s string) *string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm surprised we have to make our own, isn't there something like this in the Azure SDK already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be a good idea to push it to the Hashicorp Azure SDK if not, especially if there's a bunch of string pointers in the API calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was in the old SDK as an annoying helper tbh, and I didn't want to have to have a bunch of duplicated code. The new SDK does not have this and tbh I wasn't sure how it'd be received as a PR so I felt this was simpler
@@ -4,8 +4,14 @@ | |||
package template |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity again, what is being templated here? I'm not sure I understand what's the use for this package.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Azure ARM templates are kinda like packer/terraform templates, in that they are codied infrastructure, but its an Azure concept for defining certain types of deployments https://learn.microsoft.com/en-us/azure/azure-resource-manager/templates/overview
The way the Azure Packer plugin works is that it first deploys a template to create a VM, and then it captures that VM into an image artifact
"CHINACLOUD": "china", | ||
"AZURECHINACLOUD": "china", | ||
|
||
// TODO Implement AzureGermanCloud |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI Azure Germany doesn't exist anymore since October 2021. There are Azure regions in Germany but they are part of the general public cloud. Ref: Azure Germany
rc1 with ARM builder support available here: https://github.com/JenGoldstrich/packer-plugin-azure/releases/tag/v2.0.0-betarc1 |
Description
This PR switches from the ARM Builder from the "track 1" SDK we've been using to an SDK maintained by the Terraform Azure Provider team, this new SDK is located here. It has drop in support for track 2 SDKs to help us address the deprecation of the existing SDK with fewer breaking changes. As the new SDK adds track 2 support we can migrate them over time
Related to #243
Breaking Changes
az login --use-device-code
instead to replicate this functionalityOther Changes
What's not done