Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deploying new revisions of Cloud Run resources yields 409: Conflict #350

Closed
floyd-may opened this issue May 20, 2020 · 33 comments · Fixed by #2622
Closed

Deploying new revisions of Cloud Run resources yields 409: Conflict #350

floyd-may opened this issue May 20, 2020 · 33 comments · Fixed by #2622
Assignees
Labels
kind/bug Some behavior is incorrect or out of spec resolution/fixed This issue was fixed

Comments

@floyd-may
Copy link

Deploying new revisions of Cloud Run resources is failing due to the GCP API giving an error response of 409: Conflict. This occurs whether or not the AutoGenerateRevisionName field is set (as mentioned by this Terraform provider issue: hashicorp/terraform-provider-google#5898). I've tried AutoGenerateRevisionName and manually generating a random name as well.

@leezen leezen assigned leezen and stack72 and unassigned leezen May 20, 2020
@floyd-may
Copy link
Author

@stack72 Get with me on the Slack org to set up a video call if you'd like me to walk you through what I'm experiencing.

@jaxxstorm
Copy link
Contributor

Can you post the code you're using to deploy so we can try repro this?

@floyd-may
Copy link
Author

You bet -

var svc = new cloudrun.Service("my-service-name", new cloudrun.ServiceArgs
{
    Name = "my-service-name",
    Traffics = new InputList<cloudrun.Inputs.ServiceTrafficArgs>
    {
        new cloudrun.Inputs.ServiceTrafficArgs
        {
            LatestRevision = true,
            Percent = 100
        }
    },
    Template = new cloudrun.Inputs.ServiceTemplateArgs
    {
        Spec = new cloudrun.Inputs.ServiceTemplateSpecArgs
        {
            Containers = new InputList<cloudrun.Inputs.ServiceTemplateSpecContainerArgs> {
                new cloudrun.Inputs.ServiceTemplateSpecContainerArgs {
                    Image = DockerImage, // a gcr.io docker image URL
                    Envs = new InputList<cloudrun.Inputs.ServiceTemplateSpecContainerEnvArgs> {
                        new cloudrun.Inputs.ServiceTemplateSpecContainerEnvArgs
                        {
                            Name = "Authentication__Google__ClientId",
                            Value = GoogleOauthClientId
                        },
                        new cloudrun.Inputs.ServiceTemplateSpecContainerEnvArgs
                        {
                            Name = "Authentication__AzureAD__ClientId",
                            Value = AzureOauthClientId
                        },
                        new cloudrun.Inputs.ServiceTemplateSpecContainerEnvArgs
                        {
                            Name = "Authentication__AzureAD__TenantId",
                            Value = AzureOauthTenantId
                        },
                    }
                }
            },
            ContainerConcurrency = 1,
        },
    },
    Location = "us-central1",
    AutogenerateRevisionName = true
});

@Sytten
Copy link

Sytten commented May 21, 2020

I am also having this issue, it happens when you modify something only and then try to modify it in pulumi. The strange thing is that now my stack is corrupted because pulumi thinks it successfully apply the change.
It seems like the update is not polling first the latest revision before trying to upgrade it. I think its probably an issue on the provider itself.

@Sytten
Copy link

Sytten commented May 21, 2020

Also I tried deleting the service in GCP and pulumi simply did not detect that the service was gone... I would have expected it to try to recreate it.

@Sytten
Copy link

Sytten commented May 21, 2020

Edit: I tried doing pure terraform and I don't see this problem, so this is a problem of Pulumi not using the provider correctly and not fetching the latest resource.

@floyd-may
Copy link
Author

I wonder if it has to do with Terraform's concept of "virtual fields". I searched for "virtual" in the codebase (both the GCP provider and core Pulumi) and didn't find much.

@lukehoban
Copy link
Member

Edit: I tried doing pure terraform and I don't see this problem, so this is a problem of Pulumi not using the provider correctly and not fetching the latest resource.

One key thing that is different by default between Pulumi and Terraform is that Terraform does a "refresh" by default (but can opt-out), and Pulumi does not (but can opt-in). It may be that the Cloud Run resource was designed to require having a refresh done prior to being updated?

Can you try running pulumi refresh and the retrying the pulumi up and see if that works?

In general it is a "bug" in a provider if it cannot be used correctly without a refresh - as users can and often do opt-out of refresh by default in Terraform as well - but there may be cases where it is unavoidable.

Note that we are considering changing this default in pulumi/pulumi#2247.

@floyd-may
Copy link
Author

Doing a refresh worked once, but when I added pulumi refresh --yes to my CD script prior to pulumi up --yes it failed again with a 409, so I think there's something else at play here as well. Are there any considerations I should weigh (other than it takes longer to refresh then deploy) when adding pulumi refresh --yes to my CD scripts?

@kaisellgren
Copy link

I have this exact same issue (409).

return new gcp.cloudrun.Service(
    `${prefix}-app`,
    {
      name: `${prefix}-app`,
      location,
      template: {
        spec: {
          containers: [
            {
              image: imageUrl,
              envs: [
                {
                  name: 'PUBLIC_BUCKET_URL',
                  value: publicBucketName,
                },
              ],
              resources: {
                requests: {
                  memory: '64Mi',
                  cpu: '200m',
                },
                limits: {
                  memory: '256Mi',
                  cpu: '1000m',
                },
              },
            },
          ],
          containerConcurrency: 80,
        },
      },
    },
    { dependsOn: enableCloudRun },
  )

When I try to add a new env variable and run pulumi up it fails to update with:

Diagnostics:
  gcp:cloudrun:Service (x-dev-app):
    error: 1 error occurred:
        * updating urn:pulumi:dev::x::gcp:cloudrun/service:Service::x-dev-app: Error updating Service "locations/europe-north1/namespaces/x/services/x-dev-app": googleapi: Error 409: Revision named 'x-dev-app-00022-jot' with different configuration already exists.

If I add this:

      autogenerateRevisionName: true,

and run pulumi up:

Diagnostics:
  gcp:cloudrun:Service (x-dev-app):
    error: 1 error occurred:
        * updating urn:pulumi:dev::x::gcp:cloudrun/service:Service::x-dev-app: Error updating Service "locations/europe-north1/namespaces/x/services/x-dev-app": googleapi: Error 409: Conflict for resource 'x-dev-app' for version 'xxx'.

Running pulumi refresh will take a moment to update the state, but ultimately it has no effect on this issue and the issue continues to persist.

@floyd-may
Copy link
Author

Any update on this @jaxxstorm?

@leezen
Copy link
Contributor

leezen commented May 29, 2020

We're taking a look. It'd be helpful if anyone running into this is able to post detailed logs (https://www.pulumi.com/docs/troubleshooting/#verbose-logging) from running pulumi refresh when updating state and what pulumi refresh shows in its diff.

@floyd-may
Copy link
Author

Hi @leezen. I'm glad to provide logs. With the verbosity turned all the way up, will the logs contain secrets or other sensitive information that shouldn't be public?

@Sytten
Copy link

Sytten commented May 29, 2020

I will do that this weekend. yes it will contain sensitive stuff @floyd-may. Can you provide an email so we could send it to you securely @leezen thanks!

@leezen
Copy link
Contributor

leezen commented May 29, 2020

Yes, if you don't want to clean it, can you please DM to me on slack.pulumi.com? Alternative, lee@ via email works, too.

@leezen leezen added this to the current milestone Jun 23, 2020
@jaxxstorm jaxxstorm self-assigned this Jul 1, 2020
@jaxxstorm
Copy link
Contributor

jaxxstorm commented Jul 1, 2020

We believe this is now resolved with some upstream changes to the terraform provider. @stack72 and I could not reproduce this. If anyone else has this problem, please make sure you're using the latest version of the provider and if it persists, feel free to reopen this issue.

@Sytten
Copy link

Sytten commented Jul 1, 2020

I will test and confirm, can you link the issue/pr of the upstream provider for posterity @jaxxstorm? thanks!

@leezen leezen modified the milestones: current, 0.39 Jul 14, 2020
@jonsherrard
Copy link

jonsherrard commented Jul 29, 2020

I've had this issue in the past, and everything's been fine for a while.

It's just started happening again.

The only thing I can think of is that during the deployment process a Cloud Function failed to deploy, (issue with the resource reference), which failed the process. I am then getting to a partially deployed state that causes the errors in Cloud Run world?

googleapi: Error 409: Conflict for resource 'redacted-41a830d': version '1595922945261662' was specified but current version is '1595922945369000'.

@kaisellgren
Copy link

Does a pulumi refresh help or is it stuck in this conflict state?

@idoshamun
Copy link

refresh worked for me!

@Sytten
Copy link

Sytten commented Nov 2, 2020

Refresh usually works yes

@OliverHGray
Copy link

How can this issue be fixed when a refresh doesn't help? The problem I'm having is the service has been manually deployed and it doesn't seem to be able to come back under Pulumi control.

I've tried refreshing, setting the revision name explicitly and also exporting the stack configuration and removing occurences the offending revision name from the inputs section. None of which is helping matters.

I'm also interested to know why Pulumi (although I guess it's probably Terraform) even tries to upgrade using the current revision name? Won't that always fail?

@yonathan06
Copy link

Using import with the resource id once did the trick for me: https://www.pulumi.com/docs/guides/adopting/import/

@pierskarsenbarg
Copy link
Member

Got someone hitting this again by deploying a container that wouldn't start. Subsequent updates failed until a refresh was run.

For Pulumi engineers: see https://pulumi.slack.com/archives/CBVJAP46L/p1715511335035519 for logs

@pierskarsenbarg pierskarsenbarg added the kind/bug Some behavior is incorrect or out of spec label May 13, 2024
@alexhwoods
Copy link

Same thing as @pierskarsenbarg. If container won't start, you can't update it anymore.

Here's my error

Error 409: Conflict for resource 'makeswift': version '1715743123779128' was specified but current version is '1715744121398301'.

@alexhwoods
Copy link

A pulumi refresh does fix it. The Pulumi provider seems to have an expectation that the deployment succeeded

@mjeffryes mjeffryes modified the milestones: 0.39, 0.99 Jun 14, 2024
@daaain
Copy link

daaain commented Jun 27, 2024

A simple pulumi refresh didn't work for me, but pulumi refresh -t urn:pulumi:xxx::yyy::gcp:cloudrun/service:Service::zzz did

@mjeffryes mjeffryes modified the milestones: 0.39, 0.107 Jul 24, 2024
@dipasqualew
Copy link

The issue also happens when deploying an image change from outside pulumi, e.g. deploying the infra changes via pulumi, but an update in the image as part of another pipeline.

@rshade rshade self-assigned this Nov 12, 2024
rshade added a commit that referenced this issue Nov 13, 2024
rshade added a commit that referenced this issue Nov 13, 2024
rshade added a commit that referenced this issue Nov 13, 2024
rshade added a commit that referenced this issue Nov 13, 2024
rshade added a commit that referenced this issue Nov 13, 2024
rshade added a commit that referenced this issue Nov 13, 2024
rshade added a commit that referenced this issue Nov 14, 2024
rshade added a commit that referenced this issue Nov 14, 2024
…2622)

This change fixes an issue with the `cloud_run_service` resource which
causes a 409 conflict with the service whenever there is a change
outside of pulumi. The conflict is caused by the `resourceVersion`
property in the `metadata` blob, which is used for optimistic locking.

This change adds a `TransformFromState` hook to the resource which
deletes the `resourceVersion` from the state. This disables the
optimistic locking behaviour and prevents the 409 conflicts caused by
changes to the resource.

fixes #350
@pulumi-bot pulumi-bot added the resolution/fixed This issue was fixed label Nov 14, 2024
@pulumi-bot
Copy link
Contributor

This issue has been addressed in PR #2622 and shipped in release v8.9.2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Some behavior is incorrect or out of spec resolution/fixed This issue was fixed
Projects
None yet
Development

Successfully merging a pull request may close this issue.