Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(@aws/cx-api): metadata deployed when no Stack differences #15322

Closed
codeedog opened this issue Jun 25, 2021 · 12 comments · Fixed by #16300
Closed

(@aws/cx-api): metadata deployed when no Stack differences #15322

codeedog opened this issue Jun 25, 2021 · 12 comments · Fixed by #16300
Labels
@aws-cdk/cx-api Related to the Cloud Executable API bug This issue is a bug. effort/medium Medium work item – several days of effort p2

Comments

@codeedog
Copy link

codeedog commented Jun 25, 2021

While reviewing the fix to bug 15023 we found that metadata reporting ("versionReporting": true) causes cdk deploy to upload metadata even when cdk diff reports no differences and no files have changed. This occurs when the commands are run on different operating systems.

Reproduction Steps

Craft any stable stack that is javascript only (no native code) and ensure all modules versions are explicit. This repo abides by that. Note, versionReporting has been set to false in that repo's cdk.json. It would need to be set to true to see the issue.

# On a macOS
> git clone git@github.com:codeedog/aws-test-hello-world.git
> cd aws-test-hello-world
> node --version && npm --version
> npm ci      # this will also build the subdirectory
> cdk deploy

# On Ubuntu
> git clone git@github.com:codeedog/aws-test-hello-world.git
> cd aws-test-hello-world
> node --version && npm --version
> npm ci      # this will also build the subdirectory
> cdk diff    # Should show no diffs
> cdk deploy  # Will deploy 2 "changes"; metadata only

What did you expect to happen?

I expected there to be no deploy on the second machine due to the lack of any code changes.

What actually happened?

Metadata was collected making it look like there were changes to the Stack that needed to be built.

Environment

  • CDK CLI Version : 1.110.0
  • Framework Version: 1.110.0
  • Node.js Version: v14.17.0 (note: also, npm Version: 7.17.0)
  • OS : macOS Catalina 10.15.7 (19H524) & Ubuntu (20.04.2 LTS, on GitHub)
  • Language (Version): Typescript (3.9.9 from package-lock.json)

Other

Notes from my comment in the now fixed Bug 15023:

I don't mind turning off metadata to stop this. On the other hand, I'm more than happy to share build results with the AWS CDK team. It seems to me that if there are no differences in a Stack (according to cdk diff), metadata should not be pushed. At least, this should be a configuration option ("versionReporting": "no-deploy-on-no-diffs" | true | false). Pick a better name for it... Anyway, I'd prefer the first be the default, but at least make this available to folks.

When I see the system rebuild despite my knowing no files have changed, I worry something has failed or some merge has or hasn't happened. I suspect others may feel this way, too.

To me this feels like a bug (trigger a deploy when no code changed and all modules are stable). You don't send metadata from the same OS when cdk deploy is called with no differences. Why should you do so when the command is run on a different operating system and still there are no differences?


This is 🐛 Bug Report

@codeedog codeedog added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Jun 25, 2021
@github-actions github-actions bot added the @aws-cdk/cx-api Related to the Cloud Executable API label Jun 25, 2021
@rix0rrr
Copy link
Contributor

rix0rrr commented Jun 28, 2021

Can you share the Metadata part of the templates, the one that changes between systems? I agree with you that it shouldn't change, but since it does we have to figure out where the change is coming from.

@rix0rrr rix0rrr added the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. label Jun 28, 2021
@codeedog
Copy link
Author

@rix0rrr It looks like they re-opened 15023. I'm going to focus on this issue there.

@jogold
Copy link
Contributor

jogold commented Jun 28, 2021

Looks like gzip compression doesn't give the same results on Linux vs macOS...

See #15023 (comment)

@nija-at
Copy link
Contributor

nija-at commented Jun 28, 2021

copying over comment from @codeedog

Looks like there's a single character difference in the compressed metadata and the uncompressed metadata has no differences. Compression algorithmic variation between the two OSes? You might find this discussion on Stack Overflow useful.

Here's the program I wrote to analyze the metadata differences:

/*
Output:

Deflated: false
Inflated: true
MacOS   : 1.110.0!@aws-cdk/{core.{Stack,AssetStaging,Construct,CfnParameter,CfnOutput,CfnResource},aws-{lambda.{Function,CfnFunction,LayerVersion,CfnLayerVersion,CfnPermission},iam.{Role,CfnRole},s3-assets.Asset,s3.BucketBase,lambda-nodejs.NodejsFunction,apigateway.{LambdaRestApi,CfnRestApi,CfnAccount,Deployment,CfnDeployment,Stage,CfnStage,ResourceBase,ProxyResource,CfnResource,Method,CfnMethod}}},node.js/v14.17.0!jsii-runtime.Runtime
Ubuntu  : 1.110.0!@aws-cdk/{core.{Stack,AssetStaging,Construct,CfnParameter,CfnOutput,CfnResource},aws-{lambda.{Function,CfnFunction,LayerVersion,CfnLayerVersion,CfnPermission},iam.{Role,CfnRole},s3-assets.Asset,s3.BucketBase,lambda-nodejs.NodejsFunction,apigateway.{LambdaRestApi,CfnRestApi,CfnAccount,Deployment,CfnDeployment,Stage,CfnStage,ResourceBase,ProxyResource,CfnResource,Method,CfnMethod}}},node.js/v14.17.0!jsii-runtime.Runtime

Compressed Data Difference Indicies:
12

*/

const zlib = require('zlib');

const mac = "H4sIAAAAAAAAE11RwU7DMAz9Fu5ptgokrnRDnAZUReJuUrOlbZIqThhV1H8nSddp4uT3nq28Z6fkZbnl27snOFMh2n4ThLHIw4cD0bOKCF2ER6mPbG80OeuFY/tvXYMFhQ5tIu/ejT7LDZLxVuDM0nthAPXVAg8vXgsnjU4jV3yACe0nWro0/vMarZKU2MwkKB4aM2A2iXVmdF9Aikc8p4yc77zo0e2AkC3OhTYtdsTfcrk6wyiP4PAMEw+HPBhzu2qUlxVWWAlhvHbsGcfBTAp13vGGpdPkSAtYt88Jamt+p1W5vQ17RXcybZIWNM8zS0F5R5uf8oGXj/FDOpKysNFdKuTNUv8A3xYryK0BAAA="

const ubu = "H4sIAAAAAAAAA11RwU7DMAz9Fu5ptgokrnRDnAZUReJuUrOlbZIqThhV1H8nSddp4uT3nq28Z6fkZbnl27snOFMh2n4ThLHIw4cD0bOKCF2ER6mPbG80OeuFY/tvXYMFhQ5tIu/ejT7LDZLxVuDM0nthAPXVAg8vXgsnjU4jV3yACe0nWro0/vMarZKU2MwkKB4aM2A2iXVmdF9Aikc8p4yc77zo0e2AkC3OhTYtdsTfcrk6wyiP4PAMEw+HPBhzu2qUlxVWWAlhvHbsGcfBTAp13vGGpdPkSAtYt88Jamt+p1W5vQ17RXcybZIWNM8zS0F5R5uf8oGXj/FDOpKysNFdKuTNUv8A3xYryK0BAAA="

const mac_u = zlib.gunzipSync(Buffer.from(mac, 'base64')).toString()
const ubu_u = zlib.gunzipSync(Buffer.from(ubu, 'base64')).toString()

console.log("Deflated:", mac == ubu);
console.log("Inflated:", mac_u == ubu_u);
console.log("MacOS   :", mac_u);
console.log("Ubuntu  :", ubu_u);

console.log("\nCompressed Data Difference Indicies:");
for (let i = 0; i < mac.length; i++) {
    if (mac[i] != ubu[i]) console.log(i);
}

@njlynch
Copy link
Contributor

njlynch commented Jun 28, 2021

With zlib, it looks like one option might be to convert to using deflateRawSync, which intentionally omit the standard headers in favor of just the raw compressed data.

The server-side for this would need to be tested to verify a corresponding inflate method is available, and also likely need to involve bumping the version as well to indicate that the data coming in is header-less.

Alternatively, we could potentially just hack the resulting output (from zlib.gzipSync) to set the OS byte to "unknown". Slightly uglier implementation, but certainly also less involved.

@njlynch
Copy link
Contributor

njlynch commented Jun 28, 2021

POC of the OS-byte hack:

diff --git a/packages/@aws-cdk/core/lib/private/metadata-resource.ts b/packages/@aws-cdk/core/lib/private/metadata-resource.ts
index 1cccc4f24..6d4713abc 100644
--- a/packages/@aws-cdk/core/lib/private/metadata-resource.ts
+++ b/packages/@aws-cdk/core/lib/private/metadata-resource.ts
@@ -74,7 +74,9 @@ export function formatAnalytics(infos: ConstructInfo[]) {
   infos.forEach(info => insertFqnInTrie(`${info.version}!${info.fqn}`, trie));

   const plaintextEncodedConstructs = prefixEncodeTrie(trie);
-  const compressedConstructs = zlib.gzipSync(Buffer.from(plaintextEncodedConstructs)).toString('base64');
+  const compressedConstructsBuffer = zlib.gzipSync(Buffer.from(plaintextEncodedConstructs));
+  compressedConstructsBuffer[9] = 255;
+  const compressedConstructs = compressedConstructsBuffer.toString('base64');

   return `v2:deflate64:${compressedConstructs}`;
 }

@codeedog
Copy link
Author

codeedog commented Jun 28, 2021

There is a third option: compare the decompressed data rather than assume the compressed data is stable. This will (1) provide you the guarantee you're looking for across OSes, (2) won't disrupt every deployment like the deflateSync change and (3) won't run the risk of some future standard change causing a problem (blitting over the OS byte might - even if this is very unlikely).

The above fix still assumes the compression algorithm is stable across all implementations, OSes and individual runs. I do not believe this assumption is valid and, that only by comparing the inflated values do you have a guarantee.

@jogold
Copy link
Contributor

jogold commented Jun 29, 2021

For the diff vs deploy part, there's the question of normalizing the different implementations:

// filter out 'AWS::CDK::Metadata' resources from the template
if (diff.resources && !strict) {
diff.resources = diff.resources.filter(change => {
if (!change) { return true; }
if (change.newResourceType === 'AWS::CDK::Metadata') { return false; }
if (change.oldResourceType === 'AWS::CDK::Metadata') { return false; }
return true;
});
}

vs

// Template has changed (assets taken into account here)
if (JSON.stringify(deployStackOptions.stack.template) !== JSON.stringify(await cloudFormationStack.template())) {
debug(`${deployName}: template has changed`);
return false;
}

For the metadata string (and I don't know your backend of course) I wonder if compression makes sense... I have a template with around 140 resources with constructs from various packages and the difference in bytes is 868... also doesn't it go against the principle that CF templates are "human readable" (to understand diffs and change sets)?

@rix0rrr
Copy link
Contributor

rix0rrr commented Jul 5, 2021

I wonder if compression makes sense... [...] doesn't it go against the principle that CF templates are "human readable"

Sure, but at the same time the metadata isn't load-bearing, and if we listed out the data in there in full it would be distractingly huge (taking up a lot of vertical space in a viewer program). Plus, we would be eating 10x more space in a space-constrained template. So I think compression is an acceptable trade-off.

It's a good point that this might cause unintended updates, which don't really harm anyone but take up unnecessary time.

I think a combination of:

  • hacking out the OS byte like Nick suggested, plus
  • having the short-circuit function disregard the metadata resource

are the best we can do (apart from you configuring --no-version-reporting).

@rix0rrr rix0rrr added effort/medium Medium work item – several days of effort p2 and removed response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. needs-triage This issue or PR still needs to be triaged. labels Jul 5, 2021
@rix0rrr rix0rrr removed their assignment Jul 5, 2021
@codeedog
Copy link
Author

codeedog commented Jul 7, 2021

@rix0rrr I don't agree that "unintended" Layer rebuilds resulting in new versions are entirely "harmless".

@eladb
Copy link
Contributor

eladb commented Aug 31, 2021

This also affects GitHub Workflows support for CDK Pipelines because the asset ID of the template that's encoded in the deploy.yml workflow (generated on the developer machine) is inconsistent with the asset ID generated during the workflow run.

Currently we have to disable version reporting to make that work.

eladb pushed a commit that referenced this issue Aug 31, 2021
The gzip header includes an “OS” byte that indicates which operating system was used to perform the operation. This means that the CDK Metadata analytics string would be different for the same CDK app in two different operating systems.

To address this, we explicitly set the �OS flag to 255 (unknown).

Fixes #15322
@mergify mergify bot closed this as completed in #16300 Aug 31, 2021
mergify bot pushed a commit that referenced this issue Aug 31, 2021
…6300)

The gzip header includes an “OS” byte that indicates which operating system was used to perform the operation. This means that the CDK Metadata analytics string would be different for the same CDK app in two different operating systems.

To address this, we explicitly set the [gzip OS flag] to 255 (unknown).

Fixes #15322

[gzip OS flag]: https://datatracker.ietf.org/doc/html/rfc1952


----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
@github-actions
Copy link

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

TikiTDO pushed a commit to TikiTDO/aws-cdk that referenced this issue Sep 6, 2021
…s#16300)

The gzip header includes an “OS” byte that indicates which operating system was used to perform the operation. This means that the CDK Metadata analytics string would be different for the same CDK app in two different operating systems.

To address this, we explicitly set the [gzip OS flag] to 255 (unknown).

Fixes aws#15322

[gzip OS flag]: https://datatracker.ietf.org/doc/html/rfc1952


----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
david-doyle-as24 pushed a commit to david-doyle-as24/aws-cdk that referenced this issue Sep 7, 2021
…s#16300)

The gzip header includes an “OS” byte that indicates which operating system was used to perform the operation. This means that the CDK Metadata analytics string would be different for the same CDK app in two different operating systems.

To address this, we explicitly set the [gzip OS flag] to 255 (unknown).

Fixes aws#15322

[gzip OS flag]: https://datatracker.ietf.org/doc/html/rfc1952


----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/cx-api Related to the Cloud Executable API bug This issue is a bug. effort/medium Medium work item – several days of effort p2
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants