Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decompress GZIP'd user data #1762

Merged
merged 1 commit into from
May 23, 2024
Merged

Decompress GZIP'd user data #1762

merged 1 commit into from
May 23, 2024

Conversation

cartermckinnon
Copy link
Member

@cartermckinnon cartermckinnon commented Apr 11, 2024

Issue #, if available:

Fixes #1734

Description of changes:

Adds support for GZIP-compressed user data.

The following scenarios are supported:

  1. User data that consists of a NodeConfig that is compressed with GZIP.
  2. User data that is a multi-part MIME document that is compressed with GZIP.
  3. User data that is a multi-part MIME document containing parts that are individually GZIP'd (and have the Content-Encoding: gzip header).
  4. A combination of 2 and 3.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Copy link
Member

@ndbaker1 ndbaker1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the change LGTM. doesn't seem like we are going to adopt decompressing individual parts, so can we do a doc update before merging?

@cartermckinnon cartermckinnon force-pushed the gzip-userdata branch 2 times, most recently from 3fceead to d741996 Compare May 16, 2024 00:32
@cartermckinnon
Copy link
Member Author

doesn't seem like we are going to adopt decompressing individual parts

@ndbaker1 latest rev adds support for this, PTAL.

Copy link
Member Author

@cartermckinnon cartermckinnon May 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rewrote the unit tests for this config provider; they're more verbose but I think being explicit about the input is more clear and lets us more easily add test cases in the future.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you do any integration testing here? Or is that covered by the existing integration tests?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i like the change, thanks for skipping over the junk to setup mime documents and testing the actual provider interface

nodeadm/internal/configprovider/userdata.go Show resolved Hide resolved
nodeadm/internal/configprovider/userdata.go Outdated Show resolved Hide resolved
Comment on lines 108 to 119
if part.Header.Get(contentEncodingHeader) == "gzip" {
decompressedNodeConfigPart, err := decompressIfGZIP(nodeConfigPart)
if err != nil {
return nil, err
}
nodeConfigPart = decompressedNodeConfigPart
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need to make a check that it's gzip and then gracefully handle it not being gzip?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't just calling decompressIfGZIP in all cases work better since its fallible in constant time and safe either way?

To not hit this branch you'd have to set the nodeConfig mediaType but forget to add the gzip encoding header, which im sure would only happen on accident. maybe handling it explicitly is more "proper", but not sure if its necessary

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MIME semantics dictate that if the section is GZIP'd, this header needs to be there. If a user (or library) is creating MIME documents that use GZIP but omit this header, that should fail because it's not conformant to the spec.

Generally, folks are only using GZIP compression via a library which should implement this properly -- folks aren't hand-writing and hand-compressing their MIME docs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out cloudinit doesn't care what the spec says, and just tries to base64-decode and gzip-decompress pretty much anything it handles at any scope. It's more important for us to match the cloudinit behavior than to follow the spec, for better or worse. Updated in latest rev

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you do any integration testing here? Or is that covered by the existing integration tests?

Comment on lines 108 to 119
if part.Header.Get(contentEncodingHeader) == "gzip" {
decompressedNodeConfigPart, err := decompressIfGZIP(nodeConfigPart)
if err != nil {
return nil, err
}
nodeConfigPart = decompressedNodeConfigPart
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't just calling decompressIfGZIP in all cases work better since its fallible in constant time and safe either way?

To not hit this branch you'd have to set the nodeConfig mediaType but forget to add the gzip encoding header, which im sure would only happen on accident. maybe handling it explicitly is more "proper", but not sure if its necessary

nodeadm/internal/configprovider/userdata.go Outdated Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i like the change, thanks for skipping over the junk to setup mime documents and testing the actual provider interface

nodeadm/internal/configprovider/userdata.go Show resolved Hide resolved
@cartermckinnon
Copy link
Member Author

/ci
+workflow:os_distros al2023

Copy link
Contributor

@cartermckinnon roger that! I've dispatched a workflow. 👍

Copy link
Contributor

@cartermckinnon the workflow that you requested has completed. 🎉

AMI variantBuildTest
1.23 / al2023success ✅failure ❌
1.24 / al2023success ✅success ✅
1.25 / al2023success ✅success ✅
1.26 / al2023success ✅success ✅
1.27 / al2023success ✅success ✅
1.28 / al2023success ✅success ✅
1.29 / al2023success ✅success ✅
1.30 / al2023success ✅success ✅

@cartermckinnon
Copy link
Member Author

cartermckinnon commented May 16, 2024

One flake:

[Fail] [sig-api-machinery] CustomResourcePublishOpenAPI [Privileged:ClusterAdmin] [It] updates the published spec when one version gets renamed [Conformance]

Not related

@awslabs awslabs deleted a comment from github-actions bot May 16, 2024
@cartermckinnon cartermckinnon force-pushed the gzip-userdata branch 2 times, most recently from 4936741 to 2900e9d Compare May 22, 2024 21:09
@cartermckinnon
Copy link
Member Author

/ci
+workflow:os_distros al2023

Copy link
Contributor

@cartermckinnon roger that! I've dispatched a workflow. 👍

Copy link
Contributor

@cartermckinnon the workflow that you requested has completed. 🎉

AMI variantBuildTest
1.23 / al2023success ✅success ✅
1.24 / al2023success ✅success ✅
1.25 / al2023success ✅success ✅
1.26 / al2023success ✅success ✅
1.27 / al2023success ✅success ✅
1.28 / al2023success ✅success ✅
1.29 / al2023success ✅success ✅
1.30 / al2023success ✅success ✅

Comment on lines +113 to +118
if err != nil {
return nil, err
}
nodeConfigPart, err = decompressIfGZIP(nodeConfigPart)
if err != nil {
return nil, err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: theres a custom error message around the calls above but not here :(

Copy link
Member

@ndbaker1 ndbaker1 May 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the more i think on this, should we be more tolerant and break on these errors so that we can try the remaining parts? maybe not if theres just a simple mistake and the user would prefer it doesn't continue if their config is not included

Comment on lines +145 to +147
decodedLen, err := e.Decode(decodedData, data)
if err != nil {
return data, nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so none of the possible errors here end up mattering?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh i guess it can only return CorruptInputError 🤔

Copy link
Member

@ndbaker1 ndbaker1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm as-is, one nit comment

Comment on lines +145 to +147
decodedLen, err := e.Decode(decodedData, data)
if err != nil {
return data, nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh i guess it can only return CorruptInputError 🤔

@cartermckinnon cartermckinnon merged commit d87c6c4 into main May 23, 2024
10 checks passed
@cartermckinnon cartermckinnon deleted the gzip-userdata branch May 23, 2024 19:21
atmosx pushed a commit to gathertown/amazon-eks-ami that referenced this pull request Jun 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

nodeadm doesn't support userdata content-encoded with gzip
3 participants