HIP Support Helm release data stored in multiple K8s Secrets #256

ralfv · 2022-06-20T16:30:09Z

Helm release data encoded into the release field in a Kubernetes Secret can cause the Secret as a whole to exceed 1 MiB (as defined by MaxSecretSize in kubernetes/pkg/apis/core/types.go).

When this happens, the Kubernetes Secret cannot be stored in Etcd and the Helm install or upgrade fails.

Splitting the Helm release data across multiple Kubernetes Secrets resolves this problem.

Helm release data encoded into the release field in a Kubernetes Secret can cause the Secret as a whole to exceed 1 MiB (as defined by MaxSecretSize in kubernetes/pkg/apis/core/types.go). When this happens, the Kubernetes Secret cannot be stored in Etcd and the Helm install or upgrade fails. Splitting the Helm release data across multiple Kubernetes Secrets resolves this problem. Signed-off-by: Ralf Van Dorsselaer <ralfvandorsselaer@gmail.com>

Signed-off-by: Ralf Van Dorsselaer <ralfvandorsselaer@gmail.com>

… Secrets PR helm/community#256

PR helm/helm#11087 Signed-off-by: Ralf Van Dorsselaer <ralfvandorsselaer@gmail.com>

bacongobbler · 2022-06-27T17:28:04Z

hips/hip-9999.md

+
+## Backwards compatibility
+
+If the chunking is implemented in the secret/secrets HELM_DRIVER then backward compatibility to previously installed Helm releases is guaranteed. If no chunking information is encoded in the release secret then the driver can behave as before chunking support was added. If no chunking is needed then the release secret can be created without any additional metadata and with compatible name.


What about compatibility with third-party tooling? For example https://github.com/helm/helm-mapkubeapis.

What I understand from looking at the https://github.com/helm/helm-mapkubeapis code is that it is not impacted as it works as a Helm plugin and its code seems to be agnostic to how the Helm release is actually stored.

For tools/scripts that read K8s Secrets directly (and are expecting to find a single encoded blob that describes the Helm release) things are different.

In the scenario where the encoded release data size is < (1000 *1024) bytes there is no problem for third parties to read and parse the Helm release data. Only if the Helm release data is split into multiple parts these tools/scripts would be broken.

I have following question:

Is the Helm release information encoded into a K8s Secret considered a public interface? Is there public documentation from the Helm project that describes the format of the data inside a Kubernetes Secret? Is the Helm project encouraging reading K8s Secrets directly and developing tooling around this?

If the internal format is not documented publicly and considered a private interface then any third party tool that relies on it will have reverse engineered it from looking at K8s Secrets and Helm code and is therefore vulnerable to any changes to the internal format.

If the internal format is publicly documented and considered a public interface then that does not change much because any change to support storing a Helm release differently in K8s Secrets would result in incompatible changes for consuming third parties. The breaking change for the Helm release storage format for said third parties would have to be communicated early as part of the major release change log.

While the internal format is not documented publicly, it is expected that older versions of Helm 3's SDK is compatible until the release of Helm 4.

If the format of the secret changes to the point that older versions of Helm 3 cannot support read/write operations against a release secret, that is considered a breaking change. This could have unintended issues, like Helm 3.0.0's helm list or helm upgrade corrupting the release ledger because it cannot read chunked secrets.

As a hypothetical situation... Say you started using Helm 3, and upgraded to Helm 3.x with chunked release support. You now have three releases:

v1 - 512B
v2 - 512B
v3 - 3.2MB

A newer version of Helm will be able to understand that v3 is the latest version. But what would be the behaviour of Helm 3.0.0 in this situation? It is undetermined what will happen. Best case scenario is that it returns an error and exits without affecting the system. Worst case scenario is that it either

Updates kubernetes objects, but fails to update the release, or

Corrupts the storage system because it only reads the first chunk of v3, or

Reads v2 as the latest, because that's the "latest" release it can read without returning an error.

This is why we've generally steered users towards the database driver for Helm 3, because it is documented to support release types larger than 1MB in size.

Is that a realistic expectation that any old 3.x client works with any release of any chart? We have been publishing various releases of our product as a helm chart over the past years, and I recall having to advice customers several times that we need a certain minimum version of helm, e.g. to avoid running into some defect that was fixed only in a newer version - and that did not create problems with customers, they seemed to be on reasonably up-to-date versions anyway.

As far as I understand your concern, such a worst-case corruption problem might only occur for an application that depends on the chunking support (i.e. a respective version of helm), and there is no risk for a broader corruption at the overall cluster. In other words, if a helm-based applications has a large chart, it needs to be the application's concern that they need to instruct their customers to be on a respective version of helm, and as said, we had to that anyway in the past because of defect fixes we depended on.

bacongobbler · 2022-08-04T17:00:19Z

hips/hip-9999.md

+
+## Rejected ideas
+
+n/a


What about using the database driver? That is a known supported driver that works with releases larger than 1MB in size.

https://helm.sh/docs/topics/advanced/#sql-storage-backend

In our case, we are a software vendor who provide helm-based applications for customers to deploy in their cluster, we are not the user of helm itself. With that, we have to work with typical helm setups; asking for a certain minimum version of helm (e.g. for defect fixes in helm) has never been a problem and customers seemed to be current anyways, but asking to configured a PG only for helm would introduce a level of complexity that I'm not sure customers would accept, and I have not seen a single customer who has configured their helm with a PG storage backend.

However, if you're worried about changing the size-constrained etcd-storage backend, is it worth considering to introduce a parallel, chunked-etcd-storage backend that is in parallel? That would avoid the customer complexity of having to deploy a Postgresql and allows to use the storage that is guaranteed to be available. I believe the approach that @ralfv suggested with extending the existing etcd backend is more elegant and simpler, though, but it might address your 3.0.0 compatiblity concern.

What about using the database driver? That is a known supported driver that works with releases larger than 1MB in size.

https://helm.sh/docs/topics/advanced/#sql-storage-backend

@bacongobbler A big part of Helm (certainly v3) is that it does not require any additional infrastructure besides a Kubernetes cluster. I really want to avoid introducing additional infrastructure to satisfy the requirement to store Secrets that are larger than 1MB.

might address your 3.0.0 compatiblity concern.

@fravoss I think the compatibility concern goes away if support for chunked secrets is implemented starting with Helm v4 and converting is transparently handled between Secrets < 1MB and Secrets > 1MB - as is curently done in the initial implementation see helm/helm#11087 - as Helm v3 will simply be unable to process such Secrets there is no harm. It would also be possible to backport to Helm v3 obviously.

bacongobbler · 2022-08-04T17:05:47Z

hips/hip-9999.md

+
+## Security implications
+
+Theoretically a MTM Man in The Middle attack is possible where requests to the Kubernetes API server are intercepted. But this type of attack applies to any access to the Kubernetes API server.


etcd relies on the fact that key/values are no larger than 1MB in size. By chunking key/values into smaller sizes just so they can fit in etcd does not address the root issue: introducing more read requests will increase the latency of other read requests. It is theoretically possible that a release secret that is >1MB in size could potentially DDOS etcd. Because etcd has to dedicate a large amount of resources to read several keys, other incoming read requests could grind to a halt.

Also worth keeping in mind that etcd has a default hard storage limit at 2GB. Even if you allow release secrets larger than 1MB, you will very quickly run out of space.

I am incredibly wary of "working around" a limitation that was explicitly put in place by the etcd engineers.

True, but IMO only in theory. The idea is not to make the chunks smaller but to distribute a Helm release that comes out to 1.5MB across two Secrets one being 1MB the other 512K. In implementation I would not make the chunksize configurable in any way. (It is for testing in the current PR)

@bacongobbler I checked the etcd-dev discussion around this topic.

The main reason what they state for their size limit is MTTR and not performance during regular calls.
https://groups.google.com/g/etcd-dev/c/vCeSLBKC_M8/m/2pPBXpc9BwAJ

Also, they suggest you can go for a replicated etcd cluster if you are choosing higher size(>8GB) and MTTR is not that much of a concern for you.

I believe this is good indication from the etcd-dev themselves.

Which leads that this split mechanism is a good solution given that the sql backend storage has no migration support (existing installation to sql backed storage).

Thanks.

Hammond95 · 2024-05-23T12:40:14Z

will this ever be implemented? Could you please provide a solution to this problem?

carlossg · 2024-06-21T07:52:52Z

hips/hip-9999.md

+## Rejected ideas
+
+n/a
+


what about changing the compression algorithm ?
I have tested with one Helm secret that is 284K gzip compressed. Decompressed it is 2.0M

with bzip2 179K 37% savings

with brotli 163K 42% savings

Valid consideration, @carlossg. Compared with the original proposal to store the release in chunked secrets, I'd prefer the chunked storage from the original proposal, for the following reasons:

chunked storage has no upper limit (for realistic charts - I ignore the theoretical 2GB limit of etcd), whereas a 40% compression gain is IMHO too close to the current limit, i.e. the risk remains that a chart that would need double the size is already out of the picture

the chunked storage can be backwards-compatible at least in the way that if the release fits into a single secret, the storage can behave exactly as before, and only large charts that could not be installed would need the new version, whereas a compression change would impact all stored releases.

@Hammond95 , the chunked storage is implemented in helm/helm#12277 and pending review.

I prefer the compression :)

backwards compatibility: you could do the same check, if size under limit use gzip, else use brotli (or force brotli in all if you wish)

forward compatibility: if secret cannot be decompressed with brotli, use gzip

but the main reason, with chunked storage you still have the data in etcd and etcd has storage limits. helm charts account to a big portion of our etcd storage and anyone doing 100s/1000s of helm releases will have this problem

@carlossg , I believe our difference in preference comes from trying to solve different problems.

This HIP / PR is addressing the limitations that a single release of a large application (== helm chart) can exceed the size limit of the secret storage. A ~40% increase only shifts the limit of what "too large chart" means, but strategically, I wouldn't be comfortable that this addresses the conceptual problem of storing in a single secret.

You seem to face rather the problem that you have MANY releases of helm charts, that individually don't run into the 1MB limit of etcd, but in total sum up to exceed etcd overall storage limits. With that, I'd assume you'd ALWAYS want the improved compression, not only for releases that run into the 1MB limit. That's a valid problem, and I agree that improved compression can move the needle somewhat as long as the underlying release encoding is not optimized by itself - but as said, IMHO this is not the problem that's in the focus of this HIP.

ralfv added 2 commits June 20, 2022 18:26

Updated submit date

9ee6866

Signed-off-by: Ralf Van Dorsselaer <ralfvandorsselaer@gmail.com>

pull-request-size bot added the size/M label Jun 20, 2022

Rename hip-9999 to hip-9999.md

12790d0

Signed-off-by: Ralf Van Dorsselaer <ralfvandorsselaer@gmail.com>

ralfv added a commit to ralfv/helm that referenced this pull request Jun 27, 2022

Initial code for HIP Support Helm release data stored in multiple K8s…

c1b15f9

… Secrets PR helm/community#256

ralfv mentioned this pull request Jun 27, 2022

WIP: Initial code for HIP Support Helm release data stored in multiple K8s… helm/helm#11087

Closed

3 tasks

Add PR link for implementation

0a7455b

PR helm/helm#11087 Signed-off-by: Ralf Van Dorsselaer <ralfvandorsselaer@gmail.com>

bacongobbler reviewed Jun 27, 2022

View reviewed changes

bacongobbler reviewed Aug 4, 2022

View reviewed changes

yxxhero mentioned this pull request Aug 30, 2022

feat(affinity): support affinity goharbor/harbor-operator#956

Draft

This was referenced Feb 6, 2023

ConfigMap size limit prevents storing release helm/helm#1413

Closed

Support for splitting release manifest with Secrets storage driver helm/helm#11791

Open

HIP-0019: Split manifest with Secrets storage driver #284

Open

ralfv mentioned this pull request Aug 6, 2023

Support Helm release data stored in multiple K8s Secrets helm/helm#12277

Open

3 tasks

Haleygo mentioned this pull request Nov 9, 2023

release secret size optimization helm/helm#11493

Open

carlossg reviewed Jun 21, 2024

View reviewed changes

shahar-h mentioned this pull request Sep 23, 2024

CRDs upgrade support envoyproxy/gateway#4001

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HIP Support Helm release data stored in multiple K8s Secrets #256

HIP Support Helm release data stored in multiple K8s Secrets #256

ralfv commented Jun 20, 2022

bacongobbler Jun 27, 2022

ralfv Jun 30, 2022

bacongobbler Aug 4, 2022 •

edited

Loading

fravoss Aug 10, 2022

bacongobbler Aug 4, 2022 •

edited

Loading

fravoss Aug 10, 2022

ralfv Aug 25, 2022 •

edited

Loading

ralfv Aug 25, 2022 •

edited

Loading

bacongobbler Aug 4, 2022 •

edited

Loading

ralfv Aug 25, 2022

rajeshjsl Jul 16, 2024 •

edited

Loading

Hammond95 commented May 23, 2024

carlossg Jun 21, 2024

fravoss Jun 21, 2024

carlossg Jun 21, 2024 •

edited

Loading

fravoss Jun 21, 2024 •

edited

Loading


		## Backwards compatibility

		If the chunking is implemented in the secret/secrets HELM_DRIVER then backward compatibility to previously installed Helm releases is guaranteed. If no chunking information is encoded in the release secret then the driver can behave as before chunking support was added. If no chunking is needed then the release secret can be created without any additional metadata and with compatible name.


		## Security implications

		Theoretically a MTM Man in The Middle attack is possible where requests to the Kubernetes API server are intercepted. But this type of attack applies to any access to the Kubernetes API server.

HIP Support Helm release data stored in multiple K8s Secrets #256

Are you sure you want to change the base?

HIP Support Helm release data stored in multiple K8s Secrets #256

Conversation

ralfv commented Jun 20, 2022

bacongobbler Jun 27, 2022

Choose a reason for hiding this comment

ralfv Jun 30, 2022

Choose a reason for hiding this comment

bacongobbler Aug 4, 2022 • edited Loading

Choose a reason for hiding this comment

fravoss Aug 10, 2022

Choose a reason for hiding this comment

bacongobbler Aug 4, 2022 • edited Loading

Choose a reason for hiding this comment

fravoss Aug 10, 2022

Choose a reason for hiding this comment

ralfv Aug 25, 2022 • edited Loading

Choose a reason for hiding this comment

ralfv Aug 25, 2022 • edited Loading

Choose a reason for hiding this comment

bacongobbler Aug 4, 2022 • edited Loading

Choose a reason for hiding this comment

ralfv Aug 25, 2022

Choose a reason for hiding this comment

rajeshjsl Jul 16, 2024 • edited Loading

Choose a reason for hiding this comment

Hammond95 commented May 23, 2024

carlossg Jun 21, 2024

Choose a reason for hiding this comment

fravoss Jun 21, 2024

Choose a reason for hiding this comment

carlossg Jun 21, 2024 • edited Loading

Choose a reason for hiding this comment

fravoss Jun 21, 2024 • edited Loading

Choose a reason for hiding this comment

bacongobbler Aug 4, 2022 •

edited

Loading

bacongobbler Aug 4, 2022 •

edited

Loading

ralfv Aug 25, 2022 •

edited

Loading

ralfv Aug 25, 2022 •

edited

Loading

bacongobbler Aug 4, 2022 •

edited

Loading

rajeshjsl Jul 16, 2024 •

edited

Loading

carlossg Jun 21, 2024 •

edited

Loading

fravoss Jun 21, 2024 •

edited

Loading