-
Notifications
You must be signed in to change notification settings - Fork 706
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pass SecurityContext in all the AppRepo Controller's jobs #6605
Pass SecurityContext in all the AppRepo Controller's jobs #6605
Conversation
Signed-off-by: Antonio Gamez Diaz <agamez@vmware.com>
Signed-off-by: Antonio Gamez Diaz <agamez@vmware.com>
… tpl Signed-off-by: Antonio Gamez Diaz <agamez@vmware.com>
Signed-off-by: Antonio Gamez Diaz <agamez@vmware.com>
Signed-off-by: Antonio Gamez Diaz <agamez@vmware.com>
Signed-off-by: Antonio Gamez Diaz <agamez@vmware.com>
Signed-off-by: Antonio Gamez Diaz <agamez@vmware.com>
Signed-off-by: Antonio Gamez Diaz <agamez@vmware.com>
✅ Deploy Preview for kubeapps-dev canceled.Built without sensitive environment variables
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ouch - sorry that it wasn't an easy fix as I'd originally thought.
Even if the sync jobs were added a security context (by means of each AppRepo CRD), this information was not available for Cleanup jobs. Note that those jobs are spun once a NotFound error is thrown when fetching an AppRepo.
Right, that signals that it's too late to be running the clean-up job at that point. Kubernetes has a solution for that (see below).
The approach herein applied has been just passing this information via command line arguments. However, to maximize the usefulness of the feature, it is also taken into account in the SyncJobs (if a config is passed via AppRepo CRD, it will override the default one).
Why would we provide AppRepo CRD-specific security context, if the controller is going to run with a security context to use by default anyway?
See what you think, but I reckon the native way to solve this would be for us to instead use finalizers for the AppRepo CRDs (which allow the controller to run clean-up tasks before the resource is deleted).
7dfcaff
to
eeada4f
Compare
Signed-off-by: Antonio Gamez Diaz <agamez@vmware.com>
eeada4f
to
5f83dd8
Compare
It does not really make much sense, I agree. The point is that, currently, we do support the customization of the Whether or not using a custom security context in each repo makes actually sense... I don't know; haven't thoroughly thought about it - though I have the impression it could be useful in some contexts where the sync job is run in a certain restricted namespace somehow. Coming back to this PR, now we would have two different ways of specifying a security context: 1) using the existing CRD's
In this PR, I opted for option D, as I see it as the less disruptive maximizing backward-compatibility,
+1000, the current approach based upon syncJobs+cleanupJobs seems like a bad smell given the current myriad of k8s-native solutions... but if we start improving it... we would end up in sth like the flux |
Sorry, my thought was more that the AppRepo CRD-specific security context makes sense at the moment, but if we add a default security context then it no longer makes sense (to me?). In a multi-tenant use-case, having AppRepo specific security context allows different users to utilise different resources. But if they all have access to the default context, then you no longer have multi-tenancy for those resources (nodes, or whatever are protected by the security context).
Yes.
Yes, it's less disruptive, but I'm worried it's adding tech-debt in a direction away from our goal, when it's a minor problem that could be solved in a way that moves us towards our goal.
Yeah - the question is: is there a positive step forward here that we could take that's simple and doesn't add something moving us away from our goal? Not sure, but one idea that comes to mind: we could instead update to run the cleanup code when the app repo is deleted by Kubeapps - it's not perfect as it won't cover the case where the app repo is deleted by the CLI, but it's a step towards doing the work in the controller triggered (later) by a finalizer. (I don't think it's a huge amount of work, the cleanup). I'm going to leave a +1 on this, because I know that you'll think about this in more detail than I have, considering this info, and make a good decision based on what you think (either way), I just wanted to be sure you understood my concern about adding something that may hinder us in the future, when there may be other steps forwards more aligned with the goal of using a finalizer. |
Haven't had the time to reply yet (will do it tomorrow), but TL;DR, I gave it a try to the finalizers thing... and it seems to work pretty well :) apprepos.mp4 |
Yep, I agree. I was a bit reluctant to add more features related to the CR lifecycle and just went with the simplistic approach. But if continue doing so, we would never get to fix that tech debt.
I don't think that fully works since it can even lead to some inconsistent jobs. Not sure if diverging the behavior between what is done by Kubeapps and what is expected from the controller when interacting with the k8s resources directly, is a good idea. So, in the end, I just added the finalizers thing. It wasn't that difficult to add: #6647 :) Let's move the technical discussion there. I have also moved the extra changes I performed here to #6646. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent, thanks @antgamdia. I'll close this one after reviewing your others (so that we don't accidentally land it). Or maybe I'll switch the review instead.
#6646) ### Description of the change While working on #6605, I noticed I had a bunch of unrelated changes. Instead of overloading the PR, I have extracted them into this one. The main changes are: - Apply the same fix in go Dockerfiles as we did for kubeapps-apis (avoid downloading linters if `lint=false`). It helps reduce the build time locally. - Remove some duplicated keys in the yamls, as we already discussed. - Add some missing apprepo-controller args in the tests, mainly just for completness. - Fix some tests in the apprepo-controller. Mainly just swapping some misplaced `get, want`. - Handle cronjob v1beta1 vs v1 in a missing place. - Pass the `podSecurityContext` and `containerSecurityContext` in the pod template properly. - Update a missing copyright header. - Fix wrong values.yaml metadata (mainly `ocicatalog/ociCatalog`) ### Benefits Get the aforementioned issues solved. ### Possible drawbacks N/A (not adding new logic here) ### Applicable issues - (partially) related to ##6545 ### Additional information N/A --------- Signed-off-by: Antonio Gamez Diaz <agamez@vmware.com>
### Description of the change Even if the sync jobs were added a security context (by means of each AppRepo CRD), this information was not available for Cleanup jobs. This is mainly due to the fact that those jobs are spun up once a NotFound error is thrown when fetching an AppRepo. However, Kubernetes does have a native approach for dealing with these scenarios: finalizers. In #6605 we proposed a simplistic workaround based on adding more params to the controller... but as suggested in #6605 (comment), moving to finalizers is a better long-term solution. ### Benefits Cleanup jobs are now handled within an existing AppRepo context... meaning we have all the syncJobTemplate available to be used (ie, securityPolicies and so on). ### Possible drawbacks When dealing with finalizers in the past I often found it really annoying when they get stuck and prevent the resource to get deleted. I wonder if we should add some info in the FAQ on how to manually remove the finalizers. Additionally, and this might be something important: for the AppRepo controller to be able to `update` AppRepos in other namespaces != kubeapps.... (to add the finalizer) it now needs extra RBAC. Before we were just granting `...-appprepos-read`.. but now we would need to grant `...-write` as well...and I'm not sure we really want to do so. WDYT, @absoludity ? Another idea is using an admission policy... but not sure again if we want to deal with that... ~(I haven't modified the RBAC yet in this PR)~ Changes have been performed finally ### Applicable issues - fixes #6545 ### Additional information This PR is based on top of #6646, but the main change to review is 6e70910 The rest is just moving code into separate files, mostly. Also, I have been taking a look at `kubebuilder` to create a new controller, based on the `sigs.k8s.io/controller-runtime` rather than on the workqueues we currently have. While it is pretty easy to start with ([see quickstart](https://book.kubebuilder.io/quick-start)), I think it is adding too much complexity (using kustomize, adding rbac proxies, prometheus metrics, etc... I also quickly tried the k8s codegen scripts, but ran into some issues with my setup... but perhaps it's the best option. IMO, at some point we should start thinking about moving towards a new state-of-the-art k8s controller boilerplate. --------- Signed-off-by: Antonio Gamez Diaz <agamez@vmware.com>
Description of the change
Even if the sync jobs were added a security context (by means of each AppRepo CRD), this information was not available for Cleanup jobs. Note that those jobs are spun once a NotFound error is thrown when fetching an AppRepo.
The approach herein applied has been just passing this information via command line arguments. However, to maximize the usefulness of the feature, it is also taken into account in the SyncJobs (if a config is passed via AppRepo CRD, it will override the default one).
Besides, this PR is adding some minor fixes, namely:
adding some recommended security context params (more or less, the ones used in other Bitnami charts - but Trivy still complains).I have removed those changes, CI was failing... so I think they would deserve their own PR.Benefits
Users will be able to use SecurityContext in all the jobs created by Kubeapps.
Possible drawbacks
N/A (
let's see what CI thinks :P)It seems there are some issues during the chart installation, will look at it next week.After reverting some changes in the actual security context, CI is passing again.Applicable issues
Additional information
Here is a diff of the helm template before/after the changes in this PR:
Details