Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot configure securityContext of sync and cleanup jobs #6545

Closed
RGPosadas opened this issue Jul 31, 2023 · 8 comments · Fixed by #6647
Closed

Cannot configure securityContext of sync and cleanup jobs #6545

RGPosadas opened this issue Jul 31, 2023 · 8 comments · Fixed by #6647
Assignees
Labels
component/apprepository Issue related to kubeapps apprepository kind/bug An issue that reports a defect in an existing feature

Comments

@RGPosadas
Copy link

RGPosadas commented Jul 31, 2023

Summary
Sync and cleanup AppRepo jobs are currently not configureable to pass in container/pod SecurityContext.

Background and rationale
We are in the middle of PSP to PSA migration.
Currently, all core kubeapps components allow us to configure the container/pod SecurityContext - except for the Sync and Cleanup apprepository jobs.
Since we cannot configure the SecurityContext, the sync and cleanup jobs will be blocked from spawning pods if the namespace that kubeapps is running on has the restricted PodSecurity standard.

Description/Acceptance Criteria
Allow users to configure container/pod SecurityContext of the sync and cleanup jobs.

Extra Info
For apprepos that are defined as initialRepos, we are able to patch the AppRepository manifests, which allows us to define container/pod securityContext for the sync jobs. However, we do not have the same ability for the cleanup jobs.

@RGPosadas RGPosadas changed the title Sync and Cleanup jobs not allowed Cannot configure securityContext of sync and cleanup jobs Jul 31, 2023
@absoludity
Copy link
Contributor

I just want to confirm that, when creating the AppRepository in Kubeapps, that you did specify the sync-job pod template? Asking because I can see in the code that we do intend to use that as the basis for the pod template used by the cron job that's created:

// Get the predefined pod spec for the apprepo definition if exists
podTemplateSpec := apprepo.Spec.SyncJobPodTemplate
// Add labels
if len(podTemplateSpec.ObjectMeta.Labels) == 0 {
podTemplateSpec.ObjectMeta.Labels = map[string]string{}
}
for k, v := range jobLabels(apprepo, config) {
podTemplateSpec.ObjectMeta.Labels[k] = v
}
podTemplateSpec.ObjectMeta.Annotations = config.ParsedCustomAnnotations
// If there's an issue, will restart pod until successful or replaced
// by another instance of the job scheduled by the cronjob
// see: cronJobSpec.concurrencyPolicy
podTemplateSpec.Spec.RestartPolicy = corev1.RestartPolicyOnFailure
// Populate container spec
if len(podTemplateSpec.Spec.Containers) == 0 {
podTemplateSpec.Spec.Containers = []corev1.Container{{}}
}
// Populate ImagePullSecrets spec
podTemplateSpec.Spec.ImagePullSecrets = append(podTemplateSpec.Spec.ImagePullSecrets, config.ImagePullSecretsRefs...)
podTemplateSpec.Spec.Containers[0].Name = "sync"
podTemplateSpec.Spec.Containers[0].Image = config.RepoSyncImage
podTemplateSpec.Spec.Containers[0].ImagePullPolicy = corev1.PullIfNotPresent
podTemplateSpec.Spec.Containers[0].Command = []string{config.RepoSyncCommand}
podTemplateSpec.Spec.Containers[0].Args = apprepoSyncJobArgs(apprepo, config)
podTemplateSpec.Spec.Containers[0].Env = append(podTemplateSpec.Spec.Containers[0].Env, apprepoSyncJobEnvVars(apprepo, config)...)
podTemplateSpec.Spec.Containers[0].VolumeMounts = append(podTemplateSpec.Spec.Containers[0].VolumeMounts, volumeMounts...)
// Add volumes
podTemplateSpec.Spec.Volumes = append(podTemplateSpec.Spec.Volumes, volumes...)

See https://kubeapps.dev/docs/latest/tutorials/managing-package-repositories/#modifying-the-synchronization-job for more info about customising the sync job pod template.

Can you please paste your AppRepository yaml to compare? (redacted if necessary)

Thanks

@ppbaena ppbaena added the awaiting-more-evidence Need more info to actually get it done. label Aug 1, 2023
@RGPosadas
Copy link
Author

Sample values definition:

  apprepository:
    containerSecurityContext:
      runAsUser: 10001
      allowPrivilegeEscalation: false
      capabilities:
        drop:
          - ALL
      runAsNonRoot: true
      seccompProfile:
        type: RuntimeDefault

Templated AppRepository:

apiVersion: kubeapps.com/v1alpha1
kind: AppRepository
metadata:
  labels:
    app.kubernetes.io/instance: apps
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: kubeapps
    helm.sh/chart: kubeapps-12.4.6
  name: test-apprepo
  namespace: test-apprepo
spec:
  auth:
    header:
      secretKeyRef:
        key: authorizationHeader
        name: test-apprepo
  ociRepositories:
  - <redacted>
  syncJobPodTemplate:
    spec:
      securityContext:
        runAsUser: 10001
  type: oci
  url: <redacted>

Only the securityContext.runAsUser gets propagated down, as seen in https://github.com/vmware-tanzu/kubeapps/blob/main/chart/kubeapps/templates/apprepository/apprepositories.yaml#L42-L45

@RGPosadas
Copy link
Author

RGPosadas commented Aug 1, 2023

Sorry I misread what you wrote, but I'll leave my previous comment.

Yes we already modify the AppRepository sync job via Kustomize so that is handled fine (although it would be nice to have this configureable in the helm chart instead of patching the CRD). However, we are unable to do the same patch with the cleanup job.

Asking because I can see in the code that we do intend to use that as the basis for the pod template used by the cron job that's created:

Did you mean that the cleanup and sync jobs both use the same podTemplate? Or does this only affect the sync job

@absoludity
Copy link
Contributor

Yes - the chart options try to provide simple options that the user can use for specific tasks, and do not currently allow specifying a general syncPodTemplate, but as you've seen, you can specify that yourself by specifying the AppRepository in full (you don't need to patch the CR, most people just don't use the chart to define the CR in this case, but rather just write the CR - or generate it with kustomize - for their AppRepository).

And you are absolutely correct: the specified syncPodTemplate on the AppRepository CR is used successfully when generating the Job for the syncs, but is not currently used for generating the Job for the clean-ups. That is an easy fix that we should include :)

@RGPosadas
Copy link
Author

Great, thanks for confirming!

@ppbaena ppbaena added kind/bug An issue that reports a defect in an existing feature and removed awaiting-more-evidence Need more info to actually get it done. labels Aug 3, 2023
@ppbaena ppbaena added this to the Technical debt milestone Aug 3, 2023
@antgamdia antgamdia added the component/apprepository Issue related to kubeapps apprepository label Aug 9, 2023
@antgamdia antgamdia self-assigned this Aug 10, 2023
@antgamdia
Copy link
Contributor

That is an easy fix that we should include :)

Well... it turned out not to be that simple :P, the details of the affected AppRepo are no longer available in the cluster (wherefore it is clean-up job). I have proposed an alternative approach in #6605. See what you all think!

antgamdia added a commit that referenced this issue Aug 21, 2023
#6646)

### Description of the change

While working on #6605, I noticed I had a bunch of unrelated changes.
Instead of overloading the PR, I have extracted them into this one. The
main changes are:

- Apply the same fix in go Dockerfiles as we did for kubeapps-apis
(avoid downloading linters if `lint=false`). It helps reduce the build
time locally.
- Remove some duplicated keys in the yamls, as we already discussed.
- Add some missing apprepo-controller args in the tests, mainly just for
completness.
- Fix some tests in the apprepo-controller. Mainly just swapping some
misplaced `get, want`.
- Handle cronjob v1beta1 vs v1 in a missing place.
- Pass the `podSecurityContext` and `containerSecurityContext` in the
pod template properly.
- Update a missing copyright header.
- Fix wrong values.yaml metadata (mainly `ocicatalog/ociCatalog`)

### Benefits

Get the aforementioned issues solved.

### Possible drawbacks

N/A (not adding new logic here)

### Applicable issues

- (partially) related to
##6545

### Additional information

N/A

---------

Signed-off-by: Antonio Gamez Diaz <agamez@vmware.com>
antgamdia added a commit that referenced this issue Aug 22, 2023
### Description of the change

Even if the sync jobs were added a security context (by means of each
AppRepo CRD), this information was not available for Cleanup jobs. This
is mainly due to the fact that those jobs are spun up once a NotFound
error is thrown when fetching an AppRepo.
However, Kubernetes does have a native approach for dealing with these
scenarios: finalizers.

In #6605 we proposed a
simplistic workaround based on adding more params to the controller...
but as suggested in
#6605 (comment),
moving to finalizers is a better long-term solution.


### Benefits

Cleanup jobs are now handled within an existing AppRepo context...
meaning we have all the syncJobTemplate available to be used (ie,
securityPolicies and so on).

### Possible drawbacks

When dealing with finalizers in the past I often found it really
annoying when they get stuck and prevent the resource to get deleted. I
wonder if we should add some info in the FAQ on how to manually remove
the finalizers.

Additionally, and this might be something important: for the AppRepo
controller to be able to `update` AppRepos in other namespaces !=
kubeapps.... (to add the finalizer) it now needs extra RBAC. Before we
were just granting `...-appprepos-read`.. but now we would need to grant
`...-write` as well...and I'm not sure we really want to do so.
WDYT, @absoludity ?
Another idea is using an admission policy... but not sure again if we
want to deal with that...

~(I haven't modified the RBAC yet in this PR)~ Changes have been
performed finally

### Applicable issues

- fixes #6545

### Additional information

This PR is based on top of
#6646, but the main change
to review is
6e70910
The rest is just moving code into separate files, mostly.

Also, I have been taking a look at `kubebuilder` to create a new
controller, based on the `sigs.k8s.io/controller-runtime` rather than on
the workqueues we currently have. While it is pretty easy to start with
([see quickstart](https://book.kubebuilder.io/quick-start)), I think it
is adding too much complexity (using kustomize, adding rbac proxies,
prometheus metrics, etc...
I also quickly tried the k8s codegen scripts, but ran into some issues
with my setup... but perhaps it's the best option.

IMO, at some point we should start thinking about moving towards a new
state-of-the-art k8s controller boilerplate.

---------

Signed-off-by: Antonio Gamez Diaz <agamez@vmware.com>
@antgamdia
Copy link
Contributor

HI @RGPosadas, we ended up performing some refactor on the AppRepository controller, so instead of being the controller the responsible of creating clean-up jobs, it is a finalizer who will handle that. That means all the information already available in the AppRepo CR (like the pod template), can now be inherit by the clean-up jobs. In short: those jobs will have the security context set!

Feel free to use our dev chart (not Bitnami's, but the one in this repo) and the latest kubeapps/apprepository-controller container image if you want to give it a try now. Anyways, next Kubeapps release will include this change.

Thanks again for reporting the issue!

@RGPosadas
Copy link
Author

@antgamdia Thanks for the quick work! Hopefully I'll have time to test it out before the official release to report back 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/apprepository Issue related to kubeapps apprepository kind/bug An issue that reports a defect in an existing feature
Projects
Archived in project
4 participants