Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Porch private authenticated registries functionality #126

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

Catalin-Stratulat-Ericsson
Copy link
Contributor

@Catalin-Stratulat-Ericsson Catalin-Stratulat-Ericsson commented Oct 22, 2024

Tackles #637

This PR adds the functionality for porch to use private authenticated container registries for its KPT functions in the porch packages.
It does this by mounting a docker config.json file as a secret which holds the authenticated information for the private repositories used by the user in their porch packages.
This secret information is then used as the ImagePullSecret on the KPT function pods created by the function runner at render time.

A documentation PR will soon follow which will document the configuration and i will link to it in this PR
Documentation PR #178
RBAC changes must be propagated in the catalog PR #126

func/internal/podevaluator.go Outdated Show resolved Hide resolved
func/internal/podevaluator.go Show resolved Hide resolved
func/internal/podevaluator.go Outdated Show resolved Hide resolved
func/internal/podevaluator.go Outdated Show resolved Hide resolved
func/internal/podevaluator.go Outdated Show resolved Hide resolved
func/internal/podevaluator.go Show resolved Hide resolved
func/internal/podevaluator.go Outdated Show resolved Hide resolved
func/internal/podevaluator.go Outdated Show resolved Hide resolved
deployments/porch/5-rbac.yaml Outdated Show resolved Hide resolved
@Catalin-Stratulat-Ericsson
Copy link
Contributor Author

/retest presubmit-nephio-go-test

Copy link
Contributor

nephio-prow bot commented Oct 29, 2024

@Catalin-Stratulat-Ericsson: The /retest command does not accept any targets.
The following commands are available to trigger required jobs:

  • /test presubmit-nephio-go-test

Use /test all to run all jobs.

In response to this:

/retest presubmit-nephio-go-test

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@Catalin-Stratulat-Ericsson
Copy link
Contributor Author

/test presubmit-nephio-go-test

1 similar comment
@Catalin-Stratulat-Ericsson
Copy link
Contributor Author

/test presubmit-nephio-go-test

@JamesMcDermott
Copy link
Contributor

/approve

Copy link
Contributor

nephio-prow bot commented Oct 30, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Catalin-Stratulat-Ericsson, JamesMcDermott
Once this PR has been reviewed and has the lgtm label, please assign liamfallon for approval by writing /assign @liamfallon in a comment. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@arora-sagar
Copy link

arora-sagar commented Oct 30, 2024

Just a side question: should all the functions use the same image pull secret? Probably that will be the most used scenario.

But how about adding an imagePullSecret row in the kptfile? This will provide more flexibility.

    - image: gcr.io/kpt-fn/apply-replacements:v0.1.1
      configPath: apply-replacements-namespace.yaml
      imagePullSecret: XXXXX

@Catalin-Stratulat-Ericsson
Copy link
Contributor Author

Catalin-Stratulat-Ericsson commented Oct 31, 2024

Just a side question: should all the functions use the same image pull secret? Probably that will be the most used scenario.

But how about adding an imagePullSecret row in the kptfile? This will provide more flexibility.

    - image: gcr.io/kpt-fn/apply-replacements:v0.1.1
      configPath: apply-replacements-namespace.yaml
      imagePullSecret: XXXXX
  1. you are correct this would be a more flexible way however it comes with a major drawback and that is backwards compatibility.
  2. if we implement this new section in the kpt file we must modify the kpt file schema which has not been altered since we took it into porch.
  3. For example if currently whatever we add into the porch pkg kpt file will be rendered just fine by kpt commands e.g. kpt live init, kpt render etc, if we make fundamental changes to the porch kpt file schema for example adding a new variable "imagePullSecret" to the schema, attempting to do the kpt render will fail due to a new unknown and unaccounted for variable whilst our porch render which occurs when we push the pkg will work and render just fine.
  4. This was a problem encountered by myself and Istvan before in a previous ticket about perhaps adding the render status in the kpt file which caused the exact same error and we had to go back to the drawing board for this same reason.

@arora-sagar
Copy link

Just a side question: should all the functions use the same image pull secret? Probably that will be the most used scenario.
But how about adding an imagePullSecret row in the kptfile? This will provide more flexibility.

    - image: gcr.io/kpt-fn/apply-replacements:v0.1.1
      configPath: apply-replacements-namespace.yaml
      imagePullSecret: XXXXX
  1. you are correct this would be a more flexible way however it comes with a major drawback and that is backwards compatibility.
  2. if we implement this new section in the kpt file we must modify the kpt file schema which has not been altered since we took it into porch.
  3. For example if currently whatever we add into the porch pkg kpt file will be rendered just fine by kpt commands e.g. kpt live init, kpt render etc, if we make fundamental changes to the porch kpt file schema for example adding a new variable "imagePullSecret" to the schema, attempting to do the kpt render will fail due to a new unknown and unaccounted for variable whilst our porch render which occurs when we push the pkg will work and render just fine.
  4. This was a problem encountered by myself and Istvan before in a previous ticket about perhaps adding the render status in the kpt file which caused the exact same error and we had to go back to the drawing board for this same reason.

I understand the points thanks for explaining. In this case making changes in KPT file will result in making quite big changes in all the files.

@kispaljr
Copy link
Collaborator

kispaljr commented Nov 4, 2024

First of all thanks for this much needed feature!
A quick clarification question: do I understand correctly that the user is asked to create a secret of type kubernetes.io/dockerconfigjson that is mounted to the function-runner pods, then the function runner creates an almost identical secret (as far as I can see the only difference is the name of the secret) that is in turn specified as one of the ImagePullSecrets of the KRM function pods?
If my understanding is correct, why isn't the user simply asked to create the secret with a fixed name, or why isn't the name of the user-created secret(s) is/are passed to the function-runner (instead of mounting it)? Then we could append the user-created secret(s) directly to the ImagePullSecrets of the KRM function pods.

@Catalin-Stratulat-Ericsson
Copy link
Contributor Author

Catalin-Stratulat-Ericsson commented Nov 4, 2024

First of all thanks for this much needed feature! A quick clarification question: do I understand correctly that the user is asked to create a secret of type kubernetes.io/dockerconfigjson that is mounted to the function-runner pods, then the function runner creates an almost identical secret (as far as I can see the only difference is the name of the secret) that is in turn specified as one of the ImagePullSecrets of the KRM function pods? If my understanding is correct, why isn't the user simply asked to create the secret with a fixed name, or why isn't the name of the user-created secret(s) is/are passed to the function-runner (instead of mounting it)? Then we could append the user-created secret(s) directly to the ImagePullSecrets of the KRM function pods.

You are correct Istvan.
Ill try and explain it in sections.

  1. firstly the easy one to explain the reason for the naming differences was mainly uncertainties (as you can tell by the comment above the secret name variable in the code). in my eyes the easiest would be to have another argument passed to the fn-runner stating the name of the secret so that could be carried on.
  2. The reason a user is not asked for a fixed secret name is simply that its impractical. we shouldn't use hard coded naming for resources that porch uses etc in the code never mind in the cluster in my opinion or at least try not too even if it makes things a bit more difficult to work around.
  3. The reason the name is not given but mounted needs a deeper explanation. The secret information is needed in 2 places. firstly inside the code on the fn-runner such that authn which runs inside porch can authenticate for image digestion & entrypoint, secondly it must be present on the porch-fn-system namespace to mount it as an imagePullSecrets to the pod which requires it for use.
  4. The simplest solution i saw when doing this was to simply ask the user to create 2 identical secrets on the 2 namespaces. but that seemed like a weird ask if i was a user of this. so i ended up going for a "give me one ill replicate it" approach. however i didnt add a variable which one could pass the secret name with because i didnt want to add too much argument bloat if i could help it.

@kispaljr
Copy link
Collaborator

kispaljr commented Nov 5, 2024

Thank you for the detailed clarification. I haven't realized before that we need the same secret in two different namespaces.

That properly explains the copying in the code.

I think this also explains what my follow-up question would have been, namely why aren't we just adding the pull secretes to the appropriate service accounts (as explained here). But now I understand that wouldn't solve the authentication problem when the function-runner itself pulls the container in order to inspect its metadata. Can you confirm that this is true?

/lgtm

deployments/porch/5-rbac.yaml Show resolved Hide resolved
@@ -57,6 +58,9 @@ const (
fieldManagerName = "krm-function-runner"
functionContainerName = "function"
defaultManagerNamespace = "porch-system"
defaultRegistry = "gcr.io/kpt-fn/"
// perhaps should try and get the name of the dockerconfig secret given by user and match this secret name to that to avoid hard coded value?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no this will be removed with a new commit.

@@ -532,6 +650,7 @@ func (pm *podManager) retrieveOrCreatePod(ctx context.Context, image string, ttl
// TODO: It's possible to set up a Watch in the fn runner namespace, and always try to maintain a up-to-date local cache.
podList := &corev1.PodList{}
podTemplate, templateVersion, err := pm.getBasePodTemplate(ctx)
pm.appendImagePullSecret(image, registryAuthSecretPath, podTemplate)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this happen after the err check below?

@@ -41,6 +41,7 @@ var (
port = flag.Int("port", 9445, "The server port")
functions = flag.String("functions", "./functions", "Path to cached functions.")
config = flag.String("config", "./config.yaml", "Path to the config file.")
registryAuthSecretPath = flag.String("registry-auth-secret-path", "", "Path to means of authentication for using images from custom registries e.g. docker config file")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we provide a default path here?

@Catalin-Stratulat-Ericsson
Copy link
Contributor Author

Catalin-Stratulat-Ericsson commented Nov 5, 2024

Thank you for the detailed clarification. I haven't realized before that we need the same secret in two different namespaces.

That properly explains the copying in the code.

I think this also explains what my follow-up question would have been, namely why aren't we just adding the pull secretes to the appropriate service accounts (as explained here). But now I understand that wouldn't solve the authentication problem when the function-runner itself pulls the container in order to inspect its metadata. Can you confirm that this is true?

/lgtm

  • i believe it is true Istvan particularly in this (function)[https://github.com/nephio-project/porch/blob/419b898c7a94ac3404c52c0f0ca67a8074aa3a76/func/internal/podevaluator.go#L539].
  • The comment describing it even details that it inspects the metadata to store in the cache at first use of that image
  • I will also be including a commit soon which will fix the hard coded nature of that copied secret's name to follow what the user given secrets name is using an environment variable. i will update the documentation accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants