Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vault Injector NetworkPolicy requires additional from rules in OCP 4.13 #953

Closed
komish opened this issue Sep 7, 2023 · 6 comments
Closed
Labels
bug Something isn't working injector Area: mutating webhook service openshift

Comments

@komish
Copy link

komish commented Sep 7, 2023

This is a copy of the issue created openshift-helm-charts/charts#934. That issue will be closed in favor of this one.

Describe the bug

We had issues in our testing with vault's MutatingWebhookConfiguration not being accessible by kube-apiserver in OCP 4.13. Per my testing, this is just because the NetworkPolicy that's created needs an extra match criteria.

The policy looks like this in your latest chart:
https://github.com/openshift-helm-charts/charts/blob/main/charts/partners/hashicorp/vault/0.24.0/src/templates/injector-network-policy.yaml

The from items needs to include an additional entry allowing host network workloads to reach it.

...
spec:
  podSelector:
    ... removed for brevity ...
  ingress:
    - from:
        - namespaceSelector: {}
        - namespaceSelector:      # add this and the following lines
            matchLabels:
              policy-group.network.openshift.io/host-network: ""
...

This is documented here:
https://docs.openshift.com/container-platform/4.13/networking/network_policy/about-network-policy.html#nw-networkpolicy-allow-from-hostnetwork_about-network-policy

We discovered it when chart-testing would run. The webhook would be live and the chart-test would immediately fail because the pod it creates would fail at admission (due to the webhook being inaccessible from the kube-apiserver).

To Reproduce
Steps to reproduce the behavior:

  1. Run chart-testing against the existing chart on an OCP 4.13 cluster.

Expected behavior
chart-testing would pass

Environment
OCP 4.13

Additional context
Add any other context about the problem here.

@komish
Copy link
Author

komish commented Sep 7, 2023

One additional note, @tvoran currently has a PR to certify a new version that appears to be failing because of this issue.

openshift-helm-charts/charts#1036

@tvoran tvoran added openshift injector Area: mutating webhook service labels Sep 14, 2023
@tvoran
Copy link
Member

tvoran commented Sep 14, 2023

Hi @komish, thanks for looking into this! So far I'm unable to reproduce this on an OpenShift 4.13 cluster (using openshift-installer on GCP) with chart-verifier or a manual injector example, so it would be great to get more details on the setup that's failing.

It looks to me like our existing NetworkPolicy should allow incoming traffic from all namespaces (like the example in the these docs)? So I'm a little puzzled why the extra rule is required.

@komish
Copy link
Author

komish commented Sep 18, 2023

Interesting @tvoran. I'm running an openshift-installer provisioned cluster on AWS on 4.13, which is where the issue presented. Let me see if I can identify any cluster-specific features, and I'll report back.

@komish
Copy link
Author

komish commented Sep 18, 2023

Just a quick note that I was able to replicate a successful chart-test execution without the host-network related NetworkPolicy addition on 4.13/GCP.

I didn't observe any CNI specific differences in the cluster I provisioned vs. the AWS cluster I used for testing. I also confirmed 4.13/AWS still has the problem.

I have open questions internally that I'm pursuing to get insights on if there are any logical platform differences at play here, and will report back when I have more on it.

I did get explicit confirmation that this is how you allow access from the APIServer to the admission webhook. The only caveat would be colocated webhook workloads with the requesting source (APIServer), in which case the network policy may not apply, but I don't think that's the case you are I encounter in our testing.

More to come as I have it.

@komish
Copy link
Author

komish commented Sep 19, 2023

My apologies.

It seems like there's some sort of bug that exists in 4.13.4 that must have been resolved in newer Z-stream version of OCP. Testing against 4.13.12, I saw successful installations on both GCP and AWS, which prompted me to go try GCP on 4.13.4 (matches our certification cluster). I confirmed things failed there. I haven't identified an exact bug that's in scope here, but I'll peruse the bug reports at a later time.

We'll roll out updated infrastructure, test the chart to make sure it installs with this exact network policy, and then if all is well, I'll go ahead and close this out. I'll also follow up with your PR to certify the next version of your chart.

Thanks for taking a look.

@komish
Copy link
Author

komish commented Sep 20, 2023

I rolled out the newer infrastructure and your PR cleared up openshift-helm-charts/charts#1036. Thanks for checking this out, and apologies for the confusion.

@komish komish closed this as completed Sep 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working injector Area: mutating webhook service openshift
Projects
None yet
Development

No branches or pull requests

2 participants