-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
linkerd-cni does not chain correctly with OCI/OKE flannel cni #10413
Comments
Hi @blabu23, do you happen to know if your distribution configured flannel in a different directory than the default? You might need to tell Can you confirm this is not the case with your distribution? You can either ssh onto one of your hosts or run a pod that attaches a
|
nope, afaik, everything is where it belongs:
|
The IP address is passed into linkerd-cni via the |
AFAIK this is the cluster internal service address of the kubernetes api... The Pods are in the network 10.244.0.0/16 |
@blabu23 does your pod have IP connectivity otherwise? |
Does the pod already is alive when the FailedCreatePodSandBox error is thrown? |
@stevej - 10.96.0.1 is the default Kubernetes services subnet used when using an out-of-the-box OKE cluster, running with the flannel network overlay mode. It's not a pod range. |
Thanks for that. There's clearly some assumption we're making in linkerd-cni that's not satisfied by OKI and we suspect it's KUBERNETES_SERVICE_HOST as used in the installer script https://github.com/linkerd/linkerd2-proxy-init/blob/main/cni-plugin/deployment/scripts/install-cni.sh#L186 but we don't have any contacts at Oracle or any credits to track this down. |
Do you have a specific issue you're trying to locate @stevej or specific inputs you need to guide this? I posted #10531 that has more logging information when I was trying to use the CNI. I switched back to flannel as I was on a deadline, but I can probably task an engineer to offside for this issue, or if there's a semi-commitment to actually look at this from the Buoyant end, I'm happy to pay the cost of running an OKE cluster for you to work against to test the issue for a while. |
Anything new on this topic? Any chance I might support? |
@blabu23 sorry, we haven't really made any progress here. We've put this on the backburner for now. It's a bit of a weird problem, it sounds like some networking assumptions we have made do not hold in Oracle envs. If you want to investigate and come up with a solution here, I'm happy to assist you with pointers on how our CNI plugin works. I'd probably start checking the connection to the API server as that seems to be the culprit here. The CNI plugin seems to just error out when retrieving pods. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions. |
What is the issue?
In our Oracle OCI/OKE K8S environment we have the need to use linkerd-cni because of the requirement to use PodSecurity.
Oracle by default installs a flannel CNI right after provisioning the control plane.
Afterwards, I tried to install the linkerd-cni with helm and after that I tried to install cert-manager.
How can it be reproduced?
setup Oracle OKE cluster, node pool and a number of worker nodes
install linkerd-cni with helm
install cert-manager with helm
Logs, error output, etc
Warning FailedCreatePodSandBox 15m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_cert-manager-59bf757d77-kszm7_cert-manager_bf83dc3e-020c-42bd-8177-591fad2e8f3e_0(1f8fc4890a98b7d553c6ac5b335a8f58a41fbf9b31cf4d5e8e2460517d3c540b): error adding pod cert-manager_cert-manager-59bf757d77-kszm7 to CNI network "cbr0": plugin type="linkerd-cni" name="linkerd-cni" failed (add): Get "https://[10.96.0.1]:443/api/v1/namespaces/cert-manager/pods/cert-manager-59bf757d77-kszm7": cannotconnect
output of
linkerd check -o short
no linkerd installed yet... so
Environment
Possible solution
I really wish I had one...
Additional context
No response
Would you like to work on fixing this bug?
maybe
The text was updated successfully, but these errors were encountered: