Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

portmap plugin's iptables rules intercept kubernetes service traffic #222

Closed
danwinship opened this issue Oct 25, 2018 · 9 comments
Closed
Labels

Comments

@danwinship
Copy link
Contributor

The portmap code assumes that --dst-type LOCAL will only match traffic addressed to the node's own IP addresses. However, at least under some configurations, it will end up matching traffic to addresses in the kubernetes service range as well. As a result, if you have, eg, a pod that claims hostport 443, it might end up receiving traffic that was sent to 172.30.0.1:443 by pods on the same node.

As seen in kubernetes/kubernetes#66103. It's not clear if this depends in some way on how the network plugin sets up routing to the service network, or if it happens all the time for any plugin that uses the portmap plugin.

@squeed
Copy link
Member

squeed commented Oct 26, 2018

--dst-type LOCAL exactly translates to "is there an entry in the local routing table." 99% of the time, this means there's an interface with that address assigned to it. However, nothing stops you from adding more entries to the local table. I know that the crazy hackers at GCP do it for their cluster IPs.

The challenge is to figure out an iptables condition that matches only the exact traffic we want. That's why I added the conditionsV4 parameter, so the admin could tweak exactly what traffic gets mapped. If anyone comes up with a safe condition to add to the default, we can definitely make that change. So far, we haven't found one that works.

@danwinship
Copy link
Contributor Author

--dst-type LOCAL exactly translates to "is there an entry in the local routing table."

Hm... it doesn't seem like the service CIDR route should be local... (OpenShift SDN's route-to-serviceCIDR is unicast.) So maybe this is a Calico bug?

I know that the crazy hackers at GCP do it for their cluster IPs.

I guess that depends on whether you consider those to be alternative local IPs or ExternalIPs...

The challenge is to figure out an iptables condition that matches only the exact traffic we want.

I think what you want is "is addressed to the IP address of any network interface on the host" but if --dst-type LOCAL doesn't do that, I'm not sure what would

@dghubble
Copy link

maybe this is a Calico bug

In my testing, I notice after enabling kube-proxy IPVS mode, when a pod with a hostport (i.e. portmap) runs on a node, it prevents service IP access (from both the host node and any pods on it). I've reproduced this with both Calico and Flannel (both of which use portmap) and on different clouds, so my inclination was its unlikely in the CNI provider itself.

@squeed
Copy link
Member

squeed commented Oct 29, 2018

Aha, IPVS - that's something I hadn't considered. I wonder if we're racing on rules.

I suspect we'll either need to write a separate k8s-portmap plugin that is more opinionated, so we can hook in the right place, or just add some sort of k8s mode to the existing plugin.

@danwinship
Copy link
Contributor Author

Well, but then you're just potentially broken on every other platform that uses CNI besides kubernetes...

Ideally, the portmap plugin (and each other iptables user) would intercept only the packets it actually wanted, and then ordering wouldn't matter.

Alternatively, maybe something like my KUBE-BEFORE-POSTROUTING, etc, suggestion on the sig-net list should be implemented at the CNI level rather than the k8s level; the actual chains would be more or less the same, but it would be a requirement of CNI that the environment set them up for plugins to use, rather than being a k8s-specific thing.

@dannyk81
Copy link

@danwinship any word if there was any progress with this issue? 🙏

@danwinship
Copy link
Contributor Author

No. (I'm not working on this, I was just reporting the bug after having reviewed a kubernetes PR that was trying to fix the problem in the wrong way.)

Presumably if there was progress there would be updates here or in the linked kube bug, so you can subscribe to those.

@liqlin2015
Copy link

@squeed Do you know there is any plan to fix this issue?

@dannyk81
Copy link

ping @squeed 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants