Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Packets from unknown interface are allowed in table 0 #195

Closed
wenyingd opened this issue Dec 6, 2019 · 8 comments · Fixed by #199
Closed

Packets from unknown interface are allowed in table 0 #195

wenyingd opened this issue Dec 6, 2019 · 8 comments · Fixed by #199
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@wenyingd
Copy link
Contributor

wenyingd commented Dec 6, 2019

Describe the bug
The openflow entries in table0 are as following show. Antrea has added marks to the register according to packet's in_port. So the packet sent from Pod/gateway/tunnel should be resubmitted by flow entries 1~5, and the packets from unknown source might be handled by flow 6 or 7. It might introduce security risks because of the unknown packets.

  1. cookie=0x0, table=0, priority=200,in_port=gw0 actions=load:0x1->NXM_NX_REG0[0..15],resubmit(,10)
  2. cookie=0x0, table=0, priority=200,in_port=tun0 actions=load:0->NXM_NX_REG0[0..15],resubmit(,30)
  3. cookie=0x0, table=0, priority=190,in_port="coredns5-b17870" actions=load:0x2->NXM_NX_REG0[0..15],resubmit(,10)
  4. cookie=0x0, table=0, priority=190,in_port="coredns5-b1c57e" actions=load:0x2->NXM_NX_REG0[0..15],resubmit(,10)
  5. cookie=0x0, table=0, priority=190,in_port="apachebe-cd800d" actions=load:0x2->NXM_NX_REG0[0..15],resubmit(,10)
  6. cookie=0x0, table=0, priority=80,ip actions=resubmit(,10)
  7. cookie=0x0, table=0, priority=0 actions=NORMAL

To Reproduce
Deploy Antrea and create Pods.

Expected
Packets fronm unknown source are dropped in table 0.

Actual behavior
Accordint to flow 6 and 7, ip packets from unknown interface are resubmitted to table 10, and non-ip packets from unknown interface are forwarding normal.

Versions:
0.1/0.2

@wenyingd
Copy link
Contributor Author

wenyingd commented Dec 6, 2019

Run conformance test after change the default action as "drop", and find following cases might fail. It might be a race between Openflow realization and GARP packets. Go binding is async, it returns when the Openflow entry is sent out to the outbound stream, but not wait for the ACK from OVS. Actually, if the FlowMod message is accepted by OFSwitch without any error, there is no reply message sent back to the controller. More investigation are needed to find a valid solution.

[sig-apps] StatefulSet [k8s.io] Basic StatefulSet functionality [StatefulSetBasic] should perform rolling updates and roll backs of template modifications [Conformance]
[sig-api-machinery] AdmissionWebhook [Privileged:ClusterAdmin] should mutate custom resource with pruning [Conformance]
[sig-scheduling] SchedulerPredicates [Serial] validates resource limits of pods that are allowed to run [Conformance]
[sig-api-machinery] CustomResourceConversionWebhook [Privileged:ClusterAdmin] should be able to convert a non homogeneous list of CRs [Conformance]

@wenyingd
Copy link
Contributor Author

wenyingd commented Dec 9, 2019

If wait 50ms before sending the GARP, the failed cases could pass in Conformance. But it adds extra waiting time for K8S to know Pod is running.

@tnqn
Copy link
Member

tnqn commented Dec 11, 2019

I would suggest to start a goroutine to send several GARPs with a short interval, like 3 packets with 100ms interval. Then it wouldn't extend the time agent processes CNI ADD, and it leaves 200ms to realize the flows.

@antoninbas
Copy link
Contributor

@wenyingd does libOpenflow support flow monitors & notifications? I think longer term that's what we may want to do instead of relying on a timer. It may be useful for other things to if we have a simple framework in libOpenflow / Antrea to just register a Go channel that can get notified when a specific flow is added to a specific table. Then this Go routine here could just wait for the notification and stop sending GARPs only when the notification is received.

@wenyingd
Copy link
Contributor Author

wenyingd commented Dec 16, 2019

For a single FlowMod message, OVS didn't support to send back an ACK to the controller(ofnet/Antrea) actively. We could leverage Multipart message to check FlowStats, ofnet could decode the Multipart reply message to ensure if a flow entry is installed on the OVS.

For each Multipart request message, Antrea could define the Match and cookieIDs of the expeced Flow entry. If we want to leverage this feature to check the realization of every flow entry, for we didn't have a specific cookieID for each flow, we should also use the Match condition in the Multipart message. After OVS has sent back the multipart reply message, we also need to decode the message, and mapping the results in the flow list to the required flow entry. Do you think it satisfy the requirement @antoninbas ?

I have some quick questions about it:

  1. as one Pod is installing multiple Openflow entries, should we wait for the realization of all the Openflow entries? Does the wait take in another goroutine or in the originel namespace that configuring Pod's address? If in the same goroutine, will it delay the CNI Add reply?
  2. We can't ensure the Multipart request message is sent after OVS has realized Openflow entries. What if it doesn't realized the flow entry yet? Shall we retry with Multipart request? What is the max number of the retry?

@antoninbas
Copy link
Contributor

I am not too familiar with OF Multipart messages, but that sounds like a big change. I thought Multipart messages were to support very large messages.

I think what I was suggesting was more something like this:

  • send all FlowMod messages for the Pod
  • send a barrier message
  • in libOpenflow, wait for the corresponding barrier reply and notify Antrea by writing to a Go channel

We can also wait for bundle support:

  • send a bundle for all the Pod flows
  • wait for a OFPBCT_DISCARD_REPLY message with the correct bundle id, which indicates that the bundle has been committed (similar to a barrier message I believe), and notify Antrea by writing to a Go channel

The bundle solution is probably best but the barrier one works too. I believe that by the time we get the bundle / barrier reply, we are guaranteed that flows have been installed (so no need for a "retry" mechanism).

There is no need to block and delay the CNI Add reply if we don't want to do so. Sending the GARP will be done asynchronously in a separate goroutine. The goroutine will use select to wait on the channel mentioned above. When libOpenflow notifies us after receiving the matching barrier / bundle reply, we send one last GARP message and terminate the goroutine.

I like this solution but it's not a "must-have" at the moment. We can merge your PR and open an issue to track this (and wait for bundle support).

@wenyingd
Copy link
Contributor Author

I agree to use bundle feature if we want to make the flow installation work like a "sync" mode.

@antoninbas
Copy link
Contributor

@wenyingd it's not really synchronous from the client perspective: the call to libOpenflow won't block and we don't necessarily need to wait for the reply message before we return the CNI Add reply.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants