Skip to content

Commit

Permalink
Fix EIP inconsistency and ESVC inconsistency
Browse files Browse the repository at this point in the history
This PR fixes two issues:

EgressIP inconsistency:

If a pod is on an egressNode and it tries to reach
another node in the cluster OR it tries to reach
a service backed by a host-net backend which is another
node in the cluster, then packet is SNAT-ed to egressIP.
If a pod is on a non-egressNode packet is SNAT-ed to
nodeIP.

Desired solution: in-cluster packets shouldn't be SNAT-ed
to egressIP.

EgressSVC inconsistency:

If a backendpod is on an egressSVC node and it tries to reach
another node in the cluster OR it tries to reach
a service backed by a host-net backend which is another
node in the cluster, then packet is SNAT-ed to svcVIP of LB svc.
If a pod is on a non-egressNode packet is SNAT-ed to
nodeIP.

This totally breaks the traffic flow for pod2othernode because
if the traffic is SNAT-ed to svcVIP, then reply traffic goes to
serviceVIP which could end up in any backend and conntrack will
not be able to match it to the existing onward entry.

Desired solution: in-cluster packets shouldn't be SNAT-ed
to egressSVC.

Design: Change the default allow 102 priority policies on
the ovn_cluster_router to look like this:

102 (ip4.src == $a12749576804119081385 || ip4.src == $a16335301576733828072) && ip4.dst == $a11079093880111560446           allow               pkt_mark=1008

Here first address-set will contain IPs of all egressIPPods.
Second address-set will contain IPs of all egressSVCPods.
Destination = all the nodeIPs in the cluster.
In addition to simply "allow"-ing so that reroute policies
are not matched upon, we also mark these packets with the 1008 mark.

To ensure packets going out for egressIP (SGW MODE topology for both
modes) are SNAT-ed to nodeIP, we add a flow on br-ex:

 cookie=0xdeff105, duration=759.429s, table=0, n_packets=0, n_bytes=0, idle_age=759, priority=105,pkt_mark=0x3f0,ip,in_port=2 actions=ct(commit,zone=64000,nat(src=172.19.0.2),exec(load:0x1->NXM_NX_CT_MARK[])),output:1

This will ensure that even if the SNAT towards egressIP is done
we do another SNAT on top.

To ensure packets going out for egressSVC (LGW mode topology for
both modes) are SNAT-ed to nodeIP, we add a new iptable rule on
OVN-KUBE-EGRESS-SVC chain:

[1:64] -A OVN-KUBE-EGRESS-SVC -m mark --mark 0x3f0 -j RETURN

This ensures the other SNAT rules are skipped.

NOTE that 0x3f0 = 1008.
NOTE2: EIP SNAT issue is only in sgw while ESVC
SNAT issue is only in lgw the vice versa combo works due to
the way egress traffic topology is.

Does this seem like a design change/feature for a bugfix? YES
Bug is complicated so unless we ask for a conditionalSNAT
change in OVN which the last time we spoke about was deemed
unnecessary for this small corner case, we need this fix.

Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>
  • Loading branch information
tssurya committed Jan 27, 2023
1 parent a37c4f9 commit e2c981a
Show file tree
Hide file tree
Showing 22 changed files with 710 additions and 194 deletions.
4 changes: 2 additions & 2 deletions go-controller/pkg/node/gateway_init_linux_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -328,7 +328,7 @@ func shareGatewayInterfaceTest(app *cli.App, testNS ns.NetNS,
"OVN-KUBE-SNAT-MGMTPORT": []string{},
"OVN-KUBE-ETP": []string{},
"OVN-KUBE-ITP": []string{},
"OVN-KUBE-EGRESS-SVC": []string{},
"OVN-KUBE-EGRESS-SVC": []string{"-m mark --mark 0x3f0 -m comment --comment Do not SNAT to SVC VIP -j RETURN"},
},
"filter": {},
"mangle": {
Expand Down Expand Up @@ -1037,7 +1037,7 @@ OFPT_GET_CONFIG_REPLY (xid=0x4): frags=normal miss_send_len=0`,
"OVN-KUBE-SNAT-MGMTPORT": []string{},
"OVN-KUBE-ETP": []string{},
"OVN-KUBE-ITP": []string{},
"OVN-KUBE-EGRESS-SVC": []string{},
"OVN-KUBE-EGRESS-SVC": []string{"-m mark --mark 0x3f0 -m comment --comment Do not SNAT to SVC VIP -j RETURN"},
},
"filter": {
"FORWARD": []string{
Expand Down
13 changes: 13 additions & 0 deletions go-controller/pkg/node/gateway_iptables.go
Original file line number Diff line number Diff line change
Expand Up @@ -146,6 +146,7 @@ func getGatewayInitRules(chain string, proto iptables.Protocol) []iptRule {
args: []string{"-j", chain},
protocol: proto,
},
egressSVCIPTDefaultReturnRule(),
}
}
if chain == iptableITPChain {
Expand Down Expand Up @@ -578,3 +579,15 @@ func egressSVCIPTRulesForEndpoints(svc *kapi.Service, v4Eps, v6Eps []string) []i

return rules
}

func egressSVCIPTDefaultReturnRule() iptRule {
return iptRule{
table: "nat",
chain: iptableESVCChain,
args: []string{
"-m", "mark", "--mark", string(ovnKubeNodeSNATMark),
"-m", "comment", "--comment", "Do not SNAT to SVC VIP",
"-j", "RETURN",
},
}
}
59 changes: 30 additions & 29 deletions go-controller/pkg/node/gateway_localnet_linux_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -352,7 +352,7 @@ var _ = Describe("Node Operations", func() {
"OVN-KUBE-SNAT-MGMTPORT": []string{},
"OVN-KUBE-ETP": []string{},
"OVN-KUBE-ITP": []string{},
"OVN-KUBE-EGRESS-SVC": []string{},
"OVN-KUBE-EGRESS-SVC": []string{"-m mark --mark 0x3f0 -m comment --comment Do not SNAT to SVC VIP -j RETURN"},
},
"filter": {},
"mangle": {
Expand Down Expand Up @@ -429,7 +429,7 @@ var _ = Describe("Node Operations", func() {
"OVN-KUBE-SNAT-MGMTPORT": []string{},
"OVN-KUBE-ETP": []string{},
"OVN-KUBE-ITP": []string{},
"OVN-KUBE-EGRESS-SVC": []string{},
"OVN-KUBE-EGRESS-SVC": []string{"-m mark --mark 0x3f0 -m comment --comment Do not SNAT to SVC VIP -j RETURN"},
},
"filter": {},
"mangle": {
Expand Down Expand Up @@ -508,7 +508,7 @@ var _ = Describe("Node Operations", func() {
"OVN-KUBE-SNAT-MGMTPORT": []string{},
"OVN-KUBE-ETP": []string{},
"OVN-KUBE-ITP": []string{},
"OVN-KUBE-EGRESS-SVC": []string{},
"OVN-KUBE-EGRESS-SVC": []string{"-m mark --mark 0x3f0 -m comment --comment Do not SNAT to SVC VIP -j RETURN"},
},
"filter": {},
"mangle": {
Expand Down Expand Up @@ -601,7 +601,7 @@ var _ = Describe("Node Operations", func() {
fmt.Sprintf("-p %s -m addrtype --dst-type LOCAL --dport %v -j DNAT --to-destination %s:%v", service.Spec.Ports[0].Protocol, service.Spec.Ports[0].NodePort, types.V4HostETPLocalMasqueradeIP, service.Spec.Ports[0].NodePort),
},
"OVN-KUBE-ITP": []string{},
"OVN-KUBE-EGRESS-SVC": []string{},
"OVN-KUBE-EGRESS-SVC": []string{"-m mark --mark 0x3f0 -m comment --comment Do not SNAT to SVC VIP -j RETURN"},
},
"filter": {},
"mangle": {
Expand Down Expand Up @@ -697,7 +697,7 @@ var _ = Describe("Node Operations", func() {
"OVN-KUBE-SNAT-MGMTPORT": []string{},
"OVN-KUBE-ETP": []string{},
"OVN-KUBE-ITP": []string{},
"OVN-KUBE-EGRESS-SVC": []string{},
"OVN-KUBE-EGRESS-SVC": []string{"-m mark --mark 0x3f0 -m comment --comment Do not SNAT to SVC VIP -j RETURN"},
},
"filter": {},
"mangle": {
Expand Down Expand Up @@ -801,7 +801,7 @@ var _ = Describe("Node Operations", func() {
fmt.Sprintf("-p %s -m addrtype --dst-type LOCAL --dport %v -j DNAT --to-destination %s:%v", service.Spec.Ports[0].Protocol, service.Spec.Ports[0].NodePort, types.V4HostETPLocalMasqueradeIP, service.Spec.Ports[0].NodePort),
},
"OVN-KUBE-ITP": []string{},
"OVN-KUBE-EGRESS-SVC": []string{},
"OVN-KUBE-EGRESS-SVC": []string{"-m mark --mark 0x3f0 -m comment --comment Do not SNAT to SVC VIP -j RETURN"},
},
"filter": {},
"mangle": {
Expand Down Expand Up @@ -936,7 +936,7 @@ var _ = Describe("Node Operations", func() {
fmt.Sprintf("-p %s -d %s --dport %v -j DNAT --to-destination %s:%d -m statistic --mode random --probability 1.0000000000", service.Spec.Ports[0].Protocol, externalIP, service.Spec.Ports[0].Port, ep2.Addresses[0], int32(service.Spec.Ports[0].TargetPort.IntValue())),
},
"OVN-KUBE-ITP": []string{},
"OVN-KUBE-EGRESS-SVC": []string{},
"OVN-KUBE-EGRESS-SVC": []string{"-m mark --mark 0x3f0 -m comment --comment Do not SNAT to SVC VIP -j RETURN"},
},
"filter": {},
"mangle": {
Expand Down Expand Up @@ -1040,7 +1040,7 @@ var _ = Describe("Node Operations", func() {
},
"OVN-KUBE-ETP": []string{},
"OVN-KUBE-ITP": []string{},
"OVN-KUBE-EGRESS-SVC": []string{},
"OVN-KUBE-EGRESS-SVC": []string{"-m mark --mark 0x3f0 -m comment --comment Do not SNAT to SVC VIP -j RETURN"},
},
"filter": {},
"mangle": {
Expand Down Expand Up @@ -1158,7 +1158,7 @@ var _ = Describe("Node Operations", func() {
fmt.Sprintf("-p %s -m addrtype --dst-type LOCAL --dport %v -j DNAT --to-destination %s:%v", service.Spec.Ports[0].Protocol, service.Spec.Ports[0].NodePort, types.V4HostETPLocalMasqueradeIP, service.Spec.Ports[0].NodePort),
},
"OVN-KUBE-ITP": []string{},
"OVN-KUBE-EGRESS-SVC": []string{},
"OVN-KUBE-EGRESS-SVC": []string{"-m mark --mark 0x3f0 -m comment --comment Do not SNAT to SVC VIP -j RETURN"},
},
"filter": {},
"mangle": {
Expand Down Expand Up @@ -1262,7 +1262,7 @@ var _ = Describe("Node Operations", func() {
"OVN-KUBE-SNAT-MGMTPORT": []string{},
"OVN-KUBE-ETP": []string{},
"OVN-KUBE-ITP": []string{},
"OVN-KUBE-EGRESS-SVC": []string{},
"OVN-KUBE-EGRESS-SVC": []string{"-m mark --mark 0x3f0 -m comment --comment Do not SNAT to SVC VIP -j RETURN"},
},
"filter": {},
"mangle": {
Expand Down Expand Up @@ -1359,7 +1359,7 @@ var _ = Describe("Node Operations", func() {
"OVN-KUBE-SNAT-MGMTPORT": []string{},
"OVN-KUBE-ETP": []string{},
"OVN-KUBE-ITP": []string{},
"OVN-KUBE-EGRESS-SVC": []string{},
"OVN-KUBE-EGRESS-SVC": []string{"-m mark --mark 0x3f0 -m comment --comment Do not SNAT to SVC VIP -j RETURN"},
},
"filter": {},
"mangle": {
Expand Down Expand Up @@ -1451,7 +1451,7 @@ var _ = Describe("Node Operations", func() {
"OVN-KUBE-SNAT-MGMTPORT": []string{},
"OVN-KUBE-ETP": []string{},
"OVN-KUBE-ITP": []string{},
"OVN-KUBE-EGRESS-SVC": []string{},
"OVN-KUBE-EGRESS-SVC": []string{"-m mark --mark 0x3f0 -m comment --comment Do not SNAT to SVC VIP -j RETURN"},
},
"filter": {},
"mangle": {
Expand Down Expand Up @@ -1532,7 +1532,7 @@ var _ = Describe("Node Operations", func() {
"OVN-KUBE-SNAT-MGMTPORT": []string{},
"OVN-KUBE-ETP": []string{},
"OVN-KUBE-ITP": []string{},
"OVN-KUBE-EGRESS-SVC": []string{},
"OVN-KUBE-EGRESS-SVC": []string{"-m mark --mark 0x3f0 -m comment --comment Do not SNAT to SVC VIP -j RETURN"},
},
"filter": {},
"mangle": {
Expand Down Expand Up @@ -1620,7 +1620,7 @@ var _ = Describe("Node Operations", func() {
"OVN-KUBE-SNAT-MGMTPORT": []string{},
"OVN-KUBE-ETP": []string{},
"OVN-KUBE-ITP": []string{},
"OVN-KUBE-EGRESS-SVC": []string{},
"OVN-KUBE-EGRESS-SVC": []string{"-m mark --mark 0x3f0 -m comment --comment Do not SNAT to SVC VIP -j RETURN"},
},
"filter": {},
"mangle": {
Expand Down Expand Up @@ -1659,7 +1659,7 @@ var _ = Describe("Node Operations", func() {
"OVN-KUBE-SNAT-MGMTPORT": []string{},
"OVN-KUBE-ETP": []string{},
"OVN-KUBE-ITP": []string{},
"OVN-KUBE-EGRESS-SVC": []string{},
"OVN-KUBE-EGRESS-SVC": []string{"-m mark --mark 0x3f0 -m comment --comment Do not SNAT to SVC VIP -j RETURN"},
},
"filter": {},
"mangle": {
Expand Down Expand Up @@ -1742,7 +1742,7 @@ var _ = Describe("Node Operations", func() {
"OVN-KUBE-NODEPORT": []string{},
"OVN-KUBE-ETP": []string{},
"OVN-KUBE-ITP": []string{},
"OVN-KUBE-EGRESS-SVC": []string{},
"OVN-KUBE-EGRESS-SVC": []string{"-m mark --mark 0x3f0 -m comment --comment Do not SNAT to SVC VIP -j RETURN"},
},
"filter": {},
"mangle": {
Expand Down Expand Up @@ -1848,7 +1848,7 @@ var _ = Describe("Node Operations", func() {
"OVN-KUBE-SNAT-MGMTPORT": []string{},
"OVN-KUBE-ETP": []string{},
"OVN-KUBE-ITP": []string{},
"OVN-KUBE-EGRESS-SVC": []string{},
"OVN-KUBE-EGRESS-SVC": []string{"-m mark --mark 0x3f0 -m comment --comment Do not SNAT to SVC VIP -j RETURN"},
},
"filter": {},
"mangle": {
Expand Down Expand Up @@ -1887,7 +1887,7 @@ var _ = Describe("Node Operations", func() {
"OVN-KUBE-SNAT-MGMTPORT": []string{},
"OVN-KUBE-ETP": []string{},
"OVN-KUBE-ITP": []string{},
"OVN-KUBE-EGRESS-SVC": []string{},
"OVN-KUBE-EGRESS-SVC": []string{"-m mark --mark 0x3f0 -m comment --comment Do not SNAT to SVC VIP -j RETURN"},
},
"filter": {},
"mangle": {
Expand Down Expand Up @@ -1981,7 +1981,7 @@ var _ = Describe("Node Operations", func() {
fmt.Sprintf("-p %s -m addrtype --dst-type LOCAL --dport %v -j DNAT --to-destination %s:%v", service.Spec.Ports[0].Protocol, service.Spec.Ports[0].NodePort, types.V4HostETPLocalMasqueradeIP, service.Spec.Ports[0].NodePort),
},
"OVN-KUBE-ITP": []string{},
"OVN-KUBE-EGRESS-SVC": []string{},
"OVN-KUBE-EGRESS-SVC": []string{"-m mark --mark 0x3f0 -m comment --comment Do not SNAT to SVC VIP -j RETURN"},
},
"filter": {},
"mangle": {
Expand Down Expand Up @@ -2022,7 +2022,7 @@ var _ = Describe("Node Operations", func() {
"OVN-KUBE-SNAT-MGMTPORT": []string{},
"OVN-KUBE-ETP": []string{},
"OVN-KUBE-ITP": []string{},
"OVN-KUBE-EGRESS-SVC": []string{},
"OVN-KUBE-EGRESS-SVC": []string{"-m mark --mark 0x3f0 -m comment --comment Do not SNAT to SVC VIP -j RETURN"},
},
"filter": {},
"mangle": {
Expand Down Expand Up @@ -2120,7 +2120,7 @@ var _ = Describe("Node Operations", func() {
fmt.Sprintf("-p %s -m addrtype --dst-type LOCAL --dport %v -j DNAT --to-destination %s:%v", service.Spec.Ports[0].Protocol, service.Spec.Ports[0].NodePort, types.V4HostETPLocalMasqueradeIP, service.Spec.Ports[0].NodePort),
},
"OVN-KUBE-ITP": []string{},
"OVN-KUBE-EGRESS-SVC": []string{},
"OVN-KUBE-EGRESS-SVC": []string{"-m mark --mark 0x3f0 -m comment --comment Do not SNAT to SVC VIP -j RETURN"},
},
"filter": {},
"mangle": {
Expand Down Expand Up @@ -2166,7 +2166,7 @@ var _ = Describe("Node Operations", func() {
"OVN-KUBE-SNAT-MGMTPORT": []string{},
"OVN-KUBE-ETP": []string{},
"OVN-KUBE-ITP": []string{},
"OVN-KUBE-EGRESS-SVC": []string{},
"OVN-KUBE-EGRESS-SVC": []string{"-m mark --mark 0x3f0 -m comment --comment Do not SNAT to SVC VIP -j RETURN"},
},
"filter": {},
"mangle": {
Expand Down Expand Up @@ -2265,7 +2265,7 @@ var _ = Describe("Node Operations", func() {
"OVN-KUBE-SNAT-MGMTPORT": []string{},
"OVN-KUBE-ETP": []string{},
"OVN-KUBE-ITP": []string{},
"OVN-KUBE-EGRESS-SVC": []string{},
"OVN-KUBE-EGRESS-SVC": []string{"-m mark --mark 0x3f0 -m comment --comment Do not SNAT to SVC VIP -j RETURN"},
},
"filter": {},
"mangle": {
Expand Down Expand Up @@ -2312,7 +2312,7 @@ var _ = Describe("Node Operations", func() {
"OVN-KUBE-SNAT-MGMTPORT": []string{},
"OVN-KUBE-ITP": []string{},
"OVN-KUBE-ETP": []string{},
"OVN-KUBE-EGRESS-SVC": []string{},
"OVN-KUBE-EGRESS-SVC": []string{"-m mark --mark 0x3f0 -m comment --comment Do not SNAT to SVC VIP -j RETURN"},
},
"filter": {},
"mangle": {
Expand Down Expand Up @@ -2409,7 +2409,7 @@ var _ = Describe("Node Operations", func() {
"OVN-KUBE-ETP": []string{
fmt.Sprintf("-p %s -m addrtype --dst-type LOCAL --dport %v -j DNAT --to-destination %s:%v", service.Spec.Ports[0].Protocol, service.Spec.Ports[0].NodePort, types.V4HostETPLocalMasqueradeIP, service.Spec.Ports[0].NodePort),
},
"OVN-KUBE-EGRESS-SVC": []string{},
"OVN-KUBE-EGRESS-SVC": []string{"-m mark --mark 0x3f0 -m comment --comment Do not SNAT to SVC VIP -j RETURN"},
},
"filter": {},
"mangle": {
Expand Down Expand Up @@ -2457,7 +2457,7 @@ var _ = Describe("Node Operations", func() {
},
"OVN-KUBE-SNAT-MGMTPORT": []string{},
"OVN-KUBE-ETP": []string{},
"OVN-KUBE-EGRESS-SVC": []string{},
"OVN-KUBE-EGRESS-SVC": []string{"-m mark --mark 0x3f0 -m comment --comment Do not SNAT to SVC VIP -j RETURN"},
},
"filter": {},
"mangle": {
Expand Down Expand Up @@ -2555,7 +2555,7 @@ var _ = Describe("Node Operations", func() {
fmt.Sprintf("-p %s -d %s --dport %d -j REDIRECT --to-port %d", service.Spec.Ports[0].Protocol, service.Spec.ClusterIP, service.Spec.Ports[0].Port, int32(service.Spec.Ports[0].TargetPort.IntValue())),
},
"OVN-KUBE-ETP": []string{},
"OVN-KUBE-EGRESS-SVC": []string{},
"OVN-KUBE-EGRESS-SVC": []string{"-m mark --mark 0x3f0 -m comment --comment Do not SNAT to SVC VIP -j RETURN"},
},
"filter": {},
"mangle": {
Expand Down Expand Up @@ -2602,7 +2602,7 @@ var _ = Describe("Node Operations", func() {
"OVN-KUBE-SNAT-MGMTPORT": []string{},
"OVN-KUBE-ITP": []string{},
"OVN-KUBE-ETP": []string{},
"OVN-KUBE-EGRESS-SVC": []string{},
"OVN-KUBE-EGRESS-SVC": []string{"-m mark --mark 0x3f0 -m comment --comment Do not SNAT to SVC VIP -j RETURN"},
},
"filter": {},
"mangle": {
Expand Down Expand Up @@ -2723,6 +2723,7 @@ var _ = Describe("Node Operations", func() {
"OVN-KUBE-ETP": []string{},
"OVN-KUBE-ITP": []string{},
"OVN-KUBE-EGRESS-SVC": []string{
"-m mark --mark 0x3f0 -m comment --comment Do not SNAT to SVC VIP -j RETURN",
"-s 10.128.0.3 -m comment --comment namespace1/service1 -j SNAT --to-source 5.5.5.5",
},
},
Expand Down Expand Up @@ -2763,7 +2764,7 @@ var _ = Describe("Node Operations", func() {
},
"OVN-KUBE-SNAT-MGMTPORT": []string{},
"OVN-KUBE-ETP": []string{},
"OVN-KUBE-EGRESS-SVC": []string{},
"OVN-KUBE-EGRESS-SVC": []string{"-m mark --mark 0x3f0 -m comment --comment Do not SNAT to SVC VIP -j RETURN"},
},
"filter": {},
"mangle": {
Expand Down
Loading

0 comments on commit e2c981a

Please sign in to comment.