[chassis/multi-asic] Make sure iBGP session established as directly connected #16777

abdosi · 2023-10-04T20:55:41Z

What I did:
Make Sure for internal iBGP we are one-hop away (directly connected) by using Generic TTL security mechanism.

Why I did:
Without this change it's possible on packet chassis i-BGP can be established even if there no direct connection. Below is the example

Let's say we have 3 LC's LC1/LC2/LC3 each having i-BGP session session with each other over Loopback4096
Each LC's have static route towards other LC's Loopback4096 to establish i-BGP session
LC1 learn default route 0.0.0.0/0 from it's e-BGP peers and send it over to LC2 and LC3 over i-BGP
Now for some reason on LC2 static route towards LC3 is removed/not-present/some-issue we expect i-BGP session should go down between LC2 and LC3
However i-BGP between LC2 and LC3 does not go down because of feature ip nht-resolve-via-default where LC2 will use default route to reach Loopback4096 of LC3. As it's using default route BGP packets from LC2 towards LC3 will first route to LC1 and then go to LC3 from there.
Above scenario can result in packet mis-forwarding on data plane

How I fixed it:-

To make sure BGP packets between i-BGP peers are not going with extra routing hop enable using GTSM feature

neighbor PEER ttl-security hops NUMBER
This command enforces Generalized TTL Security Mechanism (GTSM), as specified in RFC 5082. With this command, only neighbors that are the specified number of hops away will be allowed to become neighbors. This command is mutually exclusive with ebgp-multihop.

We set hop count as 1 which makes FRR to reject BGP connection if we receive BGP packets if it's TTL < 255. Also setting this attribute make sure i-BGP frames are originated with IP TTL of 255.

How I verfiy:

Manual Verification of above scenario. See blow BGP packets receive with IP TTL 254 (additional routing hop) we are seeing FIN TCP flags as BGP is rejecting the connection

peer device's Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

higher value so that BGP learnt default route is higher priority. Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

by using Generic TTL security mechanisim. Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

abdosi · 2023-10-06T00:01:06Z

@arlakshm can you please help with review of this

arlakshm

lgtm

abdosi · 2023-10-06T17:15:43Z

@lguohan / @StormLiangMS can you please help with merge of this.

mssonicbld · 2023-10-17T07:19:14Z

@abdosi PR conflicts with 202305 branch

gechiang · 2023-10-24T19:53:31Z

@yxieca can you review if this can be approved for 202205?
Thanks!

…onnected (sonic-net#16777) What I did: Make Sure for internal iBGP we are one-hop away (directly connected) by using Generic TTL security mechanism. Why I did: Without this change it's possible on packet chassis i-BGP can be established even if there no direct connection. Below is the example - Let's say we have 3 LC's LC1/LC2/LC3 each having i-BGP session session with each other over Loopback4096 - Each LC's have static route towards other LC's Loopback4096 to establish i-BGP session - LC1 learn default route 0.0.0.0/0 from it's e-BGP peers and send it over to LC2 and LC3 over i-BGP - Now for some reason on LC2 static route towards LC3 is removed/not-present/some-issue we expect i-BGP session should go down between LC2 and LC3 - However i-BGP between LC2 and LC3 does not go down because of feature ip nht-resolve-via-default where LC2 will use default route to reach Loopback4096 of LC3. As it's using default route BGP packets from LC2 towards LC3 will first route to LC1 and then go to LC3 from there. Above scenario can result in packet mis-forwarding on data plane How I fixed it:- To make sure BGP packets between i-BGP peers are not going with extra routing hop enable using GTSM feature neighbor PEER ttl-security hops NUMBER This command enforces Generalized TTL Security Mechanism (GTSM), as specified in RFC 5082. With this command, only neighbors that are the specified number of hops away will be allowed to become neighbors. This command is mutually exclusive with ebgp-multihop. We set hop count as 1 which makes FRR to reject BGP connection if we receive BGP packets if it's TTL < 255. Also setting this attribute make sure i-BGP frames are originated with IP TTL of 255. How I verify: Manual Verification of above scenario. See blow BGP packets receive with IP TTL 254 (additional routing hop) we are seeing FIN TCP flags as BGP is rejecting the connection Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

mssonicbld · 2023-10-25T00:35:14Z

Cherry-pick PR to 202205: #16997

…onnected (#16777) What I did: Make Sure for internal iBGP we are one-hop away (directly connected) by using Generic TTL security mechanism. Why I did: Without this change it's possible on packet chassis i-BGP can be established even if there no direct connection. Below is the example - Let's say we have 3 LC's LC1/LC2/LC3 each having i-BGP session session with each other over Loopback4096 - Each LC's have static route towards other LC's Loopback4096 to establish i-BGP session - LC1 learn default route 0.0.0.0/0 from it's e-BGP peers and send it over to LC2 and LC3 over i-BGP - Now for some reason on LC2 static route towards LC3 is removed/not-present/some-issue we expect i-BGP session should go down between LC2 and LC3 - However i-BGP between LC2 and LC3 does not go down because of feature ip nht-resolve-via-default where LC2 will use default route to reach Loopback4096 of LC3. As it's using default route BGP packets from LC2 towards LC3 will first route to LC1 and then go to LC3 from there. Above scenario can result in packet mis-forwarding on data plane How I fixed it:- To make sure BGP packets between i-BGP peers are not going with extra routing hop enable using GTSM feature neighbor PEER ttl-security hops NUMBER This command enforces Generalized TTL Security Mechanism (GTSM), as specified in RFC 5082. With this command, only neighbors that are the specified number of hops away will be allowed to become neighbors. This command is mutually exclusive with ebgp-multihop. We set hop count as 1 which makes FRR to reject BGP connection if we receive BGP packets if it's TTL < 255. Also setting this attribute make sure i-BGP frames are originated with IP TTL of 255. How I verify: Manual Verification of above scenario. See blow BGP packets receive with IP TTL 254 (additional routing hop) we are seeing FIN TCP flags as BGP is rejecting the connection Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

saksarav-nokia · 2023-10-27T21:37:14Z

@abdosi ,
In Broadcom based Chassis, the TTL for ibgp packets will be 254 since the packets are sent via recycle ports and the TTL will be decremented in the ingress asic. So the ibgp neighbors are not coming up in our chassis after this PR is merged

What I did: Revert the GTSM feature for VOQ iBGP session done as part of #16777. Why I did: On VOQ chassis BGP packets go over Recycle Port and then for Ingress Pipeline Routing making ttl as 254 and failing single hop check. Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

…onnected (sonic-net#16777) What I did: Make Sure for internal iBGP we are one-hop away (directly connected) by using Generic TTL security mechanism. Why I did: Without this change it's possible on packet chassis i-BGP can be established even if there no direct connection. Below is the example - Let's say we have 3 LC's LC1/LC2/LC3 each having i-BGP session session with each other over Loopback4096 - Each LC's have static route towards other LC's Loopback4096 to establish i-BGP session - LC1 learn default route 0.0.0.0/0 from it's e-BGP peers and send it over to LC2 and LC3 over i-BGP - Now for some reason on LC2 static route towards LC3 is removed/not-present/some-issue we expect i-BGP session should go down between LC2 and LC3 - However i-BGP between LC2 and LC3 does not go down because of feature ip nht-resolve-via-default where LC2 will use default route to reach Loopback4096 of LC3. As it's using default route BGP packets from LC2 towards LC3 will first route to LC1 and then go to LC3 from there. Above scenario can result in packet mis-forwarding on data plane How I fixed it:- To make sure BGP packets between i-BGP peers are not going with extra routing hop enable using GTSM feature neighbor PEER ttl-security hops NUMBER This command enforces Generalized TTL Security Mechanism (GTSM), as specified in RFC 5082. With this command, only neighbors that are the specified number of hops away will be allowed to become neighbors. This command is mutually exclusive with ebgp-multihop. We set hop count as 1 which makes FRR to reject BGP connection if we receive BGP packets if it's TTL < 255. Also setting this attribute make sure i-BGP frames are originated with IP TTL of 255. How I verify: Manual Verification of above scenario. See blow BGP packets receive with IP TTL 254 (additional routing hop) we are seeing FIN TCP flags as BGP is rejecting the connection Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

abdosi · 2023-11-20T11:52:51Z

@abdosi PR conflicts with 202305 branch

@StormLiangMS : #17237

* [chassis/multi-asic] Make sure iBGP session established as directly connected (#16777) What I did: Make Sure for internal iBGP we are one-hop away (directly connected) by using Generic TTL security mechanism. Why I did: Without this change it's possible on packet chassis i-BGP can be established even if there no direct connection. Below is the example - Let's say we have 3 LC's LC1/LC2/LC3 each having i-BGP session session with each other over Loopback4096 - Each LC's have static route towards other LC's Loopback4096 to establish i-BGP session - LC1 learn default route 0.0.0.0/0 from it's e-BGP peers and send it over to LC2 and LC3 over i-BGP - Now for some reason on LC2 static route towards LC3 is removed/not-present/some-issue we expect i-BGP session should go down between LC2 and LC3 - However i-BGP between LC2 and LC3 does not go down because of feature ip nht-resolve-via-default where LC2 will use default route to reach Loopback4096 of LC3. As it's using default route BGP packets from LC2 towards LC3 will first route to LC1 and then go to LC3 from there. Above scenario can result in packet mis-forwarding on data plane How I fixed it:- To make sure BGP packets between i-BGP peers are not going with extra routing hop enable using GTSM feature neighbor PEER ttl-security hops NUMBER This command enforces Generalized TTL Security Mechanism (GTSM), as specified in RFC 5082. With this command, only neighbors that are the specified number of hops away will be allowed to become neighbors. This command is mutually exclusive with ebgp-multihop. We set hop count as 1 which makes FRR to reject BGP connection if we receive BGP packets if it's TTL < 255. Also setting this attribute make sure i-BGP frames are originated with IP TTL of 255. How I verify: Manual Verification of above scenario. See blow BGP packets receive with IP TTL 254 (additional routing hop) we are seeing FIN TCP flags as BGP is rejecting the connection Signed-off-by: Abhishek Dosi <abdosi@microsoft.com> * Update peer-group.conf.j2 * Update result_all.conf * Update result_base.conf --------- Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

What I did: Revert the GTSM feature for VOQ iBGP session done as part of sonic-net#16777. Why I did: On VOQ chassis BGP packets go over Recycle Port and then for Ingress Pipeline Routing making ttl as 254 and failing single hop check. Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

What I did: Revert the GTSM feature for VOQ iBGP session done as part of #16777. Why I did: On VOQ chassis BGP packets go over Recycle Port and then for Ingress Pipeline Routing making ttl as 254 and failing single hop check. Signed-off-by: Abhishek Dosi <abdosi@microsoft.com> Co-authored-by: abdosi <58047199+abdosi@users.noreply.github.com>

abdosi added 12 commits August 3, 2023 04:47

Fix the Loopback0 IPv6 address of LC's in chassis not reachable from

7a3d7e5

peer device's Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

Merge remote-tracking branch 'upstream/master'

b5b08cf

Added change to have flag

8d9dbb6

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

Merge remote-tracking branch 'upstream/master'

d04bc54

Merge remote-tracking branch 'upstream/master'

264d912

Merge remote-tracking branch 'upstream/master'

957cd71

Assign the metric vaule for Ipv6 default route learnt via RA message to

4e8b101

higher value so that BGP learnt default route is higher priority. Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

Merge remote-tracking branch 'upstream/master'

05ec92a

Add alternate name for bridge interface on supversior in chassis systrem

fcbd38d

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

Merge remote-tracking branch 'upstream/master'

76019a7

Merge remote-tracking branch 'upstream/master'

f6ad6f2

Make Sure for internal iBGP we are one-hop away (directly connected)

7f4b36a

by using Generic TTL security mechanisim. Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

abdosi requested review from StormLiangMS and lguohan as code owners October 4, 2023 20:55

abdosi requested review from arlakshm and judyjoseph October 4, 2023 20:56

abdosi added Request for 202205 Branch Request for 202305 Branch labels Oct 4, 2023

abdosi changed the title ~~[chassis/mulit-asic] Make sure iBGP session established as directly connected~~ [chassis/multi-asic] Make sure iBGP session established as directly connected Oct 5, 2023

arlakshm approved these changes Oct 6, 2023

View reviewed changes

judyjoseph approved these changes Oct 6, 2023

View reviewed changes

lguohan approved these changes Oct 10, 2023

View reviewed changes

lguohan merged commit 7059f42 into sonic-net:master Oct 10, 2023
19 checks passed

StormLiangMS added the Approved for 202305 Branch label Oct 17, 2023

mssonicbld added the Cherry Pick Conflict_202305 label Oct 17, 2023

yxieca added the Approved for 202205 Branch label Oct 25, 2023

mssonicbld added the Created PR to 202205 Branch label Oct 25, 2023

mssonicbld mentioned this pull request Oct 25, 2023

[action] [PR:16777] [chassis/multi-asic] Make sure iBGP session established as directly connected #16997

Merged

mssonicbld added Included in 202205 Branch and removed Approved for 202205 Branch Created PR to 202205 Branch labels Oct 25, 2023

abdosi mentioned this pull request Oct 28, 2023

Revert iBGP GTSM feature for VOQ Chassis #17037

Merged

abdosi mentioned this pull request Nov 20, 2023

[202305] PR to make BGP GTSM feature for packet-chassis #17237

Merged

StormLiangMS added Included in 202305 Branch and removed Cherry Pick Conflict_202305 labels Nov 22, 2023

mssonicbld mentioned this pull request Nov 29, 2023

[action] [PR:17037] Revert iBGP GTSM feature for VOQ Chassis #17347

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[chassis/multi-asic] Make sure iBGP session established as directly connected #16777

[chassis/multi-asic] Make sure iBGP session established as directly connected #16777

abdosi commented Oct 4, 2023 •

edited

Loading

abdosi commented Oct 6, 2023

arlakshm left a comment

abdosi commented Oct 6, 2023

mssonicbld commented Oct 17, 2023

gechiang commented Oct 24, 2023

mssonicbld commented Oct 25, 2023

saksarav-nokia commented Oct 27, 2023

abdosi commented Nov 20, 2023

[chassis/multi-asic] Make sure iBGP session established as directly connected #16777

[chassis/multi-asic] Make sure iBGP session established as directly connected #16777

Conversation

abdosi commented Oct 4, 2023 • edited Loading

abdosi commented Oct 6, 2023

arlakshm left a comment

Choose a reason for hiding this comment

abdosi commented Oct 6, 2023

mssonicbld commented Oct 17, 2023

gechiang commented Oct 24, 2023

mssonicbld commented Oct 25, 2023

saksarav-nokia commented Oct 27, 2023

abdosi commented Nov 20, 2023

abdosi commented Oct 4, 2023 •

edited

Loading