Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[chassis/multi-asic] Make sure iBGP session established as directly connected #16777

Merged
merged 12 commits into from
Oct 10, 2023

Conversation

abdosi
Copy link
Contributor

@abdosi abdosi commented Oct 4, 2023

What I did:
Make Sure for internal iBGP we are one-hop away (directly connected) by using Generic TTL security mechanism.

Why I did:
Without this change it's possible on packet chassis i-BGP can be established even if there no direct connection. Below is the example

  • Let's say we have 3 LC's LC1/LC2/LC3 each having i-BGP session session with each other over Loopback4096
  • Each LC's have static route towards other LC's Loopback4096 to establish i-BGP session
  • LC1 learn default route 0.0.0.0/0 from it's e-BGP peers and send it over to LC2 and LC3 over i-BGP
  • Now for some reason on LC2 static route towards LC3 is removed/not-present/some-issue we expect i-BGP session should go down between LC2 and LC3
  • However i-BGP between LC2 and LC3 does not go down because of feature ip nht-resolve-via-default  where LC2 will use default route to reach Loopback4096 of LC3. As it's using default route BGP packets from LC2 towards LC3 will first route to LC1 and then go to LC3 from there.
  • Above scenario can result in packet mis-forwarding on data plane

How I fixed it:-

  • To make sure BGP packets between i-BGP peers are not going with extra routing hop enable using GTSM feature
neighbor PEER ttl-security hops NUMBER
This command enforces Generalized TTL Security Mechanism (GTSM), as specified in RFC 5082. With this command, only neighbors that are the specified number of hops away will be allowed to become neighbors. This command is mutually exclusive with ebgp-multihop.
  • We set hop count as 1 which makes FRR to reject BGP connection if we receive BGP packets if it's TTL < 255. Also setting this attribute make sure i-BGP frames are originated with IP TTL of 255.

How I verfiy:

Manual Verification of above scenario. See blow BGP packets receive with IP TTL 254 (additional routing hop) we are seeing FIN TCP flags as BGP is rejecting the connection

image

@abdosi abdosi changed the title [chassis/mulit-asic] Make sure iBGP session established as directly connected [chassis/multi-asic] Make sure iBGP session established as directly connected Oct 5, 2023
@abdosi
Copy link
Contributor Author

abdosi commented Oct 6, 2023

@arlakshm can you please help with review of this

Copy link
Contributor

@arlakshm arlakshm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@abdosi
Copy link
Contributor Author

abdosi commented Oct 6, 2023

@lguohan / @StormLiangMS can you please help with merge of this.

@lguohan lguohan merged commit 7059f42 into sonic-net:master Oct 10, 2023
19 checks passed
@mssonicbld
Copy link
Collaborator

@abdosi PR conflicts with 202305 branch

@gechiang
Copy link
Collaborator

@yxieca can you review if this can be approved for 202205?
Thanks!

mssonicbld pushed a commit to mssonicbld/sonic-buildimage that referenced this pull request Oct 25, 2023
…onnected (sonic-net#16777)

What I did:
Make Sure for internal iBGP we are one-hop away (directly connected) by using Generic TTL security mechanism.

Why I did:
Without this change it's possible on packet chassis i-BGP can be established even if there no direct connection. Below is the example

- Let's say we have 3 LC's LC1/LC2/LC3 each having i-BGP session session with each other over Loopback4096
- Each LC's have static route towards other LC's Loopback4096 to establish i-BGP session
- LC1 learn default route 0.0.0.0/0 from it's e-BGP peers and send it over to LC2 and LC3 over i-BGP
- Now for some reason on LC2 static route towards LC3 is removed/not-present/some-issue we expect i-BGP session should go down between LC2 and LC3
- However i-BGP between LC2 and LC3 does not go down because of feature ip nht-resolve-via-default  where LC2 will use default route to reach Loopback4096 of LC3. As it's using default route BGP packets from LC2 towards LC3 will first route to LC1 and then go to LC3 from there.

Above scenario can result in packet mis-forwarding on data plane

How I fixed it:-

To make sure BGP packets between i-BGP peers are not going with extra routing hop enable using GTSM feature

neighbor PEER ttl-security hops NUMBER

This command enforces Generalized TTL Security Mechanism (GTSM), as specified in RFC 5082. With this command, only neighbors that are the specified number of hops away will be allowed to become neighbors. This command is mutually exclusive with ebgp-multihop.

We set hop count as 1 which makes FRR to reject BGP connection if we receive BGP packets if it's TTL < 255. Also setting this attribute make sure i-BGP frames are originated with IP TTL of 255.

How I verify:

Manual Verification of above scenario. See blow BGP packets receive with IP TTL 254 (additional routing hop) we are seeing FIN TCP flags as BGP is rejecting the connection

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
@mssonicbld
Copy link
Collaborator

Cherry-pick PR to 202205: #16997

mssonicbld pushed a commit that referenced this pull request Oct 25, 2023
…onnected (#16777)

What I did:
Make Sure for internal iBGP we are one-hop away (directly connected) by using Generic TTL security mechanism.

Why I did:
Without this change it's possible on packet chassis i-BGP can be established even if there no direct connection. Below is the example

- Let's say we have 3 LC's LC1/LC2/LC3 each having i-BGP session session with each other over Loopback4096
- Each LC's have static route towards other LC's Loopback4096 to establish i-BGP session
- LC1 learn default route 0.0.0.0/0 from it's e-BGP peers and send it over to LC2 and LC3 over i-BGP
- Now for some reason on LC2 static route towards LC3 is removed/not-present/some-issue we expect i-BGP session should go down between LC2 and LC3
- However i-BGP between LC2 and LC3 does not go down because of feature ip nht-resolve-via-default  where LC2 will use default route to reach Loopback4096 of LC3. As it's using default route BGP packets from LC2 towards LC3 will first route to LC1 and then go to LC3 from there.

Above scenario can result in packet mis-forwarding on data plane

How I fixed it:-

To make sure BGP packets between i-BGP peers are not going with extra routing hop enable using GTSM feature

neighbor PEER ttl-security hops NUMBER

This command enforces Generalized TTL Security Mechanism (GTSM), as specified in RFC 5082. With this command, only neighbors that are the specified number of hops away will be allowed to become neighbors. This command is mutually exclusive with ebgp-multihop.

We set hop count as 1 which makes FRR to reject BGP connection if we receive BGP packets if it's TTL < 255. Also setting this attribute make sure i-BGP frames are originated with IP TTL of 255.

How I verify:

Manual Verification of above scenario. See blow BGP packets receive with IP TTL 254 (additional routing hop) we are seeing FIN TCP flags as BGP is rejecting the connection

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
@saksarav-nokia
Copy link
Contributor

@abdosi ,
In Broadcom based Chassis, the TTL for ibgp packets will be 254 since the packets are sent via recycle ports and the TTL will be decremented in the ingress asic. So the ibgp neighbors are not coming up in our chassis after this PR is merged

lguohan pushed a commit that referenced this pull request Nov 18, 2023
What I did:

Revert the GTSM feature for VOQ iBGP session done as part of #16777.

Why I did:
On VOQ chassis BGP packets go over Recycle Port and then for Ingress Pipeline Routing making ttl as 254 and failing single hop check.

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
abdosi added a commit to abdosi/sonic-buildimage that referenced this pull request Nov 20, 2023
…onnected (sonic-net#16777)

What I did:
Make Sure for internal iBGP we are one-hop away (directly connected) by using Generic TTL security mechanism.

Why I did:
Without this change it's possible on packet chassis i-BGP can be established even if there no direct connection. Below is the example

- Let's say we have 3 LC's LC1/LC2/LC3 each having i-BGP session session with each other over Loopback4096
- Each LC's have static route towards other LC's Loopback4096 to establish i-BGP session
- LC1 learn default route 0.0.0.0/0 from it's e-BGP peers and send it over to LC2 and LC3 over i-BGP
- Now for some reason on LC2 static route towards LC3 is removed/not-present/some-issue we expect i-BGP session should go down between LC2 and LC3
- However i-BGP between LC2 and LC3 does not go down because of feature ip nht-resolve-via-default  where LC2 will use default route to reach Loopback4096 of LC3. As it's using default route BGP packets from LC2 towards LC3 will first route to LC1 and then go to LC3 from there.

Above scenario can result in packet mis-forwarding on data plane

How I fixed it:-

To make sure BGP packets between i-BGP peers are not going with extra routing hop enable using GTSM feature

neighbor PEER ttl-security hops NUMBER

This command enforces Generalized TTL Security Mechanism (GTSM), as specified in RFC 5082. With this command, only neighbors that are the specified number of hops away will be allowed to become neighbors. This command is mutually exclusive with ebgp-multihop.

We set hop count as 1 which makes FRR to reject BGP connection if we receive BGP packets if it's TTL < 255. Also setting this attribute make sure i-BGP frames are originated with IP TTL of 255.

How I verify:

Manual Verification of above scenario. See blow BGP packets receive with IP TTL 254 (additional routing hop) we are seeing FIN TCP flags as BGP is rejecting the connection

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
@abdosi
Copy link
Contributor Author

abdosi commented Nov 20, 2023

@abdosi PR conflicts with 202305 branch

@StormLiangMS : #17237

StormLiangMS pushed a commit that referenced this pull request Nov 22, 2023
* [chassis/multi-asic] Make sure iBGP session established as directly connected  (#16777)

What I did:
Make Sure for internal iBGP we are one-hop away (directly connected) by using Generic TTL security mechanism.

Why I did:
Without this change it's possible on packet chassis i-BGP can be established even if there no direct connection. Below is the example

- Let's say we have 3 LC's LC1/LC2/LC3 each having i-BGP session session with each other over Loopback4096
- Each LC's have static route towards other LC's Loopback4096 to establish i-BGP session
- LC1 learn default route 0.0.0.0/0 from it's e-BGP peers and send it over to LC2 and LC3 over i-BGP
- Now for some reason on LC2 static route towards LC3 is removed/not-present/some-issue we expect i-BGP session should go down between LC2 and LC3
- However i-BGP between LC2 and LC3 does not go down because of feature ip nht-resolve-via-default  where LC2 will use default route to reach Loopback4096 of LC3. As it's using default route BGP packets from LC2 towards LC3 will first route to LC1 and then go to LC3 from there.

Above scenario can result in packet mis-forwarding on data plane

How I fixed it:-

To make sure BGP packets between i-BGP peers are not going with extra routing hop enable using GTSM feature

neighbor PEER ttl-security hops NUMBER

This command enforces Generalized TTL Security Mechanism (GTSM), as specified in RFC 5082. With this command, only neighbors that are the specified number of hops away will be allowed to become neighbors. This command is mutually exclusive with ebgp-multihop.

We set hop count as 1 which makes FRR to reject BGP connection if we receive BGP packets if it's TTL < 255. Also setting this attribute make sure i-BGP frames are originated with IP TTL of 255.

How I verify:

Manual Verification of above scenario. See blow BGP packets receive with IP TTL 254 (additional routing hop) we are seeing FIN TCP flags as BGP is rejecting the connection

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>

* Update peer-group.conf.j2

* Update result_all.conf

* Update result_base.conf

---------

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
mssonicbld pushed a commit to mssonicbld/sonic-buildimage that referenced this pull request Nov 29, 2023
What I did:

Revert the GTSM feature for VOQ iBGP session done as part of sonic-net#16777.

Why I did:
On VOQ chassis BGP packets go over Recycle Port and then for Ingress Pipeline Routing making ttl as 254 and failing single hop check.

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
yxieca pushed a commit that referenced this pull request Nov 30, 2023
What I did:

Revert the GTSM feature for VOQ iBGP session done as part of #16777.

Why I did:
On VOQ chassis BGP packets go over Recycle Port and then for Ingress Pipeline Routing making ttl as 254 and failing single hop check.

Signed-off-by: Abhishek Dosi <abdosi@microsoft.com>
Co-authored-by: abdosi <58047199+abdosi@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

9 participants