Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce route selection deferral timer for bgp graceful restart #7533

Merged
merged 6 commits into from
Jul 26, 2021

Conversation

shi-su
Copy link
Contributor

@shi-su shi-su commented May 6, 2021

Why I did it

There are scenarios that End-of-RIB comes from a part of the peers arrives after reconciliation. In such scenarios, if the route selection deferral timer has the default value of 360 seconds, FRR would not set up routes and all routes would be removed after reconciliation. This PR reduces the route selection deferral timer so that at least routes to parts of the peers get restored at the point of reconciliation.

Fix #7488

How I did it

Reduce route selection deferral timer for bgp graceful restart to 45 seconds.

How to verify it

Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006
  • 202012

Description for the changelog

A picture of a cute animal (not mandatory but encouraged)

@shi-su shi-su marked this pull request as ready for review May 6, 2021 05:18
@shi-su shi-su requested a review from lguohan as a code owner May 6, 2021 05:18
@shi-su shi-su requested a review from yxieca May 6, 2021 05:19
@zhenggen-xu
Copy link
Collaborator

If we disable "route selection deferral timer", the restarting speaker could get into the situation where it only gets some peers EOR and then do route selection too early, and then had some unnecessary routes updates at BGP layer. This could propagate to peers and impact traffic.

Can we just reduce this timer (route selection deferral timer) to some value that fit your environment? The default seemed to be 360 seconds.

Also, the original issue (#7488) where we don't get EOR after reconciliation (120 seconds) from one peer is also bit odd as that timer value is very big already.

@lguohan
Copy link
Collaborator

lguohan commented May 7, 2021

i agree with @zhenggen-xu, if the deferral time is too small, it could cause negative impact by send the routes to early to the peers. I think this timer can be around 90 seconds so that it will timeout before fpmsync reconcilation timeout, meanwhile not too small so that given peer a chance to send EOR.

meanwhile, I do think we need to understand when EOR is not received after 120 seconds.

@shi-su
Copy link
Contributor Author

shi-su commented May 7, 2021

this timer can be around 90 seconds so that it will timeout before fpmsync reconcilation timeout, meanwhile not too small so that given peer a chance to send EOR.

One tricky thing about this selection deferral timer is that it seems to take effect per peer and starts counting from a peer to get established. That is to say, we are unable to control its timeout in the timeline of reconciliation since we do not know when the first peer gets established.

I am updating the selection deferral timer to 15 seconds, as per my observation, the EOR from all peers is likely to arrive within 15 seconds from the first one to get established. Assuming the reconciliation timer has a 15 seconds margin that the first peer should get established at least 15 seconds before the reconciliation happens, such a deferral timer should address the issue while not impose a negative effect.

In the meantime, we can put some effort into understanding why EOR comes after 120 seconds.

@shi-su shi-su changed the title Disable route selection deferral timer for bgp graceful restart Reduce route selection deferral timer for bgp graceful restart May 7, 2021
@lguohan
Copy link
Collaborator

lguohan commented May 7, 2021

as per my observation, the EOR from all peers is likely to arrive within 15 seconds from the first one to get established

is this valid? should we give more tolerance here?

@shi-su
Copy link
Contributor Author

shi-su commented May 7, 2021

as per my observation, the EOR from all peers is likely to arrive within 15 seconds from the first one to get established

is this valid? should we give more tolerance here?

I think we need to have an elaborate analysis on the timing to find a good value for the timer here. Let me collect more data and will circle back with my findings.

@shi-su
Copy link
Contributor Author

shi-su commented May 11, 2021

I collected some data on the KVM platform about the timing of the first peer gets established after the reconciliation timer starts, and the interval between the first peer gets established and the last EOR is received. The distribution is presented in the figure attached. The results show that the typical time for the first peer to be established after the reconciliation timer starts is about 70 seconds, whereas the interval between the first peer gets established and the last EOR received ranges between 5 to 20 seconds. So I think 25 seconds would be a reasonable choice for the selection deferral timer to ensure some routes can be selected while avoiding unnecessary route updates.

I think another method might be extending the reconciliation timer to the same value as the selection deferral timer. This could better prevent running into the empty route table issue. The price is that the control plane might come back a bit later.

illustration

@shi-su shi-su merged commit 8a48be9 into sonic-net:master Jul 26, 2021
carl-nokia pushed a commit to carl-nokia/sonic-buildimage that referenced this pull request Aug 7, 2021
…-net#7533)

Why I did it
There are scenarios that End-of-RIB comes from a part of the peers arrives after reconciliation. In such scenarios, if the route selection deferral timer has the default value of 360 seconds, FRR would not set up routes and all routes would be removed after reconciliation. This PR reduces the route selection deferral timer so that at least routes to parts of the peers get restored at the point of reconciliation.

Fix sonic-net#7488

How I did it
Reduce route selection deferral timer for bgp graceful restart to 15 seconds.
@yxieca yxieca added the Request for 202111 Branch For PRs being requested for 202111 branch label Dec 17, 2021
qiluo-msft pushed a commit that referenced this pull request Dec 20, 2021
Why I did it
There are scenarios that End-of-RIB comes from a part of the peers arrives after reconciliation. In such scenarios, if the route selection deferral timer has the default value of 360 seconds, FRR would not set up routes and all routes would be removed after reconciliation. This PR reduces the route selection deferral timer so that at least routes to parts of the peers get restored at the point of reconciliation.

Fix #7488

How I did it
Reduce route selection deferral timer for bgp graceful restart to 15 seconds.
@judyjoseph
Copy link
Contributor

Please raise a new PR for 202111 branch, as this patch cannot be cleanly cherry-picked.

@shi-su
Copy link
Contributor Author

shi-su commented Jan 4, 2022

Please raise a new PR for 202111 branch, as this patch cannot be cleanly cherry-picked.

This PR is already included in 202111 branch, no need for cherry-picking.

@shi-su shi-su removed the Request for 202111 Branch For PRs being requested for 202111 branch label Jan 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BGP fails to restore route table before reconciliation in warm reboot
6 participants