Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[routesync] Stale neighbor fix (PR #2553) review comment fix #3007

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

vganesan-nokia
Copy link
Contributor

What I did

Fixed a review comment for the commits made in PR #2553

Why I did it

The original fix in PR #2553 deleted stale neighbor from APPL_DB and ASIC_DB even for new route (if the route has only one next hop on interface eth0 or docker0). This is incorrect. The stale neighbor should be cleared only for the route delete command. There was review comment identifying this requirement. To address the review comment, changes are done to stale neighbor fix PR #2553. Changes include adding check to delete the stale neighbor from ASIC_DB and APPL_DB only if the kernel command is RTM_DELROUTE

How I verified it

vs test

Details if related

PR #2553

Changes done to stale neighbor fix PR#2553 (sonic-net#2553)
Changes include adding check to delete the stale neighbor from ASIC_DB
and APPL_DB only if the kernel command is RTM_DELROUTE

Signed-off-by: vedganes <veda.ganesan@nokia.com>
@@ -782,7 +782,7 @@ void RouteSync::onRouteMsg(int nlmsg_type, struct nl_object *obj, char *vrf)
// In this case since we do not want the route with next hop on eth0/docker0, we return.
// But still we need to clear the route from the APPL_DB. Otherwise the APPL_DB and data
// path will be left with stale route entry
if(alsv.size() == 1)
if((nlmsg_type == RTM_DELROUTE) && (alsv.size() == 1))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vganesan-nokia , bit confused with the comments above this line. I thought the whole idea was for a route which had previously some nexthops is now changed to only have 'eth0' in which case we have to delete it from ASIC even if its a RTM_ADD command. So with this, how do we handle that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prsunny, before PR #2553, when delete route update comes with only next hop on eth0/docker0, the update will not be sent to APPL_DB and ASIC_DB. This resulted in out-of-sync route table situation between kernel and ASIC_DB. The idea of PR #2553 is to avoid leaving these routes un-deleted in APPL_DB and ASIC_DB when the "delete" route update comes with all the next hops removed except next hop on eth0/docker0. For "add" route updates, due to timing/sequence of next hop updates, we may get a valid route update with a next hop on interfaces other than eth0/docker0 which will be sent to APPL_DB and ASIC_DB. Later if we receive add route update with next hop on eth0/docker0 (only one next hop), the changes in PR #2553 wrongly delets the already programmed valid route - as commented by @peter-yu. This is fixed by doing the delete only for delete route updates.

@yuxuehong
Copy link

Due to PR #2553, when route with muti nexthops which not include eth0, frr has feture which when one nexthop invalid,will resolve to default; thus zebra will update route with muti nexthop, and one of them via eth0, thus we can not handle this situation;
so, handle muti nexthop which include eth0, we sholud update the non-eth0 routes to APPDB?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants