Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Routes with IPv6 link-local address as nexthop are not propagating to hardware #430

Open
kirankella opened this issue Jan 9, 2019 · 5 comments

Comments

@kirankella
Copy link
Contributor

kirankella commented Jan 9, 2019

Description

Routing stack running is FRR.
I am advertising BGP IPv6 routes from Peer node with IPv6 link-local NEXTHOP address. Routes are learned in the BGP database and added into Zebra RIB/FIB with IPv6 link-local address as nexthop.
But the routes are not propagated to SAI by orchagent.

We see the below error coming in the logs from orchagent.
[ Jan 1 14:56:18.463637 sonic INFO swss#orchagent: :- addRoute: Failed to get next hop fe80::20c:9fff:fe02:203 for 1096::/98 ]

I see 2 issues to be fixed here before the above issue is addressed:

The neighsyncd is ignoring the kernel netlink notifications about new link-local ipv6 neighbors and hence they are not pushed to NEIGH_TABLE in app db. Only then the next hop table in neighorch module will have the link-local ipv6 neighbor too populated in its nexthop table.
Once this is fixed, the addRoute operation by orchagent succeeds and the route gets eventually pushed to SAI.

The problem will be half addressed with the above change 1.
I see that the m_syncdNextHops (next hops) in neighorch is indexed only by IPAddress. And the m_syncdNextHopGroups (next hop groups) in routeorch is indexed by IPAddresses.
IMO the next hop index should be {ipaddress + interface name} pair and not just ipaddress. Else the IPv6 link-local address cannot be used as a nexthop without knowing the interface index. In DC scenarios, where we see IPv6 link-local BGP peering on multiple links between the same 2 nodes, we see ECMP group with same link-local nexthop on different links. If we store the NextHopGroup indexed by IPAddresses only, we cannot uniquely add all the link-local nexthops in an ECMP group in this scenario.

I am planning to do the above 2 changes in orchagent code in the next few days.
Please let me know if you think this is not the right direction to address these issues.

Steps to reproduce the issue

  1. Add IPv6 static route with IPv6 link-local next hop (or)
  2. Advertise IPv6 BGP NLRI with IPv6 link-local next hop. Check 'show bgp ipv6' and 'show ipv6 route'.
  3. Test IPv6 traffic L3 forwarding destined to the routes.

Describe the results you received

  • Observed addRoute failure logs coming in orchagent.
  • Check if the routes are added into hardware by testing the traffic. Traffic forwarding would fail.

Describe the results you expected

  • Traffic forwarding should work over IPv6 link-local next hops too.

Additional information you deem important (e.g. issue happens only occasionally)

Output of show version

root@sonic:/home/admin# show version
SONiC Software Version: SONiC.master.0-dirty-20190108.121803
Distribution: Debian 9.6
Kernel: 4.9.0-7-amd64
Build commit: d9c076d
Build date: Tue Jan  8 07:03:52 UTC 2019
Built by: kiran@kiran-virtualBox-128GB

Docker images:
REPOSITORY                 TAG                              IMAGE ID            SIZE
docker-orchagent-brcm      latest                           f082a242236c        289 MB
docker-orchagent-brcm      master.0-dirty-20190108.121803   f082a242236c        289 MB
docker-teamd               latest                           cfc11fc65414        277.2 MB
docker-teamd               master.0-dirty-20190108.121803   cfc11fc65414        277.2 MB
docker-fpm-frr             latest                           b5d1d5af58e8        284 MB
docker-fpm-frr             master.0-dirty-20190108.121803   b5d1d5af58e8        284 MB
docker-syncd-brcm          latest                           3016f7047ed2        364.4 MB
docker-syncd-brcm          master.0-dirty-20190108.121803   3016f7047ed2        364.4 MB
docker-lldp-sv2            latest                           6ef8ec64c92e        277.2 MB
docker-lldp-sv2            master.0-dirty-20190108.121803   6ef8ec64c92e        277.2 MB
docker-platform-monitor    latest                           4e887e16193d        289.6 MB
docker-platform-monitor    master.0-dirty-20190108.121803   4e887e16193d        289.6 MB
docker-dhcp-relay          latest                           6e2f78b2ba84        258 MB
docker-dhcp-relay          master.0-dirty-20190108.121803   6e2f78b2ba84        258 MB
docker-database            latest                           abd05c0c74e2        256.6 MB
docker-database            master.0-dirty-20190108.121803   abd05c0c74e2        256.6 MB
docker-snmp-sv2            latest                           df6b224e9b37        295.5 MB
docker-snmp-sv2            master.0-dirty-20190108.121803   df6b224e9b37        295.5 MB
docker-router-advertiser   latest                           b36c74e0762a        254.3 MB
docker-router-advertiser   master.0-dirty-20190108.121803   b36c74e0762a        254.3 MB
@prsunny
Copy link
Contributor

prsunny commented Jan 10, 2019

@zhenggen-xu to review

@zhenggen-xu
Copy link
Collaborator

To support ipv6 link-local, we do have a PR available:
sonic-net/sonic-swss#437

It essentially enables the link local for neighbors and deal with the overlapping cases for ip2me and host routes.

There was some issues in SAI implementation that could cause crash if we have link local on VLAN member ports, but that should have been fixed in the recent SAI, so this PR will be rebased and resumed soon.

In case you really have the same link local address for next-hops on different ports, I agree that we need change the map for m_syncdNextHops to be able to uniquely identify the nexthops with that same ip, currently we don't have this scenario in our DC. nexthop_ids from SAI API call , on the other hand, should have taken the interface into account.

BTW: This issue should be created against sonic-swss.

@kirankella
Copy link
Contributor Author

Thanks for the information.
We see that the packets destined to our interface link-local ipv6 unicast address (ping ipv6 ) are not coming to CPU. Ping fails to link-local address.
For that reason, even the IPv6 BGP peers over link-local addresses (which is also very essential in DC deployments with IPv6 auto-detect neighbors over link-local address) won't form adjacency.
That's because the link-local interface address (fe80::) routes are not pushed to the hardware.

I believe this link-local ping issue too would be addressed when the sonic-net/sonic-swss#437 fixes are merged. Right?

@zhenggen-xu
Copy link
Collaborator

Yes, that PR should fix the issue you mentioned above.

@kirankella
Copy link
Contributor Author

Any tentative timeline when this PR may be merged?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants