-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
all: use ->text when parsing protocol argument #10
Conversation
and match on full protocol name in proto_redistnum() Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Continous Integration Result: FAILEDSee below for issues. This is a comment from an EXPERIMENTAL automated CI system. Get source and apply patch from patchwork: SuccessfulBuilding Stage: FailedFedora24 amd64 build: Successful CentOS6 amd64 build: FailedCentOS6 amd64 build: Unknown Log <log_configure.txt> FreeBSD10 amd64 build: FailedDejaGNU Unittests (make check) failed for FreeBSD10 amd64 build Ubuntu1204 amd64 build: FailedUbuntu1204 amd64 build: Unknown Log <log_configure.txt> |
Continous Integration Result: SUCCESSFULCongratulations, this patch passed basic tests Tested-by: NetDEF / OpenSourceRouting.org CI System CI System Testrun URL: https://ci1.netdef.org/browse/FRR-FRRPULLREQ-23/ This is a comment from an EXPERIMENTAL automated CI system. |
proto_redistnum() now accepts full protocol strings and not partial names per FRRouting#10 Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
If path->net is NULL in the bgp_path_info_free() function, then bgpd would crash in bgp_addpath_free_info_data() with the following backtrace: (gdb) bt #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 #1 0x00007ff7b267a42a in __GI_abort () at abort.c:89 #2 0x00007ff7b39c1ca0 in core_handler (signo=11, siginfo=0x7ffff66414f0, context=<optimized out>) at lib/sigevent.c:249 #3 <signal handler called> #4 idalloc_free_to_pool (pool_ptr=pool_ptr@entry=0x0, id=3) at lib/id_alloc.c:368 #5 0x0000560096246688 in bgp_addpath_free_info_data (d=d@entry=0x560098665468, nd=0x0) at bgpd/bgp_addpath.c:100 #6 0x00005600961bb522 in bgp_path_info_free (path=0x560098665400) at bgpd/bgp_route.c:252 #7 bgp_path_info_unlock (path=0x560098665400) at bgpd/bgp_route.c:276 #8 0x00005600961bb719 in bgp_path_info_reap (rn=rn@entry=0x5600986b2110, pi=pi@entry=0x560098665400) at bgpd/bgp_route.c:320 #9 0x00005600961bf4db in bgp_process_main_one (safi=SAFI_MPLS_VPN, afi=AFI_IP, rn=0x5600986b2110, bgp=0x560098587320) at bgpd/bgp_route.c:2476 #10 bgp_process_wq (wq=<optimized out>, data=0x56009869b8f0) at bgpd/bgp_route.c:2503 #11 0x00007ff7b39d5fcc in work_queue_run (thread=0x7ffff6641e10) at lib/workqueue.c:294 #12 0x00007ff7b39ce3b1 in thread_call (thread=thread@entry=0x7ffff6641e10) at lib/thread.c:1606 #13 0x00007ff7b39a3538 in frr_run (master=0x5600980795b0) at lib/libfrr.c:1011 #14 0x000056009618a5a3 in main (argc=3, argv=0x7ffff6642078) at bgpd/bgp_main.c:481 Add a null-check protection to fix this problem. Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
The following ASAN issue has been observed: > ERROR: AddressSanitizer: heap-use-after-free on address 0x6160000acba4 at pc 0x55910c5694d0 bp 0x7ffe3a8ac850 sp 0x7ffe3a8ac840 > READ of size 4 at 0x6160000acba4 thread T0 > #0 0x55910c5694cf in ctx_info_from_zns zebra/zebra_dplane.c:3315 > FRRouting#1 0x55910c569696 in dplane_ctx_ns_init zebra/zebra_dplane.c:3331 > FRRouting#2 0x55910c56bf61 in dplane_ctx_nexthop_init zebra/zebra_dplane.c:3680 > FRRouting#3 0x55910c5711ca in dplane_nexthop_update_internal zebra/zebra_dplane.c:4490 > FRRouting#4 0x55910c571c5c in dplane_nexthop_delete zebra/zebra_dplane.c:4717 > FRRouting#5 0x55910c61e90e in zebra_nhg_uninstall_kernel zebra/zebra_nhg.c:3413 > FRRouting#6 0x55910c615d8a in zebra_nhg_decrement_ref zebra/zebra_nhg.c:1919 > FRRouting#7 0x55910c6404db in route_entry_update_nhe zebra/zebra_rib.c:454 > FRRouting#8 0x55910c64c904 in rib_re_nhg_free zebra/zebra_rib.c:2822 > FRRouting#9 0x55910c655be2 in rib_unlink zebra/zebra_rib.c:4212 > FRRouting#10 0x55910c6430f9 in zebra_rtable_node_cleanup zebra/zebra_rib.c:968 > FRRouting#11 0x7f26f275b8a9 in route_node_free lib/table.c:75 > FRRouting#12 0x7f26f275bae4 in route_table_free lib/table.c:111 > FRRouting#13 0x7f26f275b749 in route_table_finish lib/table.c:46 > FRRouting#14 0x55910c65db17 in zebra_router_free_table zebra/zebra_router.c:191 > FRRouting#15 0x55910c65dfb5 in zebra_router_terminate zebra/zebra_router.c:244 > FRRouting#16 0x55910c4f40db in zebra_finalize zebra/main.c:249 > FRRouting#17 0x7f26f2777108 in event_call lib/event.c:2011 > FRRouting#18 0x7f26f264180e in frr_run lib/libfrr.c:1212 > FRRouting#19 0x55910c4f49cb in main zebra/main.c:531 > FRRouting#20 0x7f26f2029d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 > FRRouting#21 0x7f26f2029e3f in __libc_start_main_impl ../csu/libc-start.c:392 > FRRouting#22 0x55910c4b0114 in _start (/usr/lib/frr/zebra+0x1ae114) It happens with FRR using the kernel. During shutdown, the namespace identifier is attempted to be obtained by zebra, in an attempt to prepare zebra dataplane nexthop messages. Fix this by accessing the ns structure. Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The following heap-use-after-free message happens when teardown test happens on a topotest using protocol nexthop-groups. > ==739645==ERROR: AddressSanitizer: heap-use-after-free on address 0x60e00004df48 at pc 0x558966dbd6d1 bp 0x7ffdfc1e0ec0 sp 0x7ffdfc1e0eb0 > READ of size 8 at 0x60e00004df48 thread T0 > #0 0x558966dbd6d0 in dplane_ctx_route_init zebra/zebra_dplane.c:3447 > FRRouting#1 0x558966dbd8f5 in dplane_route_update_internal zebra/zebra_dplane.c:4237 > FRRouting#2 0x558966e5eb99 in rib_uninstall_kernel zebra/zebra_rib.c:778 > FRRouting#3 0x558966e685f8 in rib_process_del_fib zebra/zebra_rib.c:1023 > FRRouting#4 0x558966e685f8 in rib_process zebra/zebra_rib.c:1489 > FRRouting#5 0x558966e6ab55 in process_subq_route zebra/zebra_rib.c:2792 > FRRouting#6 0x558966e6ab55 in process_subq zebra/zebra_rib.c:3356 > FRRouting#7 0x558966e6ab55 in meta_queue_process zebra/zebra_rib.c:3395 > FRRouting#8 0x7f7fd771207f in work_queue_run lib/workqueue.c:282 > FRRouting#9 0x7f7fd76f3d3b in event_call lib/event.c:2011 > FRRouting#10 0x7f7fd761b897 in frr_run lib/libfrr.c:1212 > FRRouting#11 0x558966d270b6 in main zebra/main.c:533 > FRRouting#12 0x7f7fd7029d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 > FRRouting#13 0x7f7fd7029e3f in __libc_start_main_impl ../csu/libc-start.c:392 > FRRouting#14 0x558966d29ed4 in _start (/usr/lib/frr/zebra+0x1b4ed4) > > 0x60e00004df48 is located 40 bytes inside of 160-byte region [0x60e00004df20,0x60e00004dfc0) > freed by thread T0 here: > #0 0x7f7fd7ab4537 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:127 > FRRouting#1 0x558966e6b38b in process_subq_nhg zebra/zebra_rib.c:2730 > FRRouting#2 0x558966e6b38b in process_subq zebra/zebra_rib.c:3342 > FRRouting#3 0x558966e6b38b in meta_queue_process zebra/zebra_rib.c:3395 > FRRouting#4 0x7f7fd771207f in work_queue_run lib/workqueue.c:282 > FRRouting#5 0x7f7fd76f3d3b in event_call lib/event.c:2011 > FRRouting#6 0x7f7fd761b897 in frr_run lib/libfrr.c:1212 > FRRouting#7 0x558966d270b6 in main zebra/main.c:533 > FRRouting#8 0x7f7fd7029d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 A ROUTE_DELETE message is sent with an NHE identifier, in addition to NHG_DELETE. The latter message triggers the deletion of the NHE, but no check is done for the former message. Fix this by checking if the NHE ID exists before sending it to the dataplane. Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
…d: fix show bgp all with evpn Merge in HARDWARE/frr from psuchy/fix_show_bgp_all to akamai/debian/frr-8.4.2 Squashed commit of the following: commit 094f403d1c900e232ac009f3ac0047dfd652c58e Author: Louis Scalbert <louis.scalbert@6wind.com> Date: Thu Dec 29 16:50:54 2022 +0100 bgpd: fix show bgp all with evpn Fix crash on "show bgp all" when BGP EVPN is set. > #0 raise (sig=11) at ../sysdeps/unix/sysv/linux/raise.c:50 > #1 0x00007fdfe03cf53c in core_handler (signo=11, siginfo=0x7ffdebbffe30, context=0x7ffdebbffd00) at lib/sigevent.c:261 > FRRouting#2 <signal handler called> > FRRouting#3 0x00000000004d4fec in bgp_attr_get_community (attr=0x41) at bgpd/bgp_attr.h:553 > FRRouting#4 0x00000000004eee84 in bgp_show_table (vty=0x1a790d0, bgp=0x19d0a00, safi=SAFI_EVPN, table=0x19f6010, type=bgp_show_type_normal, output_arg=0x0, rd=0x0, is_last=1, output_cum=0x0, > total_cum=0x0, json_header_depth=0x7ffdebc00bf8, show_flags=4, rpki_target_state=RPKI_NOT_BEING_USED) at bgpd/bgp_route.c:11329 > FRRouting#5 0x00000000004f7765 in bgp_show (vty=0x1a790d0, bgp=0x19d0a00, afi=AFI_L2VPN, safi=SAFI_EVPN, type=bgp_show_type_normal, output_arg=0x0, show_flags=4, > rpki_target_state=RPKI_NOT_BEING_USED) at bgpd/bgp_route.c:11814 > FRRouting#6 0x00000000004fb53b in show_ip_bgp_magic (self=0x6752b0 <show_ip_bgp_cmd>, vty=0x1a790d0, argc=3, argv=0x19cb050, viewvrfname=0x0, all=0x1395390 "all", aa_nn=0x0, community_list=0, > community_list_str=0x0, community_list_name=0x0, as_path_filter_name=0x0, prefix_list=0x0, accesslist_name=0x0, rmap_name=0x0, version=0, version_str=0x0, alias_name=0x0, > orr_group_name=0x0, detail_routes=0x0, uj=0x0, detail_json=0x0, wide=0x0) at bgpd/bgp_route.c:13040 > FRRouting#7 0x00000000004fa322 in show_ip_bgp (self=0x6752b0 <show_ip_bgp_cmd>, vty=0x1a790d0, argc=3, argv=0x19cb050) at ./bgpd/bgp_route_clippy.c:519 > FRRouting#8 0x00007fdfe033ccc8 in cmd_execute_command_real (vline=0x19c9300, filter=FILTER_RELAXED, vty=0x1a790d0, cmd=0x0, up_level=0) at lib/command.c:996 > FRRouting#9 0x00007fdfe033c739 in cmd_execute_command (vline=0x19c9300, vty=0x1a790d0, cmd=0x0, vtysh=0) at lib/command.c:1056 > FRRouting#10 0x00007fdfe033cdf5 in cmd_execute (vty=0x1a790d0, cmd=0x19c9eb0 "show bgp all", matched=0x0, vtysh=0) at lib/command.c:1223 > FRRouting#11 0x00007fdfe03f65c6 in vty_command (vty=0x1a790d0, buf=0x19c9eb0 "show bgp all") at lib/vty.c:486 > FRRouting#12 0x00007fdfe03f603b in vty_execute (vty=0x1a790d0) at lib/vty.c:1249 > FRRouting#13 0x00007fdfe03f533b in vtysh_read (thread=0x7ffdebc03838) at lib/vty.c:2148 > FRRouting#14 0x00007fdfe03e815d in thread_call (thread=0x7ffdebc03838) at lib/thread.c:2006 > FRRouting#15 0x00007fdfe0379b54 in frr_run (master=0x1246880) at lib/libfrr.c:1198 > FRRouting#16 0x000000000042b2a8 in main (argc=7, argv=0x7ffdebc03af8) at bgpd/bgp_main.c:520 Link: FRRouting#12576 Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
After having refreshed a recursive protocol NHG, a heaf after free happens on the NHG dependencies. > READ of size 4 at 0x60e000074cc0 thread T0 > #0 0x555ea629eef0 in zebra_nhg_decrement_ref zebra/zebra_nhg.c:1904 > FRRouting#1 0x555ea62a2748 in zebra_nhg_proto_add zebra/zebra_nhg.c:3981 > FRRouting#2 0x555ea62ccf6c in process_subq_nhg zebra/zebra_rib.c:2737 > FRRouting#3 0x555ea62ccf6c in process_subq zebra/zebra_rib.c:3342 > FRRouting#4 0x555ea62ccf6c in meta_queue_process zebra/zebra_rib.c:3395 > FRRouting#5 0x7fd799f1207f in work_queue_run lib/workqueue.c:282 > FRRouting#6 0x7fd799ef3d3b in event_call lib/event.c:2011 > FRRouting#7 0x7fd799e1b897 in frr_run lib/libfrr.c:1212 > FRRouting#8 0x555ea61860b6 in main zebra/main.c:533 > FRRouting#9 0x7fd799829d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 > FRRouting#10 0x7fd799829e3f in __libc_start_main_impl ../csu/libc-start.c:392 > FRRouting#11 0x555ea6188ed4 in _start (/usr/lib/frr/zebra+0x1b4ed4) > > 0x60e000074cc0 is located 96 bytes inside of 160-byte region [0x60e000074c60,0x60e000074d00) > freed by thread T0 here: > #0 0x7fd79a2b4537 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:127 > FRRouting#1 0x555ea629ef69 in nhg_connected_tree_decrement_ref zebra/zebra_nhg.c:187 > FRRouting#2 0x555ea629eec7 in zebra_nhg_decrement_ref zebra/zebra_nhg.c:1920 > FRRouting#3 0x555ea62bc110 in route_entry_update_nhe zebra/zebra_rib.c:454 > FRRouting#4 0x555ea62bc3fb in rib_handle_nhg_replace zebra/zebra_rib.c:478 > FRRouting#5 0x555ea62a22f8 in zebra_nhg_proto_add zebra/zebra_nhg.c:3966 Actually, 'debug zebra nexthop detail' is enabled and tries to display nhg_depend list whose NHE have been previously flushed. Fix this by removing the nhg_depends list itself, before sending it to zebra_nhg_free(). Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
When a failover happens on ECMP paths that use the same nexthop which is recursively resolved, ZEBRA replaces the old NHG with a new one, and updates the pointer of all routes using that nexthop. Actually, if only the recursive nexthop changed, there is no need to replace the old NHG. Modify the zebra_nhg_proto_add() function, by updating the recursive nexthop on the original NHG. Using this change replaces the old method that was consisting in allocating a new nhe. This change triggers an ASAN in the bgp_nhg_zapi_scalability test, function test_bgp_ipv4_simulate_r5_machine_going_down(). > ==1195107==ERROR: AddressSanitizer: heap-use-after-free on address 0x60e0000de580 at pc 0x55b6b7d55d8e bp 0x7fffd81977a0 sp 0x7fffd8197790 > READ of size 4 at 0x60e0000de580 thread T0 > #0 0x55b6b7d55d8d in zebra_nhg_decrement_ref zebra/zebra_nhg.c:1858 > FRRouting#1 0x55b6b7d55fee in zebra_nhg_free_members zebra/zebra_nhg.c:1752 > FRRouting#2 0x55b6b7d55fee in zebra_nhg_free zebra/zebra_nhg.c:1772 > FRRouting#3 0x55b6b7d59215 in zebra_nhg_proto_add zebra/zebra_nhg.c:3883 > FRRouting#4 0x55b6b7d83615 in process_subq_nhg zebra/zebra_rib.c:2738 > FRRouting#5 0x55b6b7d83615 in process_subq zebra/zebra_rib.c:3344 > FRRouting#6 0x55b6b7d83615 in meta_queue_process zebra/zebra_rib.c:3397 > FRRouting#7 0x7fe57a916fef in work_queue_run lib/workqueue.c:282 > FRRouting#8 0x7fe57a8f863b in event_call lib/event.c:1996 > FRRouting#9 0x7fe57a81e527 in frr_run lib/libfrr.c:1237 > FRRouting#10 0x55b6b7c40c74 in main zebra/main.c:526 > FRRouting#11 0x7fe57a229d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 > FRRouting#12 0x7fe57a229e3f in __libc_start_main_impl ../csu/libc-start.c:392 > FRRouting#13 0x55b6b7c43b84 in _start (/usr/lib/frr/zebra+0x1adb84) > > 0x60e0000de580 is located 96 bytes inside of 160-byte region [0x60e0000de520,0x60e0000de5c0) > freed by thread T0 here: > #0 0x7fe57acb4537 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:127 > FRRouting#1 0x55b6b7d59628 in zebra_nhg_proto_add zebra/zebra_nhg.c:3876 > FRRouting#2 0x55b6b7d83615 in process_subq_nhg zebra/zebra_rib.c:2738 > FRRouting#3 0x55b6b7d83615 in process_subq zebra/zebra_rib.c:3344 > FRRouting#4 0x55b6b7d83615 in meta_queue_process zebra/zebra_rib.c:3397 > FRRouting#5 0x7fe57a916fef in work_queue_run lib/workqueue.c:282 > FRRouting#6 0x7fe57a8f863b in event_call lib/event.c:1996 > FRRouting#7 0x7fe57a81e527 in frr_run lib/libfrr.c:1237 > FRRouting#8 0x55b6b7c40c74 in main zebra/main.c:526 > FRRouting#9 0x7fe57a229d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 > > previously allocated by thread T0 here: > #0 0x7fe57acb4a57 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154 > FRRouting#1 0x7fe57a83e98e in qcalloc lib/memory.c:106 > FRRouting#2 0x55b6b7d5149e in zebra_nhg_alloc zebra/zebra_nhg.c:392 > FRRouting#3 0x55b6b7d5149e in zebra_nhe_copy zebra/zebra_nhg.c:499 > FRRouting#4 0x55b6b7d5181f in zebra_nhg_hash_alloc zebra/zebra_nhg.c:538 > FRRouting#5 0x7fe57a7fbf0d in hash_get lib/hash.c:147 > FRRouting#6 0x55b6b7d542ea in zebra_nhe_find zebra/zebra_nhg.c:832 > FRRouting#7 0x55b6b7d5495f in zebra_nhg_find zebra/zebra_nhg.c:1014 > FRRouting#8 0x55b6b7d54dcd in zebra_nhg_find_nexthop zebra/zebra_nhg.c:1031 > FRRouting#9 0x55b6b7d535e8 in depends_find_recursive zebra/zebra_nhg.c:1514 > FRRouting#10 0x55b6b7d535e8 in depends_find zebra/zebra_nhg.c:1563 > FRRouting#11 0x55b6b7d535e8 in depends_find_add zebra/zebra_nhg.c:1602 > FRRouting#12 0x55b6b7d59884 in zebra_nhg_update_nhe zebra/zebra_nhg.c:3738 > FRRouting#13 0x55b6b7d59884 in zebra_nhg_proto_add zebra/zebra_nhg.c:3844 > FRRouting#14 0x55b6b7d83615 in process_subq_nhg zebra/zebra_rib.c:2738 > FRRouting#15 0x55b6b7d83615 in process_subq zebra/zebra_rib.c:3344 > FRRouting#16 0x55b6b7d83615 in meta_queue_process zebra/zebra_rib.c:3397 > FRRouting#17 0x7fe57a916fef in work_queue_run lib/workqueue.c:282 > FRRouting#18 0x7fe57a8f863b in event_call lib/event.c:1996 > FRRouting#19 0x7fe57a81e527 in frr_run lib/libfrr.c:1237 > FRRouting#20 0x55b6b7c40c74 in main zebra/main.c:526 > FRRouting#21 0x7fe57a229d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 > > SUMMARY: AddressSanitizer: heap-use-after-free zebra/zebra_nhg.c:1858 in zebra_nhg_decrement_ref > Shadow bytes around the buggy address: > 0x0c1c80013c60: fd fd fd fa fa fa fa fa fa fa fa fa fd fd fd fd > 0x0c1c80013c70: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fa > 0x0c1c80013c80: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd > 0x0c1c80013c90: fd fd fd fd fd fd fd fd fd fd fd fa fa fa fa fa > 0x0c1c80013ca0: fa fa fa fa fd fd fd fd fd fd fd fd fd fd fd fd > =>0x0c1c80013cb0:[fd]fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa > 0x0c1c80013cc0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd > 0x0c1c80013cd0: fd fd fd fa fa fa fa fa fa fa fa fa fd fd fd fd > 0x0c1c80013ce0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fa > 0x0c1c80013cf0: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00 > 0x0c1c80013d00: 00 00 00 00 00 00 00 00 00 00 00 fa fa fa fa fa > Shadow byte legend (one shadow byte represents 8 application bytes): > Addressable: 00 > Partially addressable: 01 02 03 04 05 06 07 > Heap left redzone: fa > Freed heap region: fd > Stack left redzone: f1 > Stack mid redzone: f2 > Stack right redzone: f3 > Stack after return: f5 > Stack use after scope: f8 > Global redzone: f9 > Global init order: f6 > Poisoned by user: f7 > Container overflow: fc > Array cookie: ac > Intra object redzone: bb > ASan internal: fe > Left alloca redzone: ca > Right alloca redzone: cb > Shadow gap: cc > ==1195107==ABORTING > Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
A general flush is done on the nhg depend of the protocol nexthop group. Actually, the NHG should not be removed, if there are routes attached to it. In the same time, it seems the route count does not propagate to the nhg_depends. The con of this method is that there is still ASAN, and by comparing the refcount value of the old way (allocation), the count is less than expectd, for nexthop group with route count only: Allocation method in proto_add(): > 2024/10/14 10:57:24.915401 ZEBRA: [VB8P9-5F2GE] zebra_nhg_proto_add: BEFORE NHE 71428576, (71428576[39/49/59]) cnt 2002 > 2024/10/14 10:57:24.915510 ZEBRA: [HCTBK-W37K2] zebra_nhg_proto_add: NHE 71428576, (71428576[49/59/65]) cnt 1 > 2024/10/14 10:57:24.915513 ZEBRA: [RM3ZQ-V7JN5] zebra_nhg_proto_add: NHE 49, (49[50]) cnt 2012 > 2024/10/14 10:57:24.915515 ZEBRA: [VP9H1-EV2BN] (71428573) > 2024/10/14 10:57:24.915515 ZEBRA: [VP9H1-EV2BN] (71428574) > 2024/10/14 10:57:24.915516 ZEBRA: [VP9H1-EV2BN] (71428576) > 2024/10/14 10:57:24.915517 ZEBRA: [VP9H1-EV2BN] (71428578) > 2024/10/14 10:57:24.915517 ZEBRA: [RM3ZQ-V7JN5] zebra_nhg_proto_add: NHE 59, (59[60]) cnt 2007 > 2024/10/14 10:57:24.915519 ZEBRA: [VP9H1-EV2BN] (71428575) > 2024/10/14 10:57:24.915519 ZEBRA: [VP9H1-EV2BN] (71428576) > 2024/10/14 10:57:24.915520 ZEBRA: [RM3ZQ-V7JN5] zebra_nhg_proto_add: NHE 65, (65[42]) cnt 4 > 2024/10/14 10:57:24.915521 ZEBRA: [VP9H1-EV2BN] (71428571) > 2024/10/14 10:57:24.915522 ZEBRA: [VP9H1-EV2BN] (71428576) Method using general flush, but keep old pointer: > 2024/10/14 10:51:17.229799 ZEBRA: [VB8P9-5F2GE] zebra_nhg_proto_add: BEFORE NHE 71428576, (71428576[39/49/59]) cnt 2002 > 2024/10/14 10:51:17.229909 ZEBRA: [HCTBK-W37K2] zebra_nhg_proto_add: NHE 71428576, (71428576[49/59/65]) cnt 2002 > 2024/10/14 10:51:17.229912 ZEBRA: [RM3ZQ-V7JN5] zebra_nhg_proto_add: NHE 49, (49[50]) cnt 2011 > 2024/10/14 10:51:17.229914 ZEBRA: [VP9H1-EV2BN] (71428573) > 2024/10/14 10:51:17.229915 ZEBRA: [VP9H1-EV2BN] (71428574) > 2024/10/14 10:51:17.229915 ZEBRA: [VP9H1-EV2BN] (71428576) > 2024/10/14 10:51:17.229916 ZEBRA: [VP9H1-EV2BN] (71428578) > 2024/10/14 10:51:17.229916 ZEBRA: [RM3ZQ-V7JN5] zebra_nhg_proto_add: NHE 59, (59[60]) cnt 2006 > 2024/10/14 10:51:17.229918 ZEBRA: [VP9H1-EV2BN] (71428575) > 2024/10/14 10:51:17.229918 ZEBRA: [VP9H1-EV2BN] (71428576) > 2024/10/14 10:51:17.229919 ZEBRA: [RM3ZQ-V7JN5] zebra_nhg_proto_add: NHE 65, (65[42]) cnt 4 > 2024/10/14 10:51:17.229920 ZEBRA: [VP9H1-EV2BN] (71428571) > 2024/10/14 10:51:17.229921 ZEBRA: [VP9H1-EV2BN] (71428576) Resulting ASAN error when running bgp_nhg_zapi_notification, on the test_bgp_ipv4_simulate_r5_machine_going_down() function: > r1: zebra triggered an exception by AddressSanitizer > AddressSanitizer error in topotest `test_bgp_nhg_zapi_scalability.py`, test `teardown_module`, router `r1` > > ERROR: AddressSanitizer: heap-use-after-free on address 0x60e0000de580 at pc 0x558a7d98cd8e bp 0x7fff4915a6e0 sp 0x7fff4915a6d0 > READ of size 4 at 0x60e0000de580 thread T0 > #0 0x558a7d98cd8d in zebra_nhg_decrement_ref zebra/zebra_nhg.c:1858 > FRRouting#1 0x558a7d98cfee in zebra_nhg_free_members zebra/zebra_nhg.c:1752 > FRRouting#2 0x558a7d98cfee in zebra_nhg_free zebra/zebra_nhg.c:1772 > FRRouting#3 0x558a7d9901ff in zebra_nhg_proto_add zebra/zebra_nhg.c:3861 > FRRouting#4 0x558a7d9ba365 in process_subq_nhg zebra/zebra_rib.c:2738 > FRRouting#5 0x558a7d9ba365 in process_subq zebra/zebra_rib.c:3344 > FRRouting#6 0x558a7d9ba365 in meta_queue_process zebra/zebra_rib.c:3397 > FRRouting#7 0x7fa262f16fef in work_queue_run lib/workqueue.c:282 > FRRouting#8 0x7fa262ef863b in event_call lib/event.c:1996 > FRRouting#9 0x7fa262e1e527 in frr_run lib/libfrr.c:1237 > FRRouting#10 0x558a7d877c74 in main zebra/main.c:526 > FRRouting#11 0x7fa262829d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 > FRRouting#12 0x7fa262829e3f in __libc_start_main_impl ../csu/libc-start.c:392 > FRRouting#13 0x558a7d87ab84 in _start (/usr/lib/frr/zebra+0x1acb84) > > 0x60e0000de580 is located 96 bytes inside of 160-byte region [0x60e0000de520,0x60e0000de5c0) > freed by thread T0 here: > #0 0x7fa2632b4537 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:127 > FRRouting#1 0x558a7d9908a1 in zebra_nhg_proto_add zebra/zebra_nhg.c:3854 > FRRouting#2 0x558a7d9ba365 in process_subq_nhg zebra/zebra_rib.c:2738 > FRRouting#3 0x558a7d9ba365 in process_subq zebra/zebra_rib.c:3344 > FRRouting#4 0x558a7d9ba365 in meta_queue_process zebra/zebra_rib.c:3397 > FRRouting#5 0x7fa262f16fef in work_queue_run lib/workqueue.c:282 > FRRouting#6 0x7fa262ef863b in event_call lib/event.c:1996 > FRRouting#7 0x7fa262e1e527 in frr_run lib/libfrr.c:1237 > FRRouting#8 0x558a7d877c74 in main zebra/main.c:526 > FRRouting#9 0x7fa262829d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 > > previously allocated by thread T0 here: > #0 0x7fa2632b4a57 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154 > FRRouting#1 0x7fa262e3e98e in qcalloc lib/memory.c:106 > FRRouting#2 0x558a7d98849e in zebra_nhg_alloc zebra/zebra_nhg.c:392 > FRRouting#3 0x558a7d98849e in zebra_nhe_copy zebra/zebra_nhg.c:499 > FRRouting#4 0x558a7d98881f in zebra_nhg_hash_alloc zebra/zebra_nhg.c:538 > FRRouting#5 0x7fa262dfbf0d in hash_get lib/hash.c:147 > FRRouting#6 0x558a7d98b2ea in zebra_nhe_find zebra/zebra_nhg.c:832 > FRRouting#7 0x558a7d98b95f in zebra_nhg_find zebra/zebra_nhg.c:1014 > FRRouting#8 0x558a7d98bdcd in zebra_nhg_find_nexthop zebra/zebra_nhg.c:1031 > FRRouting#9 0x558a7d98a5e8 in depends_find_recursive zebra/zebra_nhg.c:1514 > FRRouting#10 0x558a7d98a5e8 in depends_find zebra/zebra_nhg.c:1563 > FRRouting#11 0x558a7d98a5e8 in depends_find_add zebra/zebra_nhg.c:1602 > FRRouting#12 0x558a7d990378 in zebra_nhg_update_nhe zebra/zebra_nhg.c:3739 > FRRouting#13 0x558a7d990378 in zebra_nhg_proto_add zebra/zebra_nhg.c:3822 > FRRouting#14 0x558a7d9ba365 in process_subq_nhg zebra/zebra_rib.c:2738 > FRRouting#15 0x558a7d9ba365 in process_subq zebra/zebra_rib.c:3344 > FRRouting#16 0x558a7d9ba365 in meta_queue_process zebra/zebra_rib.c:3397 > FRRouting#17 0x7fa262f16fef in work_queue_run lib/workqueue.c:282 > FRRouting#18 0x7fa262ef863b in event_call lib/event.c:1996 > FRRouting#19 0x7fa262e1e527 in frr_run lib/libfrr.c:1237 > FRRouting#20 0x558a7d877c74 in main zebra/main.c:526 > FRRouting#21 0x7fa262829d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 > > SUMMARY: AddressSanitizer: heap-use-after-free zebra/zebra_nhg.c:1858 in zebra_nhg_decrement_ref > Shadow bytes around the buggy address: > 0x0c1c80013c60: fd fd fd fa fa fa fa fa fa fa fa fa fd fd fd fd > 0x0c1c80013c70: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fa > 0x0c1c80013c80: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd > 0x0c1c80013c90: fd fd fd fd fd fd fd fd fd fd fd fa fa fa fa fa > 0x0c1c80013ca0: fa fa fa fa fd fd fd fd fd fd fd fd fd fd fd fd > =>0x0c1c80013cb0:[fd]fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa > 0x0c1c80013cc0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd > 0x0c1c80013cd0: fd fd fd fa fa fa fa fa fa fa fa fa fd fd fd fd > 0x0c1c80013ce0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fa > 0x0c1c80013cf0: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00 > 0x0c1c80013d00: 00 00 00 00 00 00 00 00 00 00 00 fa fa fa fa fa > Shadow byte legend (one shadow byte represents 8 application bytes): > Addressable: 00 > Partially addressable: 01 02 03 04 05 06 07 > Heap left redzone: fa > Freed heap region: fd > Stack left redzone: f1 > Stack mid redzone: f2 > Stack right redzone: f3 > Stack after return: f5 > Stack use after scope: f8 > Global redzone: f9 > Global init order: f6 > Poisoned by user: f7 > Container overflow: fc > Array cookie: ac > Intra object redzone: bb > ASan internal: fe > Left alloca redzone: ca > Right alloca redzone: cb > Shadow gap: cc > Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
When a failover happens on ECMP paths that use the same nexthop which is recursively resolved, ZEBRA replaces the old NHG with a new one, and updates the pointer of all routes using that nexthop. Actually, if only the recursive nexthop changed, there is no need to replace the old NHG. Modify the zebra_nhg_proto_add() function, by updating the recursive nexthop on the original NHG. Using this change replaces the old method that was consisting in allocating a new nhe. This change triggers an ASAN in the bgp_nhg_zapi_scalability test, function test_bgp_ipv4_simulate_r5_machine_going_down(). > ==1195107==ERROR: AddressSanitizer: heap-use-after-free on address 0x60e0000de580 at pc 0x55b6b7d55d8e bp 0x7fffd81977a0 sp 0x7fffd8197790 > READ of size 4 at 0x60e0000de580 thread T0 > #0 0x55b6b7d55d8d in zebra_nhg_decrement_ref zebra/zebra_nhg.c:1858 > FRRouting#1 0x55b6b7d55fee in zebra_nhg_free_members zebra/zebra_nhg.c:1752 > FRRouting#2 0x55b6b7d55fee in zebra_nhg_free zebra/zebra_nhg.c:1772 > FRRouting#3 0x55b6b7d59215 in zebra_nhg_proto_add zebra/zebra_nhg.c:3883 > FRRouting#4 0x55b6b7d83615 in process_subq_nhg zebra/zebra_rib.c:2738 > FRRouting#5 0x55b6b7d83615 in process_subq zebra/zebra_rib.c:3344 > FRRouting#6 0x55b6b7d83615 in meta_queue_process zebra/zebra_rib.c:3397 > FRRouting#7 0x7fe57a916fef in work_queue_run lib/workqueue.c:282 > FRRouting#8 0x7fe57a8f863b in event_call lib/event.c:1996 > FRRouting#9 0x7fe57a81e527 in frr_run lib/libfrr.c:1237 > FRRouting#10 0x55b6b7c40c74 in main zebra/main.c:526 > FRRouting#11 0x7fe57a229d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 > FRRouting#12 0x7fe57a229e3f in __libc_start_main_impl ../csu/libc-start.c:392 > FRRouting#13 0x55b6b7c43b84 in _start (/usr/lib/frr/zebra+0x1adb84) > > 0x60e0000de580 is located 96 bytes inside of 160-byte region [0x60e0000de520,0x60e0000de5c0) > freed by thread T0 here: > #0 0x7fe57acb4537 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:127 > FRRouting#1 0x55b6b7d59628 in zebra_nhg_proto_add zebra/zebra_nhg.c:3876 > FRRouting#2 0x55b6b7d83615 in process_subq_nhg zebra/zebra_rib.c:2738 > FRRouting#3 0x55b6b7d83615 in process_subq zebra/zebra_rib.c:3344 > FRRouting#4 0x55b6b7d83615 in meta_queue_process zebra/zebra_rib.c:3397 > FRRouting#5 0x7fe57a916fef in work_queue_run lib/workqueue.c:282 > FRRouting#6 0x7fe57a8f863b in event_call lib/event.c:1996 > FRRouting#7 0x7fe57a81e527 in frr_run lib/libfrr.c:1237 > FRRouting#8 0x55b6b7c40c74 in main zebra/main.c:526 > FRRouting#9 0x7fe57a229d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 > > previously allocated by thread T0 here: > #0 0x7fe57acb4a57 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154 > FRRouting#1 0x7fe57a83e98e in qcalloc lib/memory.c:106 > FRRouting#2 0x55b6b7d5149e in zebra_nhg_alloc zebra/zebra_nhg.c:392 > FRRouting#3 0x55b6b7d5149e in zebra_nhe_copy zebra/zebra_nhg.c:499 > FRRouting#4 0x55b6b7d5181f in zebra_nhg_hash_alloc zebra/zebra_nhg.c:538 > FRRouting#5 0x7fe57a7fbf0d in hash_get lib/hash.c:147 > FRRouting#6 0x55b6b7d542ea in zebra_nhe_find zebra/zebra_nhg.c:832 > FRRouting#7 0x55b6b7d5495f in zebra_nhg_find zebra/zebra_nhg.c:1014 > FRRouting#8 0x55b6b7d54dcd in zebra_nhg_find_nexthop zebra/zebra_nhg.c:1031 > FRRouting#9 0x55b6b7d535e8 in depends_find_recursive zebra/zebra_nhg.c:1514 > FRRouting#10 0x55b6b7d535e8 in depends_find zebra/zebra_nhg.c:1563 > FRRouting#11 0x55b6b7d535e8 in depends_find_add zebra/zebra_nhg.c:1602 > FRRouting#12 0x55b6b7d59884 in zebra_nhg_update_nhe zebra/zebra_nhg.c:3738 > FRRouting#13 0x55b6b7d59884 in zebra_nhg_proto_add zebra/zebra_nhg.c:3844 > FRRouting#14 0x55b6b7d83615 in process_subq_nhg zebra/zebra_rib.c:2738 > FRRouting#15 0x55b6b7d83615 in process_subq zebra/zebra_rib.c:3344 > FRRouting#16 0x55b6b7d83615 in meta_queue_process zebra/zebra_rib.c:3397 > FRRouting#17 0x7fe57a916fef in work_queue_run lib/workqueue.c:282 > FRRouting#18 0x7fe57a8f863b in event_call lib/event.c:1996 > FRRouting#19 0x7fe57a81e527 in frr_run lib/libfrr.c:1237 > FRRouting#20 0x55b6b7c40c74 in main zebra/main.c:526 > FRRouting#21 0x7fe57a229d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 > > SUMMARY: AddressSanitizer: heap-use-after-free zebra/zebra_nhg.c:1858 in zebra_nhg_decrement_ref > Shadow bytes around the buggy address: > 0x0c1c80013c60: fd fd fd fa fa fa fa fa fa fa fa fa fd fd fd fd > 0x0c1c80013c70: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fa > 0x0c1c80013c80: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd > 0x0c1c80013c90: fd fd fd fd fd fd fd fd fd fd fd fa fa fa fa fa > 0x0c1c80013ca0: fa fa fa fa fd fd fd fd fd fd fd fd fd fd fd fd > =>0x0c1c80013cb0:[fd]fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa > 0x0c1c80013cc0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd > 0x0c1c80013cd0: fd fd fd fa fa fa fa fa fa fa fa fa fd fd fd fd > 0x0c1c80013ce0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fa > 0x0c1c80013cf0: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00 > 0x0c1c80013d00: 00 00 00 00 00 00 00 00 00 00 00 fa fa fa fa fa > Shadow byte legend (one shadow byte represents 8 application bytes): > Addressable: 00 > Partially addressable: 01 02 03 04 05 06 07 > Heap left redzone: fa > Freed heap region: fd > Stack left redzone: f1 > Stack mid redzone: f2 > Stack right redzone: f3 > Stack after return: f5 > Stack use after scope: f8 > Global redzone: f9 > Global init order: f6 > Poisoned by user: f7 > Container overflow: fc > Array cookie: ac > Intra object redzone: bb > ASan internal: fe > Left alloca redzone: ca > Right alloca redzone: cb > Shadow gap: cc > ==1195107==ABORTING > Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
A general flush is done on the nhg depend of the protocol nexthop group. Actually, the NHG should not be removed, if there are routes attached to it. In the same time, it seems the route count does not propagate to the nhg_depends. The con of this method is that there is still ASAN, and by comparing the refcount value of the old way (allocation), the count is less than expectd, for nexthop group with route count only: Allocation method in proto_add(): > 2024/10/14 10:57:24.915401 ZEBRA: [VB8P9-5F2GE] zebra_nhg_proto_add: BEFORE NHE 71428576, (71428576[39/49/59]) cnt 2002 > 2024/10/14 10:57:24.915510 ZEBRA: [HCTBK-W37K2] zebra_nhg_proto_add: NHE 71428576, (71428576[49/59/65]) cnt 1 > 2024/10/14 10:57:24.915513 ZEBRA: [RM3ZQ-V7JN5] zebra_nhg_proto_add: NHE 49, (49[50]) cnt 2012 > 2024/10/14 10:57:24.915515 ZEBRA: [VP9H1-EV2BN] (71428573) > 2024/10/14 10:57:24.915515 ZEBRA: [VP9H1-EV2BN] (71428574) > 2024/10/14 10:57:24.915516 ZEBRA: [VP9H1-EV2BN] (71428576) > 2024/10/14 10:57:24.915517 ZEBRA: [VP9H1-EV2BN] (71428578) > 2024/10/14 10:57:24.915517 ZEBRA: [RM3ZQ-V7JN5] zebra_nhg_proto_add: NHE 59, (59[60]) cnt 2007 > 2024/10/14 10:57:24.915519 ZEBRA: [VP9H1-EV2BN] (71428575) > 2024/10/14 10:57:24.915519 ZEBRA: [VP9H1-EV2BN] (71428576) > 2024/10/14 10:57:24.915520 ZEBRA: [RM3ZQ-V7JN5] zebra_nhg_proto_add: NHE 65, (65[42]) cnt 4 > 2024/10/14 10:57:24.915521 ZEBRA: [VP9H1-EV2BN] (71428571) > 2024/10/14 10:57:24.915522 ZEBRA: [VP9H1-EV2BN] (71428576) Method using general flush, but keep old pointer: > 2024/10/14 10:51:17.229799 ZEBRA: [VB8P9-5F2GE] zebra_nhg_proto_add: BEFORE NHE 71428576, (71428576[39/49/59]) cnt 2002 > 2024/10/14 10:51:17.229909 ZEBRA: [HCTBK-W37K2] zebra_nhg_proto_add: NHE 71428576, (71428576[49/59/65]) cnt 2002 > 2024/10/14 10:51:17.229912 ZEBRA: [RM3ZQ-V7JN5] zebra_nhg_proto_add: NHE 49, (49[50]) cnt 2011 > 2024/10/14 10:51:17.229914 ZEBRA: [VP9H1-EV2BN] (71428573) > 2024/10/14 10:51:17.229915 ZEBRA: [VP9H1-EV2BN] (71428574) > 2024/10/14 10:51:17.229915 ZEBRA: [VP9H1-EV2BN] (71428576) > 2024/10/14 10:51:17.229916 ZEBRA: [VP9H1-EV2BN] (71428578) > 2024/10/14 10:51:17.229916 ZEBRA: [RM3ZQ-V7JN5] zebra_nhg_proto_add: NHE 59, (59[60]) cnt 2006 > 2024/10/14 10:51:17.229918 ZEBRA: [VP9H1-EV2BN] (71428575) > 2024/10/14 10:51:17.229918 ZEBRA: [VP9H1-EV2BN] (71428576) > 2024/10/14 10:51:17.229919 ZEBRA: [RM3ZQ-V7JN5] zebra_nhg_proto_add: NHE 65, (65[42]) cnt 4 > 2024/10/14 10:51:17.229920 ZEBRA: [VP9H1-EV2BN] (71428571) > 2024/10/14 10:51:17.229921 ZEBRA: [VP9H1-EV2BN] (71428576) Resulting ASAN error when running bgp_nhg_zapi_notification, on the test_bgp_ipv4_simulate_r5_machine_going_down() function: > r1: zebra triggered an exception by AddressSanitizer > AddressSanitizer error in topotest `test_bgp_nhg_zapi_scalability.py`, test `teardown_module`, router `r1` > > ERROR: AddressSanitizer: heap-use-after-free on address 0x60e0000de580 at pc 0x558a7d98cd8e bp 0x7fff4915a6e0 sp 0x7fff4915a6d0 > READ of size 4 at 0x60e0000de580 thread T0 > #0 0x558a7d98cd8d in zebra_nhg_decrement_ref zebra/zebra_nhg.c:1858 > FRRouting#1 0x558a7d98cfee in zebra_nhg_free_members zebra/zebra_nhg.c:1752 > FRRouting#2 0x558a7d98cfee in zebra_nhg_free zebra/zebra_nhg.c:1772 > FRRouting#3 0x558a7d9901ff in zebra_nhg_proto_add zebra/zebra_nhg.c:3861 > FRRouting#4 0x558a7d9ba365 in process_subq_nhg zebra/zebra_rib.c:2738 > FRRouting#5 0x558a7d9ba365 in process_subq zebra/zebra_rib.c:3344 > FRRouting#6 0x558a7d9ba365 in meta_queue_process zebra/zebra_rib.c:3397 > FRRouting#7 0x7fa262f16fef in work_queue_run lib/workqueue.c:282 > FRRouting#8 0x7fa262ef863b in event_call lib/event.c:1996 > FRRouting#9 0x7fa262e1e527 in frr_run lib/libfrr.c:1237 > FRRouting#10 0x558a7d877c74 in main zebra/main.c:526 > FRRouting#11 0x7fa262829d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 > FRRouting#12 0x7fa262829e3f in __libc_start_main_impl ../csu/libc-start.c:392 > FRRouting#13 0x558a7d87ab84 in _start (/usr/lib/frr/zebra+0x1acb84) > > 0x60e0000de580 is located 96 bytes inside of 160-byte region [0x60e0000de520,0x60e0000de5c0) > freed by thread T0 here: > #0 0x7fa2632b4537 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:127 > FRRouting#1 0x558a7d9908a1 in zebra_nhg_proto_add zebra/zebra_nhg.c:3854 > FRRouting#2 0x558a7d9ba365 in process_subq_nhg zebra/zebra_rib.c:2738 > FRRouting#3 0x558a7d9ba365 in process_subq zebra/zebra_rib.c:3344 > FRRouting#4 0x558a7d9ba365 in meta_queue_process zebra/zebra_rib.c:3397 > FRRouting#5 0x7fa262f16fef in work_queue_run lib/workqueue.c:282 > FRRouting#6 0x7fa262ef863b in event_call lib/event.c:1996 > FRRouting#7 0x7fa262e1e527 in frr_run lib/libfrr.c:1237 > FRRouting#8 0x558a7d877c74 in main zebra/main.c:526 > FRRouting#9 0x7fa262829d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 > > previously allocated by thread T0 here: > #0 0x7fa2632b4a57 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154 > FRRouting#1 0x7fa262e3e98e in qcalloc lib/memory.c:106 > FRRouting#2 0x558a7d98849e in zebra_nhg_alloc zebra/zebra_nhg.c:392 > FRRouting#3 0x558a7d98849e in zebra_nhe_copy zebra/zebra_nhg.c:499 > FRRouting#4 0x558a7d98881f in zebra_nhg_hash_alloc zebra/zebra_nhg.c:538 > FRRouting#5 0x7fa262dfbf0d in hash_get lib/hash.c:147 > FRRouting#6 0x558a7d98b2ea in zebra_nhe_find zebra/zebra_nhg.c:832 > FRRouting#7 0x558a7d98b95f in zebra_nhg_find zebra/zebra_nhg.c:1014 > FRRouting#8 0x558a7d98bdcd in zebra_nhg_find_nexthop zebra/zebra_nhg.c:1031 > FRRouting#9 0x558a7d98a5e8 in depends_find_recursive zebra/zebra_nhg.c:1514 > FRRouting#10 0x558a7d98a5e8 in depends_find zebra/zebra_nhg.c:1563 > FRRouting#11 0x558a7d98a5e8 in depends_find_add zebra/zebra_nhg.c:1602 > FRRouting#12 0x558a7d990378 in zebra_nhg_update_nhe zebra/zebra_nhg.c:3739 > FRRouting#13 0x558a7d990378 in zebra_nhg_proto_add zebra/zebra_nhg.c:3822 > FRRouting#14 0x558a7d9ba365 in process_subq_nhg zebra/zebra_rib.c:2738 > FRRouting#15 0x558a7d9ba365 in process_subq zebra/zebra_rib.c:3344 > FRRouting#16 0x558a7d9ba365 in meta_queue_process zebra/zebra_rib.c:3397 > FRRouting#17 0x7fa262f16fef in work_queue_run lib/workqueue.c:282 > FRRouting#18 0x7fa262ef863b in event_call lib/event.c:1996 > FRRouting#19 0x7fa262e1e527 in frr_run lib/libfrr.c:1237 > FRRouting#20 0x558a7d877c74 in main zebra/main.c:526 > FRRouting#21 0x7fa262829d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 > > SUMMARY: AddressSanitizer: heap-use-after-free zebra/zebra_nhg.c:1858 in zebra_nhg_decrement_ref > Shadow bytes around the buggy address: > 0x0c1c80013c60: fd fd fd fa fa fa fa fa fa fa fa fa fd fd fd fd > 0x0c1c80013c70: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fa > 0x0c1c80013c80: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd > 0x0c1c80013c90: fd fd fd fd fd fd fd fd fd fd fd fa fa fa fa fa > 0x0c1c80013ca0: fa fa fa fa fd fd fd fd fd fd fd fd fd fd fd fd > =>0x0c1c80013cb0:[fd]fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa > 0x0c1c80013cc0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd > 0x0c1c80013cd0: fd fd fd fa fa fa fa fa fa fa fa fa fd fd fd fd > 0x0c1c80013ce0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fa > 0x0c1c80013cf0: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00 > 0x0c1c80013d00: 00 00 00 00 00 00 00 00 00 00 00 fa fa fa fa fa > Shadow byte legend (one shadow byte represents 8 application bytes): > Addressable: 00 > Partially addressable: 01 02 03 04 05 06 07 > Heap left redzone: fa > Freed heap region: fd > Stack left redzone: f1 > Stack mid redzone: f2 > Stack right redzone: f3 > Stack after return: f5 > Stack use after scope: f8 > Global redzone: f9 > Global init order: f6 > Poisoned by user: f7 > Container overflow: fc > Array cookie: ac > Intra object redzone: bb > ASan internal: fe > Left alloca redzone: ca > Right alloca redzone: cb > Shadow gap: cc > Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Fix a heap-after-free that causes zebra to crash even without address-sanitizer. To reproduce: > echo "100 my_table" | tee -a /etc/iproute2/rt_tables > ip route add blackhole default table 100 > ip route show table 100 > ip l add red type vrf table 100 > ip l del red > ip route del blackhole default table 100 Zebra manages routing tables for all existing Linux RT tables, regardless of whether they are assigned to a VRF interface. When a table is not assigned to any VRF, zebra arbitrarily assigns it to the default VRF, even though this is not strictly accurate (the code expects this behavior). When an RT table is created after a VRF, zebra correctly assigns the table to the VRF. However, if a VRF interface is assigned to an existing RT table, zebra does not update the table owner, which remains as the default VRF. As a result, existing routing entries remain under the default VRF, while new entries are correctly assigned to the VRF. The VRF mismatch is unexpected in the code and creates crashes and memory related issues. Furthermore, Linux does not automatically delete RT tables when they are unassigned from a VRF. It is incorrect to delete these tables from zebra. Instead, at VRF disabling, do not release the table but reassign it to the default VRF. At VRF enabling, change the table owner back to the appropriate VRF. > ==2866266==ERROR: AddressSanitizer: heap-use-after-free on address 0x606000154f54 at pc 0x7fa32474b83f bp 0x7ffe94f67d90 sp 0x7ffe94f67d88 > READ of size 1 at 0x606000154f54 thread T0 > #0 0x7fa32474b83e in rn_hash_node_const_find lib/table.c:28 > #1 0x7fa32474bab1 in rn_hash_node_find lib/table.c:28 > #2 0x7fa32474d783 in route_node_get lib/table.c:283 > #3 0x7fa3247328dd in srcdest_rnode_get lib/srcdest_table.c:231 > FRRouting#4 0x55b0e4fa8da4 in rib_find_rn_from_ctx zebra/zebra_rib.c:1957 > FRRouting#5 0x55b0e4fa8e31 in rib_process_result zebra/zebra_rib.c:1988 > FRRouting#6 0x55b0e4fb9d64 in rib_process_dplane_results zebra/zebra_rib.c:4894 > FRRouting#7 0x7fa32476689c in event_call lib/event.c:1996 > FRRouting#8 0x7fa32463b7b2 in frr_run lib/libfrr.c:1232 > FRRouting#9 0x55b0e4e6c32a in main zebra/main.c:526 > FRRouting#10 0x7fa32424fd09 in __libc_start_main ../csu/libc-start.c:308 > FRRouting#11 0x55b0e4e2d649 in _start (/usr/lib/frr/zebra+0x1a1649) > > 0x606000154f54 is located 20 bytes inside of 56-byte region [0x606000154f40,0x606000154f78) > freed by thread T0 here: > #0 0x7fa324ca9b6f in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:123 > #1 0x7fa324668d8f in qfree lib/memory.c:130 > #2 0x7fa32474c421 in route_table_free lib/table.c:126 > #3 0x7fa32474bf96 in route_table_finish lib/table.c:46 > FRRouting#4 0x55b0e4fbca3a in zebra_router_free_table zebra/zebra_router.c:191 > FRRouting#5 0x55b0e4fbccea in zebra_router_release_table zebra/zebra_router.c:214 > FRRouting#6 0x55b0e4fd428e in zebra_vrf_disable zebra/zebra_vrf.c:219 > FRRouting#7 0x7fa32476fabf in vrf_disable lib/vrf.c:326 > FRRouting#8 0x7fa32476f5d4 in vrf_delete lib/vrf.c:231 > FRRouting#9 0x55b0e4e4ad36 in interface_vrf_change zebra/interface.c:1478 > FRRouting#10 0x55b0e4e4d5d2 in zebra_if_dplane_ifp_handling zebra/interface.c:1949 > FRRouting#11 0x55b0e4e4fb89 in zebra_if_dplane_result zebra/interface.c:2268 > FRRouting#12 0x55b0e4fb9f26 in rib_process_dplane_results zebra/zebra_rib.c:4954 > FRRouting#13 0x7fa32476689c in event_call lib/event.c:1996 > FRRouting#14 0x7fa32463b7b2 in frr_run lib/libfrr.c:1232 > FRRouting#15 0x55b0e4e6c32a in main zebra/main.c:526 > FRRouting#16 0x7fa32424fd09 in __libc_start_main ../csu/libc-start.c:308 > > previously allocated by thread T0 here: > #0 0x7fa324caa037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154 > #1 0x7fa324668c4d in qcalloc lib/memory.c:105 > #2 0x7fa32474bf33 in route_table_init_with_delegate lib/table.c:38 > #3 0x7fa32474e73c in route_table_init lib/table.c:512 > FRRouting#4 0x55b0e4fbc353 in zebra_router_get_table zebra/zebra_router.c:137 > FRRouting#5 0x55b0e4fd4da0 in zebra_vrf_table_create zebra/zebra_vrf.c:358 > FRRouting#6 0x55b0e4fd3d30 in zebra_vrf_enable zebra/zebra_vrf.c:140 > FRRouting#7 0x7fa32476f9b2 in vrf_enable lib/vrf.c:286 > FRRouting#8 0x55b0e4e4af76 in interface_vrf_change zebra/interface.c:1533 > FRRouting#9 0x55b0e4e4d612 in zebra_if_dplane_ifp_handling zebra/interface.c:1968 > FRRouting#10 0x55b0e4e4fb89 in zebra_if_dplane_result zebra/interface.c:2268 > FRRouting#11 0x55b0e4fb9f26 in rib_process_dplane_results zebra/zebra_rib.c:4954 > FRRouting#12 0x7fa32476689c in event_call lib/event.c:1996 > FRRouting#13 0x7fa32463b7b2 in frr_run lib/libfrr.c:1232 > FRRouting#14 0x55b0e4e6c32a in main zebra/main.c:526 > FRRouting#15 0x7fa32424fd09 in __libc_start_main ../csu/libc-start.c:308 Fixes: d8612e6 ("zebra: Track tables allocated by vrf and cleanup") Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
The following ASAN issue has been observed: > ERROR: AddressSanitizer: heap-use-after-free on address 0x6160000acba4 at pc 0x55910c5694d0 bp 0x7ffe3a8ac850 sp 0x7ffe3a8ac840 > READ of size 4 at 0x6160000acba4 thread T0 > #0 0x55910c5694cf in ctx_info_from_zns zebra/zebra_dplane.c:3315 > #1 0x55910c569696 in dplane_ctx_ns_init zebra/zebra_dplane.c:3331 > #2 0x55910c56bf61 in dplane_ctx_nexthop_init zebra/zebra_dplane.c:3680 > #3 0x55910c5711ca in dplane_nexthop_update_internal zebra/zebra_dplane.c:4490 > FRRouting#4 0x55910c571c5c in dplane_nexthop_delete zebra/zebra_dplane.c:4717 > FRRouting#5 0x55910c61e90e in zebra_nhg_uninstall_kernel zebra/zebra_nhg.c:3413 > FRRouting#6 0x55910c615d8a in zebra_nhg_decrement_ref zebra/zebra_nhg.c:1919 > FRRouting#7 0x55910c6404db in route_entry_update_nhe zebra/zebra_rib.c:454 > FRRouting#8 0x55910c64c904 in rib_re_nhg_free zebra/zebra_rib.c:2822 > FRRouting#9 0x55910c655be2 in rib_unlink zebra/zebra_rib.c:4212 > FRRouting#10 0x55910c6430f9 in zebra_rtable_node_cleanup zebra/zebra_rib.c:968 > FRRouting#11 0x7f26f275b8a9 in route_node_free lib/table.c:75 > FRRouting#12 0x7f26f275bae4 in route_table_free lib/table.c:111 > FRRouting#13 0x7f26f275b749 in route_table_finish lib/table.c:46 > FRRouting#14 0x55910c65db17 in zebra_router_free_table zebra/zebra_router.c:191 > FRRouting#15 0x55910c65dfb5 in zebra_router_terminate zebra/zebra_router.c:244 > FRRouting#16 0x55910c4f40db in zebra_finalize zebra/main.c:249 > FRRouting#17 0x7f26f2777108 in event_call lib/event.c:2011 > FRRouting#18 0x7f26f264180e in frr_run lib/libfrr.c:1212 > FRRouting#19 0x55910c4f49cb in main zebra/main.c:531 > FRRouting#20 0x7f26f2029d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 > FRRouting#21 0x7f26f2029e3f in __libc_start_main_impl ../csu/libc-start.c:392 > FRRouting#22 0x55910c4b0114 in _start (/usr/lib/frr/zebra+0x1ae114) It happens with FRR using the kernel. During shutdown, the namespace identifier is attempted to be obtained by zebra, in an attempt to prepare zebra dataplane nexthop messages. Fix this by accessing the ns structure. Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Fix a heap-after-free that causes zebra to crash even without address-sanitizer. To reproduce: > echo "100 my_table" | tee -a /etc/iproute2/rt_tables > ip route add blackhole default table 100 > ip route show table 100 > ip l add red type vrf table 100 > ip l del red > ip route del blackhole default table 100 Zebra manages routing tables for all existing Linux RT tables, regardless of whether they are assigned to a VRF interface. When a table is not assigned to any VRF, zebra arbitrarily assigns it to the default VRF, even though this is not strictly accurate (the code expects this behavior). When an RT table is created after a VRF, zebra correctly assigns the table to the VRF. However, if a VRF interface is assigned to an existing RT table, zebra does not update the table owner, which remains as the default VRF. As a result, existing routing entries remain under the default VRF, while new entries are correctly assigned to the VRF. The VRF mismatch is unexpected in the code and creates crashes and memory related issues. Furthermore, Linux does not automatically delete RT tables when they are unassigned from a VRF. It is incorrect to delete these tables from zebra. Instead, at VRF disabling, do not release the table but reassign it to the default VRF. At VRF enabling, change the table owner back to the appropriate VRF. > ==2866266==ERROR: AddressSanitizer: heap-use-after-free on address 0x606000154f54 at pc 0x7fa32474b83f bp 0x7ffe94f67d90 sp 0x7ffe94f67d88 > READ of size 1 at 0x606000154f54 thread T0 > #0 0x7fa32474b83e in rn_hash_node_const_find lib/table.c:28 > #1 0x7fa32474bab1 in rn_hash_node_find lib/table.c:28 > #2 0x7fa32474d783 in route_node_get lib/table.c:283 > #3 0x7fa3247328dd in srcdest_rnode_get lib/srcdest_table.c:231 > FRRouting#4 0x55b0e4fa8da4 in rib_find_rn_from_ctx zebra/zebra_rib.c:1957 > FRRouting#5 0x55b0e4fa8e31 in rib_process_result zebra/zebra_rib.c:1988 > FRRouting#6 0x55b0e4fb9d64 in rib_process_dplane_results zebra/zebra_rib.c:4894 > FRRouting#7 0x7fa32476689c in event_call lib/event.c:1996 > FRRouting#8 0x7fa32463b7b2 in frr_run lib/libfrr.c:1232 > FRRouting#9 0x55b0e4e6c32a in main zebra/main.c:526 > FRRouting#10 0x7fa32424fd09 in __libc_start_main ../csu/libc-start.c:308 > FRRouting#11 0x55b0e4e2d649 in _start (/usr/lib/frr/zebra+0x1a1649) > > 0x606000154f54 is located 20 bytes inside of 56-byte region [0x606000154f40,0x606000154f78) > freed by thread T0 here: > #0 0x7fa324ca9b6f in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:123 > #1 0x7fa324668d8f in qfree lib/memory.c:130 > #2 0x7fa32474c421 in route_table_free lib/table.c:126 > #3 0x7fa32474bf96 in route_table_finish lib/table.c:46 > FRRouting#4 0x55b0e4fbca3a in zebra_router_free_table zebra/zebra_router.c:191 > FRRouting#5 0x55b0e4fbccea in zebra_router_release_table zebra/zebra_router.c:214 > FRRouting#6 0x55b0e4fd428e in zebra_vrf_disable zebra/zebra_vrf.c:219 > FRRouting#7 0x7fa32476fabf in vrf_disable lib/vrf.c:326 > FRRouting#8 0x7fa32476f5d4 in vrf_delete lib/vrf.c:231 > FRRouting#9 0x55b0e4e4ad36 in interface_vrf_change zebra/interface.c:1478 > FRRouting#10 0x55b0e4e4d5d2 in zebra_if_dplane_ifp_handling zebra/interface.c:1949 > FRRouting#11 0x55b0e4e4fb89 in zebra_if_dplane_result zebra/interface.c:2268 > FRRouting#12 0x55b0e4fb9f26 in rib_process_dplane_results zebra/zebra_rib.c:4954 > FRRouting#13 0x7fa32476689c in event_call lib/event.c:1996 > FRRouting#14 0x7fa32463b7b2 in frr_run lib/libfrr.c:1232 > FRRouting#15 0x55b0e4e6c32a in main zebra/main.c:526 > FRRouting#16 0x7fa32424fd09 in __libc_start_main ../csu/libc-start.c:308 > > previously allocated by thread T0 here: > #0 0x7fa324caa037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154 > #1 0x7fa324668c4d in qcalloc lib/memory.c:105 > #2 0x7fa32474bf33 in route_table_init_with_delegate lib/table.c:38 > #3 0x7fa32474e73c in route_table_init lib/table.c:512 > FRRouting#4 0x55b0e4fbc353 in zebra_router_get_table zebra/zebra_router.c:137 > FRRouting#5 0x55b0e4fd4da0 in zebra_vrf_table_create zebra/zebra_vrf.c:358 > FRRouting#6 0x55b0e4fd3d30 in zebra_vrf_enable zebra/zebra_vrf.c:140 > FRRouting#7 0x7fa32476f9b2 in vrf_enable lib/vrf.c:286 > FRRouting#8 0x55b0e4e4af76 in interface_vrf_change zebra/interface.c:1533 > FRRouting#9 0x55b0e4e4d612 in zebra_if_dplane_ifp_handling zebra/interface.c:1968 > FRRouting#10 0x55b0e4e4fb89 in zebra_if_dplane_result zebra/interface.c:2268 > FRRouting#11 0x55b0e4fb9f26 in rib_process_dplane_results zebra/zebra_rib.c:4954 > FRRouting#12 0x7fa32476689c in event_call lib/event.c:1996 > FRRouting#13 0x7fa32463b7b2 in frr_run lib/libfrr.c:1232 > FRRouting#14 0x55b0e4e6c32a in main zebra/main.c:526 > FRRouting#15 0x7fa32424fd09 in __libc_start_main ../csu/libc-start.c:308 Fixes: d8612e6 ("zebra: Track tables allocated by vrf and cleanup") Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
The following ASAN issue has been observed: > ERROR: AddressSanitizer: heap-use-after-free on address 0x6160000acba4 at pc 0x55910c5694d0 bp 0x7ffe3a8ac850 sp 0x7ffe3a8ac840 > READ of size 4 at 0x6160000acba4 thread T0 > #0 0x55910c5694cf in ctx_info_from_zns zebra/zebra_dplane.c:3315 > #1 0x55910c569696 in dplane_ctx_ns_init zebra/zebra_dplane.c:3331 > #2 0x55910c56bf61 in dplane_ctx_nexthop_init zebra/zebra_dplane.c:3680 > #3 0x55910c5711ca in dplane_nexthop_update_internal zebra/zebra_dplane.c:4490 > FRRouting#4 0x55910c571c5c in dplane_nexthop_delete zebra/zebra_dplane.c:4717 > FRRouting#5 0x55910c61e90e in zebra_nhg_uninstall_kernel zebra/zebra_nhg.c:3413 > FRRouting#6 0x55910c615d8a in zebra_nhg_decrement_ref zebra/zebra_nhg.c:1919 > FRRouting#7 0x55910c6404db in route_entry_update_nhe zebra/zebra_rib.c:454 > FRRouting#8 0x55910c64c904 in rib_re_nhg_free zebra/zebra_rib.c:2822 > FRRouting#9 0x55910c655be2 in rib_unlink zebra/zebra_rib.c:4212 > FRRouting#10 0x55910c6430f9 in zebra_rtable_node_cleanup zebra/zebra_rib.c:968 > FRRouting#11 0x7f26f275b8a9 in route_node_free lib/table.c:75 > FRRouting#12 0x7f26f275bae4 in route_table_free lib/table.c:111 > FRRouting#13 0x7f26f275b749 in route_table_finish lib/table.c:46 > FRRouting#14 0x55910c65db17 in zebra_router_free_table zebra/zebra_router.c:191 > FRRouting#15 0x55910c65dfb5 in zebra_router_terminate zebra/zebra_router.c:244 > FRRouting#16 0x55910c4f40db in zebra_finalize zebra/main.c:249 > FRRouting#17 0x7f26f2777108 in event_call lib/event.c:2011 > FRRouting#18 0x7f26f264180e in frr_run lib/libfrr.c:1212 > FRRouting#19 0x55910c4f49cb in main zebra/main.c:531 > FRRouting#20 0x7f26f2029d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 > FRRouting#21 0x7f26f2029e3f in __libc_start_main_impl ../csu/libc-start.c:392 > FRRouting#22 0x55910c4b0114 in _start (/usr/lib/frr/zebra+0x1ae114) It happens with FRR using the kernel. During shutdown, the namespace identifier is attempted to be obtained by zebra, in an attempt to prepare zebra dataplane nexthop messages. Fix this by accessing the ns structure. Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The following ASAN issue has been observed: > ERROR: AddressSanitizer: heap-use-after-free on address 0x6160000acba4 at pc 0x55910c5694d0 bp 0x7ffe3a8ac850 sp 0x7ffe3a8ac840 > READ of size 4 at 0x6160000acba4 thread T0 > #0 0x55910c5694cf in ctx_info_from_zns zebra/zebra_dplane.c:3315 > #1 0x55910c569696 in dplane_ctx_ns_init zebra/zebra_dplane.c:3331 > #2 0x55910c56bf61 in dplane_ctx_nexthop_init zebra/zebra_dplane.c:3680 > #3 0x55910c5711ca in dplane_nexthop_update_internal zebra/zebra_dplane.c:4490 > #4 0x55910c571c5c in dplane_nexthop_delete zebra/zebra_dplane.c:4717 > #5 0x55910c61e90e in zebra_nhg_uninstall_kernel zebra/zebra_nhg.c:3413 > #6 0x55910c615d8a in zebra_nhg_decrement_ref zebra/zebra_nhg.c:1919 > #7 0x55910c6404db in route_entry_update_nhe zebra/zebra_rib.c:454 > #8 0x55910c64c904 in rib_re_nhg_free zebra/zebra_rib.c:2822 > #9 0x55910c655be2 in rib_unlink zebra/zebra_rib.c:4212 > #10 0x55910c6430f9 in zebra_rtable_node_cleanup zebra/zebra_rib.c:968 > #11 0x7f26f275b8a9 in route_node_free lib/table.c:75 > #12 0x7f26f275bae4 in route_table_free lib/table.c:111 > #13 0x7f26f275b749 in route_table_finish lib/table.c:46 > #14 0x55910c65db17 in zebra_router_free_table zebra/zebra_router.c:191 > #15 0x55910c65dfb5 in zebra_router_terminate zebra/zebra_router.c:244 > #16 0x55910c4f40db in zebra_finalize zebra/main.c:249 > #17 0x7f26f2777108 in event_call lib/event.c:2011 > #18 0x7f26f264180e in frr_run lib/libfrr.c:1212 > #19 0x55910c4f49cb in main zebra/main.c:531 > #20 0x7f26f2029d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 > #21 0x7f26f2029e3f in __libc_start_main_impl ../csu/libc-start.c:392 > #22 0x55910c4b0114 in _start (/usr/lib/frr/zebra+0x1ae114) It happens with FRR using the kernel. During shutdown, the namespace identifier is attempted to be obtained by zebra, in an attempt to prepare zebra dataplane nexthop messages. Fix this by accessing the ns structure. Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com> (cherry picked from commit 7ae70eb)
The following ASAN issue has been observed: > ERROR: AddressSanitizer: heap-use-after-free on address 0x6160000acba4 at pc 0x55910c5694d0 bp 0x7ffe3a8ac850 sp 0x7ffe3a8ac840 > READ of size 4 at 0x6160000acba4 thread T0 > #0 0x55910c5694cf in ctx_info_from_zns zebra/zebra_dplane.c:3315 > #1 0x55910c569696 in dplane_ctx_ns_init zebra/zebra_dplane.c:3331 > #2 0x55910c56bf61 in dplane_ctx_nexthop_init zebra/zebra_dplane.c:3680 > #3 0x55910c5711ca in dplane_nexthop_update_internal zebra/zebra_dplane.c:4490 > #4 0x55910c571c5c in dplane_nexthop_delete zebra/zebra_dplane.c:4717 > #5 0x55910c61e90e in zebra_nhg_uninstall_kernel zebra/zebra_nhg.c:3413 > #6 0x55910c615d8a in zebra_nhg_decrement_ref zebra/zebra_nhg.c:1919 > #7 0x55910c6404db in route_entry_update_nhe zebra/zebra_rib.c:454 > #8 0x55910c64c904 in rib_re_nhg_free zebra/zebra_rib.c:2822 > #9 0x55910c655be2 in rib_unlink zebra/zebra_rib.c:4212 > #10 0x55910c6430f9 in zebra_rtable_node_cleanup zebra/zebra_rib.c:968 > #11 0x7f26f275b8a9 in route_node_free lib/table.c:75 > #12 0x7f26f275bae4 in route_table_free lib/table.c:111 > #13 0x7f26f275b749 in route_table_finish lib/table.c:46 > #14 0x55910c65db17 in zebra_router_free_table zebra/zebra_router.c:191 > #15 0x55910c65dfb5 in zebra_router_terminate zebra/zebra_router.c:244 > #16 0x55910c4f40db in zebra_finalize zebra/main.c:249 > #17 0x7f26f2777108 in event_call lib/event.c:2011 > #18 0x7f26f264180e in frr_run lib/libfrr.c:1212 > #19 0x55910c4f49cb in main zebra/main.c:531 > #20 0x7f26f2029d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 > #21 0x7f26f2029e3f in __libc_start_main_impl ../csu/libc-start.c:392 > #22 0x55910c4b0114 in _start (/usr/lib/frr/zebra+0x1ae114) It happens with FRR using the kernel. During shutdown, the namespace identifier is attempted to be obtained by zebra, in an attempt to prepare zebra dataplane nexthop messages. Fix this by accessing the ns structure. Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com> (cherry picked from commit 7ae70eb) # Conflicts: # zebra/main.c # zebra/zebra_ns.h
The following ASAN issue has been observed: > ERROR: AddressSanitizer: heap-use-after-free on address 0x6160000acba4 at pc 0x55910c5694d0 bp 0x7ffe3a8ac850 sp 0x7ffe3a8ac840 > READ of size 4 at 0x6160000acba4 thread T0 > #0 0x55910c5694cf in ctx_info_from_zns zebra/zebra_dplane.c:3315 > #1 0x55910c569696 in dplane_ctx_ns_init zebra/zebra_dplane.c:3331 > #2 0x55910c56bf61 in dplane_ctx_nexthop_init zebra/zebra_dplane.c:3680 > #3 0x55910c5711ca in dplane_nexthop_update_internal zebra/zebra_dplane.c:4490 > #4 0x55910c571c5c in dplane_nexthop_delete zebra/zebra_dplane.c:4717 > #5 0x55910c61e90e in zebra_nhg_uninstall_kernel zebra/zebra_nhg.c:3413 > #6 0x55910c615d8a in zebra_nhg_decrement_ref zebra/zebra_nhg.c:1919 > #7 0x55910c6404db in route_entry_update_nhe zebra/zebra_rib.c:454 > #8 0x55910c64c904 in rib_re_nhg_free zebra/zebra_rib.c:2822 > #9 0x55910c655be2 in rib_unlink zebra/zebra_rib.c:4212 > #10 0x55910c6430f9 in zebra_rtable_node_cleanup zebra/zebra_rib.c:968 > #11 0x7f26f275b8a9 in route_node_free lib/table.c:75 > #12 0x7f26f275bae4 in route_table_free lib/table.c:111 > #13 0x7f26f275b749 in route_table_finish lib/table.c:46 > #14 0x55910c65db17 in zebra_router_free_table zebra/zebra_router.c:191 > #15 0x55910c65dfb5 in zebra_router_terminate zebra/zebra_router.c:244 > #16 0x55910c4f40db in zebra_finalize zebra/main.c:249 > #17 0x7f26f2777108 in event_call lib/event.c:2011 > #18 0x7f26f264180e in frr_run lib/libfrr.c:1212 > #19 0x55910c4f49cb in main zebra/main.c:531 > #20 0x7f26f2029d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 > #21 0x7f26f2029e3f in __libc_start_main_impl ../csu/libc-start.c:392 > #22 0x55910c4b0114 in _start (/usr/lib/frr/zebra+0x1ae114) It happens with FRR using the kernel. During shutdown, the namespace identifier is attempted to be obtained by zebra, in an attempt to prepare zebra dataplane nexthop messages. Fix this by accessing the ns structure. Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com> (cherry picked from commit 7ae70eb)
The following ASAN issue has been observed: > ERROR: AddressSanitizer: heap-use-after-free on address 0x6160000acba4 at pc 0x55910c5694d0 bp 0x7ffe3a8ac850 sp 0x7ffe3a8ac840 > READ of size 4 at 0x6160000acba4 thread T0 > #0 0x55910c5694cf in ctx_info_from_zns zebra/zebra_dplane.c:3315 > #1 0x55910c569696 in dplane_ctx_ns_init zebra/zebra_dplane.c:3331 > #2 0x55910c56bf61 in dplane_ctx_nexthop_init zebra/zebra_dplane.c:3680 > #3 0x55910c5711ca in dplane_nexthop_update_internal zebra/zebra_dplane.c:4490 > #4 0x55910c571c5c in dplane_nexthop_delete zebra/zebra_dplane.c:4717 > #5 0x55910c61e90e in zebra_nhg_uninstall_kernel zebra/zebra_nhg.c:3413 > #6 0x55910c615d8a in zebra_nhg_decrement_ref zebra/zebra_nhg.c:1919 > #7 0x55910c6404db in route_entry_update_nhe zebra/zebra_rib.c:454 > #8 0x55910c64c904 in rib_re_nhg_free zebra/zebra_rib.c:2822 > #9 0x55910c655be2 in rib_unlink zebra/zebra_rib.c:4212 > #10 0x55910c6430f9 in zebra_rtable_node_cleanup zebra/zebra_rib.c:968 > #11 0x7f26f275b8a9 in route_node_free lib/table.c:75 > #12 0x7f26f275bae4 in route_table_free lib/table.c:111 > #13 0x7f26f275b749 in route_table_finish lib/table.c:46 > #14 0x55910c65db17 in zebra_router_free_table zebra/zebra_router.c:191 > #15 0x55910c65dfb5 in zebra_router_terminate zebra/zebra_router.c:244 > #16 0x55910c4f40db in zebra_finalize zebra/main.c:249 > #17 0x7f26f2777108 in event_call lib/event.c:2011 > #18 0x7f26f264180e in frr_run lib/libfrr.c:1212 > #19 0x55910c4f49cb in main zebra/main.c:531 > #20 0x7f26f2029d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 > #21 0x7f26f2029e3f in __libc_start_main_impl ../csu/libc-start.c:392 > #22 0x55910c4b0114 in _start (/usr/lib/frr/zebra+0x1ae114) It happens with FRR using the kernel. During shutdown, the namespace identifier is attempted to be obtained by zebra, in an attempt to prepare zebra dataplane nexthop messages. Fix this by accessing the ns structure. Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com> (cherry picked from commit 7ae70eb) # Conflicts: # zebra/main.c # zebra/zebra_ns.h
Fix a heap-after-free that causes zebra to crash even without address-sanitizer. To reproduce: > echo "100 my_table" | tee -a /etc/iproute2/rt_tables > ip route add blackhole default table 100 > ip route show table 100 > ip l add red type vrf table 100 > ip l del red > ip route del blackhole default table 100 Zebra manages routing tables for all existing Linux RT tables, regardless of whether they are assigned to a VRF interface. When a table is not assigned to any VRF, zebra arbitrarily assigns it to the default VRF, even though this is not strictly accurate (the code expects this behavior). When an RT table is created after a VRF, zebra correctly assigns the table to the VRF. However, if a VRF interface is assigned to an existing RT table, zebra does not update the table owner, which remains as the default VRF. As a result, existing routing entries remain under the default VRF, while new entries are correctly assigned to the VRF. The VRF mismatch is unexpected in the code and creates crashes and memory related issues. Furthermore, Linux does not automatically delete RT tables when they are unassigned from a VRF. It is incorrect to delete these tables from zebra. Instead, at VRF disabling, do not release the table but reassign it to the default VRF. At VRF enabling, change the table owner back to the appropriate VRF. > ==2866266==ERROR: AddressSanitizer: heap-use-after-free on address 0x606000154f54 at pc 0x7fa32474b83f bp 0x7ffe94f67d90 sp 0x7ffe94f67d88 > READ of size 1 at 0x606000154f54 thread T0 > #0 0x7fa32474b83e in rn_hash_node_const_find lib/table.c:28 > #1 0x7fa32474bab1 in rn_hash_node_find lib/table.c:28 > #2 0x7fa32474d783 in route_node_get lib/table.c:283 > #3 0x7fa3247328dd in srcdest_rnode_get lib/srcdest_table.c:231 > FRRouting#4 0x55b0e4fa8da4 in rib_find_rn_from_ctx zebra/zebra_rib.c:1957 > FRRouting#5 0x55b0e4fa8e31 in rib_process_result zebra/zebra_rib.c:1988 > FRRouting#6 0x55b0e4fb9d64 in rib_process_dplane_results zebra/zebra_rib.c:4894 > FRRouting#7 0x7fa32476689c in event_call lib/event.c:1996 > FRRouting#8 0x7fa32463b7b2 in frr_run lib/libfrr.c:1232 > FRRouting#9 0x55b0e4e6c32a in main zebra/main.c:526 > FRRouting#10 0x7fa32424fd09 in __libc_start_main ../csu/libc-start.c:308 > FRRouting#11 0x55b0e4e2d649 in _start (/usr/lib/frr/zebra+0x1a1649) > > 0x606000154f54 is located 20 bytes inside of 56-byte region [0x606000154f40,0x606000154f78) > freed by thread T0 here: > #0 0x7fa324ca9b6f in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:123 > #1 0x7fa324668d8f in qfree lib/memory.c:130 > #2 0x7fa32474c421 in route_table_free lib/table.c:126 > #3 0x7fa32474bf96 in route_table_finish lib/table.c:46 > FRRouting#4 0x55b0e4fbca3a in zebra_router_free_table zebra/zebra_router.c:191 > FRRouting#5 0x55b0e4fbccea in zebra_router_release_table zebra/zebra_router.c:214 > FRRouting#6 0x55b0e4fd428e in zebra_vrf_disable zebra/zebra_vrf.c:219 > FRRouting#7 0x7fa32476fabf in vrf_disable lib/vrf.c:326 > FRRouting#8 0x7fa32476f5d4 in vrf_delete lib/vrf.c:231 > FRRouting#9 0x55b0e4e4ad36 in interface_vrf_change zebra/interface.c:1478 > FRRouting#10 0x55b0e4e4d5d2 in zebra_if_dplane_ifp_handling zebra/interface.c:1949 > FRRouting#11 0x55b0e4e4fb89 in zebra_if_dplane_result zebra/interface.c:2268 > FRRouting#12 0x55b0e4fb9f26 in rib_process_dplane_results zebra/zebra_rib.c:4954 > FRRouting#13 0x7fa32476689c in event_call lib/event.c:1996 > FRRouting#14 0x7fa32463b7b2 in frr_run lib/libfrr.c:1232 > FRRouting#15 0x55b0e4e6c32a in main zebra/main.c:526 > FRRouting#16 0x7fa32424fd09 in __libc_start_main ../csu/libc-start.c:308 > > previously allocated by thread T0 here: > #0 0x7fa324caa037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154 > #1 0x7fa324668c4d in qcalloc lib/memory.c:105 > #2 0x7fa32474bf33 in route_table_init_with_delegate lib/table.c:38 > #3 0x7fa32474e73c in route_table_init lib/table.c:512 > FRRouting#4 0x55b0e4fbc353 in zebra_router_get_table zebra/zebra_router.c:137 > FRRouting#5 0x55b0e4fd4da0 in zebra_vrf_table_create zebra/zebra_vrf.c:358 > FRRouting#6 0x55b0e4fd3d30 in zebra_vrf_enable zebra/zebra_vrf.c:140 > FRRouting#7 0x7fa32476f9b2 in vrf_enable lib/vrf.c:286 > FRRouting#8 0x55b0e4e4af76 in interface_vrf_change zebra/interface.c:1533 > FRRouting#9 0x55b0e4e4d612 in zebra_if_dplane_ifp_handling zebra/interface.c:1968 > FRRouting#10 0x55b0e4e4fb89 in zebra_if_dplane_result zebra/interface.c:2268 > FRRouting#11 0x55b0e4fb9f26 in rib_process_dplane_results zebra/zebra_rib.c:4954 > FRRouting#12 0x7fa32476689c in event_call lib/event.c:1996 > FRRouting#13 0x7fa32463b7b2 in frr_run lib/libfrr.c:1232 > FRRouting#14 0x55b0e4e6c32a in main zebra/main.c:526 > FRRouting#15 0x7fa32424fd09 in __libc_start_main ../csu/libc-start.c:308 Fixes: d8612e6 ("zebra: Track tables allocated by vrf and cleanup") Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
The following ASAN issue has been observed: > ERROR: AddressSanitizer: heap-use-after-free on address 0x6160000acba4 at pc 0x55910c5694d0 bp 0x7ffe3a8ac850 sp 0x7ffe3a8ac840 > READ of size 4 at 0x6160000acba4 thread T0 > #0 0x55910c5694cf in ctx_info_from_zns zebra/zebra_dplane.c:3315 > #1 0x55910c569696 in dplane_ctx_ns_init zebra/zebra_dplane.c:3331 > #2 0x55910c56bf61 in dplane_ctx_nexthop_init zebra/zebra_dplane.c:3680 > #3 0x55910c5711ca in dplane_nexthop_update_internal zebra/zebra_dplane.c:4490 > #4 0x55910c571c5c in dplane_nexthop_delete zebra/zebra_dplane.c:4717 > #5 0x55910c61e90e in zebra_nhg_uninstall_kernel zebra/zebra_nhg.c:3413 > #6 0x55910c615d8a in zebra_nhg_decrement_ref zebra/zebra_nhg.c:1919 > #7 0x55910c6404db in route_entry_update_nhe zebra/zebra_rib.c:454 > #8 0x55910c64c904 in rib_re_nhg_free zebra/zebra_rib.c:2822 > #9 0x55910c655be2 in rib_unlink zebra/zebra_rib.c:4212 > #10 0x55910c6430f9 in zebra_rtable_node_cleanup zebra/zebra_rib.c:968 > #11 0x7f26f275b8a9 in route_node_free lib/table.c:75 > #12 0x7f26f275bae4 in route_table_free lib/table.c:111 > #13 0x7f26f275b749 in route_table_finish lib/table.c:46 > #14 0x55910c65db17 in zebra_router_free_table zebra/zebra_router.c:191 > #15 0x55910c65dfb5 in zebra_router_terminate zebra/zebra_router.c:244 > #16 0x55910c4f40db in zebra_finalize zebra/main.c:249 > #17 0x7f26f2777108 in event_call lib/event.c:2011 > #18 0x7f26f264180e in frr_run lib/libfrr.c:1212 > #19 0x55910c4f49cb in main zebra/main.c:531 > #20 0x7f26f2029d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 > #21 0x7f26f2029e3f in __libc_start_main_impl ../csu/libc-start.c:392 > #22 0x55910c4b0114 in _start (/usr/lib/frr/zebra+0x1ae114) It happens with FRR using the kernel. During shutdown, the namespace identifier is attempted to be obtained by zebra, in an attempt to prepare zebra dataplane nexthop messages. Fix this by accessing the ns structure. Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com> (cherry picked from commit 7ae70eb)
There is no control on the returned nexthop group entry, when finding pic contexts. Actually the pic context can resolve over itself, and this may lead to stack overflow: The below can be found by generalizing the search of pic nhe for all nexthops and not only for srv6 contexts. > root@ubuntu2204hwe:~/frr# AddressSanitizer:DEADLYSIGNAL > ================================================================= > ==247856==ERROR: AddressSanitizer: stack-overflow on address 0x7ffe4e6dcff8 (pc 0x561e05bb5653 bp 0x7ffe4e6dd020 sp 0x7ffe4e6dd000 T0) > #0 0x561e05bb5653 in zebra_nhg_install_kernel zebra/zebra_nhg.c:3310 > FRRouting#1 0x561e05bb572d in zebra_nhg_install_kernel zebra/zebra_nhg.c:3329 > FRRouting#2 0x561e05bb572d in zebra_nhg_install_kernel zebra/zebra_nhg.c:3329 > FRRouting#3 0x561e05bb572d in zebra_nhg_install_kernel zebra/zebra_nhg.c:3329 > FRRouting#4 0x561e05bb572d in zebra_nhg_install_kernel zebra/zebra_nhg.c:3329 > FRRouting#5 0x561e05bb572d in zebra_nhg_install_kernel zebra/zebra_nhg.c:3329 > FRRouting#6 0x561e05bb572d in zebra_nhg_install_kernel zebra/zebra_nhg.c:3329 > FRRouting#7 0x561e05bb572d in zebra_nhg_install_kernel zebra/zebra_nhg.c:3329 > FRRouting#8 0x561e05bb572d in zebra_nhg_install_kernel zebra/zebra_nhg.c:3329 > FRRouting#9 0x561e05bb572d in zebra_nhg_install_kernel zebra/zebra_nhg.c:3329 > FRRouting#10 0x561e05bb572d in zebra_nhg_install_kernel zebra/zebra_nhg.c:3329 > FRRouting#11 0x561e05bb572d in zebra_nhg_install_kernel zebra/zebra_nhg.c:3329 > FRRouting#12 0x561e05bb572d in zebra_nhg_install_kernel zebra/zebra_nhg.c:3329 > FRRouting#13 0x561e05bb572d in zebra_nhg_install_kernel zebra/zebra_nhg.c:3329 > FRRouting#14 0x561e05bb572d in zebra_nhg_install_kernel zebra/zebra_nhg.c:3329 Fix this by not returning a nexthop group entry when creation is necessary for pic context. Add a check when the pic creation is not needed and the returned nhe has the same identifier as the requested nhe. Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
When a failover happens on ECMP paths that use the same nexthop which is recursively resolved, ZEBRA replaces the old NHG with a new one, and updates the pointer of all routes using that nexthop. Actually, if only the recursive nexthop changed, there is no need to replace the old NHG. Modify the zebra_nhg_proto_add() function, by updating the recursive nexthop on the original NHG. Using this change replaces the old method that was consisting in allocating a new nhe. This change triggers an ASAN in the bgp_nhg_zapi_scalability test, function test_bgp_ipv4_simulate_r5_machine_going_down(). > r1: zebra triggered an exception by AddressSanitizer > AddressSanitizer error in topotest `test_bgp_nhg_zapi_scalability.py`, test `teardown_module`, router `r1` > > ERROR: AddressSanitizer: heap-use-after-free on address 0x60e00230afa0 at pc 0x55bfebc9681e bp 0x7ffd657ceb40 sp 0x7ffd657ceb30 > READ of size 4 at 0x60e00230afa0 thread T0 > #0 0x55bfebc9681d in zebra_nhg_decrement_ref zebra/zebra_nhg.c:1855 > FRRouting#1 0x55bfebc967f7 in zebra_nhg_decrement_ref zebra/zebra_nhg.c:1868 > FRRouting#2 0x55bfebcb32f6 in route_entry_update_nhe zebra/zebra_rib.c:460 > FRRouting#3 0x55bfebcb352f in rib_handle_nhg_replace zebra/zebra_rib.c:486 > FRRouting#4 0x55bfebc99c14 in zebra_nhg_proto_add zebra/zebra_nhg.c:3836 > FRRouting#5 0x55bfebcc4035 in process_subq_nhg zebra/zebra_rib.c:2763 > FRRouting#6 0x55bfebcc4035 in process_subq zebra/zebra_rib.c:3369 > FRRouting#7 0x55bfebcc4035 in meta_queue_process zebra/zebra_rib.c:3422 > FRRouting#8 0x7f458a518bff in work_queue_run lib/workqueue.c:282 > FRRouting#9 0x7f458a4fa24b in event_call lib/event.c:2019 > FRRouting#10 0x7f458a41f717 in frr_run lib/libfrr.c:1238 > FRRouting#11 0x55bfebb82cb4 in main zebra/main.c:528 > FRRouting#12 0x7f4589e29d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 > FRRouting#13 0x7f4589e29e3f in __libc_start_main_impl ../csu/libc-start.c:392 > FRRouting#14 0x55bfebb85c34 in _start (/usr/lib/frr/zebra+0x1abc34) > > 0x60e00230afa0 is located 96 bytes inside of 160-byte region [0x60e00230af40,0x60e00230afe0) > freed by thread T0 here: > #0 0x7f458a8b4537 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:127 > FRRouting#1 0x55bfebc967f7 in zebra_nhg_decrement_ref zebra/zebra_nhg.c:1868 > FRRouting#2 0x55bfebcb32f6 in route_entry_update_nhe zebra/zebra_rib.c:460 > FRRouting#3 0x55bfebcb352f in rib_handle_nhg_replace zebra/zebra_rib.c:486 > FRRouting#4 0x55bfebc99c14 in zebra_nhg_proto_add zebra/zebra_nhg.c:3836 > FRRouting#5 0x55bfebcc4035 in process_subq_nhg zebra/zebra_rib.c:2763 > FRRouting#6 0x55bfebcc4035 in process_subq zebra/zebra_rib.c:3369 > FRRouting#7 0x55bfebcc4035 in meta_queue_process zebra/zebra_rib.c:3422 > FRRouting#8 0x7f458a518bff in work_queue_run lib/workqueue.c:282 > FRRouting#9 0x7f458a4fa24b in event_call lib/event.c:2019 > FRRouting#10 0x7f458a41f717 in frr_run lib/libfrr.c:1238 > FRRouting#11 0x55bfebb82cb4 in main zebra/main.c:528 > FRRouting#12 0x7f4589e29d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 > > previously allocated by thread T0 here: > #0 0x7f458a8b4a57 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154 > FRRouting#1 0x7f458a43fb7e in qcalloc lib/memory.c:106 > FRRouting#2 0x55bfebc91f2e in zebra_nhg_alloc zebra/zebra_nhg.c:392 > FRRouting#3 0x55bfebc91f2e in zebra_nhe_copy zebra/zebra_nhg.c:499 > FRRouting#4 0x55bfebc922af in zebra_nhg_hash_alloc zebra/zebra_nhg.c:538 > FRRouting#5 0x7f458a3fd0bd in hash_get lib/hash.c:147 > FRRouting#6 0x55bfebc94d7a in zebra_nhe_find zebra/zebra_nhg.c:831 > FRRouting#7 0x55bfebc953ef in zebra_nhg_find zebra/zebra_nhg.c:1013 > FRRouting#8 0x55bfebc9585d in zebra_nhg_find_nexthop zebra/zebra_nhg.c:1030 > FRRouting#9 0x55bfebc94078 in depends_find_recursive zebra/zebra_nhg.c:1511 > FRRouting#10 0x55bfebc94078 in depends_find zebra/zebra_nhg.c:1560 > FRRouting#11 0x55bfebc94078 in depends_find_add zebra/zebra_nhg.c:1599 > FRRouting#12 0x55bfebc99e40 in zebra_nhg_update_nhe zebra/zebra_nhg.c:3732 > FRRouting#13 0x55bfebc99e40 in zebra_nhg_proto_add zebra/zebra_nhg.c:3819 > FRRouting#14 0x55bfebcc4035 in process_subq_nhg zebra/zebra_rib.c:2763 > FRRouting#15 0x55bfebcc4035 in process_subq zebra/zebra_rib.c:3369 > FRRouting#16 0x55bfebcc4035 in meta_queue_process zebra/zebra_rib.c:3422 > FRRouting#17 0x7f458a518bff in work_queue_run lib/workqueue.c:282 > FRRouting#18 0x7f458a4fa24b in event_call lib/event.c:2019 > FRRouting#19 0x7f458a41f717 in frr_run lib/libfrr.c:1238 > FRRouting#20 0x55bfebb82cb4 in main zebra/main.c:528 > FRRouting#21 0x7f4589e29d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 > > SUMMARY: AddressSanitizer: heap-use-after-free zebra/zebra_nhg.c:1855 in zebra_nhg_decrement_ref > Shadow bytes around the buggy address: > 0x0c1c804595a0: fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa fa > 0x0c1c804595b0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd > 0x0c1c804595c0: fd fd fd fa fa fa fa fa fa fa fa fa fd fd fd fd > 0x0c1c804595d0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fa > 0x0c1c804595e0: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd > =>0x0c1c804595f0: fd fd fd fd[fd]fd fd fd fd fd fd fd fa fa fa fa > 0x0c1c80459600: fa fa fa fa fd fd fd fd fd fd fd fd fd fd fd fd > 0x0c1c80459610: fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa fa > 0x0c1c80459620: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd > 0x0c1c80459630: fd fd fd fa fa fa fa fa fa fa fa fa 00 00 00 00 > 0x0c1c80459640: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fa > Shadow byte legend (one shadow byte represents 8 application bytes): > Addressable: 00 > Partially addressable: 01 02 03 04 05 06 07 > Heap left redzone: fa > Freed heap region: fd > Stack left redzone: f1 > Stack mid redzone: f2 > Stack right redzone: f3 > Stack after return: f5 > Stack use after scope: f8 > Global redzone: f9 > Global init order: f6 > Poisoned by user: f7 > Container overflow: fc > Array cookie: ac > Intra object redzone: bb > ASan internal: fe > Left alloca redzone: ca > Right alloca redzone: cb > Shadow gap: cc > Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The following ASAN error can be seen. > ERROR: AddressSanitizer: attempting to call malloc_usable_size() for pointer which is not owned: 0x608000036c20 > #0 0x7f3d7a4b5425 in __interceptor_malloc_usable_size ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:198 > FRRouting#1 0x7f3d7a426a16 in __sanitizer::BufferedStackTrace::Unwind(unsigned long, unsigned long, void*, bool, unsigned int) ../../../../src/libsanitizer/sanitizer_common > /sanitizer_stacktrace.h:122 > FRRouting#2 0x7f3d7a426a16 in __asan::asan_malloc_usable_size(void const*, unsigned long, unsigned long) ../../../../src/libsanitizer/asan/asan_allocator.cpp:1074 > FRRouting#3 0x7f3d7a03f330 in mt_count_free lib/memory.c:78 > FRRouting#4 0x7f3d7a03f330 in qfree lib/memory.c:130 > FRRouting#5 0x7f3d76ccf89b in bmp_peer_status_changed bgpd/bgp_bmp.c:982 > FRRouting#6 0x560ae2aa6a94 in hook_call_peer_status_changed bgpd/bgp_fsm.c:47 > FRRouting#7 0x560ae2aa6a94 in bgp_fsm_change_status bgpd/bgp_fsm.c:1287 > FRRouting#8 0x560ae2c4f2e5 in peer_delete bgpd/bgpd.c:2777 > FRRouting#9 0x560ae2c58d24 in bgp_delete bgpd/bgpd.c:4140 > FRRouting#10 0x560ae2bbb47e in no_router_bgp bgpd/bgp_vty.c:1764 > FRRouting#11 0x7f3d79fb74ed in cmd_execute_command_real lib/command.c:1003 > FRRouting#12 0x7f3d79fb78a3 in cmd_execute_command lib/command.c:1062 > FRRouting#13 0x7f3d79fb7e03 in cmd_execute lib/command.c:1228 > FRRouting#14 0x7f3d7a107b53 in vty_command lib/vty.c:625 > FRRouting#15 0x7f3d7a109902 in vty_execute lib/vty.c:1388 > FRRouting#16 0x7f3d7a10cc32 in vtysh_read lib/vty.c:2400 > FRRouting#17 0x7f3d7a0f848b in event_call lib/event.c:2019 > FRRouting#18 0x7f3d7a01e627 in frr_run lib/libfrr.c:1232 > FRRouting#19 0x560ae29e0037 in main bgpd/bgp_main.c:555 > FRRouting#20 0x7f3d79a29d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 > FRRouting#21 0x7f3d79a29e3f in __libc_start_main_impl ../csu/libc-start.c:392 > FRRouting#22 0x560ae29e4ef4 in _start (/usr/lib/frr/bgpd+0x2eeef4) > > 0x608000036c20 is located 0 bytes inside of 81-byte region [0x608000036c20,0x608000036c71) > freed by thread T0 here: > #0 0x7f3d7a4b4537 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:127 > FRRouting#1 0x7f3d76ccf85f in bmp_peer_status_changed bgpd/bgp_bmp.c:981 > FRRouting#2 0x560ae2aa6a94 in hook_call_peer_status_changed bgpd/bgp_fsm.c:47 > FRRouting#3 0x560ae2aa6a94 in bgp_fsm_change_status bgpd/bgp_fsm.c:1287 > FRRouting#4 0x560ae2c4f2e5 in peer_delete bgpd/bgpd.c:2777 > FRRouting#5 0x560ae2c58d24 in bgp_delete bgpd/bgpd.c:4140 > FRRouting#6 0x560ae2bbb47e in no_router_bgp bgpd/bgp_vty.c:1764 > FRRouting#7 0x7f3d79fb74ed in cmd_execute_command_real lib/command.c:1003 > FRRouting#8 0x7f3d79fb78a3 in cmd_execute_command lib/command.c:1062 > FRRouting#9 0x7f3d79fb7e03 in cmd_execute lib/command.c:1228 > FRRouting#10 0x7f3d7a107b53 in vty_command lib/vty.c:625 > FRRouting#11 0x7f3d7a109902 in vty_execute lib/vty.c:1388 > FRRouting#12 0x7f3d7a10cc32 in vtysh_read lib/vty.c:2400 > FRRouting#13 0x7f3d7a0f848b in event_call lib/event.c:2019 > FRRouting#14 0x7f3d7a01e627 in frr_run lib/libfrr.c:1232 > FRRouting#15 0x560ae29e0037 in main bgpd/bgp_main.c:555 > FRRouting#16 0x7f3d79a29d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 > > previously allocated by thread T0 here: > #0 0x7f3d7a4b4887 in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145 > FRRouting#1 0x7f3d7a03f0e9 in qmalloc lib/memory.c:101 > FRRouting#2 0x7f3d76cd0166 in bmp_bgp_peer_vrf bgpd/bgp_bmp.c:2194 > FRRouting#3 0x7f3d76cd0166 in bmp_bgp_update_vrf_status bgpd/bgp_bmp.c:2236 > FRRouting#4 0x7f3d76cd29b8 in bmp_vrf_state_changed bgpd/bgp_bmp.c:3479 > FRRouting#5 0x560ae2c45b34 in hook_call_bgp_instance_state bgpd/bgpd.c:88 > FRRouting#6 0x560ae2c4d158 in bgp_instance_up bgpd/bgpd.c:3936 > FRRouting#7 0x560ae29e5ed1 in bgp_vrf_enable bgpd/bgp_main.c:299 > FRRouting#8 0x7f3d7a0ff8b1 in vrf_enable lib/vrf.c:286 > FRRouting#9 0x7f3d7a0ff8b1 in vrf_enable lib/vrf.c:275 > FRRouting#10 0x7f3d7a12ab66 in zclient_vrf_add lib/zclient.c:2561 > FRRouting#11 0x7f3d7a12eb43 in zclient_read lib/zclient.c:4624 > FRRouting#12 0x7f3d7a0f848b in event_call lib/event.c:2019 > FRRouting#13 0x7f3d7a01e627 in frr_run lib/libfrr.c:1232 > FRRouting#14 0x560ae29e0037 in main bgpd/bgp_main.c:555 > FRRouting#15 0x7f3d79a29d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
The following ASAN error can be seen. > ERROR: AddressSanitizer: attempting to call malloc_usable_size() for pointer which is not owned: 0x608000036c20 > #0 0x7f3d7a4b5425 in __interceptor_malloc_usable_size ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:198 > FRRouting#1 0x7f3d7a426a16 in __sanitizer::BufferedStackTrace::Unwind(unsigned long, unsigned long, void*, bool, unsigned int) ../../../../src/libsanitizer/sanitizer_common > /sanitizer_stacktrace.h:122 > FRRouting#2 0x7f3d7a426a16 in __asan::asan_malloc_usable_size(void const*, unsigned long, unsigned long) ../../../../src/libsanitizer/asan/asan_allocator.cpp:1074 > FRRouting#3 0x7f3d7a03f330 in mt_count_free lib/memory.c:78 > FRRouting#4 0x7f3d7a03f330 in qfree lib/memory.c:130 > FRRouting#5 0x7f3d76ccf89b in bmp_peer_status_changed bgpd/bgp_bmp.c:982 > FRRouting#6 0x560ae2aa6a94 in hook_call_peer_status_changed bgpd/bgp_fsm.c:47 > FRRouting#7 0x560ae2aa6a94 in bgp_fsm_change_status bgpd/bgp_fsm.c:1287 > FRRouting#8 0x560ae2c4f2e5 in peer_delete bgpd/bgpd.c:2777 > FRRouting#9 0x560ae2c58d24 in bgp_delete bgpd/bgpd.c:4140 > FRRouting#10 0x560ae2bbb47e in no_router_bgp bgpd/bgp_vty.c:1764 > FRRouting#11 0x7f3d79fb74ed in cmd_execute_command_real lib/command.c:1003 > FRRouting#12 0x7f3d79fb78a3 in cmd_execute_command lib/command.c:1062 > FRRouting#13 0x7f3d79fb7e03 in cmd_execute lib/command.c:1228 > FRRouting#14 0x7f3d7a107b53 in vty_command lib/vty.c:625 > FRRouting#15 0x7f3d7a109902 in vty_execute lib/vty.c:1388 > FRRouting#16 0x7f3d7a10cc32 in vtysh_read lib/vty.c:2400 > FRRouting#17 0x7f3d7a0f848b in event_call lib/event.c:2019 > FRRouting#18 0x7f3d7a01e627 in frr_run lib/libfrr.c:1232 > FRRouting#19 0x560ae29e0037 in main bgpd/bgp_main.c:555 > FRRouting#20 0x7f3d79a29d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 > FRRouting#21 0x7f3d79a29e3f in __libc_start_main_impl ../csu/libc-start.c:392 > FRRouting#22 0x560ae29e4ef4 in _start (/usr/lib/frr/bgpd+0x2eeef4) > > 0x608000036c20 is located 0 bytes inside of 81-byte region [0x608000036c20,0x608000036c71) > freed by thread T0 here: > #0 0x7f3d7a4b4537 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:127 > FRRouting#1 0x7f3d76ccf85f in bmp_peer_status_changed bgpd/bgp_bmp.c:981 > FRRouting#2 0x560ae2aa6a94 in hook_call_peer_status_changed bgpd/bgp_fsm.c:47 > FRRouting#3 0x560ae2aa6a94 in bgp_fsm_change_status bgpd/bgp_fsm.c:1287 > FRRouting#4 0x560ae2c4f2e5 in peer_delete bgpd/bgpd.c:2777 > FRRouting#5 0x560ae2c58d24 in bgp_delete bgpd/bgpd.c:4140 > FRRouting#6 0x560ae2bbb47e in no_router_bgp bgpd/bgp_vty.c:1764 > FRRouting#7 0x7f3d79fb74ed in cmd_execute_command_real lib/command.c:1003 > FRRouting#8 0x7f3d79fb78a3 in cmd_execute_command lib/command.c:1062 > FRRouting#9 0x7f3d79fb7e03 in cmd_execute lib/command.c:1228 > FRRouting#10 0x7f3d7a107b53 in vty_command lib/vty.c:625 > FRRouting#11 0x7f3d7a109902 in vty_execute lib/vty.c:1388 > FRRouting#12 0x7f3d7a10cc32 in vtysh_read lib/vty.c:2400 > FRRouting#13 0x7f3d7a0f848b in event_call lib/event.c:2019 > FRRouting#14 0x7f3d7a01e627 in frr_run lib/libfrr.c:1232 > FRRouting#15 0x560ae29e0037 in main bgpd/bgp_main.c:555 > FRRouting#16 0x7f3d79a29d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 > > previously allocated by thread T0 here: > #0 0x7f3d7a4b4887 in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145 > FRRouting#1 0x7f3d7a03f0e9 in qmalloc lib/memory.c:101 > FRRouting#2 0x7f3d76cd0166 in bmp_bgp_peer_vrf bgpd/bgp_bmp.c:2194 > FRRouting#3 0x7f3d76cd0166 in bmp_bgp_update_vrf_status bgpd/bgp_bmp.c:2236 > FRRouting#4 0x7f3d76cd29b8 in bmp_vrf_state_changed bgpd/bgp_bmp.c:3479 > FRRouting#5 0x560ae2c45b34 in hook_call_bgp_instance_state bgpd/bgpd.c:88 > FRRouting#6 0x560ae2c4d158 in bgp_instance_up bgpd/bgpd.c:3936 > FRRouting#7 0x560ae29e5ed1 in bgp_vrf_enable bgpd/bgp_main.c:299 > FRRouting#8 0x7f3d7a0ff8b1 in vrf_enable lib/vrf.c:286 > FRRouting#9 0x7f3d7a0ff8b1 in vrf_enable lib/vrf.c:275 > FRRouting#10 0x7f3d7a12ab66 in zclient_vrf_add lib/zclient.c:2561 > FRRouting#11 0x7f3d7a12eb43 in zclient_read lib/zclient.c:4624 > FRRouting#12 0x7f3d7a0f848b in event_call lib/event.c:2019 > FRRouting#13 0x7f3d7a01e627 in frr_run lib/libfrr.c:1232 > FRRouting#14 0x560ae29e0037 in main bgpd/bgp_main.c:555 > FRRouting#15 0x7f3d79a29d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
and match on full protocol name in proto_redistnum()
Signed-off-by: Quentin Young qlyoung@cumulusnetworks.com
Fixes #9