Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pressing TAB crashes vtysh #17

Closed
NetDEF-CI opened this issue Dec 19, 2016 · 0 comments
Closed

Pressing TAB crashes vtysh #17

NetDEF-CI opened this issue Dec 19, 2016 · 0 comments
Labels

Comments

@NetDEF-CI
Copy link
Collaborator

Issue by rwestphal
Thursday Dec 15, 2016 at 11:35 GMT
Originally opened as https://github.com/opensourcerouting/cumulus-private_quagga/issues/18


This only happens when we are in configure mode and start the command with "do ...". All daemons are affected as well (when we telnet to them directly).

Bug can be reproduced on the 'master' branch.

Backtrace bellow:
ubuntu# conf t
ubuntu(config)# do sh mpl2016/12/15 09:27:35 unknown: Assertion `last_token' failed in file command_match.c, line 373, function command_complete
2016/12/15 09:27:35 unknown: Backtrace for 17 stack frames:
2016/12/15 09:27:35 unknown: [bt 0] /usr/local/lib/libzebra.so.0(zlog_backtrace+0x34) [0x7fa9cc848d8d]
2016/12/15 09:27:35 unknown: [bt 1] /usr/local/lib/libzebra.so.0(_zlog_assert_failed+0xe1) [0x7fa9cc849576]
2016/12/15 09:27:35 unknown: [bt 2] /usr/local/lib/libzebra.so.0(command_complete+0x1f1) [0x7fa9cc8230bf]
2016/12/15 09:27:35 unknown: [bt 3] /usr/local/lib/libzebra.so.0(+0x2a816) [0x7fa9cc825816]
2016/12/15 09:27:35 unknown: [bt 4] /usr/local/lib/libzebra.so.0(cmd_complete_command+0xed) [0x7fa9cc825a86]
2016/12/15 09:27:35 unknown: [bt 5] vtysh() [0x405f0d]
2016/12/15 09:27:35 unknown: [bt 6] /lib/x86_64-linux-gnu/libreadline.so.6(rl_completion_matches+0x96) [0x7fa9cc5d1436]
2016/12/15 09:27:35 unknown: [bt 7] vtysh() [0x405f99]
2016/12/15 09:27:35 unknown: [bt 8] /lib/x86_64-linux-gnu/libreadline.so.6(+0x1c527) [0x7fa9cc5d1527]
2016/12/15 09:27:35 unknown: [bt 9] /lib/x86_64-linux-gnu/libreadline.so.6(rl_complete_internal+0x132) [0x7fa9cc5d1702]
2016/12/15 09:27:35 unknown: [bt 10] /lib/x86_64-linux-gnu/libreadline.so.6(_rl_dispatch_subseq+0x260) [0x7fa9cc5c8990]
2016/12/15 09:27:35 unknown: [bt 11] /lib/x86_64-linux-gnu/libreadline.so.6(readline_internal_char+0x92) [0x7fa9cc5c8e12]
2016/12/15 09:27:35 unknown: [bt 12] /lib/x86_64-linux-gnu/libreadline.so.6(readline+0x55) [0x7fa9cc5c9545]
2016/12/15 09:27:35 unknown: [bt 13] vtysh() [0x403abe]
2016/12/15 09:27:35 unknown: [bt 14] vtysh(main+0x720) [0x4043f3]
2016/12/15 09:27:35 unknown: [bt 15] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0) [0x7fa9cc20c830]
2016/12/15 09:27:35 unknown: [bt 16] vtysh(_start+0x29) [0x403819]
2016/12/15 09:27:35 unknown: Current thread not known/applicable
log: showing active allocations in memory group libzebra
log: memstats: Vector : 46157 * 16
log: memstats: Vector index : 46157 * (variably sized)
log: memstats: Link List : 4 * 40
log: memstats: Link Node : 17 * 24
log: memstats: VTY : 3 * (variably sized)
log: memstats: Graph : 38 * 8
log: memstats: Graph Node : 23038 * 32
log: memstats: Command Tokens : 5 * (variably sized)
log: memstats: String vector : 3 * (variably sized)
log: memstats: Command desc : 8811 * (variably sized)
log: memstats: Buffer : 1 * 24
log: memstats: Hash : 39 * 40
log: memstats: Hash Bucket : 2972 * 24
log: memstats: Hash Index : 39 * (variably sized)
log: memstats: Temporary memory : 16861 * (variably sized)
log: showing active allocations in memory group vtysh
Aborted (core dumped)

@NetDEF-CI NetDEF-CI added the bug label Dec 19, 2016
cfra referenced this issue in opensourcerouting/frr Nov 29, 2018
@louberger louberger mentioned this issue May 1, 2019
ton31337 pushed a commit that referenced this issue Oct 17, 2020
When zebra is running with debugs turned on there
is a use after free reported by the address sanitizer:

2020/10/16 12:58:02 ZEBRA: rib_delnode: (0:254):4.5.6.16/32: rn 0x60b000026f20, re 0x6080000131a0, removing
2020/10/16 12:58:02 ZEBRA: rib_meta_queue_add: (0:254):4.5.6.16/32: queued rn 0x60b000026f20 into sub-queue 3
=================================================================
==3101430==ERROR: AddressSanitizer: heap-use-after-free on address 0x608000011d28 at pc 0x555555705ab6 bp 0x7fffffffdab0 sp 0x7fffffffdaa8
READ of size 8 at 0x608000011d28 thread T0
    #0 0x555555705ab5 in re_list_const_first zebra/rib.h:222
    #1 0x555555705b54 in re_list_first zebra/rib.h:222
    #2 0x555555711a4f in process_subq_route zebra/zebra_rib.c:2248
    #3 0x555555711d2e in process_subq zebra/zebra_rib.c:2286
    #4 0x555555711ec7 in meta_queue_process zebra/zebra_rib.c:2320
    #5 0x7ffff74701f7 in work_queue_run lib/workqueue.c:291
    #6 0x7ffff7450e9c in thread_call lib/thread.c:1581
    #7 0x7ffff738eaf7 in frr_run lib/libfrr.c:1099
    #8 0x55555561a578 in main zebra/main.c:455
    #9 0x7ffff7079cc9 in __libc_start_main ../csu/libc-start.c:308
    #10 0x5555555e3429 in _start (/usr/lib/frr/zebra+0x8f429)
0x608000011d28 is located 8 bytes inside of 88-byte region [0x608000011d20,0x608000011d78)
freed by thread T0 here:
    #0 0x7ffff768bb6f in __interceptor_free (/lib/x86_64-linux-gnu/libasan.so.6+0xa9b6f)
    #1 0x7ffff739ccad in qfree lib/memory.c:129
    #2 0x555555709ee4 in rib_gc_dest zebra/zebra_rib.c:746
    #3 0x55555570ca76 in rib_process zebra/zebra_rib.c:1240
    #4 0x555555711a05 in process_subq_route zebra/zebra_rib.c:2245
    #5 0x555555711d2e in process_subq zebra/zebra_rib.c:2286
    #6 0x555555711ec7 in meta_queue_process zebra/zebra_rib.c:2320
    #7 0x7ffff74701f7 in work_queue_run lib/workqueue.c:291
    #8 0x7ffff7450e9c in thread_call lib/thread.c:1581
    #9 0x7ffff738eaf7 in frr_run lib/libfrr.c:1099
    #10 0x55555561a578 in main zebra/main.c:455
    #11 0x7ffff7079cc9 in __libc_start_main ../csu/libc-start.c:308
previously allocated by thread T0 here:
    #0 0x7ffff768c037 in calloc (/lib/x86_64-linux-gnu/libasan.so.6+0xaa037)
    #1 0x7ffff739cb98 in qcalloc lib/memory.c:110
    #2 0x555555712ace in zebra_rib_create_dest zebra/zebra_rib.c:2515
    #3 0x555555712c6c in rib_link zebra/zebra_rib.c:2576
    #4 0x555555712faa in rib_addnode zebra/zebra_rib.c:2607
    #5 0x555555715bf0 in rib_add_multipath_nhe zebra/zebra_rib.c:3012
    #6 0x555555715f56 in rib_add_multipath zebra/zebra_rib.c:3049
    #7 0x55555571788b in rib_add zebra/zebra_rib.c:3327
    #8 0x5555555e584a in connected_up zebra/connected.c:254
    #9 0x5555555e42ff in connected_announce zebra/connected.c:94
    #10 0x5555555e4fd3 in connected_update zebra/connected.c:195
    #11 0x5555555e61ad in connected_add_ipv4 zebra/connected.c:340
    #12 0x5555555f26f5 in netlink_interface_addr zebra/if_netlink.c:1213
    #13 0x55555560f756 in netlink_information_fetch zebra/kernel_netlink.c:350
    #14 0x555555612e49 in netlink_parse_info zebra/kernel_netlink.c:941
    #15 0x55555560f9f1 in kernel_read zebra/kernel_netlink.c:402
    #16 0x7ffff7450e9c in thread_call lib/thread.c:1581
    #17 0x7ffff738eaf7 in frr_run lib/libfrr.c:1099
    #18 0x55555561a578 in main zebra/main.c:455
    #19 0x7ffff7079cc9 in __libc_start_main ../csu/libc-start.c:308
SUMMARY: AddressSanitizer: heap-use-after-free zebra/rib.h:222 in re_list_const_first

This is happening because we are using the dest pointer after a call into
rib_gc_dest.  In process_subq_route, we call rib_process() and if the
dest is deleted dest pointer is now garbage.  We must reload the
dest pointer in this case.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
ton31337 pushed a commit that referenced this issue Oct 18, 2020
When zebra is running with debugs turned on there
is a use after free reported by the address sanitizer:

2020/10/16 12:58:02 ZEBRA: rib_delnode: (0:254):4.5.6.16/32: rn 0x60b000026f20, re 0x6080000131a0, removing
2020/10/16 12:58:02 ZEBRA: rib_meta_queue_add: (0:254):4.5.6.16/32: queued rn 0x60b000026f20 into sub-queue 3
=================================================================
==3101430==ERROR: AddressSanitizer: heap-use-after-free on address 0x608000011d28 at pc 0x555555705ab6 bp 0x7fffffffdab0 sp 0x7fffffffdaa8
READ of size 8 at 0x608000011d28 thread T0
    #0 0x555555705ab5 in re_list_const_first zebra/rib.h:222
    #1 0x555555705b54 in re_list_first zebra/rib.h:222
    #2 0x555555711a4f in process_subq_route zebra/zebra_rib.c:2248
    #3 0x555555711d2e in process_subq zebra/zebra_rib.c:2286
    #4 0x555555711ec7 in meta_queue_process zebra/zebra_rib.c:2320
    #5 0x7ffff74701f7 in work_queue_run lib/workqueue.c:291
    #6 0x7ffff7450e9c in thread_call lib/thread.c:1581
    #7 0x7ffff738eaf7 in frr_run lib/libfrr.c:1099
    #8 0x55555561a578 in main zebra/main.c:455
    #9 0x7ffff7079cc9 in __libc_start_main ../csu/libc-start.c:308
    #10 0x5555555e3429 in _start (/usr/lib/frr/zebra+0x8f429)
0x608000011d28 is located 8 bytes inside of 88-byte region [0x608000011d20,0x608000011d78)
freed by thread T0 here:
    #0 0x7ffff768bb6f in __interceptor_free (/lib/x86_64-linux-gnu/libasan.so.6+0xa9b6f)
    #1 0x7ffff739ccad in qfree lib/memory.c:129
    #2 0x555555709ee4 in rib_gc_dest zebra/zebra_rib.c:746
    #3 0x55555570ca76 in rib_process zebra/zebra_rib.c:1240
    #4 0x555555711a05 in process_subq_route zebra/zebra_rib.c:2245
    #5 0x555555711d2e in process_subq zebra/zebra_rib.c:2286
    #6 0x555555711ec7 in meta_queue_process zebra/zebra_rib.c:2320
    #7 0x7ffff74701f7 in work_queue_run lib/workqueue.c:291
    #8 0x7ffff7450e9c in thread_call lib/thread.c:1581
    #9 0x7ffff738eaf7 in frr_run lib/libfrr.c:1099
    #10 0x55555561a578 in main zebra/main.c:455
    #11 0x7ffff7079cc9 in __libc_start_main ../csu/libc-start.c:308
previously allocated by thread T0 here:
    #0 0x7ffff768c037 in calloc (/lib/x86_64-linux-gnu/libasan.so.6+0xaa037)
    #1 0x7ffff739cb98 in qcalloc lib/memory.c:110
    #2 0x555555712ace in zebra_rib_create_dest zebra/zebra_rib.c:2515
    #3 0x555555712c6c in rib_link zebra/zebra_rib.c:2576
    #4 0x555555712faa in rib_addnode zebra/zebra_rib.c:2607
    #5 0x555555715bf0 in rib_add_multipath_nhe zebra/zebra_rib.c:3012
    #6 0x555555715f56 in rib_add_multipath zebra/zebra_rib.c:3049
    #7 0x55555571788b in rib_add zebra/zebra_rib.c:3327
    #8 0x5555555e584a in connected_up zebra/connected.c:254
    #9 0x5555555e42ff in connected_announce zebra/connected.c:94
    #10 0x5555555e4fd3 in connected_update zebra/connected.c:195
    #11 0x5555555e61ad in connected_add_ipv4 zebra/connected.c:340
    #12 0x5555555f26f5 in netlink_interface_addr zebra/if_netlink.c:1213
    #13 0x55555560f756 in netlink_information_fetch zebra/kernel_netlink.c:350
    #14 0x555555612e49 in netlink_parse_info zebra/kernel_netlink.c:941
    #15 0x55555560f9f1 in kernel_read zebra/kernel_netlink.c:402
    #16 0x7ffff7450e9c in thread_call lib/thread.c:1581
    #17 0x7ffff738eaf7 in frr_run lib/libfrr.c:1099
    #18 0x55555561a578 in main zebra/main.c:455
    #19 0x7ffff7079cc9 in __libc_start_main ../csu/libc-start.c:308
SUMMARY: AddressSanitizer: heap-use-after-free zebra/rib.h:222 in re_list_const_first

This is happening because we are using the dest pointer after a call into
rib_gc_dest.  In process_subq_route, we call rib_process() and if the
dest is deleted dest pointer is now garbage.  We must reload the
dest pointer in this case.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
chiragshah6 pushed a commit to chiragshah6/frr that referenced this issue Oct 27, 2020
When zebra is running with debugs turned on there
is a use after free reported by the address sanitizer:

2020/10/16 12:58:02 ZEBRA: rib_delnode: (0:254):4.5.6.16/32: rn 0x60b000026f20, re 0x6080000131a0, removing
2020/10/16 12:58:02 ZEBRA: rib_meta_queue_add: (0:254):4.5.6.16/32: queued rn 0x60b000026f20 into sub-queue 3
=================================================================
==3101430==ERROR: AddressSanitizer: heap-use-after-free on address 0x608000011d28 at pc 0x555555705ab6 bp 0x7fffffffdab0 sp 0x7fffffffdaa8
READ of size 8 at 0x608000011d28 thread T0
    #0 0x555555705ab5 in re_list_const_first zebra/rib.h:222
    FRRouting#1 0x555555705b54 in re_list_first zebra/rib.h:222
    FRRouting#2 0x555555711a4f in process_subq_route zebra/zebra_rib.c:2248
    FRRouting#3 0x555555711d2e in process_subq zebra/zebra_rib.c:2286
    FRRouting#4 0x555555711ec7 in meta_queue_process zebra/zebra_rib.c:2320
    FRRouting#5 0x7ffff74701f7 in work_queue_run lib/workqueue.c:291
    FRRouting#6 0x7ffff7450e9c in thread_call lib/thread.c:1581
    FRRouting#7 0x7ffff738eaf7 in frr_run lib/libfrr.c:1099
    FRRouting#8 0x55555561a578 in main zebra/main.c:455
    FRRouting#9 0x7ffff7079cc9 in __libc_start_main ../csu/libc-start.c:308
    FRRouting#10 0x5555555e3429 in _start (/usr/lib/frr/zebra+0x8f429)
0x608000011d28 is located 8 bytes inside of 88-byte region [0x608000011d20,0x608000011d78)
freed by thread T0 here:
    #0 0x7ffff768bb6f in __interceptor_free (/lib/x86_64-linux-gnu/libasan.so.6+0xa9b6f)
    FRRouting#1 0x7ffff739ccad in qfree lib/memory.c:129
    FRRouting#2 0x555555709ee4 in rib_gc_dest zebra/zebra_rib.c:746
    FRRouting#3 0x55555570ca76 in rib_process zebra/zebra_rib.c:1240
    FRRouting#4 0x555555711a05 in process_subq_route zebra/zebra_rib.c:2245
    FRRouting#5 0x555555711d2e in process_subq zebra/zebra_rib.c:2286
    FRRouting#6 0x555555711ec7 in meta_queue_process zebra/zebra_rib.c:2320
    FRRouting#7 0x7ffff74701f7 in work_queue_run lib/workqueue.c:291
    FRRouting#8 0x7ffff7450e9c in thread_call lib/thread.c:1581
    FRRouting#9 0x7ffff738eaf7 in frr_run lib/libfrr.c:1099
    FRRouting#10 0x55555561a578 in main zebra/main.c:455
    FRRouting#11 0x7ffff7079cc9 in __libc_start_main ../csu/libc-start.c:308
previously allocated by thread T0 here:
    #0 0x7ffff768c037 in calloc (/lib/x86_64-linux-gnu/libasan.so.6+0xaa037)
    FRRouting#1 0x7ffff739cb98 in qcalloc lib/memory.c:110
    FRRouting#2 0x555555712ace in zebra_rib_create_dest zebra/zebra_rib.c:2515
    FRRouting#3 0x555555712c6c in rib_link zebra/zebra_rib.c:2576
    FRRouting#4 0x555555712faa in rib_addnode zebra/zebra_rib.c:2607
    FRRouting#5 0x555555715bf0 in rib_add_multipath_nhe zebra/zebra_rib.c:3012
    FRRouting#6 0x555555715f56 in rib_add_multipath zebra/zebra_rib.c:3049
    FRRouting#7 0x55555571788b in rib_add zebra/zebra_rib.c:3327
    FRRouting#8 0x5555555e584a in connected_up zebra/connected.c:254
    FRRouting#9 0x5555555e42ff in connected_announce zebra/connected.c:94
    FRRouting#10 0x5555555e4fd3 in connected_update zebra/connected.c:195
    FRRouting#11 0x5555555e61ad in connected_add_ipv4 zebra/connected.c:340
    FRRouting#12 0x5555555f26f5 in netlink_interface_addr zebra/if_netlink.c:1213
    FRRouting#13 0x55555560f756 in netlink_information_fetch zebra/kernel_netlink.c:350
    FRRouting#14 0x555555612e49 in netlink_parse_info zebra/kernel_netlink.c:941
    FRRouting#15 0x55555560f9f1 in kernel_read zebra/kernel_netlink.c:402
    FRRouting#16 0x7ffff7450e9c in thread_call lib/thread.c:1581
    FRRouting#17 0x7ffff738eaf7 in frr_run lib/libfrr.c:1099
    FRRouting#18 0x55555561a578 in main zebra/main.c:455
    FRRouting#19 0x7ffff7079cc9 in __libc_start_main ../csu/libc-start.c:308
SUMMARY: AddressSanitizer: heap-use-after-free zebra/rib.h:222 in re_list_const_first

This is happening because we are using the dest pointer after a call into
rib_gc_dest.  In process_subq_route, we call rib_process() and if the
dest is deleted dest pointer is now garbage.  We must reload the
dest pointer in this case.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
louis-6wind added a commit to louis-6wind/frr that referenced this issue Dec 15, 2020
Temporal fix

Thread 2.1 "bgpd" received signal SIGSEGV, Segmentation fault.
0x00007ffff7b14180 in route_top (table=0x0) at lib/table.c:401
401		if (table->top == NULL)
(gdb) bt
\#0  0x00007ffff7b14180 in route_top (table=0x0) at lib/table.c:401
\#1  0x0000555555657286 in bgp_table_top (table=0x55555629c440) at ./bgpd/bgp_table.h:203
\#2  0x0000555555666dd0 in bgp_soft_reconfig_table_flag (srta=0x55555bc68fd0, flag=false) at bgpd/bgp_route.c:4669
\#3  0x0000555555666f5e in bgp_soft_reconfig_table_thread_cancel (nsrta=0x0, bgp=0x5555562767a0) at bgpd/bgp_route.c:4698
\FRRouting#4  0x00005555556e9463 in bgp_delete (bgp=0x5555562767a0) at bgpd/bgpd.c:3482
\FRRouting#5  0x00005555556f9ae5 in bgp_router_destroy (args=0x7fffffff6b90) at bgpd/bgp_nb_config.c:176
\FRRouting#6  0x00007ffff7ad985d in nb_callback_destroy (context=0x7fffffff7180, nb_node=0x555555c0c580, event=NB_EV_APPLY, dnode=0x5555563cdbf0, errmsg=0x7fffffff7190 "", errmsg_len=8192) at lib/northbound.c:970
\FRRouting#7  0x00007ffff7ada17a in nb_callback_configuration (context=0x7fffffff7180, event=NB_EV_APPLY, change=0x55555d5aa560, errmsg=0x7fffffff7190 "", errmsg_len=8192) at lib/northbound.c:1195
\FRRouting#8  0x00007ffff7ada564 in nb_transaction_process (event=NB_EV_APPLY, transaction=0x55556a6ed510, errmsg=0x7fffffff7190 "", errmsg_len=8192) at lib/northbound.c:1312
\FRRouting#9  0x00007ffff7ad900b in nb_candidate_commit_apply (transaction=0x55556a6ed510, save_transaction=true, transaction_id=0x0, errmsg=0x7fffffff7190 "", errmsg_len=8192) at lib/northbound.c:745
\FRRouting#10 0x00007ffff7ad912e in nb_candidate_commit (context=0x7fffffff7180, candidate=0x555555bddd00, save_transaction=true, comment=0x0, transaction_id=0x0, errmsg=0x7fffffff7190 "", errmsg_len=8192) at lib/northbound.c:777
\FRRouting#11 0x00007ffff7ae0249 in nb_cli_classic_commit (vty=0x555557b62790) at lib/northbound_cli.c:64
\FRRouting#12 0x00007ffff7ae0cce in nb_cli_apply_changes (vty=0x555557b62790, xpath_base_fmt=0x7fffffffb730 "/frr-routing:routing/control-plane-protocols/control-plane-protocol[type='frr-bgp:bgp'][name='bgp'][vrf='default']/frr-bgp:bgp") at lib/northbound_cli.c:281
\FRRouting#13 0x00005555556a01e6 in no_router_bgp (self=0x555555a28140 <no_router_bgp_cmd>, vty=0x555557b62790, argc=3, argv=0x555560be1bd0) at bgpd/bgp_vty.c:1466
\FRRouting#14 0x00007ffff7a90ebc in cmd_execute_command_real (vline=0x55556635c140, filter=FILTER_RELAXED, vty=0x555557b62790, cmd=0x0) at lib/command.c:938
\FRRouting#15 0x00007ffff7a91031 in cmd_execute_command (vline=0x55556635c140, vty=0x555557b62790, cmd=0x0, vtysh=0) at lib/command.c:997
\FRRouting#16 0x00007ffff7a91586 in cmd_execute (vty=0x555557b62790, cmd=0x555557b68f20 "no router bgp", matched=0x0, vtysh=0) at lib/command.c:1162
\FRRouting#17 0x00007ffff7b228f9 in vty_command (vty=0x555557b62790, buf=0x555557b68f20 "no router bgp") at lib/vty.c:517
\FRRouting#18 0x00007ffff7b2465b in vty_execute (vty=0x555557b62790) at lib/vty.c:1282
\FRRouting#19 0x00007ffff7b2656e in vtysh_read (thread=0x7fffffffe2e0) at lib/vty.c:2120
\FRRouting#20 0x00007ffff7b1bd23 in thread_call (thread=0x7fffffffe2e0) at lib/thread.c:1681
\FRRouting#21 0x00007ffff7ac7fc2 in frr_run (master=0x555555a6aab0) at lib/libfrr.c:1110
\FRRouting#22 0x00005555555d88b2 in main (argc=4, argv=0x7fffffffe518) at bgpd/bgp_main.c:523

Signed-off-by: Louis Scalbert <louis.scalbert@6wind.com>
sworleys pushed a commit to sworleys/frr that referenced this issue Mar 19, 2021
When zebra is running with debugs turned on there
is a use after free reported by the address sanitizer:

2020/10/16 12:58:02 ZEBRA: rib_delnode: (0:254):4.5.6.16/32: rn 0x60b000026f20, re 0x6080000131a0, removing
2020/10/16 12:58:02 ZEBRA: rib_meta_queue_add: (0:254):4.5.6.16/32: queued rn 0x60b000026f20 into sub-queue 3
=================================================================
==3101430==ERROR: AddressSanitizer: heap-use-after-free on address 0x608000011d28 at pc 0x555555705ab6 bp 0x7fffffffdab0 sp 0x7fffffffdaa8
READ of size 8 at 0x608000011d28 thread T0
    #0 0x555555705ab5 in re_list_const_first zebra/rib.h:222
    #1 0x555555705b54 in re_list_first zebra/rib.h:222
    #2 0x555555711a4f in process_subq_route zebra/zebra_rib.c:2248
    FRRouting#3 0x555555711d2e in process_subq zebra/zebra_rib.c:2286
    FRRouting#4 0x555555711ec7 in meta_queue_process zebra/zebra_rib.c:2320
    FRRouting#5 0x7ffff74701f7 in work_queue_run lib/workqueue.c:291
    FRRouting#6 0x7ffff7450e9c in thread_call lib/thread.c:1581
    FRRouting#7 0x7ffff738eaf7 in frr_run lib/libfrr.c:1099
    FRRouting#8 0x55555561a578 in main zebra/main.c:455
    FRRouting#9 0x7ffff7079cc9 in __libc_start_main ../csu/libc-start.c:308
    FRRouting#10 0x5555555e3429 in _start (/usr/lib/frr/zebra+0x8f429)
0x608000011d28 is located 8 bytes inside of 88-byte region [0x608000011d20,0x608000011d78)
freed by thread T0 here:
    #0 0x7ffff768bb6f in __interceptor_free (/lib/x86_64-linux-gnu/libasan.so.6+0xa9b6f)
    #1 0x7ffff739ccad in qfree lib/memory.c:129
    #2 0x555555709ee4 in rib_gc_dest zebra/zebra_rib.c:746
    FRRouting#3 0x55555570ca76 in rib_process zebra/zebra_rib.c:1240
    FRRouting#4 0x555555711a05 in process_subq_route zebra/zebra_rib.c:2245
    FRRouting#5 0x555555711d2e in process_subq zebra/zebra_rib.c:2286
    FRRouting#6 0x555555711ec7 in meta_queue_process zebra/zebra_rib.c:2320
    FRRouting#7 0x7ffff74701f7 in work_queue_run lib/workqueue.c:291
    FRRouting#8 0x7ffff7450e9c in thread_call lib/thread.c:1581
    FRRouting#9 0x7ffff738eaf7 in frr_run lib/libfrr.c:1099
    FRRouting#10 0x55555561a578 in main zebra/main.c:455
    FRRouting#11 0x7ffff7079cc9 in __libc_start_main ../csu/libc-start.c:308
previously allocated by thread T0 here:
    #0 0x7ffff768c037 in calloc (/lib/x86_64-linux-gnu/libasan.so.6+0xaa037)
    #1 0x7ffff739cb98 in qcalloc lib/memory.c:110
    #2 0x555555712ace in zebra_rib_create_dest zebra/zebra_rib.c:2515
    FRRouting#3 0x555555712c6c in rib_link zebra/zebra_rib.c:2576
    FRRouting#4 0x555555712faa in rib_addnode zebra/zebra_rib.c:2607
    FRRouting#5 0x555555715bf0 in rib_add_multipath_nhe zebra/zebra_rib.c:3012
    FRRouting#6 0x555555715f56 in rib_add_multipath zebra/zebra_rib.c:3049
    FRRouting#7 0x55555571788b in rib_add zebra/zebra_rib.c:3327
    FRRouting#8 0x5555555e584a in connected_up zebra/connected.c:254
    FRRouting#9 0x5555555e42ff in connected_announce zebra/connected.c:94
    FRRouting#10 0x5555555e4fd3 in connected_update zebra/connected.c:195
    FRRouting#11 0x5555555e61ad in connected_add_ipv4 zebra/connected.c:340
    FRRouting#12 0x5555555f26f5 in netlink_interface_addr zebra/if_netlink.c:1213
    FRRouting#13 0x55555560f756 in netlink_information_fetch zebra/kernel_netlink.c:350
    FRRouting#14 0x555555612e49 in netlink_parse_info zebra/kernel_netlink.c:941
    FRRouting#15 0x55555560f9f1 in kernel_read zebra/kernel_netlink.c:402
    FRRouting#16 0x7ffff7450e9c in thread_call lib/thread.c:1581
    FRRouting#17 0x7ffff738eaf7 in frr_run lib/libfrr.c:1099
    FRRouting#18 0x55555561a578 in main zebra/main.c:455
    FRRouting#19 0x7ffff7079cc9 in __libc_start_main ../csu/libc-start.c:308
SUMMARY: AddressSanitizer: heap-use-after-free zebra/rib.h:222 in re_list_const_first

This is happening because we are using the dest pointer after a call into
rib_gc_dest.  In process_subq_route, we call rib_process() and if the
dest is deleted dest pointer is now garbage.  We must reload the
dest pointer in this case.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
ranjanyash54 pushed a commit to ranjanyash54/frr that referenced this issue Aug 14, 2021
cmgd: Add a sanity check for the db-name in th get-data and get-confi…
gpnaveen pushed a commit to gpnaveen/frr that referenced this issue May 31, 2022
riw777 pushed a commit that referenced this issue Dec 6, 2022
When changing the peers sockunion structure the bgp->peer
list was not being updated properly.  Since the peer's su
is being used for a sorted insert then the change of it requires
that the value be pulled out of the bgp->peer list and then
put back into as well.

Additionally ensure that the hash is always released on peer
deletion.

Lead to this from this decode in a address sanitizer run.

=================================================================
==30778==ERROR: AddressSanitizer: heap-use-after-free on address 0x62a0000d8440 at pc 0x7f48c9c5c547 bp 0x7ffcba272cb0 sp 0x7ffcba272ca8
READ of size 2 at 0x62a0000d8440 thread T0
    #0 0x7f48c9c5c546 in sockunion_same lib/sockunion.c:425
    #1 0x55cfefe3000f in peer_hash_same bgpd/bgpd.c:890
    #2 0x7f48c9bde039 in hash_release lib/hash.c:209
    #3 0x55cfefe3373f in bgp_peer_conf_if_to_su_update bgpd/bgpd.c:1541
    #4 0x55cfefd0be7a in bgp_stop bgpd/bgp_fsm.c:1631
    #5 0x55cfefe4028f in peer_delete bgpd/bgpd.c:2362
    #6 0x55cfefdd5e97 in no_neighbor_interface_config bgpd/bgp_vty.c:4267
    #7 0x7f48c9b9d160 in cmd_execute_command_real lib/command.c:949
    #8 0x7f48c9ba1112 in cmd_execute_command lib/command.c:1009
    #9 0x7f48c9ba1573 in cmd_execute lib/command.c:1162
    #10 0x7f48c9c87402 in vty_command lib/vty.c:526
    #11 0x7f48c9c87832 in vty_execute lib/vty.c:1291
    #12 0x7f48c9c8e741 in vtysh_read lib/vty.c:2130
    #13 0x7f48c9c7a66d in thread_call lib/thread.c:1585
    #14 0x7f48c9bf64e7 in frr_run lib/libfrr.c:1123
    #15 0x55cfefc75a15 in main bgpd/bgp_main.c:540
    #16 0x7f48c96b009a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
    #17 0x55cfefc787f9 in _start (/usr/lib/frr/bgpd+0xe27f9)

0x62a0000d8440 is located 576 bytes inside of 23376-byte region [0x62a0000d8200,0x62a0000ddd50)
freed by thread T0 here:
    #0 0x7f48c9eb9fb0 in __interceptor_free (/lib/x86_64-linux-gnu/libasan.so.5+0xe8fb0)
    #1 0x55cfefe3fe42 in peer_free bgpd/bgpd.c:1113
    #2 0x55cfefe3fe42 in peer_unlock_with_caller bgpd/bgpd.c:1144
    #3 0x55cfefe4092e in peer_delete bgpd/bgpd.c:2457
    #4 0x55cfefdd5e97 in no_neighbor_interface_config bgpd/bgp_vty.c:4267
    #5 0x7f48c9b9d160 in cmd_execute_command_real lib/command.c:949
    #6 0x7f48c9ba1112 in cmd_execute_command lib/command.c:1009
    #7 0x7f48c9ba1573 in cmd_execute lib/command.c:1162
    #8 0x7f48c9c87402 in vty_command lib/vty.c:526
    #9 0x7f48c9c87832 in vty_execute lib/vty.c:1291
    #10 0x7f48c9c8e741 in vtysh_read lib/vty.c:2130
    #11 0x7f48c9c7a66d in thread_call lib/thread.c:1585
    #12 0x7f48c9bf64e7 in frr_run lib/libfrr.c:1123
    #13 0x55cfefc75a15 in main bgpd/bgp_main.c:540
    #14 0x7f48c96b009a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
taspelund pushed a commit to taspelund/frr that referenced this issue Dec 15, 2022
ASAN reported the following memleak:
```
Direct leak of 40 byte(s) in 1 object(s) allocated from:
    #0 0x4d4342 in calloc (/usr/lib/frr/bgpd+0x4d4342)
    FRRouting#1 0xbc3d68 in qcalloc /home/sharpd/frr8/lib/memory.c:116:27
    FRRouting#2 0xb869f7 in list_new /home/sharpd/frr8/lib/linklist.c:64:9
    FRRouting#3 0x5a38bc in bgp_evpn_remote_ip_hash_alloc /home/sharpd/frr8/bgpd/bgp_evpn.c:6789:24
    FRRouting#4 0xb358d3 in hash_get /home/sharpd/frr8/lib/hash.c:162:13
    FRRouting#5 0x593d39 in bgp_evpn_remote_ip_hash_add /home/sharpd/frr8/bgpd/bgp_evpn.c:6881:7
    FRRouting#6 0x59dbbd in install_evpn_route_entry_in_vni_common /home/sharpd/frr8/bgpd/bgp_evpn.c:3049:2
    FRRouting#7 0x59cfe0 in install_evpn_route_entry_in_vni_ip /home/sharpd/frr8/bgpd/bgp_evpn.c:3126:8
    FRRouting#8 0x59c6f0 in install_evpn_route_entry /home/sharpd/frr8/bgpd/bgp_evpn.c:3318:8
    FRRouting#9 0x59bb52 in install_uninstall_route_in_vnis /home/sharpd/frr8/bgpd/bgp_evpn.c:3888:10
    FRRouting#10 0x59b6d2 in bgp_evpn_install_uninstall_table /home/sharpd/frr8/bgpd/bgp_evpn.c:4019:5
    FRRouting#11 0x578857 in install_uninstall_evpn_route /home/sharpd/frr8/bgpd/bgp_evpn.c:4051:9
    FRRouting#12 0x58ada6 in bgp_evpn_import_route /home/sharpd/frr8/bgpd/bgp_evpn.c:6049:9
    FRRouting#13 0x713794 in bgp_update /home/sharpd/frr8/bgpd/bgp_route.c:4842:3
    FRRouting#14 0x583fa0 in process_type2_route /home/sharpd/frr8/bgpd/bgp_evpn.c:4518:9
    FRRouting#15 0x5824ba in bgp_nlri_parse_evpn /home/sharpd/frr8/bgpd/bgp_evpn.c:5732:8
    FRRouting#16 0x6ae6a2 in bgp_nlri_parse /home/sharpd/frr8/bgpd/bgp_packet.c:363:10
    FRRouting#17 0x6be6fa in bgp_update_receive /home/sharpd/frr8/bgpd/bgp_packet.c:2020:15
    FRRouting#18 0x6b7433 in bgp_process_packet /home/sharpd/frr8/bgpd/bgp_packet.c:2929:11
    FRRouting#19 0xd00146 in thread_call /home/sharpd/frr8/lib/thread.c:2006:2
```

The list itself was not being cleaned up when the final list entry was
removed, so make sure we do that instead of leaking memory.

Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
pguibert6WIND added a commit to pguibert6WIND/frr that referenced this issue May 24, 2023
A BGP crash happens when the 'show bgp label-nexthop' is executed, on a
BGP VRF configuration with the 'label vpn export allocation-mode per-nexthop'
command configured.

> (gdb) bt
> #0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=140039420118912) at ./nptl/pthread_kill.c:44
> FRRouting#1  __pthread_kill_internal (signo=6, threadid=140039420118912) at ./nptl/pthread_kill.c:78
> FRRouting#2  __GI___pthread_kill (threadid=140039420118912, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
> FRRouting#3  0x00007f5d78063476 in __GI_raise (sig=6) at ../sysdeps/posix/raise.c:26
> FRRouting#4  0x00007f5d78448705 in core_handler (signo=6, siginfo=0x7ffffe2e0f70, context=0x7ffffe2e0e40)
>     at /build/make-pkg/output/_packages/cp-routing/src/lib/sigevent.c:262
> FRRouting#5  <signal handler called>
> FRRouting#6  __pthread_kill_implementation (no_tid=0, signo=6, threadid=140039420118912) at ./nptl/pthread_kill.c:44
> FRRouting#7  __pthread_kill_internal (signo=6, threadid=140039420118912) at ./nptl/pthread_kill.c:78
> FRRouting#8  __GI___pthread_kill (threadid=140039420118912, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
> FRRouting#9  0x00007f5d78063476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
> FRRouting#10 0x00007f5d780497f3 in __GI_abort () at ./stdlib/abort.c:79
> FRRouting#11 0x00007f5d7848872f in _zlog_assert_failed (xref=0x5605f59a4ec0 <_xref.9>, extra=0x0)
>     at /build/make-pkg/output/_packages/cp-routing/src/lib/zlog.c:557
> FRRouting#12 0x00005605f57ad3ab in show_bgp_nexthop_label_afi (vty=0x5605f79138e0, afi=AFI_IP, bgp=0x5605f7855090, detail=true)
>     at /build/make-pkg/output/_packages/cp-routing/src/bgpd/bgp_labelpool.c:1071
> FRRouting#13 0x00005605f57ad62a in show_bgp_nexthop_label (self=0x5605f59a4920 <show_bgp_nexthop_label_cmd>, vty=0x5605f79138e0, argc=6, argv=0x5605f77c5cc0)
>     at /build/make-pkg/output/_packages/cp-routing/src/bgpd/bgp_labelpool.c:1116
> FRRouting#14 0x00007f5d783bb858 in cmd_execute_command_real (vline=0x5605f791c060, filter=FILTER_RELAXED, vty=0x5605f79138e0, cmd=0x0, up_level=0)
>     at /build/make-pkg/output/_packages/cp-routing/src/lib/command.c:1070
> FRRouting#15 0x00007f5d783bb9dd in cmd_execute_command (vline=0x5605f791c060, vty=0x5605f79138e0, cmd=0x0, vtysh=0)
>     at /build/make-pkg/output/_packages/cp-routing/src/lib/command.c:1130
> FRRouting#16 0x00007f5d783bbf8a in cmd_execute (vty=0x5605f79138e0, cmd=0x5605f791a010 "show bgp vrf vrf1 label-nexthop detail", matched=0x0, vtysh=0)
>     at /build/make-pkg/output/_packages/cp-routing/src/lib/command.c:1294
> FRRouting#17 0x00007f5d7846965c in vty_command (vty=0x5605f79138e0, buf=0x5605f791a010 "show bgp vrf vrf1 label-nexthop detail")
>     at /build/make-pkg/output/_packages/cp-routing/src/lib/vty.c:530
> FRRouting#18 0x00007f5d7846b52c in vty_execute (vty=0x5605f79138e0) at /build/make-pkg/output/_packages/cp-routing/src/lib/vty.c:1296
> FRRouting#19 0x00007f5d7846d508 in vtysh_read (thread=0x7ffffe2e4410) at /build/make-pkg/output/_packages/cp-routing/src/lib/vty.c:2137
> FRRouting#20 0x00007f5d78461fe6 in thread_call (thread=0x7ffffe2e4410) at /build/make-pkg/output/_packages/cp-routing/src/lib/thread.c:1825
>
> (gdb) frame 12
> FRRouting#12 0x00005605f57ad3ab in show_bgp_nexthop_label_afi (vty=0x5605f79138e0, afi=AFI_IP, bgp=0x5605f7855090, detail=true)
>     at /build/make-pkg/output/_packages/cp-routing/src/bgpd/bgp_labelpool.c:1071
>

This crash is a segmentation fault: the 'path' pointer is not a valid pointer;
consequently, the 'dest' and the 'table' pointer are also invalid. The 'path'
pointer is a 'bgp_path_info' structure previously freed when a peer down event
occured. The 'show bgp label-nexthop' command was attempting to dump the
'bgp_path_info' entries referenced in the 'bgp_label_per_nexthop_cache' entries.
As the 'bgp_path_info' entries were invalid, the crash happened. To illustrate,
the below dump shows 3 path entries linked to the '192.0.2.11' next-hop, whereas
the '192.0.2.11' peer has been removed.

> dut-vm# show bgp vrf vrf1 label-nexthop
> Current BGP label nexthop cache for IPv4, VRF VRF vrf1
>  192.0.2.11, label 20 #paths 3
>   if r1-eth1
>   Last update: Wed May 24 15:16:21 2023
> dut-vm# show bgp vrf vrf1 label-nexthop detail
> <-- crash

When the 'bgp_path_info' entries are freed, the 'bgp_mplsvpn_path_nh_label_unlink()'
function is called. The 'pi->net' pointer is needed to check the BGP RIB table is
the SAFI_UNICAST routing table, and to de-reference the 'bgp_labl_per_nexthop_cache'
entry from the 'bgp_path_info' entry. In this case, the 'pi->net' was unset.

Fix this by introducing a new 'mplsvpn_usage' field to determine if the 'bgp_path_info'
structure contains a 'bgp_mplsvpn_label_nh' structure.

Fixes: ("bgpd: allocate label bound to received mpls vpn routes")
Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Keelan10 added a commit to Keelan10/frr that referenced this issue Jun 26, 2023
This commit ensures proper cleanup by deleting the gm_join_list when a PIM interface is deleted. The gm_join_list was previously not being freed, causing a memory leak.

The ASan leak log for reference:
```
***********************************************************************************
Address Sanitizer Error detected in multicast_mld_join_topo1.test_multicast_mld_local_join/r1.asan.pim6d.28070

=================================================================
==28070==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 40 byte(s) in 1 object(s) allocated from:
    #0 0x7f3605dbfd28 in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xded28)
    FRRouting#1 0x56230373dd6b in qcalloc lib/memory.c:105
    FRRouting#2 0x56230372180f in list_new lib/linklist.c:49
    FRRouting#3 0x56230361b589 in pim_if_gm_join_add pimd/pim_iface.c:1313
    FRRouting#4 0x562303642247 in lib_interface_gmp_address_family_static_group_create pimd/pim_nb_config.c:2868
    FRRouting#5 0x562303767280 in nb_callback_create lib/northbound.c:1235
    FRRouting#6 0x562303767280 in nb_callback_configuration lib/northbound.c:1579
    FRRouting#7 0x562303768a1d in nb_transaction_process lib/northbound.c:1710
    FRRouting#8 0x56230376904a in nb_candidate_commit_apply lib/northbound.c:1104
    FRRouting#9 0x5623037692ba in nb_candidate_commit lib/northbound.c:1137
    FRRouting#10 0x562303769dec in nb_cli_classic_commit lib/northbound_cli.c:49
    FRRouting#11 0x56230376fb79 in nb_cli_pending_commit_check lib/northbound_cli.c:88
    FRRouting#12 0x5623036c5bcb in cmd_execute_command_real lib/command.c:991
    FRRouting#13 0x5623036c5f1b in cmd_execute_command lib/command.c:1053
    FRRouting#14 0x5623036c6392 in cmd_execute lib/command.c:1221
    FRRouting#15 0x5623037e75da in vty_command lib/vty.c:591
    FRRouting#16 0x5623037e7a74 in vty_execute lib/vty.c:1354
    FRRouting#17 0x5623037f0253 in vtysh_read lib/vty.c:2362
    FRRouting#18 0x5623037db4e8 in event_call lib/event.c:1995
    FRRouting#19 0x562303720f97 in frr_run lib/libfrr.c:1213
    FRRouting#20 0x56230368615d in main pimd/pim6_main.c:184
    FRRouting#21 0x7f360461bc86 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21c86)

Indirect leak of 192 byte(s) in 4 object(s) allocated from:
    #0 0x7f3605dbfd28 in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xded28)
    FRRouting#1 0x56230373dd6b in qcalloc lib/memory.c:105
    FRRouting#2 0x56230361b91d in gm_join_new pimd/pim_iface.c:1288
    FRRouting#3 0x56230361b91d in pim_if_gm_join_add pimd/pim_iface.c:1326
    FRRouting#4 0x562303642247 in lib_interface_gmp_address_family_static_group_create pimd/pim_nb_config.c:2868
    FRRouting#5 0x562303767280 in nb_callback_create lib/northbound.c:1235
    FRRouting#6 0x562303767280 in nb_callback_configuration lib/northbound.c:1579
    FRRouting#7 0x562303768a1d in nb_transaction_process lib/northbound.c:1710
    FRRouting#8 0x56230376904a in nb_candidate_commit_apply lib/northbound.c:1104
    FRRouting#9 0x5623037692ba in nb_candidate_commit lib/northbound.c:1137
    FRRouting#10 0x562303769dec in nb_cli_classic_commit lib/northbound_cli.c:49
    FRRouting#11 0x56230376fb79 in nb_cli_pending_commit_check lib/northbound_cli.c:88
    FRRouting#12 0x5623036c5bcb in cmd_execute_command_real lib/command.c:991
    FRRouting#13 0x5623036c5f1b in cmd_execute_command lib/command.c:1053
    FRRouting#14 0x5623036c6392 in cmd_execute lib/command.c:1221
    FRRouting#15 0x5623037e75da in vty_command lib/vty.c:591
    FRRouting#16 0x5623037e7a74 in vty_execute lib/vty.c:1354
    FRRouting#17 0x5623037f0253 in vtysh_read lib/vty.c:2362
    FRRouting#18 0x5623037db4e8 in event_call lib/event.c:1995
    FRRouting#19 0x562303720f97 in frr_run lib/libfrr.c:1213
    FRRouting#20 0x56230368615d in main pimd/pim6_main.c:184
    FRRouting#21 0x7f360461bc86 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21c86)

Indirect leak of 96 byte(s) in 4 object(s) allocated from:
    #0 0x7f3605dbfd28 in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xded28)
    FRRouting#1 0x56230373dd6b in qcalloc lib/memory.c:105
    FRRouting#2 0x562303721651 in listnode_new lib/linklist.c:71
    FRRouting#3 0x56230372182b in listnode_add lib/linklist.c:92
    FRRouting#4 0x56230361ba9a in gm_join_new pimd/pim_iface.c:1295
    FRRouting#5 0x56230361ba9a in pim_if_gm_join_add pimd/pim_iface.c:1326
    FRRouting#6 0x562303642247 in lib_interface_gmp_address_family_static_group_create pimd/pim_nb_config.c:2868
    FRRouting#7 0x562303767280 in nb_callback_create lib/northbound.c:1235
    FRRouting#8 0x562303767280 in nb_callback_configuration lib/northbound.c:1579
    FRRouting#9 0x562303768a1d in nb_transaction_process lib/northbound.c:1710
    FRRouting#10 0x56230376904a in nb_candidate_commit_apply lib/northbound.c:1104
    FRRouting#11 0x5623037692ba in nb_candidate_commit lib/northbound.c:1137
    FRRouting#12 0x562303769dec in nb_cli_classic_commit lib/northbound_cli.c:49
    FRRouting#13 0x56230376fb79 in nb_cli_pending_commit_check lib/northbound_cli.c:88
    FRRouting#14 0x5623036c5bcb in cmd_execute_command_real lib/command.c:991
    FRRouting#15 0x5623036c5f1b in cmd_execute_command lib/command.c:1053
    FRRouting#16 0x5623036c6392 in cmd_execute lib/command.c:1221
    FRRouting#17 0x5623037e75da in vty_command lib/vty.c:591
    FRRouting#18 0x5623037e7a74 in vty_execute lib/vty.c:1354
    FRRouting#19 0x5623037f0253 in vtysh_read lib/vty.c:2362
    FRRouting#20 0x5623037db4e8 in event_call lib/event.c:1995
    FRRouting#21 0x562303720f97 in frr_run lib/libfrr.c:1213
    FRRouting#22 0x56230368615d in main pimd/pim6_main.c:184
    FRRouting#23 0x7f360461bc86 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21c86)

Indirect leak of 48 byte(s) in 1 object(s) allocated from:
    #0 0x7f3605dbfd28 in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xded28)
    FRRouting#1 0x56230373dd6b in qcalloc lib/memory.c:105
    FRRouting#2 0x56230361b91d in gm_join_new pimd/pim_iface.c:1288
    FRRouting#3 0x56230361b91d in pim_if_gm_join_add pimd/pim_iface.c:1326
    FRRouting#4 0x562303642247 in lib_interface_gmp_address_family_static_group_create pimd/pim_nb_config.c:2868
    FRRouting#5 0x562303767280 in nb_callback_create lib/northbound.c:1235
    FRRouting#6 0x562303767280 in nb_callback_configuration lib/northbound.c:1579
    FRRouting#7 0x562303768a1d in nb_transaction_process lib/northbound.c:1710
    FRRouting#8 0x56230376904a in nb_candidate_commit_apply lib/northbound.c:1104
    FRRouting#9 0x5623037692ba in nb_candidate_commit lib/northbound.c:1137
    FRRouting#10 0x562303769dec in nb_cli_classic_commit lib/northbound_cli.c:49
    FRRouting#11 0x56230376fb79 in nb_cli_pending_commit_check lib/northbound_cli.c:88
    FRRouting#12 0x5623036c5bcb in cmd_execute_command_real lib/command.c:991
    FRRouting#13 0x5623036c5f6f in cmd_execute_command lib/command.c:1072
    FRRouting#14 0x5623036c6392 in cmd_execute lib/command.c:1221
    FRRouting#15 0x5623037e75da in vty_command lib/vty.c:591
    FRRouting#16 0x5623037e7a74 in vty_execute lib/vty.c:1354
    FRRouting#17 0x5623037f0253 in vtysh_read lib/vty.c:2362
    FRRouting#18 0x5623037db4e8 in event_call lib/event.c:1995
    FRRouting#19 0x562303720f97 in frr_run lib/libfrr.c:1213
    FRRouting#20 0x56230368615d in main pimd/pim6_main.c:184
    FRRouting#21 0x7f360461bc86 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21c86)

Indirect leak of 24 byte(s) in 1 object(s) allocated from:
    #0 0x7f3605dbfd28 in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xded28)
    FRRouting#1 0x56230373dd6b in qcalloc lib/memory.c:105
    FRRouting#2 0x562303721651 in listnode_new lib/linklist.c:71
    FRRouting#3 0x56230372182b in listnode_add lib/linklist.c:92
    FRRouting#4 0x56230361ba9a in gm_join_new pimd/pim_iface.c:1295
    FRRouting#5 0x56230361ba9a in pim_if_gm_join_add pimd/pim_iface.c:1326
    FRRouting#6 0x562303642247 in lib_interface_gmp_address_family_static_group_create pimd/pim_nb_config.c:2868
    FRRouting#7 0x562303767280 in nb_callback_create lib/northbound.c:1235
    FRRouting#8 0x562303767280 in nb_callback_configuration lib/northbound.c:1579
    FRRouting#9 0x562303768a1d in nb_transaction_process lib/northbound.c:1710
    FRRouting#10 0x56230376904a in nb_candidate_commit_apply lib/northbound.c:1104
    FRRouting#11 0x5623037692ba in nb_candidate_commit lib/northbound.c:1137
    FRRouting#12 0x562303769dec in nb_cli_classic_commit lib/northbound_cli.c:49
    FRRouting#13 0x56230376fb79 in nb_cli_pending_commit_check lib/northbound_cli.c:88
    FRRouting#14 0x5623036c5bcb in cmd_execute_command_real lib/command.c:991
    FRRouting#15 0x5623036c5f6f in cmd_execute_command lib/command.c:1072
    FRRouting#16 0x5623036c6392 in cmd_execute lib/command.c:1221
    FRRouting#17 0x5623037e75da in vty_command lib/vty.c:591
    FRRouting#18 0x5623037e7a74 in vty_execute lib/vty.c:1354
    FRRouting#19 0x5623037f0253 in vtysh_read lib/vty.c:2362
    FRRouting#20 0x5623037db4e8 in event_call lib/event.c:1995
    FRRouting#21 0x562303720f97 in frr_run lib/libfrr.c:1213
    FRRouting#22 0x56230368615d in main pimd/pim6_main.c:184
    FRRouting#23 0x7f360461bc86 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21c86)

SUMMARY: AddressSanitizer: 400 byte(s) leaked in 11 allocation(s).
***********************************************************************************
```

Signed-off-by: Keelan Cannoo <keelan.cannoo@icloud.com>
pguibert6WIND added a commit to pguibert6WIND/frr that referenced this issue Oct 7, 2024
The following ASAN issue has been observed:

> ERROR: AddressSanitizer: heap-use-after-free on address 0x6160000acba4 at pc 0x55910c5694d0 bp 0x7ffe3a8ac850 sp 0x7ffe3a8ac840
> READ of size 4 at 0x6160000acba4 thread T0
>         #0 0x55910c5694cf in ctx_info_from_zns zebra/zebra_dplane.c:3315
>     FRRouting#1 0x55910c569696 in dplane_ctx_ns_init zebra/zebra_dplane.c:3331
>     FRRouting#2 0x55910c56bf61 in dplane_ctx_nexthop_init zebra/zebra_dplane.c:3680
>     FRRouting#3 0x55910c5711ca in dplane_nexthop_update_internal zebra/zebra_dplane.c:4490
>     FRRouting#4 0x55910c571c5c in dplane_nexthop_delete zebra/zebra_dplane.c:4717
>     FRRouting#5 0x55910c61e90e in zebra_nhg_uninstall_kernel zebra/zebra_nhg.c:3413
>     FRRouting#6 0x55910c615d8a in zebra_nhg_decrement_ref zebra/zebra_nhg.c:1919
>     FRRouting#7 0x55910c6404db in route_entry_update_nhe zebra/zebra_rib.c:454
>     FRRouting#8 0x55910c64c904 in rib_re_nhg_free zebra/zebra_rib.c:2822
>     FRRouting#9 0x55910c655be2 in rib_unlink zebra/zebra_rib.c:4212
>     FRRouting#10 0x55910c6430f9 in zebra_rtable_node_cleanup zebra/zebra_rib.c:968
>     FRRouting#11 0x7f26f275b8a9 in route_node_free lib/table.c:75
>     FRRouting#12 0x7f26f275bae4 in route_table_free lib/table.c:111
>     FRRouting#13 0x7f26f275b749 in route_table_finish lib/table.c:46
>     FRRouting#14 0x55910c65db17 in zebra_router_free_table zebra/zebra_router.c:191
>     FRRouting#15 0x55910c65dfb5 in zebra_router_terminate zebra/zebra_router.c:244
>     FRRouting#16 0x55910c4f40db in zebra_finalize zebra/main.c:249
>     FRRouting#17 0x7f26f2777108 in event_call lib/event.c:2011
>     FRRouting#18 0x7f26f264180e in frr_run lib/libfrr.c:1212
>     FRRouting#19 0x55910c4f49cb in main zebra/main.c:531
>     FRRouting#20 0x7f26f2029d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
>     FRRouting#21 0x7f26f2029e3f in __libc_start_main_impl ../csu/libc-start.c:392
>     FRRouting#22 0x55910c4b0114 in _start (/usr/lib/frr/zebra+0x1ae114)

It happens with FRR using the kernel. During shutdown, the
namespace identifier is attempted to be obtained by zebra, in an
attempt to prepare zebra dataplane nexthop messages.

Fix this by accessing the ns structure.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
pguibert6WIND added a commit to pguibert6WIND/frr that referenced this issue Oct 7, 2024
The following ASAN issue has been observed:

> ERROR: AddressSanitizer: heap-use-after-free on address 0x6160000acba4 at pc 0x55910c5694d0 bp 0x7ffe3a8ac850 sp 0x7ffe3a8ac840
> READ of size 4 at 0x6160000acba4 thread T0
>         #0 0x55910c5694cf in ctx_info_from_zns zebra/zebra_dplane.c:3315
>     FRRouting#1 0x55910c569696 in dplane_ctx_ns_init zebra/zebra_dplane.c:3331
>     FRRouting#2 0x55910c56bf61 in dplane_ctx_nexthop_init zebra/zebra_dplane.c:3680
>     FRRouting#3 0x55910c5711ca in dplane_nexthop_update_internal zebra/zebra_dplane.c:4490
>     FRRouting#4 0x55910c571c5c in dplane_nexthop_delete zebra/zebra_dplane.c:4717
>     FRRouting#5 0x55910c61e90e in zebra_nhg_uninstall_kernel zebra/zebra_nhg.c:3413
>     FRRouting#6 0x55910c615d8a in zebra_nhg_decrement_ref zebra/zebra_nhg.c:1919
>     FRRouting#7 0x55910c6404db in route_entry_update_nhe zebra/zebra_rib.c:454
>     FRRouting#8 0x55910c64c904 in rib_re_nhg_free zebra/zebra_rib.c:2822
>     FRRouting#9 0x55910c655be2 in rib_unlink zebra/zebra_rib.c:4212
>     FRRouting#10 0x55910c6430f9 in zebra_rtable_node_cleanup zebra/zebra_rib.c:968
>     FRRouting#11 0x7f26f275b8a9 in route_node_free lib/table.c:75
>     FRRouting#12 0x7f26f275bae4 in route_table_free lib/table.c:111
>     FRRouting#13 0x7f26f275b749 in route_table_finish lib/table.c:46
>     FRRouting#14 0x55910c65db17 in zebra_router_free_table zebra/zebra_router.c:191
>     FRRouting#15 0x55910c65dfb5 in zebra_router_terminate zebra/zebra_router.c:244
>     FRRouting#16 0x55910c4f40db in zebra_finalize zebra/main.c:249
>     FRRouting#17 0x7f26f2777108 in event_call lib/event.c:2011
>     FRRouting#18 0x7f26f264180e in frr_run lib/libfrr.c:1212
>     FRRouting#19 0x55910c4f49cb in main zebra/main.c:531
>     FRRouting#20 0x7f26f2029d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
>     FRRouting#21 0x7f26f2029e3f in __libc_start_main_impl ../csu/libc-start.c:392
>     FRRouting#22 0x55910c4b0114 in _start (/usr/lib/frr/zebra+0x1ae114)

It happens with FRR using the kernel. During shutdown, the
namespace identifier is attempted to be obtained by zebra, in an
attempt to prepare zebra dataplane nexthop messages.

Fix this by accessing the ns structure.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
pguibert6WIND added a commit to pguibert6WIND/frr that referenced this issue Oct 7, 2024
The following ASAN issue has been observed:

> ERROR: AddressSanitizer: heap-use-after-free on address 0x6160000acba4 at pc 0x55910c5694d0 bp 0x7ffe3a8ac850 sp 0x7ffe3a8ac840
> READ of size 4 at 0x6160000acba4 thread T0
>         #0 0x55910c5694cf in ctx_info_from_zns zebra/zebra_dplane.c:3315
>     FRRouting#1 0x55910c569696 in dplane_ctx_ns_init zebra/zebra_dplane.c:3331
>     FRRouting#2 0x55910c56bf61 in dplane_ctx_nexthop_init zebra/zebra_dplane.c:3680
>     FRRouting#3 0x55910c5711ca in dplane_nexthop_update_internal zebra/zebra_dplane.c:4490
>     FRRouting#4 0x55910c571c5c in dplane_nexthop_delete zebra/zebra_dplane.c:4717
>     FRRouting#5 0x55910c61e90e in zebra_nhg_uninstall_kernel zebra/zebra_nhg.c:3413
>     FRRouting#6 0x55910c615d8a in zebra_nhg_decrement_ref zebra/zebra_nhg.c:1919
>     FRRouting#7 0x55910c6404db in route_entry_update_nhe zebra/zebra_rib.c:454
>     FRRouting#8 0x55910c64c904 in rib_re_nhg_free zebra/zebra_rib.c:2822
>     FRRouting#9 0x55910c655be2 in rib_unlink zebra/zebra_rib.c:4212
>     FRRouting#10 0x55910c6430f9 in zebra_rtable_node_cleanup zebra/zebra_rib.c:968
>     FRRouting#11 0x7f26f275b8a9 in route_node_free lib/table.c:75
>     FRRouting#12 0x7f26f275bae4 in route_table_free lib/table.c:111
>     FRRouting#13 0x7f26f275b749 in route_table_finish lib/table.c:46
>     FRRouting#14 0x55910c65db17 in zebra_router_free_table zebra/zebra_router.c:191
>     FRRouting#15 0x55910c65dfb5 in zebra_router_terminate zebra/zebra_router.c:244
>     FRRouting#16 0x55910c4f40db in zebra_finalize zebra/main.c:249
>     FRRouting#17 0x7f26f2777108 in event_call lib/event.c:2011
>     FRRouting#18 0x7f26f264180e in frr_run lib/libfrr.c:1212
>     FRRouting#19 0x55910c4f49cb in main zebra/main.c:531
>     FRRouting#20 0x7f26f2029d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
>     FRRouting#21 0x7f26f2029e3f in __libc_start_main_impl ../csu/libc-start.c:392
>     FRRouting#22 0x55910c4b0114 in _start (/usr/lib/frr/zebra+0x1ae114)

It happens with FRR using the kernel. During shutdown, the
namespace identifier is attempted to be obtained by zebra, in an
attempt to prepare zebra dataplane nexthop messages.

Fix this by accessing the ns structure.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
pguibert6WIND added a commit to pguibert6WIND/frr that referenced this issue Oct 8, 2024
The following ASAN issue has been observed:

> ERROR: AddressSanitizer: heap-use-after-free on address 0x6160000acba4 at pc 0x55910c5694d0 bp 0x7ffe3a8ac850 sp 0x7ffe3a8ac840
> READ of size 4 at 0x6160000acba4 thread T0
>         #0 0x55910c5694cf in ctx_info_from_zns zebra/zebra_dplane.c:3315
>     FRRouting#1 0x55910c569696 in dplane_ctx_ns_init zebra/zebra_dplane.c:3331
>     FRRouting#2 0x55910c56bf61 in dplane_ctx_nexthop_init zebra/zebra_dplane.c:3680
>     FRRouting#3 0x55910c5711ca in dplane_nexthop_update_internal zebra/zebra_dplane.c:4490
>     FRRouting#4 0x55910c571c5c in dplane_nexthop_delete zebra/zebra_dplane.c:4717
>     FRRouting#5 0x55910c61e90e in zebra_nhg_uninstall_kernel zebra/zebra_nhg.c:3413
>     FRRouting#6 0x55910c615d8a in zebra_nhg_decrement_ref zebra/zebra_nhg.c:1919
>     FRRouting#7 0x55910c6404db in route_entry_update_nhe zebra/zebra_rib.c:454
>     FRRouting#8 0x55910c64c904 in rib_re_nhg_free zebra/zebra_rib.c:2822
>     FRRouting#9 0x55910c655be2 in rib_unlink zebra/zebra_rib.c:4212
>     FRRouting#10 0x55910c6430f9 in zebra_rtable_node_cleanup zebra/zebra_rib.c:968
>     FRRouting#11 0x7f26f275b8a9 in route_node_free lib/table.c:75
>     FRRouting#12 0x7f26f275bae4 in route_table_free lib/table.c:111
>     FRRouting#13 0x7f26f275b749 in route_table_finish lib/table.c:46
>     FRRouting#14 0x55910c65db17 in zebra_router_free_table zebra/zebra_router.c:191
>     FRRouting#15 0x55910c65dfb5 in zebra_router_terminate zebra/zebra_router.c:244
>     FRRouting#16 0x55910c4f40db in zebra_finalize zebra/main.c:249
>     FRRouting#17 0x7f26f2777108 in event_call lib/event.c:2011
>     FRRouting#18 0x7f26f264180e in frr_run lib/libfrr.c:1212
>     FRRouting#19 0x55910c4f49cb in main zebra/main.c:531
>     FRRouting#20 0x7f26f2029d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
>     FRRouting#21 0x7f26f2029e3f in __libc_start_main_impl ../csu/libc-start.c:392
>     FRRouting#22 0x55910c4b0114 in _start (/usr/lib/frr/zebra+0x1ae114)

It happens with FRR using the kernel. During shutdown, the
namespace identifier is attempted to be obtained by zebra, in an
attempt to prepare zebra dataplane nexthop messages.

Fix this by accessing the ns structure.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
pguibert6WIND added a commit to pguibert6WIND/frr that referenced this issue Oct 8, 2024
The following ASAN issue has been observed:

> ERROR: AddressSanitizer: heap-use-after-free on address 0x6160000acba4 at pc 0x55910c5694d0 bp 0x7ffe3a8ac850 sp 0x7ffe3a8ac840
> READ of size 4 at 0x6160000acba4 thread T0
>         #0 0x55910c5694cf in ctx_info_from_zns zebra/zebra_dplane.c:3315
>     FRRouting#1 0x55910c569696 in dplane_ctx_ns_init zebra/zebra_dplane.c:3331
>     FRRouting#2 0x55910c56bf61 in dplane_ctx_nexthop_init zebra/zebra_dplane.c:3680
>     FRRouting#3 0x55910c5711ca in dplane_nexthop_update_internal zebra/zebra_dplane.c:4490
>     FRRouting#4 0x55910c571c5c in dplane_nexthop_delete zebra/zebra_dplane.c:4717
>     FRRouting#5 0x55910c61e90e in zebra_nhg_uninstall_kernel zebra/zebra_nhg.c:3413
>     FRRouting#6 0x55910c615d8a in zebra_nhg_decrement_ref zebra/zebra_nhg.c:1919
>     FRRouting#7 0x55910c6404db in route_entry_update_nhe zebra/zebra_rib.c:454
>     FRRouting#8 0x55910c64c904 in rib_re_nhg_free zebra/zebra_rib.c:2822
>     FRRouting#9 0x55910c655be2 in rib_unlink zebra/zebra_rib.c:4212
>     FRRouting#10 0x55910c6430f9 in zebra_rtable_node_cleanup zebra/zebra_rib.c:968
>     FRRouting#11 0x7f26f275b8a9 in route_node_free lib/table.c:75
>     FRRouting#12 0x7f26f275bae4 in route_table_free lib/table.c:111
>     FRRouting#13 0x7f26f275b749 in route_table_finish lib/table.c:46
>     FRRouting#14 0x55910c65db17 in zebra_router_free_table zebra/zebra_router.c:191
>     FRRouting#15 0x55910c65dfb5 in zebra_router_terminate zebra/zebra_router.c:244
>     FRRouting#16 0x55910c4f40db in zebra_finalize zebra/main.c:249
>     FRRouting#17 0x7f26f2777108 in event_call lib/event.c:2011
>     FRRouting#18 0x7f26f264180e in frr_run lib/libfrr.c:1212
>     FRRouting#19 0x55910c4f49cb in main zebra/main.c:531
>     FRRouting#20 0x7f26f2029d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
>     FRRouting#21 0x7f26f2029e3f in __libc_start_main_impl ../csu/libc-start.c:392
>     FRRouting#22 0x55910c4b0114 in _start (/usr/lib/frr/zebra+0x1ae114)

It happens with FRR using the kernel. During shutdown, the
namespace identifier is attempted to be obtained by zebra, in an
attempt to prepare zebra dataplane nexthop messages.

Fix this by accessing the ns structure.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
pguibert6WIND added a commit to pguibert6WIND/frr that referenced this issue Oct 8, 2024
The following ASAN issue has been observed:

> ERROR: AddressSanitizer: heap-use-after-free on address 0x6160000acba4 at pc 0x55910c5694d0 bp 0x7ffe3a8ac850 sp 0x7ffe3a8ac840
> READ of size 4 at 0x6160000acba4 thread T0
>         #0 0x55910c5694cf in ctx_info_from_zns zebra/zebra_dplane.c:3315
>     FRRouting#1 0x55910c569696 in dplane_ctx_ns_init zebra/zebra_dplane.c:3331
>     FRRouting#2 0x55910c56bf61 in dplane_ctx_nexthop_init zebra/zebra_dplane.c:3680
>     FRRouting#3 0x55910c5711ca in dplane_nexthop_update_internal zebra/zebra_dplane.c:4490
>     FRRouting#4 0x55910c571c5c in dplane_nexthop_delete zebra/zebra_dplane.c:4717
>     FRRouting#5 0x55910c61e90e in zebra_nhg_uninstall_kernel zebra/zebra_nhg.c:3413
>     FRRouting#6 0x55910c615d8a in zebra_nhg_decrement_ref zebra/zebra_nhg.c:1919
>     FRRouting#7 0x55910c6404db in route_entry_update_nhe zebra/zebra_rib.c:454
>     FRRouting#8 0x55910c64c904 in rib_re_nhg_free zebra/zebra_rib.c:2822
>     FRRouting#9 0x55910c655be2 in rib_unlink zebra/zebra_rib.c:4212
>     FRRouting#10 0x55910c6430f9 in zebra_rtable_node_cleanup zebra/zebra_rib.c:968
>     FRRouting#11 0x7f26f275b8a9 in route_node_free lib/table.c:75
>     FRRouting#12 0x7f26f275bae4 in route_table_free lib/table.c:111
>     FRRouting#13 0x7f26f275b749 in route_table_finish lib/table.c:46
>     FRRouting#14 0x55910c65db17 in zebra_router_free_table zebra/zebra_router.c:191
>     FRRouting#15 0x55910c65dfb5 in zebra_router_terminate zebra/zebra_router.c:244
>     FRRouting#16 0x55910c4f40db in zebra_finalize zebra/main.c:249
>     FRRouting#17 0x7f26f2777108 in event_call lib/event.c:2011
>     FRRouting#18 0x7f26f264180e in frr_run lib/libfrr.c:1212
>     FRRouting#19 0x55910c4f49cb in main zebra/main.c:531
>     FRRouting#20 0x7f26f2029d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
>     FRRouting#21 0x7f26f2029e3f in __libc_start_main_impl ../csu/libc-start.c:392
>     FRRouting#22 0x55910c4b0114 in _start (/usr/lib/frr/zebra+0x1ae114)

It happens with FRR using the kernel. During shutdown, the
namespace identifier is attempted to be obtained by zebra, in an
attempt to prepare zebra dataplane nexthop messages.

Fix this by accessing the ns structure.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
pguibert6WIND added a commit to pguibert6WIND/frr that referenced this issue Oct 8, 2024
The following ASAN issue has been observed:

> ERROR: AddressSanitizer: heap-use-after-free on address 0x6160000acba4 at pc 0x55910c5694d0 bp 0x7ffe3a8ac850 sp 0x7ffe3a8ac840
> READ of size 4 at 0x6160000acba4 thread T0
>         #0 0x55910c5694cf in ctx_info_from_zns zebra/zebra_dplane.c:3315
>     FRRouting#1 0x55910c569696 in dplane_ctx_ns_init zebra/zebra_dplane.c:3331
>     FRRouting#2 0x55910c56bf61 in dplane_ctx_nexthop_init zebra/zebra_dplane.c:3680
>     FRRouting#3 0x55910c5711ca in dplane_nexthop_update_internal zebra/zebra_dplane.c:4490
>     FRRouting#4 0x55910c571c5c in dplane_nexthop_delete zebra/zebra_dplane.c:4717
>     FRRouting#5 0x55910c61e90e in zebra_nhg_uninstall_kernel zebra/zebra_nhg.c:3413
>     FRRouting#6 0x55910c615d8a in zebra_nhg_decrement_ref zebra/zebra_nhg.c:1919
>     FRRouting#7 0x55910c6404db in route_entry_update_nhe zebra/zebra_rib.c:454
>     FRRouting#8 0x55910c64c904 in rib_re_nhg_free zebra/zebra_rib.c:2822
>     FRRouting#9 0x55910c655be2 in rib_unlink zebra/zebra_rib.c:4212
>     FRRouting#10 0x55910c6430f9 in zebra_rtable_node_cleanup zebra/zebra_rib.c:968
>     FRRouting#11 0x7f26f275b8a9 in route_node_free lib/table.c:75
>     FRRouting#12 0x7f26f275bae4 in route_table_free lib/table.c:111
>     FRRouting#13 0x7f26f275b749 in route_table_finish lib/table.c:46
>     FRRouting#14 0x55910c65db17 in zebra_router_free_table zebra/zebra_router.c:191
>     FRRouting#15 0x55910c65dfb5 in zebra_router_terminate zebra/zebra_router.c:244
>     FRRouting#16 0x55910c4f40db in zebra_finalize zebra/main.c:249
>     FRRouting#17 0x7f26f2777108 in event_call lib/event.c:2011
>     FRRouting#18 0x7f26f264180e in frr_run lib/libfrr.c:1212
>     FRRouting#19 0x55910c4f49cb in main zebra/main.c:531
>     FRRouting#20 0x7f26f2029d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
>     FRRouting#21 0x7f26f2029e3f in __libc_start_main_impl ../csu/libc-start.c:392
>     FRRouting#22 0x55910c4b0114 in _start (/usr/lib/frr/zebra+0x1ae114)

It happens with FRR using the kernel. During shutdown, the
namespace identifier is attempted to be obtained by zebra, in an
attempt to prepare zebra dataplane nexthop messages.

Fix this by accessing the ns structure.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
louis-6wind pushed a commit to louis-6wind/frr that referenced this issue Oct 9, 2024
The following ASAN issue has been observed:

> ERROR: AddressSanitizer: heap-use-after-free on address 0x6160000acba4 at pc 0x55910c5694d0 bp 0x7ffe3a8ac850 sp 0x7ffe3a8ac840
> READ of size 4 at 0x6160000acba4 thread T0
>         #0 0x55910c5694cf in ctx_info_from_zns zebra/zebra_dplane.c:3315
>     #1 0x55910c569696 in dplane_ctx_ns_init zebra/zebra_dplane.c:3331
>     #2 0x55910c56bf61 in dplane_ctx_nexthop_init zebra/zebra_dplane.c:3680
>     #3 0x55910c5711ca in dplane_nexthop_update_internal zebra/zebra_dplane.c:4490
>     FRRouting#4 0x55910c571c5c in dplane_nexthop_delete zebra/zebra_dplane.c:4717
>     FRRouting#5 0x55910c61e90e in zebra_nhg_uninstall_kernel zebra/zebra_nhg.c:3413
>     FRRouting#6 0x55910c615d8a in zebra_nhg_decrement_ref zebra/zebra_nhg.c:1919
>     FRRouting#7 0x55910c6404db in route_entry_update_nhe zebra/zebra_rib.c:454
>     FRRouting#8 0x55910c64c904 in rib_re_nhg_free zebra/zebra_rib.c:2822
>     FRRouting#9 0x55910c655be2 in rib_unlink zebra/zebra_rib.c:4212
>     FRRouting#10 0x55910c6430f9 in zebra_rtable_node_cleanup zebra/zebra_rib.c:968
>     FRRouting#11 0x7f26f275b8a9 in route_node_free lib/table.c:75
>     FRRouting#12 0x7f26f275bae4 in route_table_free lib/table.c:111
>     FRRouting#13 0x7f26f275b749 in route_table_finish lib/table.c:46
>     FRRouting#14 0x55910c65db17 in zebra_router_free_table zebra/zebra_router.c:191
>     FRRouting#15 0x55910c65dfb5 in zebra_router_terminate zebra/zebra_router.c:244
>     FRRouting#16 0x55910c4f40db in zebra_finalize zebra/main.c:249
>     FRRouting#17 0x7f26f2777108 in event_call lib/event.c:2011
>     FRRouting#18 0x7f26f264180e in frr_run lib/libfrr.c:1212
>     FRRouting#19 0x55910c4f49cb in main zebra/main.c:531
>     FRRouting#20 0x7f26f2029d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
>     FRRouting#21 0x7f26f2029e3f in __libc_start_main_impl ../csu/libc-start.c:392
>     FRRouting#22 0x55910c4b0114 in _start (/usr/lib/frr/zebra+0x1ae114)

It happens with FRR using the kernel. During shutdown, the
namespace identifier is attempted to be obtained by zebra, in an
attempt to prepare zebra dataplane nexthop messages.

Fix this by accessing the ns structure.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
pguibert6WIND added a commit to pguibert6WIND/frr that referenced this issue Oct 10, 2024
The following ASAN issue has been observed:

> ERROR: AddressSanitizer: heap-use-after-free on address 0x6160000acba4 at pc 0x55910c5694d0 bp 0x7ffe3a8ac850 sp 0x7ffe3a8ac840
> READ of size 4 at 0x6160000acba4 thread T0
>         #0 0x55910c5694cf in ctx_info_from_zns zebra/zebra_dplane.c:3315
>     FRRouting#1 0x55910c569696 in dplane_ctx_ns_init zebra/zebra_dplane.c:3331
>     FRRouting#2 0x55910c56bf61 in dplane_ctx_nexthop_init zebra/zebra_dplane.c:3680
>     FRRouting#3 0x55910c5711ca in dplane_nexthop_update_internal zebra/zebra_dplane.c:4490
>     FRRouting#4 0x55910c571c5c in dplane_nexthop_delete zebra/zebra_dplane.c:4717
>     FRRouting#5 0x55910c61e90e in zebra_nhg_uninstall_kernel zebra/zebra_nhg.c:3413
>     FRRouting#6 0x55910c615d8a in zebra_nhg_decrement_ref zebra/zebra_nhg.c:1919
>     FRRouting#7 0x55910c6404db in route_entry_update_nhe zebra/zebra_rib.c:454
>     FRRouting#8 0x55910c64c904 in rib_re_nhg_free zebra/zebra_rib.c:2822
>     FRRouting#9 0x55910c655be2 in rib_unlink zebra/zebra_rib.c:4212
>     FRRouting#10 0x55910c6430f9 in zebra_rtable_node_cleanup zebra/zebra_rib.c:968
>     FRRouting#11 0x7f26f275b8a9 in route_node_free lib/table.c:75
>     FRRouting#12 0x7f26f275bae4 in route_table_free lib/table.c:111
>     FRRouting#13 0x7f26f275b749 in route_table_finish lib/table.c:46
>     FRRouting#14 0x55910c65db17 in zebra_router_free_table zebra/zebra_router.c:191
>     FRRouting#15 0x55910c65dfb5 in zebra_router_terminate zebra/zebra_router.c:244
>     FRRouting#16 0x55910c4f40db in zebra_finalize zebra/main.c:249
>     FRRouting#17 0x7f26f2777108 in event_call lib/event.c:2011
>     FRRouting#18 0x7f26f264180e in frr_run lib/libfrr.c:1212
>     FRRouting#19 0x55910c4f49cb in main zebra/main.c:531
>     FRRouting#20 0x7f26f2029d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
>     FRRouting#21 0x7f26f2029e3f in __libc_start_main_impl ../csu/libc-start.c:392
>     FRRouting#22 0x55910c4b0114 in _start (/usr/lib/frr/zebra+0x1ae114)

It happens with FRR using the kernel. During shutdown, the
namespace identifier is attempted to be obtained by zebra, in an
attempt to prepare zebra dataplane nexthop messages.

Fix this by accessing the ns structure.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
pguibert6WIND added a commit to pguibert6WIND/frr that referenced this issue Oct 14, 2024
When a failover happens on ECMP paths that use the same
nexthop which is recursively resolved, ZEBRA replaces the
old NHG with a new one, and updates the pointer of all
routes using that nexthop.

Actually, if only the recursive nexthop changed, there is
no need to replace the old NHG.
Modify the zebra_nhg_proto_add() function, by updating
the recursive nexthop on the original NHG.

Using this change replaces the old method that was consisting in
allocating a new nhe. This change triggers an ASAN in the
bgp_nhg_zapi_scalability test, function
test_bgp_ipv4_simulate_r5_machine_going_down().

> ==1195107==ERROR: AddressSanitizer: heap-use-after-free on address 0x60e0000de580 at pc 0x55b6b7d55d8e bp 0x7fffd81977a0 sp 0x7fffd8197790
> READ of size 4 at 0x60e0000de580 thread T0
>     #0 0x55b6b7d55d8d in zebra_nhg_decrement_ref zebra/zebra_nhg.c:1858
>     FRRouting#1 0x55b6b7d55fee in zebra_nhg_free_members zebra/zebra_nhg.c:1752
>     FRRouting#2 0x55b6b7d55fee in zebra_nhg_free zebra/zebra_nhg.c:1772
>     FRRouting#3 0x55b6b7d59215 in zebra_nhg_proto_add zebra/zebra_nhg.c:3883
>     FRRouting#4 0x55b6b7d83615 in process_subq_nhg zebra/zebra_rib.c:2738
>     FRRouting#5 0x55b6b7d83615 in process_subq zebra/zebra_rib.c:3344
>     FRRouting#6 0x55b6b7d83615 in meta_queue_process zebra/zebra_rib.c:3397
>     FRRouting#7 0x7fe57a916fef in work_queue_run lib/workqueue.c:282
>     FRRouting#8 0x7fe57a8f863b in event_call lib/event.c:1996
>     FRRouting#9 0x7fe57a81e527 in frr_run lib/libfrr.c:1237
>     FRRouting#10 0x55b6b7c40c74 in main zebra/main.c:526
>     FRRouting#11 0x7fe57a229d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
>     FRRouting#12 0x7fe57a229e3f in __libc_start_main_impl ../csu/libc-start.c:392
>     FRRouting#13 0x55b6b7c43b84 in _start (/usr/lib/frr/zebra+0x1adb84)
>
> 0x60e0000de580 is located 96 bytes inside of 160-byte region [0x60e0000de520,0x60e0000de5c0)
> freed by thread T0 here:
>     #0 0x7fe57acb4537 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:127
>     FRRouting#1 0x55b6b7d59628 in zebra_nhg_proto_add zebra/zebra_nhg.c:3876
>     FRRouting#2 0x55b6b7d83615 in process_subq_nhg zebra/zebra_rib.c:2738
>     FRRouting#3 0x55b6b7d83615 in process_subq zebra/zebra_rib.c:3344
>     FRRouting#4 0x55b6b7d83615 in meta_queue_process zebra/zebra_rib.c:3397
>     FRRouting#5 0x7fe57a916fef in work_queue_run lib/workqueue.c:282
>     FRRouting#6 0x7fe57a8f863b in event_call lib/event.c:1996
>     FRRouting#7 0x7fe57a81e527 in frr_run lib/libfrr.c:1237
>     FRRouting#8 0x55b6b7c40c74 in main zebra/main.c:526
>     FRRouting#9 0x7fe57a229d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
>
> previously allocated by thread T0 here:
>     #0 0x7fe57acb4a57 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
>     FRRouting#1 0x7fe57a83e98e in qcalloc lib/memory.c:106
>     FRRouting#2 0x55b6b7d5149e in zebra_nhg_alloc zebra/zebra_nhg.c:392
>     FRRouting#3 0x55b6b7d5149e in zebra_nhe_copy zebra/zebra_nhg.c:499
>     FRRouting#4 0x55b6b7d5181f in zebra_nhg_hash_alloc zebra/zebra_nhg.c:538
>     FRRouting#5 0x7fe57a7fbf0d in hash_get lib/hash.c:147
>     FRRouting#6 0x55b6b7d542ea in zebra_nhe_find zebra/zebra_nhg.c:832
>     FRRouting#7 0x55b6b7d5495f in zebra_nhg_find zebra/zebra_nhg.c:1014
>     FRRouting#8 0x55b6b7d54dcd in zebra_nhg_find_nexthop zebra/zebra_nhg.c:1031
>     FRRouting#9 0x55b6b7d535e8 in depends_find_recursive zebra/zebra_nhg.c:1514
>     FRRouting#10 0x55b6b7d535e8 in depends_find zebra/zebra_nhg.c:1563
>     FRRouting#11 0x55b6b7d535e8 in depends_find_add zebra/zebra_nhg.c:1602
>     FRRouting#12 0x55b6b7d59884 in zebra_nhg_update_nhe zebra/zebra_nhg.c:3738
>     FRRouting#13 0x55b6b7d59884 in zebra_nhg_proto_add zebra/zebra_nhg.c:3844
>     FRRouting#14 0x55b6b7d83615 in process_subq_nhg zebra/zebra_rib.c:2738
>     FRRouting#15 0x55b6b7d83615 in process_subq zebra/zebra_rib.c:3344
>     FRRouting#16 0x55b6b7d83615 in meta_queue_process zebra/zebra_rib.c:3397
>     FRRouting#17 0x7fe57a916fef in work_queue_run lib/workqueue.c:282
>     FRRouting#18 0x7fe57a8f863b in event_call lib/event.c:1996
>     FRRouting#19 0x7fe57a81e527 in frr_run lib/libfrr.c:1237
>     FRRouting#20 0x55b6b7c40c74 in main zebra/main.c:526
>     FRRouting#21 0x7fe57a229d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
>
> SUMMARY: AddressSanitizer: heap-use-after-free zebra/zebra_nhg.c:1858 in zebra_nhg_decrement_ref
> Shadow bytes around the buggy address:
>   0x0c1c80013c60: fd fd fd fa fa fa fa fa fa fa fa fa fd fd fd fd
>   0x0c1c80013c70: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fa
>   0x0c1c80013c80: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
>   0x0c1c80013c90: fd fd fd fd fd fd fd fd fd fd fd fa fa fa fa fa
>   0x0c1c80013ca0: fa fa fa fa fd fd fd fd fd fd fd fd fd fd fd fd
> =>0x0c1c80013cb0:[fd]fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa
>   0x0c1c80013cc0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
>   0x0c1c80013cd0: fd fd fd fa fa fa fa fa fa fa fa fa fd fd fd fd
>   0x0c1c80013ce0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fa
>   0x0c1c80013cf0: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
>   0x0c1c80013d00: 00 00 00 00 00 00 00 00 00 00 00 fa fa fa fa fa
> Shadow byte legend (one shadow byte represents 8 application bytes):
>   Addressable:           00
>   Partially addressable: 01 02 03 04 05 06 07
>   Heap left redzone:       fa
>   Freed heap region:       fd
>   Stack left redzone:      f1
>   Stack mid redzone:       f2
>   Stack right redzone:     f3
>   Stack after return:      f5
>   Stack use after scope:   f8
>   Global redzone:          f9
>   Global init order:       f6
>   Poisoned by user:        f7
>   Container overflow:      fc
>   Array cookie:            ac
>   Intra object redzone:    bb
>   ASan internal:           fe
>   Left alloca redzone:     ca
>   Right alloca redzone:    cb
>   Shadow gap:              cc
> ==1195107==ABORTING
>

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
pguibert6WIND added a commit to pguibert6WIND/frr that referenced this issue Oct 14, 2024
A general flush is done on the nhg depend of the protocol nexthop group.
Actually, the NHG should not be removed, if there are routes attached to
it. In the same time, it seems the route count does not propagate to
the nhg_depends.

The con of this method is that there is still ASAN, and by comparing
the refcount value of the old way (allocation), the count is less
than expectd, for nexthop group with route count only:

Allocation method in proto_add():

> 2024/10/14 10:57:24.915401 ZEBRA: [VB8P9-5F2GE] zebra_nhg_proto_add: BEFORE NHE 71428576, (71428576[39/49/59]) cnt 2002
> 2024/10/14 10:57:24.915510 ZEBRA: [HCTBK-W37K2] zebra_nhg_proto_add: NHE 71428576, (71428576[49/59/65]) cnt 1
> 2024/10/14 10:57:24.915513 ZEBRA: [RM3ZQ-V7JN5] zebra_nhg_proto_add:            NHE 49, (49[50]) cnt 2012
> 2024/10/14 10:57:24.915515 ZEBRA: [VP9H1-EV2BN] 	(71428573)
> 2024/10/14 10:57:24.915515 ZEBRA: [VP9H1-EV2BN] 	(71428574)
> 2024/10/14 10:57:24.915516 ZEBRA: [VP9H1-EV2BN] 	(71428576)
> 2024/10/14 10:57:24.915517 ZEBRA: [VP9H1-EV2BN] 	(71428578)
> 2024/10/14 10:57:24.915517 ZEBRA: [RM3ZQ-V7JN5] zebra_nhg_proto_add:            NHE 59, (59[60]) cnt 2007
> 2024/10/14 10:57:24.915519 ZEBRA: [VP9H1-EV2BN] 	(71428575)
> 2024/10/14 10:57:24.915519 ZEBRA: [VP9H1-EV2BN] 	(71428576)
> 2024/10/14 10:57:24.915520 ZEBRA: [RM3ZQ-V7JN5] zebra_nhg_proto_add:            NHE 65, (65[42]) cnt 4
> 2024/10/14 10:57:24.915521 ZEBRA: [VP9H1-EV2BN] 	(71428571)
> 2024/10/14 10:57:24.915522 ZEBRA: [VP9H1-EV2BN] 	(71428576)

Method using general flush, but keep old pointer:

> 2024/10/14 10:51:17.229799 ZEBRA: [VB8P9-5F2GE] zebra_nhg_proto_add: BEFORE NHE 71428576, (71428576[39/49/59]) cnt 2002
> 2024/10/14 10:51:17.229909 ZEBRA: [HCTBK-W37K2] zebra_nhg_proto_add: NHE 71428576, (71428576[49/59/65]) cnt 2002
> 2024/10/14 10:51:17.229912 ZEBRA: [RM3ZQ-V7JN5] zebra_nhg_proto_add:            NHE 49, (49[50]) cnt 2011
> 2024/10/14 10:51:17.229914 ZEBRA: [VP9H1-EV2BN] 	(71428573)
> 2024/10/14 10:51:17.229915 ZEBRA: [VP9H1-EV2BN] 	(71428574)
> 2024/10/14 10:51:17.229915 ZEBRA: [VP9H1-EV2BN] 	(71428576)
> 2024/10/14 10:51:17.229916 ZEBRA: [VP9H1-EV2BN] 	(71428578)
> 2024/10/14 10:51:17.229916 ZEBRA: [RM3ZQ-V7JN5] zebra_nhg_proto_add:            NHE 59, (59[60]) cnt 2006
> 2024/10/14 10:51:17.229918 ZEBRA: [VP9H1-EV2BN] 	(71428575)
> 2024/10/14 10:51:17.229918 ZEBRA: [VP9H1-EV2BN] 	(71428576)
> 2024/10/14 10:51:17.229919 ZEBRA: [RM3ZQ-V7JN5] zebra_nhg_proto_add:            NHE 65, (65[42]) cnt 4
> 2024/10/14 10:51:17.229920 ZEBRA: [VP9H1-EV2BN] 	(71428571)
> 2024/10/14 10:51:17.229921 ZEBRA: [VP9H1-EV2BN] 	(71428576)

Resulting ASAN error when running bgp_nhg_zapi_notification, on the
test_bgp_ipv4_simulate_r5_machine_going_down() function:

> r1: zebra triggered an exception by AddressSanitizer
> AddressSanitizer error in topotest `test_bgp_nhg_zapi_scalability.py`, test `teardown_module`, router `r1`
>
> ERROR: AddressSanitizer: heap-use-after-free on address 0x60e0000de580 at pc 0x558a7d98cd8e bp 0x7fff4915a6e0 sp 0x7fff4915a6d0
> READ of size 4 at 0x60e0000de580 thread T0
>     #0 0x558a7d98cd8d in zebra_nhg_decrement_ref zebra/zebra_nhg.c:1858
>     FRRouting#1 0x558a7d98cfee in zebra_nhg_free_members zebra/zebra_nhg.c:1752
>     FRRouting#2 0x558a7d98cfee in zebra_nhg_free zebra/zebra_nhg.c:1772
>     FRRouting#3 0x558a7d9901ff in zebra_nhg_proto_add zebra/zebra_nhg.c:3861
>     FRRouting#4 0x558a7d9ba365 in process_subq_nhg zebra/zebra_rib.c:2738
>     FRRouting#5 0x558a7d9ba365 in process_subq zebra/zebra_rib.c:3344
>     FRRouting#6 0x558a7d9ba365 in meta_queue_process zebra/zebra_rib.c:3397
>     FRRouting#7 0x7fa262f16fef in work_queue_run lib/workqueue.c:282
>     FRRouting#8 0x7fa262ef863b in event_call lib/event.c:1996
>     FRRouting#9 0x7fa262e1e527 in frr_run lib/libfrr.c:1237
>     FRRouting#10 0x558a7d877c74 in main zebra/main.c:526
>     FRRouting#11 0x7fa262829d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
>     FRRouting#12 0x7fa262829e3f in __libc_start_main_impl ../csu/libc-start.c:392
>     FRRouting#13 0x558a7d87ab84 in _start (/usr/lib/frr/zebra+0x1acb84)
>
> 0x60e0000de580 is located 96 bytes inside of 160-byte region [0x60e0000de520,0x60e0000de5c0)
> freed by thread T0 here:
>     #0 0x7fa2632b4537 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:127
>     FRRouting#1 0x558a7d9908a1 in zebra_nhg_proto_add zebra/zebra_nhg.c:3854
>     FRRouting#2 0x558a7d9ba365 in process_subq_nhg zebra/zebra_rib.c:2738
>     FRRouting#3 0x558a7d9ba365 in process_subq zebra/zebra_rib.c:3344
>     FRRouting#4 0x558a7d9ba365 in meta_queue_process zebra/zebra_rib.c:3397
>     FRRouting#5 0x7fa262f16fef in work_queue_run lib/workqueue.c:282
>     FRRouting#6 0x7fa262ef863b in event_call lib/event.c:1996
>     FRRouting#7 0x7fa262e1e527 in frr_run lib/libfrr.c:1237
>     FRRouting#8 0x558a7d877c74 in main zebra/main.c:526
>     FRRouting#9 0x7fa262829d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
>
> previously allocated by thread T0 here:
>     #0 0x7fa2632b4a57 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
>     FRRouting#1 0x7fa262e3e98e in qcalloc lib/memory.c:106
>     FRRouting#2 0x558a7d98849e in zebra_nhg_alloc zebra/zebra_nhg.c:392
>     FRRouting#3 0x558a7d98849e in zebra_nhe_copy zebra/zebra_nhg.c:499
>     FRRouting#4 0x558a7d98881f in zebra_nhg_hash_alloc zebra/zebra_nhg.c:538
>     FRRouting#5 0x7fa262dfbf0d in hash_get lib/hash.c:147
>     FRRouting#6 0x558a7d98b2ea in zebra_nhe_find zebra/zebra_nhg.c:832
>     FRRouting#7 0x558a7d98b95f in zebra_nhg_find zebra/zebra_nhg.c:1014
>     FRRouting#8 0x558a7d98bdcd in zebra_nhg_find_nexthop zebra/zebra_nhg.c:1031
>     FRRouting#9 0x558a7d98a5e8 in depends_find_recursive zebra/zebra_nhg.c:1514
>     FRRouting#10 0x558a7d98a5e8 in depends_find zebra/zebra_nhg.c:1563
>     FRRouting#11 0x558a7d98a5e8 in depends_find_add zebra/zebra_nhg.c:1602
>     FRRouting#12 0x558a7d990378 in zebra_nhg_update_nhe zebra/zebra_nhg.c:3739
>     FRRouting#13 0x558a7d990378 in zebra_nhg_proto_add zebra/zebra_nhg.c:3822
>     FRRouting#14 0x558a7d9ba365 in process_subq_nhg zebra/zebra_rib.c:2738
>     FRRouting#15 0x558a7d9ba365 in process_subq zebra/zebra_rib.c:3344
>     FRRouting#16 0x558a7d9ba365 in meta_queue_process zebra/zebra_rib.c:3397
>     FRRouting#17 0x7fa262f16fef in work_queue_run lib/workqueue.c:282
>     FRRouting#18 0x7fa262ef863b in event_call lib/event.c:1996
>     FRRouting#19 0x7fa262e1e527 in frr_run lib/libfrr.c:1237
>     FRRouting#20 0x558a7d877c74 in main zebra/main.c:526
>     FRRouting#21 0x7fa262829d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
>
> SUMMARY: AddressSanitizer: heap-use-after-free zebra/zebra_nhg.c:1858 in zebra_nhg_decrement_ref
> Shadow bytes around the buggy address:
>   0x0c1c80013c60: fd fd fd fa fa fa fa fa fa fa fa fa fd fd fd fd
>   0x0c1c80013c70: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fa
>   0x0c1c80013c80: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
>   0x0c1c80013c90: fd fd fd fd fd fd fd fd fd fd fd fa fa fa fa fa
>   0x0c1c80013ca0: fa fa fa fa fd fd fd fd fd fd fd fd fd fd fd fd
> =>0x0c1c80013cb0:[fd]fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa
>   0x0c1c80013cc0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
>   0x0c1c80013cd0: fd fd fd fa fa fa fa fa fa fa fa fa fd fd fd fd
>   0x0c1c80013ce0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fa
>   0x0c1c80013cf0: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
>   0x0c1c80013d00: 00 00 00 00 00 00 00 00 00 00 00 fa fa fa fa fa
> Shadow byte legend (one shadow byte represents 8 application bytes):
>   Addressable:           00
>   Partially addressable: 01 02 03 04 05 06 07
>   Heap left redzone:       fa
>   Freed heap region:       fd
>   Stack left redzone:      f1
>   Stack mid redzone:       f2
>   Stack right redzone:     f3
>   Stack after return:      f5
>   Stack use after scope:   f8
>   Global redzone:          f9
>   Global init order:       f6
>   Poisoned by user:        f7
>   Container overflow:      fc
>   Array cookie:            ac
>   Intra object redzone:    bb
>   ASan internal:           fe
>   Left alloca redzone:     ca
>   Right alloca redzone:    cb
>   Shadow gap:              cc
>

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
pguibert6WIND added a commit to pguibert6WIND/frr that referenced this issue Oct 14, 2024
When a failover happens on ECMP paths that use the same
nexthop which is recursively resolved, ZEBRA replaces the
old NHG with a new one, and updates the pointer of all
routes using that nexthop.

Actually, if only the recursive nexthop changed, there is
no need to replace the old NHG.
Modify the zebra_nhg_proto_add() function, by updating
the recursive nexthop on the original NHG.

Using this change replaces the old method that was consisting in
allocating a new nhe. This change triggers an ASAN in the
bgp_nhg_zapi_scalability test, function
test_bgp_ipv4_simulate_r5_machine_going_down().

> ==1195107==ERROR: AddressSanitizer: heap-use-after-free on address 0x60e0000de580 at pc 0x55b6b7d55d8e bp 0x7fffd81977a0 sp 0x7fffd8197790
> READ of size 4 at 0x60e0000de580 thread T0
>     #0 0x55b6b7d55d8d in zebra_nhg_decrement_ref zebra/zebra_nhg.c:1858
>     FRRouting#1 0x55b6b7d55fee in zebra_nhg_free_members zebra/zebra_nhg.c:1752
>     FRRouting#2 0x55b6b7d55fee in zebra_nhg_free zebra/zebra_nhg.c:1772
>     FRRouting#3 0x55b6b7d59215 in zebra_nhg_proto_add zebra/zebra_nhg.c:3883
>     FRRouting#4 0x55b6b7d83615 in process_subq_nhg zebra/zebra_rib.c:2738
>     FRRouting#5 0x55b6b7d83615 in process_subq zebra/zebra_rib.c:3344
>     FRRouting#6 0x55b6b7d83615 in meta_queue_process zebra/zebra_rib.c:3397
>     FRRouting#7 0x7fe57a916fef in work_queue_run lib/workqueue.c:282
>     FRRouting#8 0x7fe57a8f863b in event_call lib/event.c:1996
>     FRRouting#9 0x7fe57a81e527 in frr_run lib/libfrr.c:1237
>     FRRouting#10 0x55b6b7c40c74 in main zebra/main.c:526
>     FRRouting#11 0x7fe57a229d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
>     FRRouting#12 0x7fe57a229e3f in __libc_start_main_impl ../csu/libc-start.c:392
>     FRRouting#13 0x55b6b7c43b84 in _start (/usr/lib/frr/zebra+0x1adb84)
>
> 0x60e0000de580 is located 96 bytes inside of 160-byte region [0x60e0000de520,0x60e0000de5c0)
> freed by thread T0 here:
>     #0 0x7fe57acb4537 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:127
>     FRRouting#1 0x55b6b7d59628 in zebra_nhg_proto_add zebra/zebra_nhg.c:3876
>     FRRouting#2 0x55b6b7d83615 in process_subq_nhg zebra/zebra_rib.c:2738
>     FRRouting#3 0x55b6b7d83615 in process_subq zebra/zebra_rib.c:3344
>     FRRouting#4 0x55b6b7d83615 in meta_queue_process zebra/zebra_rib.c:3397
>     FRRouting#5 0x7fe57a916fef in work_queue_run lib/workqueue.c:282
>     FRRouting#6 0x7fe57a8f863b in event_call lib/event.c:1996
>     FRRouting#7 0x7fe57a81e527 in frr_run lib/libfrr.c:1237
>     FRRouting#8 0x55b6b7c40c74 in main zebra/main.c:526
>     FRRouting#9 0x7fe57a229d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
>
> previously allocated by thread T0 here:
>     #0 0x7fe57acb4a57 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
>     FRRouting#1 0x7fe57a83e98e in qcalloc lib/memory.c:106
>     FRRouting#2 0x55b6b7d5149e in zebra_nhg_alloc zebra/zebra_nhg.c:392
>     FRRouting#3 0x55b6b7d5149e in zebra_nhe_copy zebra/zebra_nhg.c:499
>     FRRouting#4 0x55b6b7d5181f in zebra_nhg_hash_alloc zebra/zebra_nhg.c:538
>     FRRouting#5 0x7fe57a7fbf0d in hash_get lib/hash.c:147
>     FRRouting#6 0x55b6b7d542ea in zebra_nhe_find zebra/zebra_nhg.c:832
>     FRRouting#7 0x55b6b7d5495f in zebra_nhg_find zebra/zebra_nhg.c:1014
>     FRRouting#8 0x55b6b7d54dcd in zebra_nhg_find_nexthop zebra/zebra_nhg.c:1031
>     FRRouting#9 0x55b6b7d535e8 in depends_find_recursive zebra/zebra_nhg.c:1514
>     FRRouting#10 0x55b6b7d535e8 in depends_find zebra/zebra_nhg.c:1563
>     FRRouting#11 0x55b6b7d535e8 in depends_find_add zebra/zebra_nhg.c:1602
>     FRRouting#12 0x55b6b7d59884 in zebra_nhg_update_nhe zebra/zebra_nhg.c:3738
>     FRRouting#13 0x55b6b7d59884 in zebra_nhg_proto_add zebra/zebra_nhg.c:3844
>     FRRouting#14 0x55b6b7d83615 in process_subq_nhg zebra/zebra_rib.c:2738
>     FRRouting#15 0x55b6b7d83615 in process_subq zebra/zebra_rib.c:3344
>     FRRouting#16 0x55b6b7d83615 in meta_queue_process zebra/zebra_rib.c:3397
>     FRRouting#17 0x7fe57a916fef in work_queue_run lib/workqueue.c:282
>     FRRouting#18 0x7fe57a8f863b in event_call lib/event.c:1996
>     FRRouting#19 0x7fe57a81e527 in frr_run lib/libfrr.c:1237
>     FRRouting#20 0x55b6b7c40c74 in main zebra/main.c:526
>     FRRouting#21 0x7fe57a229d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
>
> SUMMARY: AddressSanitizer: heap-use-after-free zebra/zebra_nhg.c:1858 in zebra_nhg_decrement_ref
> Shadow bytes around the buggy address:
>   0x0c1c80013c60: fd fd fd fa fa fa fa fa fa fa fa fa fd fd fd fd
>   0x0c1c80013c70: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fa
>   0x0c1c80013c80: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
>   0x0c1c80013c90: fd fd fd fd fd fd fd fd fd fd fd fa fa fa fa fa
>   0x0c1c80013ca0: fa fa fa fa fd fd fd fd fd fd fd fd fd fd fd fd
> =>0x0c1c80013cb0:[fd]fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa
>   0x0c1c80013cc0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
>   0x0c1c80013cd0: fd fd fd fa fa fa fa fa fa fa fa fa fd fd fd fd
>   0x0c1c80013ce0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fa
>   0x0c1c80013cf0: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
>   0x0c1c80013d00: 00 00 00 00 00 00 00 00 00 00 00 fa fa fa fa fa
> Shadow byte legend (one shadow byte represents 8 application bytes):
>   Addressable:           00
>   Partially addressable: 01 02 03 04 05 06 07
>   Heap left redzone:       fa
>   Freed heap region:       fd
>   Stack left redzone:      f1
>   Stack mid redzone:       f2
>   Stack right redzone:     f3
>   Stack after return:      f5
>   Stack use after scope:   f8
>   Global redzone:          f9
>   Global init order:       f6
>   Poisoned by user:        f7
>   Container overflow:      fc
>   Array cookie:            ac
>   Intra object redzone:    bb
>   ASan internal:           fe
>   Left alloca redzone:     ca
>   Right alloca redzone:    cb
>   Shadow gap:              cc
> ==1195107==ABORTING
>

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
pguibert6WIND added a commit to pguibert6WIND/frr that referenced this issue Oct 14, 2024
A general flush is done on the nhg depend of the protocol nexthop group.
Actually, the NHG should not be removed, if there are routes attached to
it. In the same time, it seems the route count does not propagate to
the nhg_depends.

The con of this method is that there is still ASAN, and by comparing
the refcount value of the old way (allocation), the count is less
than expectd, for nexthop group with route count only:

Allocation method in proto_add():

> 2024/10/14 10:57:24.915401 ZEBRA: [VB8P9-5F2GE] zebra_nhg_proto_add: BEFORE NHE 71428576, (71428576[39/49/59]) cnt 2002
> 2024/10/14 10:57:24.915510 ZEBRA: [HCTBK-W37K2] zebra_nhg_proto_add: NHE 71428576, (71428576[49/59/65]) cnt 1
> 2024/10/14 10:57:24.915513 ZEBRA: [RM3ZQ-V7JN5] zebra_nhg_proto_add:            NHE 49, (49[50]) cnt 2012
> 2024/10/14 10:57:24.915515 ZEBRA: [VP9H1-EV2BN] 	(71428573)
> 2024/10/14 10:57:24.915515 ZEBRA: [VP9H1-EV2BN] 	(71428574)
> 2024/10/14 10:57:24.915516 ZEBRA: [VP9H1-EV2BN] 	(71428576)
> 2024/10/14 10:57:24.915517 ZEBRA: [VP9H1-EV2BN] 	(71428578)
> 2024/10/14 10:57:24.915517 ZEBRA: [RM3ZQ-V7JN5] zebra_nhg_proto_add:            NHE 59, (59[60]) cnt 2007
> 2024/10/14 10:57:24.915519 ZEBRA: [VP9H1-EV2BN] 	(71428575)
> 2024/10/14 10:57:24.915519 ZEBRA: [VP9H1-EV2BN] 	(71428576)
> 2024/10/14 10:57:24.915520 ZEBRA: [RM3ZQ-V7JN5] zebra_nhg_proto_add:            NHE 65, (65[42]) cnt 4
> 2024/10/14 10:57:24.915521 ZEBRA: [VP9H1-EV2BN] 	(71428571)
> 2024/10/14 10:57:24.915522 ZEBRA: [VP9H1-EV2BN] 	(71428576)

Method using general flush, but keep old pointer:

> 2024/10/14 10:51:17.229799 ZEBRA: [VB8P9-5F2GE] zebra_nhg_proto_add: BEFORE NHE 71428576, (71428576[39/49/59]) cnt 2002
> 2024/10/14 10:51:17.229909 ZEBRA: [HCTBK-W37K2] zebra_nhg_proto_add: NHE 71428576, (71428576[49/59/65]) cnt 2002
> 2024/10/14 10:51:17.229912 ZEBRA: [RM3ZQ-V7JN5] zebra_nhg_proto_add:            NHE 49, (49[50]) cnt 2011
> 2024/10/14 10:51:17.229914 ZEBRA: [VP9H1-EV2BN] 	(71428573)
> 2024/10/14 10:51:17.229915 ZEBRA: [VP9H1-EV2BN] 	(71428574)
> 2024/10/14 10:51:17.229915 ZEBRA: [VP9H1-EV2BN] 	(71428576)
> 2024/10/14 10:51:17.229916 ZEBRA: [VP9H1-EV2BN] 	(71428578)
> 2024/10/14 10:51:17.229916 ZEBRA: [RM3ZQ-V7JN5] zebra_nhg_proto_add:            NHE 59, (59[60]) cnt 2006
> 2024/10/14 10:51:17.229918 ZEBRA: [VP9H1-EV2BN] 	(71428575)
> 2024/10/14 10:51:17.229918 ZEBRA: [VP9H1-EV2BN] 	(71428576)
> 2024/10/14 10:51:17.229919 ZEBRA: [RM3ZQ-V7JN5] zebra_nhg_proto_add:            NHE 65, (65[42]) cnt 4
> 2024/10/14 10:51:17.229920 ZEBRA: [VP9H1-EV2BN] 	(71428571)
> 2024/10/14 10:51:17.229921 ZEBRA: [VP9H1-EV2BN] 	(71428576)

Resulting ASAN error when running bgp_nhg_zapi_notification, on the
test_bgp_ipv4_simulate_r5_machine_going_down() function:

> r1: zebra triggered an exception by AddressSanitizer
> AddressSanitizer error in topotest `test_bgp_nhg_zapi_scalability.py`, test `teardown_module`, router `r1`
>
> ERROR: AddressSanitizer: heap-use-after-free on address 0x60e0000de580 at pc 0x558a7d98cd8e bp 0x7fff4915a6e0 sp 0x7fff4915a6d0
> READ of size 4 at 0x60e0000de580 thread T0
>     #0 0x558a7d98cd8d in zebra_nhg_decrement_ref zebra/zebra_nhg.c:1858
>     FRRouting#1 0x558a7d98cfee in zebra_nhg_free_members zebra/zebra_nhg.c:1752
>     FRRouting#2 0x558a7d98cfee in zebra_nhg_free zebra/zebra_nhg.c:1772
>     FRRouting#3 0x558a7d9901ff in zebra_nhg_proto_add zebra/zebra_nhg.c:3861
>     FRRouting#4 0x558a7d9ba365 in process_subq_nhg zebra/zebra_rib.c:2738
>     FRRouting#5 0x558a7d9ba365 in process_subq zebra/zebra_rib.c:3344
>     FRRouting#6 0x558a7d9ba365 in meta_queue_process zebra/zebra_rib.c:3397
>     FRRouting#7 0x7fa262f16fef in work_queue_run lib/workqueue.c:282
>     FRRouting#8 0x7fa262ef863b in event_call lib/event.c:1996
>     FRRouting#9 0x7fa262e1e527 in frr_run lib/libfrr.c:1237
>     FRRouting#10 0x558a7d877c74 in main zebra/main.c:526
>     FRRouting#11 0x7fa262829d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
>     FRRouting#12 0x7fa262829e3f in __libc_start_main_impl ../csu/libc-start.c:392
>     FRRouting#13 0x558a7d87ab84 in _start (/usr/lib/frr/zebra+0x1acb84)
>
> 0x60e0000de580 is located 96 bytes inside of 160-byte region [0x60e0000de520,0x60e0000de5c0)
> freed by thread T0 here:
>     #0 0x7fa2632b4537 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:127
>     FRRouting#1 0x558a7d9908a1 in zebra_nhg_proto_add zebra/zebra_nhg.c:3854
>     FRRouting#2 0x558a7d9ba365 in process_subq_nhg zebra/zebra_rib.c:2738
>     FRRouting#3 0x558a7d9ba365 in process_subq zebra/zebra_rib.c:3344
>     FRRouting#4 0x558a7d9ba365 in meta_queue_process zebra/zebra_rib.c:3397
>     FRRouting#5 0x7fa262f16fef in work_queue_run lib/workqueue.c:282
>     FRRouting#6 0x7fa262ef863b in event_call lib/event.c:1996
>     FRRouting#7 0x7fa262e1e527 in frr_run lib/libfrr.c:1237
>     FRRouting#8 0x558a7d877c74 in main zebra/main.c:526
>     FRRouting#9 0x7fa262829d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
>
> previously allocated by thread T0 here:
>     #0 0x7fa2632b4a57 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
>     FRRouting#1 0x7fa262e3e98e in qcalloc lib/memory.c:106
>     FRRouting#2 0x558a7d98849e in zebra_nhg_alloc zebra/zebra_nhg.c:392
>     FRRouting#3 0x558a7d98849e in zebra_nhe_copy zebra/zebra_nhg.c:499
>     FRRouting#4 0x558a7d98881f in zebra_nhg_hash_alloc zebra/zebra_nhg.c:538
>     FRRouting#5 0x7fa262dfbf0d in hash_get lib/hash.c:147
>     FRRouting#6 0x558a7d98b2ea in zebra_nhe_find zebra/zebra_nhg.c:832
>     FRRouting#7 0x558a7d98b95f in zebra_nhg_find zebra/zebra_nhg.c:1014
>     FRRouting#8 0x558a7d98bdcd in zebra_nhg_find_nexthop zebra/zebra_nhg.c:1031
>     FRRouting#9 0x558a7d98a5e8 in depends_find_recursive zebra/zebra_nhg.c:1514
>     FRRouting#10 0x558a7d98a5e8 in depends_find zebra/zebra_nhg.c:1563
>     FRRouting#11 0x558a7d98a5e8 in depends_find_add zebra/zebra_nhg.c:1602
>     FRRouting#12 0x558a7d990378 in zebra_nhg_update_nhe zebra/zebra_nhg.c:3739
>     FRRouting#13 0x558a7d990378 in zebra_nhg_proto_add zebra/zebra_nhg.c:3822
>     FRRouting#14 0x558a7d9ba365 in process_subq_nhg zebra/zebra_rib.c:2738
>     FRRouting#15 0x558a7d9ba365 in process_subq zebra/zebra_rib.c:3344
>     FRRouting#16 0x558a7d9ba365 in meta_queue_process zebra/zebra_rib.c:3397
>     FRRouting#17 0x7fa262f16fef in work_queue_run lib/workqueue.c:282
>     FRRouting#18 0x7fa262ef863b in event_call lib/event.c:1996
>     FRRouting#19 0x7fa262e1e527 in frr_run lib/libfrr.c:1237
>     FRRouting#20 0x558a7d877c74 in main zebra/main.c:526
>     FRRouting#21 0x7fa262829d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
>
> SUMMARY: AddressSanitizer: heap-use-after-free zebra/zebra_nhg.c:1858 in zebra_nhg_decrement_ref
> Shadow bytes around the buggy address:
>   0x0c1c80013c60: fd fd fd fa fa fa fa fa fa fa fa fa fd fd fd fd
>   0x0c1c80013c70: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fa
>   0x0c1c80013c80: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
>   0x0c1c80013c90: fd fd fd fd fd fd fd fd fd fd fd fa fa fa fa fa
>   0x0c1c80013ca0: fa fa fa fa fd fd fd fd fd fd fd fd fd fd fd fd
> =>0x0c1c80013cb0:[fd]fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa
>   0x0c1c80013cc0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
>   0x0c1c80013cd0: fd fd fd fa fa fa fa fa fa fa fa fa fd fd fd fd
>   0x0c1c80013ce0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fa
>   0x0c1c80013cf0: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
>   0x0c1c80013d00: 00 00 00 00 00 00 00 00 00 00 00 fa fa fa fa fa
> Shadow byte legend (one shadow byte represents 8 application bytes):
>   Addressable:           00
>   Partially addressable: 01 02 03 04 05 06 07
>   Heap left redzone:       fa
>   Freed heap region:       fd
>   Stack left redzone:      f1
>   Stack mid redzone:       f2
>   Stack right redzone:     f3
>   Stack after return:      f5
>   Stack use after scope:   f8
>   Global redzone:          f9
>   Global init order:       f6
>   Poisoned by user:        f7
>   Container overflow:      fc
>   Array cookie:            ac
>   Intra object redzone:    bb
>   ASan internal:           fe
>   Left alloca redzone:     ca
>   Right alloca redzone:    cb
>   Shadow gap:              cc
>

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
louis-6wind pushed a commit to louis-6wind/frr that referenced this issue Oct 15, 2024
The following ASAN issue has been observed:

> ERROR: AddressSanitizer: heap-use-after-free on address 0x6160000acba4 at pc 0x55910c5694d0 bp 0x7ffe3a8ac850 sp 0x7ffe3a8ac840
> READ of size 4 at 0x6160000acba4 thread T0
>         #0 0x55910c5694cf in ctx_info_from_zns zebra/zebra_dplane.c:3315
>     #1 0x55910c569696 in dplane_ctx_ns_init zebra/zebra_dplane.c:3331
>     #2 0x55910c56bf61 in dplane_ctx_nexthop_init zebra/zebra_dplane.c:3680
>     #3 0x55910c5711ca in dplane_nexthop_update_internal zebra/zebra_dplane.c:4490
>     FRRouting#4 0x55910c571c5c in dplane_nexthop_delete zebra/zebra_dplane.c:4717
>     FRRouting#5 0x55910c61e90e in zebra_nhg_uninstall_kernel zebra/zebra_nhg.c:3413
>     FRRouting#6 0x55910c615d8a in zebra_nhg_decrement_ref zebra/zebra_nhg.c:1919
>     FRRouting#7 0x55910c6404db in route_entry_update_nhe zebra/zebra_rib.c:454
>     FRRouting#8 0x55910c64c904 in rib_re_nhg_free zebra/zebra_rib.c:2822
>     FRRouting#9 0x55910c655be2 in rib_unlink zebra/zebra_rib.c:4212
>     FRRouting#10 0x55910c6430f9 in zebra_rtable_node_cleanup zebra/zebra_rib.c:968
>     FRRouting#11 0x7f26f275b8a9 in route_node_free lib/table.c:75
>     FRRouting#12 0x7f26f275bae4 in route_table_free lib/table.c:111
>     FRRouting#13 0x7f26f275b749 in route_table_finish lib/table.c:46
>     FRRouting#14 0x55910c65db17 in zebra_router_free_table zebra/zebra_router.c:191
>     FRRouting#15 0x55910c65dfb5 in zebra_router_terminate zebra/zebra_router.c:244
>     FRRouting#16 0x55910c4f40db in zebra_finalize zebra/main.c:249
>     FRRouting#17 0x7f26f2777108 in event_call lib/event.c:2011
>     FRRouting#18 0x7f26f264180e in frr_run lib/libfrr.c:1212
>     FRRouting#19 0x55910c4f49cb in main zebra/main.c:531
>     FRRouting#20 0x7f26f2029d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
>     FRRouting#21 0x7f26f2029e3f in __libc_start_main_impl ../csu/libc-start.c:392
>     FRRouting#22 0x55910c4b0114 in _start (/usr/lib/frr/zebra+0x1ae114)

It happens with FRR using the kernel. During shutdown, the
namespace identifier is attempted to be obtained by zebra, in an
attempt to prepare zebra dataplane nexthop messages.

Fix this by accessing the ns structure.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
louis-6wind pushed a commit to louis-6wind/frr that referenced this issue Oct 16, 2024
The following ASAN issue has been observed:

> ERROR: AddressSanitizer: heap-use-after-free on address 0x6160000acba4 at pc 0x55910c5694d0 bp 0x7ffe3a8ac850 sp 0x7ffe3a8ac840
> READ of size 4 at 0x6160000acba4 thread T0
>         #0 0x55910c5694cf in ctx_info_from_zns zebra/zebra_dplane.c:3315
>     #1 0x55910c569696 in dplane_ctx_ns_init zebra/zebra_dplane.c:3331
>     #2 0x55910c56bf61 in dplane_ctx_nexthop_init zebra/zebra_dplane.c:3680
>     #3 0x55910c5711ca in dplane_nexthop_update_internal zebra/zebra_dplane.c:4490
>     FRRouting#4 0x55910c571c5c in dplane_nexthop_delete zebra/zebra_dplane.c:4717
>     FRRouting#5 0x55910c61e90e in zebra_nhg_uninstall_kernel zebra/zebra_nhg.c:3413
>     FRRouting#6 0x55910c615d8a in zebra_nhg_decrement_ref zebra/zebra_nhg.c:1919
>     FRRouting#7 0x55910c6404db in route_entry_update_nhe zebra/zebra_rib.c:454
>     FRRouting#8 0x55910c64c904 in rib_re_nhg_free zebra/zebra_rib.c:2822
>     FRRouting#9 0x55910c655be2 in rib_unlink zebra/zebra_rib.c:4212
>     FRRouting#10 0x55910c6430f9 in zebra_rtable_node_cleanup zebra/zebra_rib.c:968
>     FRRouting#11 0x7f26f275b8a9 in route_node_free lib/table.c:75
>     FRRouting#12 0x7f26f275bae4 in route_table_free lib/table.c:111
>     FRRouting#13 0x7f26f275b749 in route_table_finish lib/table.c:46
>     FRRouting#14 0x55910c65db17 in zebra_router_free_table zebra/zebra_router.c:191
>     FRRouting#15 0x55910c65dfb5 in zebra_router_terminate zebra/zebra_router.c:244
>     FRRouting#16 0x55910c4f40db in zebra_finalize zebra/main.c:249
>     FRRouting#17 0x7f26f2777108 in event_call lib/event.c:2011
>     FRRouting#18 0x7f26f264180e in frr_run lib/libfrr.c:1212
>     FRRouting#19 0x55910c4f49cb in main zebra/main.c:531
>     FRRouting#20 0x7f26f2029d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
>     FRRouting#21 0x7f26f2029e3f in __libc_start_main_impl ../csu/libc-start.c:392
>     FRRouting#22 0x55910c4b0114 in _start (/usr/lib/frr/zebra+0x1ae114)

It happens with FRR using the kernel. During shutdown, the
namespace identifier is attempted to be obtained by zebra, in an
attempt to prepare zebra dataplane nexthop messages.

Fix this by accessing the ns structure.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
mergify bot pushed a commit that referenced this issue Oct 16, 2024
The following ASAN issue has been observed:

> ERROR: AddressSanitizer: heap-use-after-free on address 0x6160000acba4 at pc 0x55910c5694d0 bp 0x7ffe3a8ac850 sp 0x7ffe3a8ac840
> READ of size 4 at 0x6160000acba4 thread T0
>         #0 0x55910c5694cf in ctx_info_from_zns zebra/zebra_dplane.c:3315
>     #1 0x55910c569696 in dplane_ctx_ns_init zebra/zebra_dplane.c:3331
>     #2 0x55910c56bf61 in dplane_ctx_nexthop_init zebra/zebra_dplane.c:3680
>     #3 0x55910c5711ca in dplane_nexthop_update_internal zebra/zebra_dplane.c:4490
>     #4 0x55910c571c5c in dplane_nexthop_delete zebra/zebra_dplane.c:4717
>     #5 0x55910c61e90e in zebra_nhg_uninstall_kernel zebra/zebra_nhg.c:3413
>     #6 0x55910c615d8a in zebra_nhg_decrement_ref zebra/zebra_nhg.c:1919
>     #7 0x55910c6404db in route_entry_update_nhe zebra/zebra_rib.c:454
>     #8 0x55910c64c904 in rib_re_nhg_free zebra/zebra_rib.c:2822
>     #9 0x55910c655be2 in rib_unlink zebra/zebra_rib.c:4212
>     #10 0x55910c6430f9 in zebra_rtable_node_cleanup zebra/zebra_rib.c:968
>     #11 0x7f26f275b8a9 in route_node_free lib/table.c:75
>     #12 0x7f26f275bae4 in route_table_free lib/table.c:111
>     #13 0x7f26f275b749 in route_table_finish lib/table.c:46
>     #14 0x55910c65db17 in zebra_router_free_table zebra/zebra_router.c:191
>     #15 0x55910c65dfb5 in zebra_router_terminate zebra/zebra_router.c:244
>     #16 0x55910c4f40db in zebra_finalize zebra/main.c:249
>     #17 0x7f26f2777108 in event_call lib/event.c:2011
>     #18 0x7f26f264180e in frr_run lib/libfrr.c:1212
>     #19 0x55910c4f49cb in main zebra/main.c:531
>     #20 0x7f26f2029d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
>     #21 0x7f26f2029e3f in __libc_start_main_impl ../csu/libc-start.c:392
>     #22 0x55910c4b0114 in _start (/usr/lib/frr/zebra+0x1ae114)

It happens with FRR using the kernel. During shutdown, the
namespace identifier is attempted to be obtained by zebra, in an
attempt to prepare zebra dataplane nexthop messages.

Fix this by accessing the ns structure.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
(cherry picked from commit 7ae70eb)
mergify bot pushed a commit that referenced this issue Oct 16, 2024
The following ASAN issue has been observed:

> ERROR: AddressSanitizer: heap-use-after-free on address 0x6160000acba4 at pc 0x55910c5694d0 bp 0x7ffe3a8ac850 sp 0x7ffe3a8ac840
> READ of size 4 at 0x6160000acba4 thread T0
>         #0 0x55910c5694cf in ctx_info_from_zns zebra/zebra_dplane.c:3315
>     #1 0x55910c569696 in dplane_ctx_ns_init zebra/zebra_dplane.c:3331
>     #2 0x55910c56bf61 in dplane_ctx_nexthop_init zebra/zebra_dplane.c:3680
>     #3 0x55910c5711ca in dplane_nexthop_update_internal zebra/zebra_dplane.c:4490
>     #4 0x55910c571c5c in dplane_nexthop_delete zebra/zebra_dplane.c:4717
>     #5 0x55910c61e90e in zebra_nhg_uninstall_kernel zebra/zebra_nhg.c:3413
>     #6 0x55910c615d8a in zebra_nhg_decrement_ref zebra/zebra_nhg.c:1919
>     #7 0x55910c6404db in route_entry_update_nhe zebra/zebra_rib.c:454
>     #8 0x55910c64c904 in rib_re_nhg_free zebra/zebra_rib.c:2822
>     #9 0x55910c655be2 in rib_unlink zebra/zebra_rib.c:4212
>     #10 0x55910c6430f9 in zebra_rtable_node_cleanup zebra/zebra_rib.c:968
>     #11 0x7f26f275b8a9 in route_node_free lib/table.c:75
>     #12 0x7f26f275bae4 in route_table_free lib/table.c:111
>     #13 0x7f26f275b749 in route_table_finish lib/table.c:46
>     #14 0x55910c65db17 in zebra_router_free_table zebra/zebra_router.c:191
>     #15 0x55910c65dfb5 in zebra_router_terminate zebra/zebra_router.c:244
>     #16 0x55910c4f40db in zebra_finalize zebra/main.c:249
>     #17 0x7f26f2777108 in event_call lib/event.c:2011
>     #18 0x7f26f264180e in frr_run lib/libfrr.c:1212
>     #19 0x55910c4f49cb in main zebra/main.c:531
>     #20 0x7f26f2029d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
>     #21 0x7f26f2029e3f in __libc_start_main_impl ../csu/libc-start.c:392
>     #22 0x55910c4b0114 in _start (/usr/lib/frr/zebra+0x1ae114)

It happens with FRR using the kernel. During shutdown, the
namespace identifier is attempted to be obtained by zebra, in an
attempt to prepare zebra dataplane nexthop messages.

Fix this by accessing the ns structure.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
(cherry picked from commit 7ae70eb)

# Conflicts:
#	zebra/main.c
#	zebra/zebra_ns.h
mergify bot pushed a commit that referenced this issue Oct 16, 2024
The following ASAN issue has been observed:

> ERROR: AddressSanitizer: heap-use-after-free on address 0x6160000acba4 at pc 0x55910c5694d0 bp 0x7ffe3a8ac850 sp 0x7ffe3a8ac840
> READ of size 4 at 0x6160000acba4 thread T0
>         #0 0x55910c5694cf in ctx_info_from_zns zebra/zebra_dplane.c:3315
>     #1 0x55910c569696 in dplane_ctx_ns_init zebra/zebra_dplane.c:3331
>     #2 0x55910c56bf61 in dplane_ctx_nexthop_init zebra/zebra_dplane.c:3680
>     #3 0x55910c5711ca in dplane_nexthop_update_internal zebra/zebra_dplane.c:4490
>     #4 0x55910c571c5c in dplane_nexthop_delete zebra/zebra_dplane.c:4717
>     #5 0x55910c61e90e in zebra_nhg_uninstall_kernel zebra/zebra_nhg.c:3413
>     #6 0x55910c615d8a in zebra_nhg_decrement_ref zebra/zebra_nhg.c:1919
>     #7 0x55910c6404db in route_entry_update_nhe zebra/zebra_rib.c:454
>     #8 0x55910c64c904 in rib_re_nhg_free zebra/zebra_rib.c:2822
>     #9 0x55910c655be2 in rib_unlink zebra/zebra_rib.c:4212
>     #10 0x55910c6430f9 in zebra_rtable_node_cleanup zebra/zebra_rib.c:968
>     #11 0x7f26f275b8a9 in route_node_free lib/table.c:75
>     #12 0x7f26f275bae4 in route_table_free lib/table.c:111
>     #13 0x7f26f275b749 in route_table_finish lib/table.c:46
>     #14 0x55910c65db17 in zebra_router_free_table zebra/zebra_router.c:191
>     #15 0x55910c65dfb5 in zebra_router_terminate zebra/zebra_router.c:244
>     #16 0x55910c4f40db in zebra_finalize zebra/main.c:249
>     #17 0x7f26f2777108 in event_call lib/event.c:2011
>     #18 0x7f26f264180e in frr_run lib/libfrr.c:1212
>     #19 0x55910c4f49cb in main zebra/main.c:531
>     #20 0x7f26f2029d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
>     #21 0x7f26f2029e3f in __libc_start_main_impl ../csu/libc-start.c:392
>     #22 0x55910c4b0114 in _start (/usr/lib/frr/zebra+0x1ae114)

It happens with FRR using the kernel. During shutdown, the
namespace identifier is attempted to be obtained by zebra, in an
attempt to prepare zebra dataplane nexthop messages.

Fix this by accessing the ns structure.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
(cherry picked from commit 7ae70eb)
mergify bot pushed a commit that referenced this issue Oct 16, 2024
The following ASAN issue has been observed:

> ERROR: AddressSanitizer: heap-use-after-free on address 0x6160000acba4 at pc 0x55910c5694d0 bp 0x7ffe3a8ac850 sp 0x7ffe3a8ac840
> READ of size 4 at 0x6160000acba4 thread T0
>         #0 0x55910c5694cf in ctx_info_from_zns zebra/zebra_dplane.c:3315
>     #1 0x55910c569696 in dplane_ctx_ns_init zebra/zebra_dplane.c:3331
>     #2 0x55910c56bf61 in dplane_ctx_nexthop_init zebra/zebra_dplane.c:3680
>     #3 0x55910c5711ca in dplane_nexthop_update_internal zebra/zebra_dplane.c:4490
>     #4 0x55910c571c5c in dplane_nexthop_delete zebra/zebra_dplane.c:4717
>     #5 0x55910c61e90e in zebra_nhg_uninstall_kernel zebra/zebra_nhg.c:3413
>     #6 0x55910c615d8a in zebra_nhg_decrement_ref zebra/zebra_nhg.c:1919
>     #7 0x55910c6404db in route_entry_update_nhe zebra/zebra_rib.c:454
>     #8 0x55910c64c904 in rib_re_nhg_free zebra/zebra_rib.c:2822
>     #9 0x55910c655be2 in rib_unlink zebra/zebra_rib.c:4212
>     #10 0x55910c6430f9 in zebra_rtable_node_cleanup zebra/zebra_rib.c:968
>     #11 0x7f26f275b8a9 in route_node_free lib/table.c:75
>     #12 0x7f26f275bae4 in route_table_free lib/table.c:111
>     #13 0x7f26f275b749 in route_table_finish lib/table.c:46
>     #14 0x55910c65db17 in zebra_router_free_table zebra/zebra_router.c:191
>     #15 0x55910c65dfb5 in zebra_router_terminate zebra/zebra_router.c:244
>     #16 0x55910c4f40db in zebra_finalize zebra/main.c:249
>     #17 0x7f26f2777108 in event_call lib/event.c:2011
>     #18 0x7f26f264180e in frr_run lib/libfrr.c:1212
>     #19 0x55910c4f49cb in main zebra/main.c:531
>     #20 0x7f26f2029d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
>     #21 0x7f26f2029e3f in __libc_start_main_impl ../csu/libc-start.c:392
>     #22 0x55910c4b0114 in _start (/usr/lib/frr/zebra+0x1ae114)

It happens with FRR using the kernel. During shutdown, the
namespace identifier is attempted to be obtained by zebra, in an
attempt to prepare zebra dataplane nexthop messages.

Fix this by accessing the ns structure.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
(cherry picked from commit 7ae70eb)

# Conflicts:
#	zebra/main.c
#	zebra/zebra_ns.h
mergify bot pushed a commit that referenced this issue Oct 16, 2024
The following ASAN issue has been observed:

> ERROR: AddressSanitizer: heap-use-after-free on address 0x6160000acba4 at pc 0x55910c5694d0 bp 0x7ffe3a8ac850 sp 0x7ffe3a8ac840
> READ of size 4 at 0x6160000acba4 thread T0
>         #0 0x55910c5694cf in ctx_info_from_zns zebra/zebra_dplane.c:3315
>     #1 0x55910c569696 in dplane_ctx_ns_init zebra/zebra_dplane.c:3331
>     #2 0x55910c56bf61 in dplane_ctx_nexthop_init zebra/zebra_dplane.c:3680
>     #3 0x55910c5711ca in dplane_nexthop_update_internal zebra/zebra_dplane.c:4490
>     #4 0x55910c571c5c in dplane_nexthop_delete zebra/zebra_dplane.c:4717
>     #5 0x55910c61e90e in zebra_nhg_uninstall_kernel zebra/zebra_nhg.c:3413
>     #6 0x55910c615d8a in zebra_nhg_decrement_ref zebra/zebra_nhg.c:1919
>     #7 0x55910c6404db in route_entry_update_nhe zebra/zebra_rib.c:454
>     #8 0x55910c64c904 in rib_re_nhg_free zebra/zebra_rib.c:2822
>     #9 0x55910c655be2 in rib_unlink zebra/zebra_rib.c:4212
>     #10 0x55910c6430f9 in zebra_rtable_node_cleanup zebra/zebra_rib.c:968
>     #11 0x7f26f275b8a9 in route_node_free lib/table.c:75
>     #12 0x7f26f275bae4 in route_table_free lib/table.c:111
>     #13 0x7f26f275b749 in route_table_finish lib/table.c:46
>     #14 0x55910c65db17 in zebra_router_free_table zebra/zebra_router.c:191
>     #15 0x55910c65dfb5 in zebra_router_terminate zebra/zebra_router.c:244
>     #16 0x55910c4f40db in zebra_finalize zebra/main.c:249
>     #17 0x7f26f2777108 in event_call lib/event.c:2011
>     #18 0x7f26f264180e in frr_run lib/libfrr.c:1212
>     #19 0x55910c4f49cb in main zebra/main.c:531
>     #20 0x7f26f2029d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
>     #21 0x7f26f2029e3f in __libc_start_main_impl ../csu/libc-start.c:392
>     #22 0x55910c4b0114 in _start (/usr/lib/frr/zebra+0x1ae114)

It happens with FRR using the kernel. During shutdown, the
namespace identifier is attempted to be obtained by zebra, in an
attempt to prepare zebra dataplane nexthop messages.

Fix this by accessing the ns structure.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
(cherry picked from commit 7ae70eb)
pguibert6WIND added a commit to pguibert6WIND/frr that referenced this issue Oct 21, 2024
When a failover happens on ECMP paths that use the same
nexthop which is recursively resolved, ZEBRA replaces the
old NHG with a new one, and updates the pointer of all
routes using that nexthop.

Actually, if only the recursive nexthop changed, there is
no need to replace the old NHG.
Modify the zebra_nhg_proto_add() function, by updating
the recursive nexthop on the original NHG.

Using this change replaces the old method that was consisting in
allocating a new nhe. This change triggers an ASAN in the
bgp_nhg_zapi_scalability test, function
test_bgp_ipv4_simulate_r5_machine_going_down().

> r1: zebra triggered an exception by AddressSanitizer
> AddressSanitizer error in topotest `test_bgp_nhg_zapi_scalability.py`, test `teardown_module`, router `r1`
>
> ERROR: AddressSanitizer: heap-use-after-free on address 0x60e00230afa0 at pc 0x55bfebc9681e bp 0x7ffd657ceb40 sp 0x7ffd657ceb30
> READ of size 4 at 0x60e00230afa0 thread T0
>     #0 0x55bfebc9681d in zebra_nhg_decrement_ref zebra/zebra_nhg.c:1855
>     FRRouting#1 0x55bfebc967f7 in zebra_nhg_decrement_ref zebra/zebra_nhg.c:1868
>     FRRouting#2 0x55bfebcb32f6 in route_entry_update_nhe zebra/zebra_rib.c:460
>     FRRouting#3 0x55bfebcb352f in rib_handle_nhg_replace zebra/zebra_rib.c:486
>     FRRouting#4 0x55bfebc99c14 in zebra_nhg_proto_add zebra/zebra_nhg.c:3836
>     FRRouting#5 0x55bfebcc4035 in process_subq_nhg zebra/zebra_rib.c:2763
>     FRRouting#6 0x55bfebcc4035 in process_subq zebra/zebra_rib.c:3369
>     FRRouting#7 0x55bfebcc4035 in meta_queue_process zebra/zebra_rib.c:3422
>     FRRouting#8 0x7f458a518bff in work_queue_run lib/workqueue.c:282
>     FRRouting#9 0x7f458a4fa24b in event_call lib/event.c:2019
>     FRRouting#10 0x7f458a41f717 in frr_run lib/libfrr.c:1238
>     FRRouting#11 0x55bfebb82cb4 in main zebra/main.c:528
>     FRRouting#12 0x7f4589e29d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
>     FRRouting#13 0x7f4589e29e3f in __libc_start_main_impl ../csu/libc-start.c:392
>     FRRouting#14 0x55bfebb85c34 in _start (/usr/lib/frr/zebra+0x1abc34)
>
> 0x60e00230afa0 is located 96 bytes inside of 160-byte region [0x60e00230af40,0x60e00230afe0)
> freed by thread T0 here:
>     #0 0x7f458a8b4537 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:127
>     FRRouting#1 0x55bfebc967f7 in zebra_nhg_decrement_ref zebra/zebra_nhg.c:1868
>     FRRouting#2 0x55bfebcb32f6 in route_entry_update_nhe zebra/zebra_rib.c:460
>     FRRouting#3 0x55bfebcb352f in rib_handle_nhg_replace zebra/zebra_rib.c:486
>     FRRouting#4 0x55bfebc99c14 in zebra_nhg_proto_add zebra/zebra_nhg.c:3836
>     FRRouting#5 0x55bfebcc4035 in process_subq_nhg zebra/zebra_rib.c:2763
>     FRRouting#6 0x55bfebcc4035 in process_subq zebra/zebra_rib.c:3369
>     FRRouting#7 0x55bfebcc4035 in meta_queue_process zebra/zebra_rib.c:3422
>     FRRouting#8 0x7f458a518bff in work_queue_run lib/workqueue.c:282
>     FRRouting#9 0x7f458a4fa24b in event_call lib/event.c:2019
>     FRRouting#10 0x7f458a41f717 in frr_run lib/libfrr.c:1238
>     FRRouting#11 0x55bfebb82cb4 in main zebra/main.c:528
>     FRRouting#12 0x7f4589e29d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
>
> previously allocated by thread T0 here:
>     #0 0x7f458a8b4a57 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
>     FRRouting#1 0x7f458a43fb7e in qcalloc lib/memory.c:106
>     FRRouting#2 0x55bfebc91f2e in zebra_nhg_alloc zebra/zebra_nhg.c:392
>     FRRouting#3 0x55bfebc91f2e in zebra_nhe_copy zebra/zebra_nhg.c:499
>     FRRouting#4 0x55bfebc922af in zebra_nhg_hash_alloc zebra/zebra_nhg.c:538
>     FRRouting#5 0x7f458a3fd0bd in hash_get lib/hash.c:147
>     FRRouting#6 0x55bfebc94d7a in zebra_nhe_find zebra/zebra_nhg.c:831
>     FRRouting#7 0x55bfebc953ef in zebra_nhg_find zebra/zebra_nhg.c:1013
>     FRRouting#8 0x55bfebc9585d in zebra_nhg_find_nexthop zebra/zebra_nhg.c:1030
>     FRRouting#9 0x55bfebc94078 in depends_find_recursive zebra/zebra_nhg.c:1511
>     FRRouting#10 0x55bfebc94078 in depends_find zebra/zebra_nhg.c:1560
>     FRRouting#11 0x55bfebc94078 in depends_find_add zebra/zebra_nhg.c:1599
>     FRRouting#12 0x55bfebc99e40 in zebra_nhg_update_nhe zebra/zebra_nhg.c:3732
>     FRRouting#13 0x55bfebc99e40 in zebra_nhg_proto_add zebra/zebra_nhg.c:3819
>     FRRouting#14 0x55bfebcc4035 in process_subq_nhg zebra/zebra_rib.c:2763
>     FRRouting#15 0x55bfebcc4035 in process_subq zebra/zebra_rib.c:3369
>     FRRouting#16 0x55bfebcc4035 in meta_queue_process zebra/zebra_rib.c:3422
>     FRRouting#17 0x7f458a518bff in work_queue_run lib/workqueue.c:282
>     FRRouting#18 0x7f458a4fa24b in event_call lib/event.c:2019
>     FRRouting#19 0x7f458a41f717 in frr_run lib/libfrr.c:1238
>     FRRouting#20 0x55bfebb82cb4 in main zebra/main.c:528
>     FRRouting#21 0x7f4589e29d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
>
> SUMMARY: AddressSanitizer: heap-use-after-free zebra/zebra_nhg.c:1855 in zebra_nhg_decrement_ref
> Shadow bytes around the buggy address:
>   0x0c1c804595a0: fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa fa
>   0x0c1c804595b0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
>   0x0c1c804595c0: fd fd fd fa fa fa fa fa fa fa fa fa fd fd fd fd
>   0x0c1c804595d0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fa
>   0x0c1c804595e0: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
> =>0x0c1c804595f0: fd fd fd fd[fd]fd fd fd fd fd fd fd fa fa fa fa
>   0x0c1c80459600: fa fa fa fa fd fd fd fd fd fd fd fd fd fd fd fd
>   0x0c1c80459610: fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa fa
>   0x0c1c80459620: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
>   0x0c1c80459630: fd fd fd fa fa fa fa fa fa fa fa fa 00 00 00 00
>   0x0c1c80459640: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fa
> Shadow byte legend (one shadow byte represents 8 application bytes):
>   Addressable:           00
>   Partially addressable: 01 02 03 04 05 06 07
>   Heap left redzone:       fa
>   Freed heap region:       fd
>   Stack left redzone:      f1
>   Stack mid redzone:       f2
>   Stack right redzone:     f3
>   Stack after return:      f5
>   Stack use after scope:   f8
>   Global redzone:          f9
>   Global init order:       f6
>   Poisoned by user:        f7
>   Container overflow:      fc
>   Array cookie:            ac
>   Intra object redzone:    bb
>   ASan internal:           fe
>   Left alloca redzone:     ca
>   Right alloca redzone:    cb
>   Shadow gap:              cc
>

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
pguibert6WIND added a commit to pguibert6WIND/frr that referenced this issue Oct 30, 2024
The following ASAN error can be seen.

> ERROR: AddressSanitizer: attempting to call malloc_usable_size() for pointer which is not owned: 0x608000036c20
>     #0 0x7f3d7a4b5425 in __interceptor_malloc_usable_size ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:198
>     FRRouting#1 0x7f3d7a426a16 in __sanitizer::BufferedStackTrace::Unwind(unsigned long, unsigned long, void*, bool, unsigned int) ../../../../src/libsanitizer/sanitizer_common
> /sanitizer_stacktrace.h:122
>     FRRouting#2 0x7f3d7a426a16 in __asan::asan_malloc_usable_size(void const*, unsigned long, unsigned long) ../../../../src/libsanitizer/asan/asan_allocator.cpp:1074
>     FRRouting#3 0x7f3d7a03f330 in mt_count_free lib/memory.c:78
>     FRRouting#4 0x7f3d7a03f330 in qfree lib/memory.c:130
>     FRRouting#5 0x7f3d76ccf89b in bmp_peer_status_changed bgpd/bgp_bmp.c:982
>     FRRouting#6 0x560ae2aa6a94 in hook_call_peer_status_changed bgpd/bgp_fsm.c:47
>     FRRouting#7 0x560ae2aa6a94 in bgp_fsm_change_status bgpd/bgp_fsm.c:1287
>     FRRouting#8 0x560ae2c4f2e5 in peer_delete bgpd/bgpd.c:2777
>     FRRouting#9 0x560ae2c58d24 in bgp_delete bgpd/bgpd.c:4140
>     FRRouting#10 0x560ae2bbb47e in no_router_bgp bgpd/bgp_vty.c:1764
>     FRRouting#11 0x7f3d79fb74ed in cmd_execute_command_real lib/command.c:1003
>     FRRouting#12 0x7f3d79fb78a3 in cmd_execute_command lib/command.c:1062
>     FRRouting#13 0x7f3d79fb7e03 in cmd_execute lib/command.c:1228
>     FRRouting#14 0x7f3d7a107b53 in vty_command lib/vty.c:625
>     FRRouting#15 0x7f3d7a109902 in vty_execute lib/vty.c:1388
>     FRRouting#16 0x7f3d7a10cc32 in vtysh_read lib/vty.c:2400
>     FRRouting#17 0x7f3d7a0f848b in event_call lib/event.c:2019
>     FRRouting#18 0x7f3d7a01e627 in frr_run lib/libfrr.c:1232
>     FRRouting#19 0x560ae29e0037 in main bgpd/bgp_main.c:555
>     FRRouting#20 0x7f3d79a29d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
>     FRRouting#21 0x7f3d79a29e3f in __libc_start_main_impl ../csu/libc-start.c:392
>     FRRouting#22 0x560ae29e4ef4 in _start (/usr/lib/frr/bgpd+0x2eeef4)
>
> 0x608000036c20 is located 0 bytes inside of 81-byte region [0x608000036c20,0x608000036c71)
> freed by thread T0 here:
>     #0 0x7f3d7a4b4537 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:127
>     FRRouting#1 0x7f3d76ccf85f in bmp_peer_status_changed bgpd/bgp_bmp.c:981
>     FRRouting#2 0x560ae2aa6a94 in hook_call_peer_status_changed bgpd/bgp_fsm.c:47
>     FRRouting#3 0x560ae2aa6a94 in bgp_fsm_change_status bgpd/bgp_fsm.c:1287
>     FRRouting#4 0x560ae2c4f2e5 in peer_delete bgpd/bgpd.c:2777
>     FRRouting#5 0x560ae2c58d24 in bgp_delete bgpd/bgpd.c:4140
>     FRRouting#6 0x560ae2bbb47e in no_router_bgp bgpd/bgp_vty.c:1764
>     FRRouting#7 0x7f3d79fb74ed in cmd_execute_command_real lib/command.c:1003
>     FRRouting#8 0x7f3d79fb78a3 in cmd_execute_command lib/command.c:1062
>     FRRouting#9 0x7f3d79fb7e03 in cmd_execute lib/command.c:1228
>     FRRouting#10 0x7f3d7a107b53 in vty_command lib/vty.c:625
>     FRRouting#11 0x7f3d7a109902 in vty_execute lib/vty.c:1388
>     FRRouting#12 0x7f3d7a10cc32 in vtysh_read lib/vty.c:2400
>     FRRouting#13 0x7f3d7a0f848b in event_call lib/event.c:2019
>     FRRouting#14 0x7f3d7a01e627 in frr_run lib/libfrr.c:1232
>     FRRouting#15 0x560ae29e0037 in main bgpd/bgp_main.c:555
>     FRRouting#16 0x7f3d79a29d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
>
> previously allocated by thread T0 here:
>     #0 0x7f3d7a4b4887 in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145
>     FRRouting#1 0x7f3d7a03f0e9 in qmalloc lib/memory.c:101
>     FRRouting#2 0x7f3d76cd0166 in bmp_bgp_peer_vrf bgpd/bgp_bmp.c:2194
>     FRRouting#3 0x7f3d76cd0166 in bmp_bgp_update_vrf_status bgpd/bgp_bmp.c:2236
>     FRRouting#4 0x7f3d76cd29b8 in bmp_vrf_state_changed bgpd/bgp_bmp.c:3479
>     FRRouting#5 0x560ae2c45b34 in hook_call_bgp_instance_state bgpd/bgpd.c:88
>     FRRouting#6 0x560ae2c4d158 in bgp_instance_up bgpd/bgpd.c:3936
>     FRRouting#7 0x560ae29e5ed1 in bgp_vrf_enable bgpd/bgp_main.c:299
>     FRRouting#8 0x7f3d7a0ff8b1 in vrf_enable lib/vrf.c:286
>     FRRouting#9 0x7f3d7a0ff8b1 in vrf_enable lib/vrf.c:275
>     FRRouting#10 0x7f3d7a12ab66 in zclient_vrf_add lib/zclient.c:2561
>     FRRouting#11 0x7f3d7a12eb43 in zclient_read lib/zclient.c:4624
>     FRRouting#12 0x7f3d7a0f848b in event_call lib/event.c:2019
>     FRRouting#13 0x7f3d7a01e627 in frr_run lib/libfrr.c:1232
>     FRRouting#14 0x560ae29e0037 in main bgpd/bgp_main.c:555
>     FRRouting#15 0x7f3d79a29d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
pguibert6WIND added a commit to pguibert6WIND/frr that referenced this issue Oct 30, 2024
The following ASAN error can be seen.

> ERROR: AddressSanitizer: attempting to call malloc_usable_size() for pointer which is not owned: 0x608000036c20
>     #0 0x7f3d7a4b5425 in __interceptor_malloc_usable_size ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:198
>     FRRouting#1 0x7f3d7a426a16 in __sanitizer::BufferedStackTrace::Unwind(unsigned long, unsigned long, void*, bool, unsigned int) ../../../../src/libsanitizer/sanitizer_common
> /sanitizer_stacktrace.h:122
>     FRRouting#2 0x7f3d7a426a16 in __asan::asan_malloc_usable_size(void const*, unsigned long, unsigned long) ../../../../src/libsanitizer/asan/asan_allocator.cpp:1074
>     FRRouting#3 0x7f3d7a03f330 in mt_count_free lib/memory.c:78
>     FRRouting#4 0x7f3d7a03f330 in qfree lib/memory.c:130
>     FRRouting#5 0x7f3d76ccf89b in bmp_peer_status_changed bgpd/bgp_bmp.c:982
>     FRRouting#6 0x560ae2aa6a94 in hook_call_peer_status_changed bgpd/bgp_fsm.c:47
>     FRRouting#7 0x560ae2aa6a94 in bgp_fsm_change_status bgpd/bgp_fsm.c:1287
>     FRRouting#8 0x560ae2c4f2e5 in peer_delete bgpd/bgpd.c:2777
>     FRRouting#9 0x560ae2c58d24 in bgp_delete bgpd/bgpd.c:4140
>     FRRouting#10 0x560ae2bbb47e in no_router_bgp bgpd/bgp_vty.c:1764
>     FRRouting#11 0x7f3d79fb74ed in cmd_execute_command_real lib/command.c:1003
>     FRRouting#12 0x7f3d79fb78a3 in cmd_execute_command lib/command.c:1062
>     FRRouting#13 0x7f3d79fb7e03 in cmd_execute lib/command.c:1228
>     FRRouting#14 0x7f3d7a107b53 in vty_command lib/vty.c:625
>     FRRouting#15 0x7f3d7a109902 in vty_execute lib/vty.c:1388
>     FRRouting#16 0x7f3d7a10cc32 in vtysh_read lib/vty.c:2400
>     FRRouting#17 0x7f3d7a0f848b in event_call lib/event.c:2019
>     FRRouting#18 0x7f3d7a01e627 in frr_run lib/libfrr.c:1232
>     FRRouting#19 0x560ae29e0037 in main bgpd/bgp_main.c:555
>     FRRouting#20 0x7f3d79a29d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
>     FRRouting#21 0x7f3d79a29e3f in __libc_start_main_impl ../csu/libc-start.c:392
>     FRRouting#22 0x560ae29e4ef4 in _start (/usr/lib/frr/bgpd+0x2eeef4)
>
> 0x608000036c20 is located 0 bytes inside of 81-byte region [0x608000036c20,0x608000036c71)
> freed by thread T0 here:
>     #0 0x7f3d7a4b4537 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:127
>     FRRouting#1 0x7f3d76ccf85f in bmp_peer_status_changed bgpd/bgp_bmp.c:981
>     FRRouting#2 0x560ae2aa6a94 in hook_call_peer_status_changed bgpd/bgp_fsm.c:47
>     FRRouting#3 0x560ae2aa6a94 in bgp_fsm_change_status bgpd/bgp_fsm.c:1287
>     FRRouting#4 0x560ae2c4f2e5 in peer_delete bgpd/bgpd.c:2777
>     FRRouting#5 0x560ae2c58d24 in bgp_delete bgpd/bgpd.c:4140
>     FRRouting#6 0x560ae2bbb47e in no_router_bgp bgpd/bgp_vty.c:1764
>     FRRouting#7 0x7f3d79fb74ed in cmd_execute_command_real lib/command.c:1003
>     FRRouting#8 0x7f3d79fb78a3 in cmd_execute_command lib/command.c:1062
>     FRRouting#9 0x7f3d79fb7e03 in cmd_execute lib/command.c:1228
>     FRRouting#10 0x7f3d7a107b53 in vty_command lib/vty.c:625
>     FRRouting#11 0x7f3d7a109902 in vty_execute lib/vty.c:1388
>     FRRouting#12 0x7f3d7a10cc32 in vtysh_read lib/vty.c:2400
>     FRRouting#13 0x7f3d7a0f848b in event_call lib/event.c:2019
>     FRRouting#14 0x7f3d7a01e627 in frr_run lib/libfrr.c:1232
>     FRRouting#15 0x560ae29e0037 in main bgpd/bgp_main.c:555
>     FRRouting#16 0x7f3d79a29d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
>
> previously allocated by thread T0 here:
>     #0 0x7f3d7a4b4887 in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145
>     FRRouting#1 0x7f3d7a03f0e9 in qmalloc lib/memory.c:101
>     FRRouting#2 0x7f3d76cd0166 in bmp_bgp_peer_vrf bgpd/bgp_bmp.c:2194
>     FRRouting#3 0x7f3d76cd0166 in bmp_bgp_update_vrf_status bgpd/bgp_bmp.c:2236
>     FRRouting#4 0x7f3d76cd29b8 in bmp_vrf_state_changed bgpd/bgp_bmp.c:3479
>     FRRouting#5 0x560ae2c45b34 in hook_call_bgp_instance_state bgpd/bgpd.c:88
>     FRRouting#6 0x560ae2c4d158 in bgp_instance_up bgpd/bgpd.c:3936
>     FRRouting#7 0x560ae29e5ed1 in bgp_vrf_enable bgpd/bgp_main.c:299
>     FRRouting#8 0x7f3d7a0ff8b1 in vrf_enable lib/vrf.c:286
>     FRRouting#9 0x7f3d7a0ff8b1 in vrf_enable lib/vrf.c:275
>     FRRouting#10 0x7f3d7a12ab66 in zclient_vrf_add lib/zclient.c:2561
>     FRRouting#11 0x7f3d7a12eb43 in zclient_read lib/zclient.c:4624
>     FRRouting#12 0x7f3d7a0f848b in event_call lib/event.c:2019
>     FRRouting#13 0x7f3d7a01e627 in frr_run lib/libfrr.c:1232
>     FRRouting#14 0x560ae29e0037 in main bgpd/bgp_main.c:555
>     FRRouting#15 0x7f3d79a29d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant