Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BGP VNC is crashing in our Address Sanitizer tests #5025

Closed
donaldsharp opened this issue Sep 20, 2019 · 6 comments
Closed

BGP VNC is crashing in our Address Sanitizer tests #5025

donaldsharp opened this issue Sep 20, 2019 · 6 comments
Labels
triage Needs further investigation

Comments

@donaldsharp
Copy link
Member

error 19-Sep-2019 14:14:22 r4: Daemon bgpd not running
error 19-Sep-2019 14:14:23
error 19-Sep-2019 14:14:23 From frr r4 bgpd log file:
error 19-Sep-2019 14:14:23 2019/09/19 14:13:54 BGP: vty[??]@> enable
error 19-Sep-2019 14:14:23 2019/09/19 14:13:54 BGP: vty[??]@# show bgp ipv4 vpn
error 19-Sep-2019 14:14:23 2019/09/19 14:13:54 BGP: vty[??]@> enable
error 19-Sep-2019 14:14:23 2019/09/19 14:13:54 BGP: vty[??]@# show bgp ipv4 vpn json
error 19-Sep-2019 14:14:23 2019/09/19 14:13:58 BGP: vty[??]@> enable
error 19-Sep-2019 14:14:23 2019/09/19 14:13:58 BGP: vty[??]@# show bgp vrf r4-cust1 ipv4 unicast
error 19-Sep-2019 14:14:23 2019/09/19 14:13:59 BGP: vty[??]@> enable
error 19-Sep-2019 14:14:23 2019/09/19 14:13:59 BGP: vty[??]@# show bgp vrf r4-cust1 ipv4 unicast json
error 19-Sep-2019 14:14:23 2019/09/19 14:14:00 BGP: vty[??]@> enable
error 19-Sep-2019 14:14:23 2019/09/19 14:14:00 BGP: vty[??]@# show bgp vrf r4-cust2 ipv4 unicast
error 19-Sep-2019 14:14:23 2019/09/19 14:14:01 BGP: vty[??]@> enable
error 19-Sep-2019 14:14:23 2019/09/19 14:14:01 BGP: vty[??]@# show bgp vrf r4-cust2 ipv4 unicast json
error 19-Sep-2019 14:14:23 2019/09/19 14:14:08 BGP: vty[??]@> enable
error 19-Sep-2019 14:14:23 2019/09/19 14:14:08 BGP: vty[??]@# show bgp vrf r4-cust1 ipv4 uni
error 19-Sep-2019 14:14:23 2019/09/19 14:14:09 BGP: vty[??]@> enable
error 19-Sep-2019 14:14:23 2019/09/19 14:14:09 BGP: vty[??]@# show bgp vrf r4-cust2 ipv4 uni
error 19-Sep-2019 14:14:23 2019/09/19 14:14:10 BGP: vty[??]@> enable
error 19-Sep-2019 14:14:23 2019/09/19 14:14:10 BGP: vty[??]@# show bgp ipv4 vpn
error 19-Sep-2019 14:14:23 2019/09/19 14:14:11 BGP: vty[??]@> enable
error 19-Sep-2019 14:14:23 2019/09/19 14:14:11 BGP: vty[??]@> enable
error 19-Sep-2019 14:14:23
error 19-Sep-2019 14:14:23 r4: bgpd triggered an exception by AddressSanitizer
error 19-Sep-2019 14:14:23 ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffdd0f26310 at pc 0x0000006844b1 bp 0x7ffdd0f23800 sp 0x7ffdd0f237f0
error 19-Sep-2019 14:14:23 READ of size 1 at 0x7ffdd0f26310 thread T0
error 19-Sep-2019 14:14:23 #0 0x6844b0 in prefix_cmp lib/prefix.c:776
error 19-Sep-2019 14:14:23 #1 0x5879a9 in rfapiItBiIndexSearch bgpd/rfapi/rfapi_import.c:2230
error 19-Sep-2019 14:14:23 #2 0x5879a9 in rfapiBgpInfoFilteredImportVPN bgpd/rfapi/rfapi_import.c:3520
error 19-Sep-2019 14:14:23 #3 0x58a894 in rfapiProcessWithdraw bgpd/rfapi/rfapi_import.c:4071
error 19-Sep-2019 14:14:23 #4 0x4c38ff in bgp_withdraw bgpd/bgp_route.c:3735
error 19-Sep-2019 14:14:23 #5 0x483662 in bgp_nlri_parse_vpn bgpd/bgp_mplsvpn.c:237
error 19-Sep-2019 14:14:23 #6 0x497492 in bgp_nlri_parse bgpd/bgp_packet.c:315
error 19-Sep-2019 14:14:23 #7 0x49c5ad in bgp_update_receive bgpd/bgp_packet.c:1598
error 19-Sep-2019 14:14:23 #8 0x49c5ad in bgp_process_packet bgpd/bgp_packet.c:2274
error 19-Sep-2019 14:14:23 #9 0x6b8ba2 in thread_call lib/thread.c:1531
error 19-Sep-2019 14:14:23 #10 0x655d89 in frr_run lib/libfrr.c:1052
error 19-Sep-2019 14:14:23 #11 0x42ce88 in main bgpd/bgp_main.c:486
error 19-Sep-2019 14:14:23 #12 0x7f21a6ecb82f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
error 19-Sep-2019 14:14:23 #13 0x42b8e8 in _start (/usr/lib/frr/bgpd+0x42b8e8)
error 19-Sep-2019 14:14:23
error 19-Sep-2019 14:14:23 Address 0x7ffdd0f26310 is located in stack of thread T0 at offset 240 in frame
error 19-Sep-2019 14:14:23 #0 0x482e85 in bgp_nlri_parse_vpn bgpd/bgp_mplsvpn.c:103
error 19-Sep-2019 14:14:23
error 19-Sep-2019 14:14:23 This frame has 5 object(s):
error 19-Sep-2019 14:14:23 [32, 36) 'label'
error 19-Sep-2019 14:14:23 [96, 108) 'rd_as'
error 19-Sep-2019 14:14:23 [160, 172) 'rd_ip'
error 19-Sep-2019 14:14:23 [224, 240) 'prd' <== Memory access at offset 240 overflows this variable
error 19-Sep-2019 14:14:23 [288, 336) 'p'
error 19-Sep-2019 14:14:23 HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext
error 19-Sep-2019 14:14:23 (longjmp and C++ exceptions are supported)
error 19-Sep-2019 14:14:23 SUMMARY: AddressSanitizer: stack-buffer-overflow lib/prefix.c:776 prefix_cmp
error 19-Sep-2019 14:14:23 Shadow bytes around the buggy address:
error 19-Sep-2019 14:14:23 0x10003a1dcc10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
error 19-Sep-2019 14:14:23 0x10003a1dcc20: f3 f3 f3 f3 f3 f3 f3 f3 00 00 00 00 00 00 00 00
error 19-Sep-2019 14:14:23 0x10003a1dcc30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
error 19-Sep-2019 14:14:23 0x10003a1dcc40: 00 00 00 00 f1 f1 f1 f1 04 f4 f4 f4 f2 f2 f2 f2
error 19-Sep-2019 14:14:23 0x10003a1dcc50: 00 04 f4 f4 f2 f2 f2 f2 00 04 f4 f4 f2 f2 f2 f2
error 19-Sep-2019 14:14:23 =>0x10003a1dcc60: 00 00[f4]f4 f2 f2 f2 f2 00 00 00 00 00 00 f4 f4
error 19-Sep-2019 14:14:23 0x10003a1dcc70: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
error 19-Sep-2019 14:14:23 0x10003a1dcc80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
error 19-Sep-2019 14:14:23 0x10003a1dcc90: f1 f1 f1 f1 02 f4 f4 f4 f2 f2 f2 f2 04 f4 f4 f4
error 19-Sep-2019 14:14:23 0x10003a1dcca0: f2 f2 f2 f2 00 00 f4 f4 f2 f2 f2 f2 00 00 00 00
error 19-Sep-2019 14:14:23 0x10003a1dccb0: f2 f2 f2 f2 00 00 00 00 00 00 00 00 00 00 00 00
error 19-Sep-2019 14:14:23 Shadow byte legend (one shadow byte represents 8 application bytes):
error 19-Sep-2019 14:14:23 Addressable: 00
error 19-Sep-2019 14:14:23 Partially addressable: 01 02 03 04 05 06 07
error 19-Sep-2019 14:14:23 Heap left redzone: fa
error 19-Sep-2019 14:14:23 Heap right redzone: fb
error 19-Sep-2019 14:14:23 Freed heap region: fd
error 19-Sep-2019 14:14:23 Stack left redzone: f1
error 19-Sep-2019 14:14:23 Stack mid redzone: f2
error 19-Sep-2019 14:14:23 Stack right redzone: f3
error 19-Sep-2019 14:14:23 Stack partial redzone: f4
error 19-Sep-2019 14:14:23 Stack after return: f5
error 19-Sep-2019 14:14:23 Stack use after scope: f8
error 19-Sep-2019 14:14:23 Global redzone: f9
error 19-Sep-2019 14:14:23 Global init order: f6
error 19-Sep-2019 14:14:23 Poisoned by user: f7
error 19-Sep-2019 14:14:23 Container overflow: fc
error 19-Sep-2019 14:14:23 Array cookie: ac
error 19-Sep-2019 14:14:23 Intra object redzone: bb
error 19-Sep-2019 14:14:23 ASan internal: fe
error 19-Sep-2019 14:14:25 r3: Daemon bgpd not running
error 19-Sep-2019 14:14:25
error 19-Sep-2019 14:14:25 From frr r3 bgpd log file:
error 19-Sep-2019 14:14:25 2019/09/19 14:13:21 BGP: vty[??]@# show bgp summary
error 19-Sep-2019 14:14:25 2019/09/19 14:13:25 BGP: vty[??]@> enable
error 19-Sep-2019 14:14:25 2019/09/19 14:13:25 BGP: vty[??]@# show bgp vrf all summary
error 19-Sep-2019 14:14:25 2019/09/19 14:13:34 BGP: vty[??]@> enable
error 19-Sep-2019 14:14:25 2019/09/19 14:13:34 BGP: vty[??]@# show bgp vrf r3-cust1 ipv4 unicast
error 19-Sep-2019 14:14:25 2019/09/19 14:13:35 BGP: vty[??]@> enable
error 19-Sep-2019 14:14:25 2019/09/19 14:13:35 BGP: vty[??]@# show bgp vrf r3-cust1 ipv4 unicast json
error 19-Sep-2019 14:14:25 2019/09/19 14:13:41 BGP: vty[??]@> enable
error 19-Sep-2019 14:14:25 2019/09/19 14:13:41 BGP: vty[??]@# show bgp ipv4 uni
error 19-Sep-2019 14:14:25 2019/09/19 14:13:44 BGP: vty[??]@> enable
error 19-Sep-2019 14:14:25 2019/09/19 14:13:44 BGP: vty[??]@# show bgp ipv4 vpn
error 19-Sep-2019 14:14:25 2019/09/19 14:13:47 BGP: vty[??]@> enable
error 19-Sep-2019 14:14:25 2019/09/19 14:13:52 BGP: vty[??]@> enable
error 19-Sep-2019 14:14:25 2019/09/19 14:13:52 BGP: vty[??]@# show bgp ipv4 vpn
error 19-Sep-2019 14:14:25 2019/09/19 14:13:53 BGP: vty[??]@> enable
error 19-Sep-2019 14:14:25 2019/09/19 14:13:53 BGP: vty[??]@# show bgp ipv4 vpn json
error 19-Sep-2019 14:14:25 2019/09/19 14:13:57 BGP: vty[??]@> enable
error 19-Sep-2019 14:14:25 2019/09/19 14:13:57 BGP: vty[??]@# show bgp vrf r3-cust1 ipv4 unicast
error 19-Sep-2019 14:14:25 2019/09/19 14:13:58 BGP: vty[??]@> enable
error 19-Sep-2019 14:14:25 2019/09/19 14:13:58 BGP: vty[??]@# show bgp vrf r3-cust1 ipv4 unicast json
error 19-Sep-2019 14:14:25
error 19-Sep-2019 14:14:25 r3: bgpd triggered an exception by AddressSanitizer
error 19-Sep-2019 14:14:25 ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffe0e546610 at pc 0x0000006844b1 bp 0x7ffe0e543b00 sp 0x7ffe0e543af0
error 19-Sep-2019 14:14:25 READ of size 1 at 0x7ffe0e546610 thread T0
error 19-Sep-2019 14:14:25 #0 0x6844b0 in prefix_cmp lib/prefix.c:776
error 19-Sep-2019 14:14:25 #1 0x5879a9 in rfapiItBiIndexSearch bgpd/rfapi/rfapi_import.c:2230
error 19-Sep-2019 14:14:25 #2 0x5879a9 in rfapiBgpInfoFilteredImportVPN bgpd/rfapi/rfapi_import.c:3520
error 19-Sep-2019 14:14:25 #3 0x58a894 in rfapiProcessWithdraw bgpd/rfapi/rfapi_import.c:4071
error 19-Sep-2019 14:14:25 #4 0x4c38ff in bgp_withdraw bgpd/bgp_route.c:3735
error 19-Sep-2019 14:14:25 #5 0x483662 in bgp_nlri_parse_vpn bgpd/bgp_mplsvpn.c:237
error 19-Sep-2019 14:14:25 #6 0x497492 in bgp_nlri_parse bgpd/bgp_packet.c:315
error 19-Sep-2019 14:14:25 #7 0x49c5ad in bgp_update_receive bgpd/bgp_packet.c:1598
error 19-Sep-2019 14:14:25 #8 0x49c5ad in bgp_process_packet bgpd/bgp_packet.c:2274
error 19-Sep-2019 14:14:25 #9 0x6b8ba2 in thread_call lib/thread.c:1531
error 19-Sep-2019 14:14:25 #10 0x655d89 in frr_run lib/libfrr.c:1052
error 19-Sep-2019 14:14:25 #11 0x42ce88 in main bgpd/bgp_main.c:486
error 19-Sep-2019 14:14:25 #12 0x7f305273f82f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
error 19-Sep-2019 14:14:25 #13 0x42b8e8 in _start (/usr/lib/frr/bgpd+0x42b8e8)
error 19-Sep-2019 14:14:25
error 19-Sep-2019 14:14:25 Address 0x7ffe0e546610 is located in stack of thread T0 at offset 240 in frame
error 19-Sep-2019 14:14:25 #0 0x482e85 in bgp_nlri_parse_vpn bgpd/bgp_mplsvpn.c:103
error 19-Sep-2019 14:14:25
error 19-Sep-2019 14:14:25 This frame has 5 object(s):
error 19-Sep-2019 14:14:25 [32, 36) 'label'
error 19-Sep-2019 14:14:25 [96, 108) 'rd_as'
error 19-Sep-2019 14:14:25 [160, 172) 'rd_ip'
error 19-Sep-2019 14:14:25 [224, 240) 'prd' <== Memory access at offset 240 overflows this variable
error 19-Sep-2019 14:14:25 [288, 336) 'p'
error 19-Sep-2019 14:14:25 HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext
error 19-Sep-2019 14:14:25 (longjmp and C++ exceptions are supported)
error 19-Sep-2019 14:14:25 SUMMARY: AddressSanitizer: stack-buffer-overflow lib/prefix.c:776 prefix_cmp
error 19-Sep-2019 14:14:25 Shadow bytes around the buggy address:
error 19-Sep-2019 14:14:25 0x100041ca0c70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
error 19-Sep-2019 14:14:25 0x100041ca0c80: f3 f3 f3 f3 f3 f3 f3 f3 00 00 00 00 00 00 00 00
error 19-Sep-2019 14:14:25 0x100041ca0c90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
error 19-Sep-2019 14:14:25 0x100041ca0ca0: 00 00 00 00 f1 f1 f1 f1 04 f4 f4 f4 f2 f2 f2 f2
error 19-Sep-2019 14:14:25 0x100041ca0cb0: 00 04 f4 f4 f2 f2 f2 f2 00 04 f4 f4 f2 f2 f2 f2
error 19-Sep-2019 14:14:25 =>0x100041ca0cc0: 00 00[f4]f4 f2 f2 f2 f2 00 00 00 00 00 00 f4 f4
error 19-Sep-2019 14:14:25 0x100041ca0cd0: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
error 19-Sep-2019 14:14:25 0x100041ca0ce0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
error 19-Sep-2019 14:14:25 0x100041ca0cf0: f1 f1 f1 f1 02 f4 f4 f4 f2 f2 f2 f2 04 f4 f4 f4
error 19-Sep-2019 14:14:25 0x100041ca0d00: f2 f2 f2 f2 00 00 f4 f4 f2 f2 f2 f2 00 00 00 00
error 19-Sep-2019 14:14:25 0x100041ca0d10: f2 f2 f2 f2 00 00 00 00 00 00 00 00 00 00 00 00
error 19-Sep-2019 14:14:25 Shadow byte legend (one shadow byte represents 8 application bytes):
error 19-Sep-2019 14:14:25 Addressable: 00
error 19-Sep-2019 14:14:25 Partially addressable: 01 02 03 04 05 06 07
error 19-Sep-2019 14:14:25 Heap left redzone: fa
error 19-Sep-2019 14:14:25 Heap right redzone: fb
error 19-Sep-2019 14:14:25 Freed heap region: fd
error 19-Sep-2019 14:14:25 Stack left redzone: f1
error 19-Sep-2019 14:14:25 Stack mid redzone: f2
error 19-Sep-2019 14:14:25 Stack right redzone: f3
error 19-Sep-2019 14:14:25 Stack partial redzone: f4
error 19-Sep-2019 14:14:25 Stack after return: f5
error 19-Sep-2019 14:14:25 Stack use after scope: f8
error 19-Sep-2019 14:14:25 Global redzone: f9
error 19-Sep-2019 14:14:25 Global init order: f6
error 19-Sep-2019 14:14:25 Poisoned by user: f7
error 19-Sep-2019 14:14:25 Container overflow: fc
error 19-Sep-2019 14:14:25 Array cookie: ac
error 19-Sep-2019 14:14:25 Intra object redzone: bb
error 19-Sep-2019 14:14:25 ASan internal: fe
build 19-Sep-2019 14:14:28 bgp_instance_del_test/test_bgp_instance_del_test.py::test_memory_leak <- lib/ltemplate.py 2019-09-19 14:14:28,882 INFO: r4: Daemon bgpd not running - killed by AddressSanitizer
build 19-Sep-2019 14:14:28 r3: Daemon bgpd not running - killed by AddressSanitizer
build 19-Sep-2019 14:14:28
error 19-Sep-2019 14:14:28 2019-09-19 14:14:28,886 ERROR: assert failed at "bgp_instance_del_test.test_bgp_instance_del_test/test_memory_leak": r4: Daemon bgpd not running - killed by AddressSanitizer
error 19-Sep-2019 14:14:28 r3: Daemon bgpd not running - killed by AddressSanitizer
error 19-Sep-2019 14:14:28

Actual log file:
https://ci1.netdef.org/download/FRR-FRRPULLREQ-ASANTOPO/build_logs/FRR-FRRPULLREQ-ASANTOPO-8971.log

I've gone through and spot checked a bunch of successful tests and I am consistently seeing this bgp crash. I will be opening an issue to fix this issue in the topotests as well.

@donaldsharp donaldsharp added the triage Needs further investigation label Sep 20, 2019
@louberger
Copy link
Member

seen by running valgrind:

==7313== Conditional jump or move depends on uninitialised value(s)
==7313== at 0x181F9F: subgroup_announce_check (bgp_route.c:1555)
==7313== by 0x1A112B: subgroup_announce_table (bgp_updgrp_adv.c:641)
==7313== by 0x1A1340: subgroup_announce_route (bgp_updgrp_adv.c:704)
==7313== by 0x1A13E3: subgroup_coalesce_timer (bgp_updgrp_adv.c:331)
==7313== by 0x4EBA615: thread_call (thread.c:1531)
==7313== by 0x4E8AC37: frr_run (libfrr.c:1052)
==7313== by 0x1429E0: main (bgp_main.c:486)
==7313==
==7313== Conditional jump or move depends on uninitialised value(s)
==7313== at 0x201C0E: rfapi_vty_out_vncinfo (rfapi_vty.c:429)
==7313== by 0x18D0D6: route_vty_out (bgp_route.c:7481)
==7313== by 0x18DD76: bgp_show_table (bgp_route.c:9365)
==7313== by 0x1930C4: bgp_show_table_rd (bgp_route.c:9471)
==7313== by 0x1932A3: bgp_show (bgp_route.c:9510)
==7313== by 0x193E68: show_ip_bgp_json (bgp_route.c:10284)
==7313== by 0x4E6D024: cmd_execute_command_real.isra.2 (command.c:1072)
==7313== by 0x4E6F51E: cmd_execute_command (command.c:1131)
==7313== by 0x4E6F686: cmd_execute (command.c:1285)
==7313== by 0x4EBF9C4: vty_command (vty.c:516)
==7313== by 0x4EBFB9F: vty_execute (vty.c:1285)
==7313== by 0x4EC250F: vtysh_read (vty.c:2119)
==7313==

to run valgrind:
diff --git a/tests/topotests/lib/topotest.py b/tests/topotests/lib/topotest.py
index 9e1d34468..ca2b7e607 100644
--- a/tests/topotests/lib/topotest.py
+++ b/tests/topotests/lib/topotest.py
@@ -952,7 +952,7 @@ class Router(Node):
if self.daemons[daemon] == 0 or daemon == 'zebra' or daemon == 'staticd':
continue
daemon_path = os.path.join(self.daemondir, daemon)

  •        self.cmd('{0} {1} > {2}.out 2> {2}.err &'.format(
    
  •        self.cmd('valgrind '+ '{0} {1} > {2}.out 2> {2}.err &'.format(
               daemon_path, self.daemons_options.get(daemon, ''), daemon
           ))
           self.waitOutput()
    

@louberger
Copy link
Member

liiks like extra->label is not initialized is same issue for both valgrind reports

@donaldsharp
Copy link
Member Author

Adding valgrind to topotests as suggested immediately above just makes all topotests unhappy. The simple patch above needs to be more complicated. :(

@donaldsharp
Copy link
Member Author

https://ci1.netdef.org/browse/FRR-FRR-2414 -> This does build does not appear to have the bgp crash

https://ci1.netdef.org/browse/FRR-FRR-2415 -> This build does appear to have the bgp crash

@donaldsharp
Copy link
Member Author

sharpd@eva ~/frr> git log --oneline 506fc1a..a6ffcbd
a6ffcbd Merge pull request #4740 from opensourcerouting/omgwtfbbq
4937287 (origin/pr/4740) lib: fix prefix_copy() for clang-SA
4d5cf6b lib: fix misplaced brace in typesafe lists
9c3a217 lib: use some more transparent unions for prefixes
1315d74 lib: fix prefix_cmp() return values

@donaldsharp
Copy link
Member Author

@eqvinox -> can you take a look at this?

donaldsharp added a commit to donaldsharp/frr that referenced this issue Oct 10, 2019
BGP code assumes that the extra data is zero'ed out.  Ensure that we
are not leaving any situation that the data on the stack is actually all
0's when we pass it around as a pointer later.

Please note in issue FRRouting#5025, Lou reported a different valgrind
issue, which is not the same issue:

==7313== Conditional jump or move depends on uninitialised value(s)
==7313== at 0x181F9F: subgroup_announce_check (bgp_route.c:1555)
==7313== by 0x1A112B: subgroup_announce_table (bgp_updgrp_adv.c:641)
==7313== by 0x1A1340: subgroup_announce_route (bgp_updgrp_adv.c:704)
==7313== by 0x1A13E3: subgroup_coalesce_timer (bgp_updgrp_adv.c:331)
==7313== by 0x4EBA615: thread_call (thread.c:1531)
==7313== by 0x4E8AC37: frr_run (libfrr.c:1052)
==7313== by 0x1429E0: main (bgp_main.c:486)
==7313==
==7313== Conditional jump or move depends on uninitialised value(s)
==7313== at 0x201C0E: rfapi_vty_out_vncinfo (rfapi_vty.c:429)
==7313== by 0x18D0D6: route_vty_out (bgp_route.c:7481)
==7313== by 0x18DD76: bgp_show_table (bgp_route.c:9365)
==7313== by 0x1930C4: bgp_show_table_rd (bgp_route.c:9471)
==7313== by 0x1932A3: bgp_show (bgp_route.c:9510)
==7313== by 0x193E68: show_ip_bgp_json (bgp_route.c:10284)
==7313== by 0x4E6D024: cmd_execute_command_real.isra.2 (command.c:1072)
==7313== by 0x4E6F51E: cmd_execute_command (command.c:1131)
==7313== by 0x4E6F686: cmd_execute (command.c:1285)
==7313== by 0x4EBF9C4: vty_command (vty.c:516)
==7313== by 0x4EBFB9F: vty_execute (vty.c:1285)
==7313== by 0x4EC250F: vtysh_read (vty.c:2119)
==7313==

that is causing the actual crash.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
@riw777 riw777 closed this as completed in dd5bab0 Oct 11, 2019
donaldsharp added a commit to donaldsharp/frr that referenced this issue Oct 15, 2019
BGP code assumes that the extra data is zero'ed out.  Ensure that we
are not leaving any situation that the data on the stack is actually all
0's when we pass it around as a pointer later.

Please note in issue FRRouting#5025, Lou reported a different valgrind
issue, which is not the same issue:

==7313== Conditional jump or move depends on uninitialised value(s)
==7313== at 0x181F9F: subgroup_announce_check (bgp_route.c:1555)
==7313== by 0x1A112B: subgroup_announce_table (bgp_updgrp_adv.c:641)
==7313== by 0x1A1340: subgroup_announce_route (bgp_updgrp_adv.c:704)
==7313== by 0x1A13E3: subgroup_coalesce_timer (bgp_updgrp_adv.c:331)
==7313== by 0x4EBA615: thread_call (thread.c:1531)
==7313== by 0x4E8AC37: frr_run (libfrr.c:1052)
==7313== by 0x1429E0: main (bgp_main.c:486)
==7313==
==7313== Conditional jump or move depends on uninitialised value(s)
==7313== at 0x201C0E: rfapi_vty_out_vncinfo (rfapi_vty.c:429)
==7313== by 0x18D0D6: route_vty_out (bgp_route.c:7481)
==7313== by 0x18DD76: bgp_show_table (bgp_route.c:9365)
==7313== by 0x1930C4: bgp_show_table_rd (bgp_route.c:9471)
==7313== by 0x1932A3: bgp_show (bgp_route.c:9510)
==7313== by 0x193E68: show_ip_bgp_json (bgp_route.c:10284)
==7313== by 0x4E6D024: cmd_execute_command_real.isra.2 (command.c:1072)
==7313== by 0x4E6F51E: cmd_execute_command (command.c:1131)
==7313== by 0x4E6F686: cmd_execute (command.c:1285)
==7313== by 0x4EBF9C4: vty_command (vty.c:516)
==7313== by 0x4EBFB9F: vty_execute (vty.c:1285)
==7313== by 0x4EC250F: vtysh_read (vty.c:2119)
==7313==

that is causing the actual crash.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
donaldsharp added a commit to donaldsharp/frr that referenced this issue Oct 15, 2019
Our Address Sanitizer CI is finding this issue:
error	09-Oct-2019 19:28:33	r4: bgpd triggered an exception by AddressSanitizer
error	09-Oct-2019 19:28:33	ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffdd425b060 at pc 0x00000068575f bp 0x7ffdd4258550 sp 0x7ffdd4258540
error	09-Oct-2019 19:28:33	READ of size 1 at 0x7ffdd425b060 thread T0
error	09-Oct-2019 19:28:33	    #0 0x68575e in prefix_cmp lib/prefix.c:776
error	09-Oct-2019 19:28:33	    #1 0x5889f5 in rfapiItBiIndexSearch bgpd/rfapi/rfapi_import.c:2230
error	09-Oct-2019 19:28:33	    #2 0x5889f5 in rfapiBgpInfoFilteredImportVPN bgpd/rfapi/rfapi_import.c:3520
error	09-Oct-2019 19:28:33	    #3 0x58b909 in rfapiProcessWithdraw bgpd/rfapi/rfapi_import.c:4071
error	09-Oct-2019 19:28:33	    #4 0x4c459b in bgp_withdraw bgpd/bgp_route.c:3736
error	09-Oct-2019 19:28:33	    #5 0x484122 in bgp_nlri_parse_vpn bgpd/bgp_mplsvpn.c:237
error	09-Oct-2019 19:28:33	    #6 0x497f52 in bgp_nlri_parse bgpd/bgp_packet.c:315
error	09-Oct-2019 19:28:33	    #7 0x49d06d in bgp_update_receive bgpd/bgp_packet.c:1598
error	09-Oct-2019 19:28:33	    #8 0x49d06d in bgp_process_packet bgpd/bgp_packet.c:2274
error	09-Oct-2019 19:28:33	    #9 0x6b9f54 in thread_call lib/thread.c:1531
error	09-Oct-2019 19:28:33	    #10 0x657037 in frr_run lib/libfrr.c:1052
error	09-Oct-2019 19:28:33	    #11 0x42d268 in main bgpd/bgp_main.c:486
error	09-Oct-2019 19:28:33	    #12 0x7f806032482f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
error	09-Oct-2019 19:28:33	    #13 0x42bcc8 in _start (/usr/lib/frr/bgpd+0x42bcc8)
error	09-Oct-2019 19:28:33
error	09-Oct-2019 19:28:33	Address 0x7ffdd425b060 is located in stack of thread T0 at offset 240 in frame
error	09-Oct-2019 19:28:33	    #0 0x483945 in bgp_nlri_parse_vpn bgpd/bgp_mplsvpn.c:103
error	09-Oct-2019 19:28:33
error	09-Oct-2019 19:28:33	  This frame has 5 object(s):
error	09-Oct-2019 19:28:33	    [32, 36) 'label'
error	09-Oct-2019 19:28:33	    [96, 108) 'rd_as'
error	09-Oct-2019 19:28:33	    [160, 172) 'rd_ip'
error	09-Oct-2019 19:28:33	    [224, 240) 'prd' <== Memory access at offset 240 overflows this variable
error	09-Oct-2019 19:28:33	    [288, 336) 'p'
error	09-Oct-2019 19:28:33	HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext
error	09-Oct-2019 19:28:33	      (longjmp and C++ exceptions *are* supported)
error	09-Oct-2019 19:28:33	SUMMARY: AddressSanitizer: stack-buffer-overflow lib/prefix.c:776 prefix_cmp
error	09-Oct-2019 19:28:33	Shadow bytes around the buggy address:
error	09-Oct-2019 19:28:33	  0x10003a8435b0: 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 00 00 00
error	09-Oct-2019 19:28:33	  0x10003a8435c0: 00 00 00 00 00 00 00 00 00 00 f3 f3 f3 f3 f3 f3
error	09-Oct-2019 19:28:33	  0x10003a8435d0: f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 00 00
error	09-Oct-2019 19:28:33	  0x10003a8435e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1
error	09-Oct-2019 19:28:33	  0x10003a8435f0: f1 f1 04 f4 f4 f4 f2 f2 f2 f2 00 04 f4 f4 f2 f2
error	09-Oct-2019 19:28:33	=>0x10003a843600: f2 f2 00 04 f4 f4 f2 f2 f2 f2 00 00[f4]f4 f2 f2
error	09-Oct-2019 19:28:33	  0x10003a843610: f2 f2 00 00 00 00 00 00 f4 f4 f3 f3 f3 f3 00 00
error	09-Oct-2019 19:28:33	  0x10003a843620: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
error	09-Oct-2019 19:28:33	  0x10003a843630: 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 02 f4
error	09-Oct-2019 19:28:33	  0x10003a843640: f4 f4 f2 f2 f2 f2 04 f4 f4 f4 f2 f2 f2 f2 00 00
error	09-Oct-2019 19:28:33	  0x10003a843650: f4 f4 f2 f2 f2 f2 00 00 00 00 f2 f2 f2 f2 00 00
error	09-Oct-2019 19:28:33	Shadow byte legend (one shadow byte represents 8 application bytes):
error	09-Oct-2019 19:28:33	  Addressable:           00
error	09-Oct-2019 19:28:33	  Partially addressable: 01 02 03 04 05 06 07
error	09-Oct-2019 19:28:33	  Heap left redzone:       fa
error	09-Oct-2019 19:28:33	  Heap right redzone:      fb
error	09-Oct-2019 19:28:33	  Freed heap region:       fd
error	09-Oct-2019 19:28:33	  Stack left redzone:      f1
error	09-Oct-2019 19:28:33	  Stack mid redzone:       f2
error	09-Oct-2019 19:28:33	  Stack right redzone:     f3
error	09-Oct-2019 19:28:33	  Stack partial redzone:   f4
error	09-Oct-2019 19:28:33	  Stack after return:      f5
error	09-Oct-2019 19:28:33	  Stack use after scope:   f8
error	09-Oct-2019 19:28:33	  Global redzone:          f9
error	09-Oct-2019 19:28:33	  Global init order:       f6
error	09-Oct-2019 19:28:33	  Poisoned by user:        f7
error	09-Oct-2019 19:28:33	  Container overflow:      fc
error	09-Oct-2019 19:28:33	  Array cookie:            ac
error	09-Oct-2019 19:28:33	  Intra object redzone:    bb
error	09-Oct-2019 19:28:33	  ASan internal:           fe
error	09-Oct-2019 19:28:36	r3: Daemon bgpd not running

This is the result of this code pattern in rfapi/rfapi_import.c:

prefix_cmp((struct prefix *)&bpi_result->extra->vnc.import.rd,
	   (struct prefix *)prd))

Effectively prd or vnc.import.rd are `struct prefix_rd` which
are being typecast to a `struct prefix`.  Not a big deal except commit
1315d74 modified the prefix_cmp
function to allow for a sorted prefix_cmp.  In prefix_cmp
we were looking at the offset and shift.  In the case
of vnc we were passing a prefix length of 64 which is the exact length of
the remaining data structure for struct prefix_rd.  So we calculated
a offset of 8 and a shift of 0.  The data structures for the prefix
portion happened to be equal to 64 bits of data. So we checked that
with the memcmp got a 0 and promptly read off the end of the data
structure for the numcmp.  The fix is if shift is 0 that means thei
the memcmp has checked everything and there is nothing to do.

Please note: We will still crash if we set the prefixlen > then
~312 bits currently( ie if the prefixlen specifies a bit length
longer than the prefix length ).  I do not think there is
anything to do here( nor am I sure how to correct this either )
as that we are going to have some severe problems when we muck
up the prefixlen.

Fixes: FRRouting#5025
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
SumitAgarwal123 pushed a commit to SumitAgarwal123/frr that referenced this issue Nov 19, 2019
BGP code assumes that the extra data is zero'ed out.  Ensure that we
are not leaving any situation that the data on the stack is actually all
0's when we pass it around as a pointer later.

Please note in issue FRRouting#5025, Lou reported a different valgrind
issue, which is not the same issue:

==7313== Conditional jump or move depends on uninitialised value(s)
==7313== at 0x181F9F: subgroup_announce_check (bgp_route.c:1555)
==7313== by 0x1A112B: subgroup_announce_table (bgp_updgrp_adv.c:641)
==7313== by 0x1A1340: subgroup_announce_route (bgp_updgrp_adv.c:704)
==7313== by 0x1A13E3: subgroup_coalesce_timer (bgp_updgrp_adv.c:331)
==7313== by 0x4EBA615: thread_call (thread.c:1531)
==7313== by 0x4E8AC37: frr_run (libfrr.c:1052)
==7313== by 0x1429E0: main (bgp_main.c:486)
==7313==
==7313== Conditional jump or move depends on uninitialised value(s)
==7313== at 0x201C0E: rfapi_vty_out_vncinfo (rfapi_vty.c:429)
==7313== by 0x18D0D6: route_vty_out (bgp_route.c:7481)
==7313== by 0x18DD76: bgp_show_table (bgp_route.c:9365)
==7313== by 0x1930C4: bgp_show_table_rd (bgp_route.c:9471)
==7313== by 0x1932A3: bgp_show (bgp_route.c:9510)
==7313== by 0x193E68: show_ip_bgp_json (bgp_route.c:10284)
==7313== by 0x4E6D024: cmd_execute_command_real.isra.2 (command.c:1072)
==7313== by 0x4E6F51E: cmd_execute_command (command.c:1131)
==7313== by 0x4E6F686: cmd_execute (command.c:1285)
==7313== by 0x4EBF9C4: vty_command (vty.c:516)
==7313== by 0x4EBFB9F: vty_execute (vty.c:1285)
==7313== by 0x4EC250F: vtysh_read (vty.c:2119)
==7313==

that is causing the actual crash.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
SumitAgarwal123 pushed a commit to SumitAgarwal123/frr that referenced this issue Nov 19, 2019
Our Address Sanitizer CI is finding this issue:
error	09-Oct-2019 19:28:33	r4: bgpd triggered an exception by AddressSanitizer
error	09-Oct-2019 19:28:33	ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffdd425b060 at pc 0x00000068575f bp 0x7ffdd4258550 sp 0x7ffdd4258540
error	09-Oct-2019 19:28:33	READ of size 1 at 0x7ffdd425b060 thread T0
error	09-Oct-2019 19:28:33	    #0 0x68575e in prefix_cmp lib/prefix.c:776
error	09-Oct-2019 19:28:33	    FRRouting#1 0x5889f5 in rfapiItBiIndexSearch bgpd/rfapi/rfapi_import.c:2230
error	09-Oct-2019 19:28:33	    FRRouting#2 0x5889f5 in rfapiBgpInfoFilteredImportVPN bgpd/rfapi/rfapi_import.c:3520
error	09-Oct-2019 19:28:33	    FRRouting#3 0x58b909 in rfapiProcessWithdraw bgpd/rfapi/rfapi_import.c:4071
error	09-Oct-2019 19:28:33	    FRRouting#4 0x4c459b in bgp_withdraw bgpd/bgp_route.c:3736
error	09-Oct-2019 19:28:33	    FRRouting#5 0x484122 in bgp_nlri_parse_vpn bgpd/bgp_mplsvpn.c:237
error	09-Oct-2019 19:28:33	    FRRouting#6 0x497f52 in bgp_nlri_parse bgpd/bgp_packet.c:315
error	09-Oct-2019 19:28:33	    FRRouting#7 0x49d06d in bgp_update_receive bgpd/bgp_packet.c:1598
error	09-Oct-2019 19:28:33	    FRRouting#8 0x49d06d in bgp_process_packet bgpd/bgp_packet.c:2274
error	09-Oct-2019 19:28:33	    FRRouting#9 0x6b9f54 in thread_call lib/thread.c:1531
error	09-Oct-2019 19:28:33	    FRRouting#10 0x657037 in frr_run lib/libfrr.c:1052
error	09-Oct-2019 19:28:33	    FRRouting#11 0x42d268 in main bgpd/bgp_main.c:486
error	09-Oct-2019 19:28:33	    FRRouting#12 0x7f806032482f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
error	09-Oct-2019 19:28:33	    FRRouting#13 0x42bcc8 in _start (/usr/lib/frr/bgpd+0x42bcc8)
error	09-Oct-2019 19:28:33
error	09-Oct-2019 19:28:33	Address 0x7ffdd425b060 is located in stack of thread T0 at offset 240 in frame
error	09-Oct-2019 19:28:33	    #0 0x483945 in bgp_nlri_parse_vpn bgpd/bgp_mplsvpn.c:103
error	09-Oct-2019 19:28:33
error	09-Oct-2019 19:28:33	  This frame has 5 object(s):
error	09-Oct-2019 19:28:33	    [32, 36) 'label'
error	09-Oct-2019 19:28:33	    [96, 108) 'rd_as'
error	09-Oct-2019 19:28:33	    [160, 172) 'rd_ip'
error	09-Oct-2019 19:28:33	    [224, 240) 'prd' <== Memory access at offset 240 overflows this variable
error	09-Oct-2019 19:28:33	    [288, 336) 'p'
error	09-Oct-2019 19:28:33	HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext
error	09-Oct-2019 19:28:33	      (longjmp and C++ exceptions *are* supported)
error	09-Oct-2019 19:28:33	SUMMARY: AddressSanitizer: stack-buffer-overflow lib/prefix.c:776 prefix_cmp
error	09-Oct-2019 19:28:33	Shadow bytes around the buggy address:
error	09-Oct-2019 19:28:33	  0x10003a8435b0: 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 00 00 00
error	09-Oct-2019 19:28:33	  0x10003a8435c0: 00 00 00 00 00 00 00 00 00 00 f3 f3 f3 f3 f3 f3
error	09-Oct-2019 19:28:33	  0x10003a8435d0: f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 00 00
error	09-Oct-2019 19:28:33	  0x10003a8435e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1
error	09-Oct-2019 19:28:33	  0x10003a8435f0: f1 f1 04 f4 f4 f4 f2 f2 f2 f2 00 04 f4 f4 f2 f2
error	09-Oct-2019 19:28:33	=>0x10003a843600: f2 f2 00 04 f4 f4 f2 f2 f2 f2 00 00[f4]f4 f2 f2
error	09-Oct-2019 19:28:33	  0x10003a843610: f2 f2 00 00 00 00 00 00 f4 f4 f3 f3 f3 f3 00 00
error	09-Oct-2019 19:28:33	  0x10003a843620: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
error	09-Oct-2019 19:28:33	  0x10003a843630: 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 02 f4
error	09-Oct-2019 19:28:33	  0x10003a843640: f4 f4 f2 f2 f2 f2 04 f4 f4 f4 f2 f2 f2 f2 00 00
error	09-Oct-2019 19:28:33	  0x10003a843650: f4 f4 f2 f2 f2 f2 00 00 00 00 f2 f2 f2 f2 00 00
error	09-Oct-2019 19:28:33	Shadow byte legend (one shadow byte represents 8 application bytes):
error	09-Oct-2019 19:28:33	  Addressable:           00
error	09-Oct-2019 19:28:33	  Partially addressable: 01 02 03 04 05 06 07
error	09-Oct-2019 19:28:33	  Heap left redzone:       fa
error	09-Oct-2019 19:28:33	  Heap right redzone:      fb
error	09-Oct-2019 19:28:33	  Freed heap region:       fd
error	09-Oct-2019 19:28:33	  Stack left redzone:      f1
error	09-Oct-2019 19:28:33	  Stack mid redzone:       f2
error	09-Oct-2019 19:28:33	  Stack right redzone:     f3
error	09-Oct-2019 19:28:33	  Stack partial redzone:   f4
error	09-Oct-2019 19:28:33	  Stack after return:      f5
error	09-Oct-2019 19:28:33	  Stack use after scope:   f8
error	09-Oct-2019 19:28:33	  Global redzone:          f9
error	09-Oct-2019 19:28:33	  Global init order:       f6
error	09-Oct-2019 19:28:33	  Poisoned by user:        f7
error	09-Oct-2019 19:28:33	  Container overflow:      fc
error	09-Oct-2019 19:28:33	  Array cookie:            ac
error	09-Oct-2019 19:28:33	  Intra object redzone:    bb
error	09-Oct-2019 19:28:33	  ASan internal:           fe
error	09-Oct-2019 19:28:36	r3: Daemon bgpd not running

This is the result of this code pattern in rfapi/rfapi_import.c:

prefix_cmp((struct prefix *)&bpi_result->extra->vnc.import.rd,
	   (struct prefix *)prd))

Effectively prd or vnc.import.rd are `struct prefix_rd` which
are being typecast to a `struct prefix`.  Not a big deal except commit
1315d74 modified the prefix_cmp
function to allow for a sorted prefix_cmp.  In prefix_cmp
we were looking at the offset and shift.  In the case
of vnc we were passing a prefix length of 64 which is the exact length of
the remaining data structure for struct prefix_rd.  So we calculated
a offset of 8 and a shift of 0.  The data structures for the prefix
portion happened to be equal to 64 bits of data. So we checked that
with the memcmp got a 0 and promptly read off the end of the data
structure for the numcmp.  The fix is if shift is 0 that means thei
the memcmp has checked everything and there is nothing to do.

Please note: We will still crash if we set the prefixlen > then
~312 bits currently( ie if the prefixlen specifies a bit length
longer than the prefix length ).  I do not think there is
anything to do here( nor am I sure how to correct this either )
as that we are going to have some severe problems when we muck
up the prefixlen.

Fixes: FRRouting#5025
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triage Needs further investigation
Projects
None yet
Development

No branches or pull requests

2 participants