Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

native: segfault on heavy network usage in transceiver #499

Closed
mehlis opened this issue Jan 14, 2014 · 12 comments
Closed

native: segfault on heavy network usage in transceiver #499

mehlis opened this issue Jan 14, 2014 · 12 comments
Assignees
Labels
Platform: native Platform: This PR/issue effects the native platform Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors)

Comments

@mehlis
Copy link
Contributor

mehlis commented Jan 14, 2014

sending >100 packets per second causes a segfault in RIOT

in 9 of 10 cases it's:

Core was generated by `../projects/ccn-lite-client/bin/native/ccn-lite-client.elf grid5x5_b4 -t 4711 -'.
Program terminated with signal 11, Segmentation fault.
#0  0x08051cb1 in run () at transceiver.c:269
269                 response = send_packet(cmd->transceivers, cmd->data);
Missing separate debuginfos, use: debuginfo-install glibc-2.17-20.fc19.i686
(gdb) bt
#0  0x08051cb1 in run () at transceiver.c:269
#1  0x4a121ceb in makecontext () from /lib/libc.so.6
#2  0x08095444 in __end_stack ()
#3  0x080c33b4 in transceiver_stack ()
#4  0x080c33b8 in transceiver_stack ()
#5  0x080c33bc in transceiver_stack ()
#6  0x080c33c0 in transceiver_stack ()
#7  0x08095444 in __end_stack ()
#8  0x080bf560 in msg_buffer ()
#9  0x00000000 in ?? ()

in 1 of 10 cases it is:

Core was generated by `../projects/ccn-lite-client/bin/native/ccn-lite-client.elf grid5x5_c4 -t 4724 -'.
Program terminated with signal 11, Segmentation fault.
#0  receive_packet (type=0, pos=0 '\000') at transceiver.c:395
395             trans_p = NULL;
(gdb) bt
#0  receive_packet (type=0, pos=0 '\000') at transceiver.c:395
#1  0x4a121ceb in ?? ()
#2  0x0809645c in relay_stack ()
#3  0x080c43d4 in theRelay ()
#4  0x080c43d8 in theRelay ()
#5  0x0809645c in relay_stack ()
#6  0x080c0578 in transceiver_stack ()
#7  0x00000000 in ?? ()

This looks strange, because receive_packet with type argument=0 indicates it's a packet from CC1100 transceiver. So this might be a corrupted stack

@LudwigKnuepfer
Copy link
Member

Another possible effect is in valgrind: process becomes unresponsive.

@LudwigKnuepfer
Copy link
Member

Reportedly freezes are possible without a debugging environment as well. (that was unrelated)

@mehlis
Copy link
Contributor Author

mehlis commented Feb 2, 2014

../RIOT/examples/ccn-lite-client/bin/native/ccn-lite-client.elf: _native_lpm_sleep: select(): Resource temporarily unavailable
../RIOT/examples/ccn-lite-client/bin/native/ccn-lite-client.elf: XXX: this should not have happened!

Core was generated by `../RIOT/examples/ccn-lite-client/bin/native/ccn-lite-client.elf grid5x5-ccn_e4'.
Program terminated with signal 11, Segmentation fault.
#0  0x080c1a6e in transceiver_stack ()
Missing separate debuginfos, use: debuginfo-install glibc-2.17-20.fc19.i686
(gdb) bt
#0  0x080c1a6e in transceiver_stack ()
#1  0x4a121c4b in setcontext () from /lib/libc.so.6
#2  0x0804bd43 in isr_cpu_switch_context_exit () at native_cpu.c:127
#3  0x0804be0a in cpu_switch_context_exit () at native_cpu.c:148
#4  0x0804b47f in native_irq_handler () at irq_cpu.c:281
#5  0x4a121ceb in makecontext () from /lib/libc.so.6
#6  0x00000000 in ?? ()

@mehlis
Copy link
Contributor Author

mehlis commented Feb 2, 2014

Core was generated by `../RIOT/examples/ccn-lite-client/bin/native/ccn-lite-client.elf grid5x5-ccn_c3'.
Program terminated with signal 5, Trace/breakpoint trap.
#0  _native_sig_leave_tramp () at tramp.S:48
48      jmp *-4(%esp)
Missing separate debuginfos, use: debuginfo-install glibc-2.17-20.fc19.i686
(gdb) bt
#0  _native_sig_leave_tramp () at tramp.S:48
#1  0x080677cc in idle_stack ()
#2  0x4a121c4b in setcontext () from /lib/libc.so.6
#3  0x0804be52 in isr_thread_yield () at native_cpu.c:163
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

@mehlis
Copy link
Contributor Author

mehlis commented Feb 2, 2014

../RIOT/examples/ccn-lite-client/bin/native/ccn-lite-client.elf: _native_lpm_sleep: select(): Resource temporarily unavailable
../RIOT/examples/ccn-lite-client/bin/native/ccn-lite-client.elf: XXX: this should not have happened!

@LudwigKnuepfer
Copy link
Member

Is this only happening when you use the ltc?

@LudwigKnuepfer
Copy link
Member

@mehlis because then it is probably related to #495

@mehlis
Copy link
Contributor Author

mehlis commented Apr 24, 2014

I'm not using the LTC since months

@LudwigKnuepfer LudwigKnuepfer removed this from the Release 2014.05 milestone Apr 28, 2014
@OlegHahm OlegHahm added this to the FIX ME FIRST milestone Jun 3, 2014
@OlegHahm
Copy link
Member

OlegHahm commented Aug 1, 2014

Is it possible to create a testcase?

@LudwigKnuepfer
Copy link
Member

It's nondeterministic, but somewhat reproducible with https://github.com/LudwigOrtmann/riot-tools/tree/master/l2perf by sending packets really fast.

@cgundogan
Copy link
Member

This issue is not apparent with the new network stack. It is possible to send pings with high throughput without crashing native (for the normal case). @OlegHahm what's your opinion on this? Should we close this?

@OlegHahm
Copy link
Member

OlegHahm commented Jan 6, 2016

Yeah, let's close this. Even if there are problem with native and heavy network usage, this issue is about the transceiver module which has vanished.

@OlegHahm OlegHahm closed this as completed Jan 6, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Platform: native Platform: This PR/issue effects the native platform Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors)
Projects
None yet
Development

No branches or pull requests

5 participants