-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ND: Lost of Global IPV6 on node after sending lot of UDP frame from BR #5790
Comments
I tried again to send lot of data in a row (20 packets/sec) |
Directly after? |
node starts and get global IPV6: great In the middle, I've got NDP receiving SL2A:
And you can see that node misses some lines when NDP is talking |
what are those "lines"? |
Some UDP testing lines ("hello world') |
Ah okay, thanks. I will consider this for my tests when testing the new ND! |
This mainly show that if there are many messages arriving at the same time on a node, it can loose its global IPV6 address Do you know why or how packets from 116 to 121 are not received? |
If I do the same test with local IPV6, it will loose some line when ND runs but it will continue to the end because local IPV6 is not lost (comparing to global IPV6) |
I have multiple suspicions.
Any thing sounds like something you observed? |
I can test but I need to know where to debug? |
Try the following:
Check with a sniffer
Add
Call
Call |
With IPV6 debug, I added the number of lost lines since the start of sending packets:
|
I added the followings:
gnrc_pktbuf_stats() is called every received packet
|
I'm a little bit confused if this is the sender or the receiver now? Also, come to think of it
Seems like none of my suspicions are true (if this is the sender) |
This is the receiver, I only gave receiver output. |
I have 2 questions:
I think I still don't understand your setup :(, but maybe the issue is a duplicate of #5122?
This only helps on the sender side. |
Ok I've got a BR + one node.
Because each sent line has a number in the payload :-)
Good question, only thing I see is that when ND runs (SL2A solicitation), the receiver misses some lines. And if the packets are sent to global IPV6 of the node, it stops receiving at some point as shown earlier. And no it is not related to #5122 IMO because I don't need to wait 15-20min to lose global IPV6, I only need to send lot of UDP packets Is it clearer? |
Yes. Can you do the same debugging on the BR? Or isn't that a RIOT node? Another theory I have might be, that the missing MAC might let you loose packets (both UDP and NDP) and that the issue stems from that. |
Ok I got debug for BR and node. BR tries to send 1000 lines (with payload= "[Num Line] hello". I kept only interesting part of debug. node:
BR:
|
What do you mean? When does the node miss the MAC? After three solicitations BR removes node global IPV6 from ncache? |
@biboc Do you have the same issue with debugging disabled? On the samr21, output over serial is very expensive. |
There is no MAC protocol implemented in RIOT yet. That's what I meant with missing MAC, but maybe the answer lies in @kaspar030's remark ;-). |
Yes same problem if ND debug is disabled |
@miri64 Does anything seems weird on the debug log I gave? |
@kaspar030, see, I removed most of the debug (I show only if line%100 == 0)
I don't receive anything afterwards |
@biboc can you check if #5122 (comment) fixes this? This would at least be a quick fix for the release. |
RIOT-2016.10 - Release Notes ============================ RIOT is a real-time multi-threading operating system that supports a range of devices that are typically found in the Internet of Things: 8-bit microcontrollers, 16-bit microcontrollers and light-weight 32-bit processors. RIOT is based on the following design principles: energy-efficiency, real-time capabilities, small memory footprint, modularity, and uniform API access, independent of the underlying hardware (this API offers partial POSIX compliance). RIOT is developed by an international open source community which is independent of specific vendors (e.g. similarly to the Linux community) and is licensed with a non-viral copyleft license (LGPLv2.1), which allows indirect business models around the free open-source software platform provided by RIOT. About this release: =================== This release provides a lot of new features as well as it fixes several major bugs. Among these new features are the new simplified network socket API called sock, the GNRC specific CoAP implementation gcoap and several new packages: TinyDTLS, the Aversive++ microcontroller library for robotics, the u8g2 graphic library, and nanocoap. Using the new sock API an implementation of the Simple Time Network Protocol (SNTP) was also introduced, allowing for time synchronization between nodes. New platforms include the Arduino Uno, the Arduino Duemilanove, the Arduino Zero, SODAQ Autonomo, and the Zolertia remote (rev. B). The most significant bug fix was done in native which led to a significantly more robust handling of ISRs and now allows for at least 1,000 native instances running stably on one machine. About 263 pull requests with about 398 commits have been merged since the last release and about 42 issues have been solved. 37 people contributed with code in 100 days. 1006 files have been touched with 166500 insertions and 26926 deletions. Notations used below: ===================== + means new feature/item * means modified feature/item - means removed feature/item New features and changes ======================== General ------- * Verbose behavior for assert() macro Core ---- + MPU support for Cortex-M API changes ----------- + Socket-like sock API (replacing conn) * netdev2: Add Testmodes and CCA modes * IEEE 802.15.4: clean-up Intra-PAN behavior * IEEE 802.15.4: centralize default values * gnrc_pktbuf: allow for 0-sized snips + gnrc_netapi: mbox and arbitrary callback support System libraries ---------------- No new features or changes Networking ---------- + Provide sock-port for GNRC + gcoap: a GNRC-based CoAP implementation + Simple Network Time Protocol (RFC 5905, section 14) + Priority Queue for packet snips + IPv4 header definitions Packages -------- + nanocoap: CoAP header parser/builder + TinyDTLS: DTLS library + tiny-asn1: asn.1/der decoder + Aversive++ microcontroller programming library + u8g2 graphic library Platforms --------- + Support for stm32f2xx MCU family + Low power modes for samd21 CPUs + More Arduino-based platforms: + Arduino Uno + Arduino Duemilanove + Arduino Zero + More boards of ST's Nucleo platforms: + ST Nucleo F030 board support + ST Nucleo F070 board support + ST Nucleo F446 board support + SODAQ Automono + Zolertia remote rev. B Drivers ------- + W5100 Ethernet device + Atmel IO1 Xplained extension + LPD8808 LED strips * at86rf2xx: provide capability to access the RND_VALUE random value register Build System ------------ + static-tests build target for easy local execution of CI's static tests Other ----- + Provide Arduino API to Nucleo boards + Packer configuration file to build vagrant boxes + CC2650STK Debugger Support + ethos: add Ethos over TCP support Fixed Issues from the last release ================================== RIOT-OS#534: native debugging on osx fails RIOT-OS#2071: native: *long* overdue fixes RIOT-OS#3341: netdev2_tap crashes when hammered RIOT-OS#5007: gnrc icmpv6: Ping reply goes out the wrong interface RIOT-OS#5432: native: valgrind fails Known Issues ============ Networking related issues ------------------------- RIOT-OS#3075: nhdp: unnecessary microsecond precision: NHDP works with timer values of microsecond precision which is not required. Changing to lower precision would save some memory. RIOT-OS#4048: potential racey memory leak: According to the packet buffer stats, flood-pinging a multicast destination may lead to a memory leak due to a race condition. However, it seems to be a rare case and a completely filled up packet buffer was not observed. RIOT-OS#4388: POSIX sockets: open socket is bound to a specific thread: This was an inherit problem of the conn API under GNRC. Since the POSIX sockets are still based on conn for this release, this issue persists RIOT-OS#4527: gnrc_ipv6: Multicast is not forwarded if routing node listens to the address (might still be fixable for release, see RIOT-OS#5729, RIOT-OS#5230: gnrc ipv6: multicast packets are not dispatched to the upper layers) RIOT-OS#5016: gnrc_rpl: Rejoining RPL instance as root after reboot messes up routing RIOT-OS#5055: cpuid: multiple radios will get same EUI-64 Nodes with multiple interfaces might get the same EUI-64 for them since they are generated from the same CPU ID. RIOT-OS#5656: Possible Weakness with locking in the GNRC network stack: For some operations mutexes to the network interfaces need to get unlocked in the current implementation to not get deadlocked. Recursive mutexes as provided in RIOT-OS#5731 might help to solve this problem. RIOT-OS#5748: gnrc: nodes crashing with too small packet buffer: A packet buffer of size ~512 B might lead to crashes. The issue describes this for several hundret nodes, but agressive flooding with just two nodes was also shown to lead to this problem. RIOT-OS#5858: gnrc: 6lo: potential problem with reassembly of fragments: If one frame gets lost the reassembly state machine might get out of sync ### NDP is not working properly RIOT-OS#4499: handle of l2src_len in gnrc_ndp_rtr_sol_handle: Reception of a router solicitation might lead to invalid zero-length link-layer addresses in neighbor cache. RIOT-OS#5005: ndp: router advertisement sent with global address: Under some circumstances a router might send RAs with GUAs. While they are ignored on receive (as RFC 4861 specifies), RAs should have link-local addresses and not even be send out this way. RIOT-OS#5122: NDP: global unicast address on non-6LBR nodes disappears after a while: Several issues (also see RIOT-OS#5760) lead to a global unicast address effectively being banned from the network (disappears from neighbor cache, is not added again) RIOT-OS#5467: ipv6 address vanishes when ARO (wrongly) indicates DUP caused by outdated ncache at router RIOT-OS#5539: Border Router: packet not forwarded from ethos to interface 6 RIOT-OS#5790: ND: Lost of Global IPV6 on node after sending lot of UDP frame from BR Timer related issues -------------------- RIOT-OS#4841: xtimer: timer already in the list: Under some conditions an xtimer can end up twice in the internal list of the xtimer module RIOT-OS#4902: xtimer: xtimer_set: xtimer_set does not handle integer overflows well RIOT-OS#5338: xtimer: xtimer_now() not ISR safe for non-32-bit platforms. RIOT-OS#5928: xtimer: usage in board_init() crashes: some boards use the xtimer in there board_init() function. The xtimer is however first initialized in the auto_init module which is executed after board_init() RIOT-OS#6052: tests: xtimer_drift gets stuck: xtimer_drift application freezes after ~30-200 seconds native related issues --------------------- RIOT-OS#495: native not float safe: When the FPU is used when an asynchronous context switch occurs, either the stack gets corrupted or a floating point exception occurs. RIOT-OS#2175: ubjson: valgind registers "Invalid write of size 4" in unittests RIOT-OS#4590: pkg: building relic with clang fails. RIOT-OS#5796: native: tlsf: early malloc will lead to a crash: TLSF needs pools to be initialized (which is currently expected to be done in an application). If a malloc is needed before an application's main started (e.g. driver initialization) the node can crash, since no pool is allocated yet. other platform related issues ----------------------------- RIOT-OS#1891: newlib-nano: Printf formatting does not work properly for some numberic types: PRI[uxdi]64, PRI[uxdi]8 and float are not parsed in newlib-nano RIOT-OS#2006: cpu/nrf51822: timer callback may be fired too early RIOT-OS#2143: unittests: tests-core doesn't compile for all platforms: GCC build-ins were used in the unittests which are not available with msp430-gcc RIOT-OS#2300: qemu unittest fails because of a page fault RIOT-OS#4512: pkg: tests: RELIC unittests fail on iotlab-m3 RIOT-OS#4522: avsextrem: linker sometimes doesn't find `bl_init_clks()` RIOT-OS#4560: make: clang is more pedantic than gcc oonf_api is not building with clang. (Partly solved by RIOT-OS#4593) RIOT-OS#4694: drivers/lm75a: does not build RIOT-OS#4737: cortex-m: Hard fault after a thread exits (under some circumstances) RIOT-OS#4822: kw2xrf: packet loss when packets get fragmented RIOT-OS#4876: at86rf2xx: Simultaneous use of different transceiver types is not supported RIOT-OS#4954: chronos: compiling with -O0 breaks RIOT-OS#4866: not all GPIO driver implementations are thread safe: Due to non-atomic operations in the drivers some pin configurations might get lost. RIOT-OS#5009: RIOT is saw-toothing in energy consumption (even when idling) RIOT-OS#5103: xtimer: weird behavior of tests/xtimer_drift: xtimer_drift randomly jumps a few seconds on nrf52 RIOT-OS#5361: cpu/cc26x0: timer broken RIOT-OS#5405: Eratic timings on iotlab-m3 with compression context activated RIOT-OS#5460: cpu/samd21: i2c timing with compiler optimization RIOT-OS#5486: at86rf2xx: lost interrupts RIOT-OS#5489: cpu/lpc11u34: ADC broken RIOT-OS#5603: atmega boards second UART issue RIOT-OS#5678: at86rf2xx: failed assertion in _isr RIOT-OS#5719: cc2538: rf driver doesn't handle large packets RIOT-OS#5799: kw2x: 15.4 duplicate transmits RIOT-OS#5944: msp430: ipv6_hdr unittests fail RIOT-OS#5848: arduino: Race condition in sys/arduino/Makefile.include RIOT-OS#5954: nRF52 uart_write get stuck RIOT-OS#6018: nRF52 gnrc 6lowpan ble memory leak other issues ------------ RIOT-OS#1263: TLSF implementation contains (a) read-before-write error(s). RIOT-OS#3256: make: Setting constants on compile time doesn't really set them everywhere RIOT-OS#3366: periph/i2c: handle NACK RIOT-OS#4488: Making the newlib thread-safe: When calling puts/printf after thread_create(), the CPU hangs for DMA enabled uart drivers. RIOT-OS#4866: periph: GPIO drivers are not thread safe RIOT-OS#5128: make: buildtest breaks when exporting FEATURES_PROVIDED var RIOT-OS#5207: make: buildest fails with board dependent application Makefiles RIOT-OS#5390: pkg: OpenWSN does not compile: This package still uses deprecated modules and was not tested for a long time. RIOT-OS#5520: tests/periph_uart not working RIOT-OS#5561: C++11 extensions in header files RIOT-OS#5776: make: Predefining CFLAGS are parsed weirdly RIOT-OS#5863: OSX + SAMR21-xpro: shell cannot handle command inputs larger than 64 chars RIOT-OS#5962: Makefile: UNDEF variable is not working as documented RIOT-OS#6022: pkg: build order issue Special Thanks ============== We like to give our special thanks to all the companies that provided us with their hardware for porting and testing, namely the people from (in alphabeticalorder): Atmel, Freescale, Imagination Technologies, Limifrog, Nordic, OpenMote, Phytec, SiLabs, UDOO,and Zolertia; and also companies that directly sponsored development time: Cisco Systems, Eistec, Ell-i, Enigeering Spirit, Nordic, FreshTemp LLC, OTAkeys and Phytec. More information ================ http://www.riot-os.org Mailing lists ------------- * RIOT OS kernel developers list devel@riot-os.org (http://lists.riot-os.org/mailman/listinfo/devel) * RIOT OS users list users@riot-os.org (http://lists.riot-os.org/mailman/listinfo/users) * RIOT commits commits@riot-os.org (http://lists.riot-os.org/mailman/listinfo/commits) * Github notifications notifications@riot-os.org (http://lists.riot-os.org/mailman/listinfo/notifications) IRC --- * Join the RIOT IRC channel at: irc.freenode.net, #riot-os License ======= * Most of the code developed by the RIOT community is licensed under the GNU Lesser General Public License (LGPL) version 2.1 as published by the Free Software Foundation. * Some external sources are published under a separate, LGPL compatible license (e.g. some files developed by SICS). All code files contain licensing information.
@biboc ping. |
I'm sorry @kYc0o , I won't be able to test this for now. |
Fixed by #7925 |
Great, I believe you! |
After pondering this a little bit, I I'm not sure if this is actually a conjestion issue (which can't be fixed with normal IPv6). Can you retry? |
Testing now with while true; do
sleep 0.1
echo "Hello World!" | nc -u 2001:db8::7b62:3323:ec77:c86 1337 -w0
done From Linux host. |
(I'll run it a bit longer, but sniffer already tells me, that the ND messages come through clear as day, both for link-local addresses and GUAs) |
Great if you solve it! |
Ran for over an our now and multiple updates to the neighbor cache and default router list happened on both border and downstream router. I think we can deem this issue fixed. |
I'm using two SAMR21, one with BR (on A) and the other (on B) with
gnrc_networking example.
I'm on April release.
Switch on A (border router) then switch on B, I'm able to send UDP
message from Linux to 2001:db8 address of B.
If I try to send lot of data (every 100ms), it works for a while and
then B does not receive message anymore. After I see it failed, I
stopped the loop sending UDP message, I checked on BR and 2001:db8
address of B was not here anymore. I can ping B with fe80:: address on
iface 6 but can't with 2001:db8.
Why or how can a node be removed from ncache? Why the node does not
automatically reconnect with BR?
Answer from Martine:
Hi,
can you check if this is #5467?
As the implementer of the neighbor discovery I can say this: because
the neighbor discovery is sh** ;-). It was written in a haste and has
a lot of design flaws. I'm working on a replacement (see
#5704, happy to hear your
thoughts ;-)), but sadly I did not find some time to put heavy
implementation efforts into this yet (hopefully in autumn though, so
we have something for the October release, fingers crossed).
Thanks for reporting and kind regards,
Martine
The text was updated successfully, but these errors were encountered: