Multi-Hop ping6 broken #3597

cgundogan · 2015-08-10T10:16:30Z

Given the following scenario on the IoT-Lab testbed:

Node 1 configured to have IPv6 address: abcd::1 and fe80::1.
Node 2 configured to have IPv6 address: abcd::2 and fe80::2.
Node 3 configured to have IPv6 address: abcd::3 and fe80::3.

Furthermore, the following routes are configured manually in the fib:
Node 1:

fibroute
abcd::2 via fe80::2
fe80::2 via fe80::2
abcd::3 via fe80::3
fe80::3 via fe80::3

Node 2:

fibroute
:: via fe80::1

Node 3:

fibroute
:: via fe80::1

Then:
ping6 abcd::1 from Node 2 does work as expected.
ping6 abcd::2 from Node 1 does work as expected.
ping6 abcd::1 from Node 3 does work as expected.
ping6 abcd::3 from Node 1 does work as expected.
ping6 abcd::3 from Node 2 does not work (100% packet loss).
ping6 abcd::2 from Node 3 does not work (100% packet loss).

The text was updated successfully, but these errors were encountered:

OlegHahm · 2015-08-12T11:01:35Z

Bisecting revealed: 4e5fa61 broke it.

OlegHahm · 2015-08-12T11:33:02Z

And btw much simpler scenario doesn't work currently:
Two nodes, one with abcd::1/64, the other with abcd::2/64. Default route is set to abcd::2 and abcd::1 respectively. Pinging worked before 4e5fa61, but not with current master.

cgundogan · 2015-08-12T11:37:53Z

I can confirm what @OlegHahm wrote. Pinging link-local addresses works however.

OlegHahm · 2015-08-12T14:31:48Z

Only single-hop is fixex by #3617.

OlegHahm · 2015-08-12T15:40:36Z

Debugging revealed: if a default route is set, neighbor solicitations are sent to the link-local mulitcast address for the default gateway in response to a ICMPv6 echo request instead of using the actual link-local source address.

miri64 · 2015-08-12T15:48:00Z

What's "the link-local multicast address for the default gateway"? Do you mean all-routers (ff02::2)?

OlegHahm · 2015-08-12T15:50:50Z

No, if the the default gateway is, for instance, affe::2, the used link-local address is ff02::1:ff02:2. I think the problem is caused by the double-usage of FIB as plain FIB and destination cache is causing this problem.

OlegHahm · 2015-08-12T15:52:17Z

https://github.com/RIOT-OS/RIOT/blob/master/sys/net/network_layer/ng_ndp/node/ng_ndp_node.c#L73 determines the wrong destination address for the neighbor solicitation.

miri64 · 2015-08-12T15:52:27Z

Ah, that's the solicited nodes multicast address of the default gateway ;)

OlegHahm · 2015-08-12T16:05:44Z

@BytesGalore or anyone else, can someone enlighten me what's the purpose of global_flags and next_hop_flags in the FIB? As far as I can see they are not used yet and I wonder if one can use them for distinguishing between FIB and destination cache entries.

OlegHahm · 2015-08-12T16:08:34Z

Or should we make it possible to instantiate mulitple fib instances and then create one for the FIB itself and one for the destination cache?

BytesGalore · 2015-08-12T16:08:35Z

@OlegHahm basically they provide to store extra information for the address.
For instance if it is a network or host address (when #3157 is merged).

OlegHahm · 2015-08-12T16:09:14Z

thx

BytesGalore · 2015-08-12T16:21:25Z

I wonder if one can use them for distinguishing between FIB and destination cache entries.

I guess we can :) since there are enough bits (uint32_t) available to store the information.

Or should we make it possible to instantiate multiple fib instances

I'm not in favour to have multiple FIB instances. I would vote to go with flags.

OlegHahm · 2015-08-12T16:25:36Z

Actually, I prefer the later solution, since it seems to be a cleaner separation. Basically, we're currently kind of misusing the FIB for a different purpose. This makes sense in a way that it contains the same tuples, but would IMO cleaner if we have two different buffers. An additional problem, I see with the flags is that we would actually need a flag for the entry, not one per column.

(As a side question: what was the rationale for giving 8 bytes to the flags?)

BytesGalore · 2015-08-12T16:36:07Z

So you would vote to have a destination cache information base or like, which provides exactly the same functionality as the FIB and store the data in an own pool of also equally looking tuples?

(As a side question: what was the rationale for giving 8 bytes to the flags?)

It was just a gut feeling that we probably need some bits (more than 8) for extra information.

OlegHahm · 2015-08-12T16:37:44Z

So you would vote to have a destination cache information base or like, which provides exactly the same functionality as the FIB and store the data in an own pool of also equally looking tuples?

No, I would use the same data structure, but just add a parameter to the current buffer instead of using a static one.

BytesGalore · 2015-08-12T16:40:37Z

Um, I don't get what you mean 😊

OlegHahm · 2015-08-12T16:42:58Z

I'm preparing a PR, but it's more work than thought. ;) The idea is to add a parameter to (almost) all fib functions that passes an array of the type fib_entry_t that is used instead of the global fib_table[]. This way multiple instances of an fib table could exist.

BytesGalore · 2015-08-12T16:45:29Z

Ah ok, you want to separate out the data from the utility functions.

BytesGalore · 2015-08-12T16:52:38Z

This way every FIB user has to "remember" the data pool it uses, but it sounds reasonable :)

OlegHahm · 2015-08-12T17:06:15Z

See OlegHahm@725420f

I will probably do a cleanup tomorrow.

BytesGalore · 2015-08-12T17:14:04Z

nice, it a bit like the source routing extension for the FIB I'm working on BytesGalore@488a824

cgundogan · 2015-08-12T17:21:09Z

OT: @BytesGalore, finally, a glance at the source routing extension that I long for (:

cgundogan · 2015-08-12T18:21:54Z

RFC2461#section-5.1

 Destination Cache
               - A set of entries about destinations to which
                 traffic has been sent recently.  The Destination
                 Cache includes both on-link and off-link
                 destinations and provides a level of indirection
                 into the Neighbor Cache; the Destination Cache maps
                 a destination IP address to the IP address of the
                 next-hop neighbor.  This cache is updated with
                 information learned from Redirect messages.
                 Implementations may find it convenient to store
                 additional information not directly related to
                 Neighbor Discovery in Destination Cache entries,
                 such as the Path MTU (PMTU) and round trip timers
                 maintained by transport protocols.

RFC2461#section-5.2

For efficiency reasons, next-hop determination is not performed on
every packet that is sent. Instead, the results of next-hop
determination computations are saved in the Destination Cache (which
also contains updates learned from Redirect messages). When the
sending node has a packet to send, it first examines the Destination
Cache. If no entry exists for the destination, next-hop
determination is invoked to create a Destination Cache entry.

AFAIU, the text above describes a data structure, which is, in my opinion, almost equivalent to a Forwarding Information Base data structure. And I even believe that my use of the word almost is misplaced in the previous sentence. I think @authmillenon's approach to use the FIB is the right way to go, even if the FIB is modifiable by processes other than the IPv6-ND next-hop determination, like it would be the case if external routing protocols were in use.

Assuming we would have a separate destination cache and a FIB table. Which table should the sending algorithm described in RFC2461#section-5.2 consolidate? And in which order?

EDIT: Sorry for using the obsolete RFC2461 but the cited parts did not change in the more recent version described in RFC4861

OlegHahm · 2015-08-12T23:16:05Z

AFAIU, the text above describes a data structure, which is, in my opinion, almost equivalent to a Forwarding Information Base data structure. And I even believe that my use of the word almost is misplaced in the previous sentence.

Yes, that's why we re-use the data structure.

I think @authmillenon's approach to use the FIB is the right way to go, even if the FIB is modifiable by processes other than the IPv6-ND next-hop determination, like it would be the case if external routing protocols were in use.

No, FIB and destination cache are two different things.

Assuming we would have a separate destination cache and a FIB table. Which table should the sending algorithm described in RFC2461#section-5.2 consolidate? And in which order?

IMO that's pretty clear described there.

OlegHahm · 2015-08-13T08:52:30Z

Ok, looking at the RFCs and implementation again, I think there's actually no need for splitting up destination cache and FIB, but I think ng_ndp_node_next_hop_l2addr() is "wrong". Looking up the L2 address should not involve FIB/destination cache.

miri64 · 2015-08-13T16:30:09Z

Why? ng_ndp_node_next_hop_l2addr() is not just address resolution (looking up the L2 address), but also next hop determination. RFC 4861, section 5.2 states:

When the sending node has a packet to send, it first examines the Destination Cache. If no entry exists for the destination, next-hop determination is invoked to create a Destination Cache entry.

True, there is no word about the FIB, but that's true for the whole RFC. But my argument (and the old stack implementor's too I guess) is: where else put the FIB lookup for the next hop than in the next hop determination and given that it has precedent over all other lookups (since the FIB is supposed to override any "default" next-hop) it has to be placed first in there.

miri64 · 2015-08-13T16:33:24Z

On the other hand: if caching the nhd (next-hop determination) results causes this kind of problems in multi-hop scenarios we should maybe remove it for now and focus more on getting bugs fix and find a solution for the destination cache, when there is a real need for it: whenever we implement NDP Redirect handling.

OlegHahm · 2015-08-13T16:41:49Z

see #3622

miri64 · 2015-08-13T17:06:32Z

Off-topic usage question of fibroute: how does one add specifies the prefix length of the prefix. I would have expected something like

fibroute add abcd::/4 via ...

Why I ask: when I put in the following on native without FIB as destination cache (that's what I gathered from @cgundogan initial description)

Node 1

ifconfig 7 add abcd::1/16
fibroute add abcd::2 via fe80::2
fibroute add fe80::2 via abcd::2
fibroute add abcd::3 via fe80::3
fibroute add fe80::3 via abcd::3

Node 2:

ifconfig 7 add abcd::2/16
fibroute add :: via fe80::1

Node 3:

ifconfig 7 add abcd::3/16
fibroute add :: via fe80::1

ping6 abcd::2 from Node 1 does work as expected.
ping6 abcd::3 from Node 1 does work as expected.
ping6 abcd::1 from Node 2 does work as expected.
ping6 abcd::3 from Node 2 does not work (100% packet loss).
ping6 abcd::1 from Node 3 does work as expected.
ping6 abcd::2 from Node 3 does not work (100% packet loss).

So multi-hop ping still does not work. But I'm wondering if I configured the FIB wrong.

miri64 · 2015-08-13T17:18:17Z

Lastly, I've read the statement "FIB != Destination Cache" in this issue of course, but I fail to understand it. The FIB is a data structure to look up an address' (or prefix') next hop and the destination cache is a data structure to look up an address' next hop. So where is the differentiating point that I've missed. Is it the prefix part of the FIB? But then: it is still true that the Destination Cache is a subset of the FIB. As such it should be possible to use the FIB as the Destination Cache.

OlegHahm · 2015-08-13T17:23:24Z

Yes, destination cache is a subset of FIB with full addresses rather than prefixes.

BytesGalore · 2015-08-13T18:10:04Z

@authmillenon the FIB determines automatically the prefix from the significant bits in the given address. The position of the last 1 bit sets the prefix length.
It is done that way since the FIB should be independent from the used address type, for instance when using non-IP addresses.

The :: (unspecified) address should work as default gateway address, since the unittest 19 for the FIB [1] tests exactly the case where all bits of the destination address are 0 (thus significant prefix bits).

[1] https://github.com/RIOT-OS/RIOT/blob/master/tests/unittests/tests-fib/tests-fib.c#L745

miri64 · 2015-08-13T19:32:33Z

When link-local addresses are not supposed to be searched in the FIB, why is @cgundogan adding them then? When I compile Node 1 (the router) with USEMODULE += ng_ipv6_default_router multi-hop pinging is working with #3624 (which causes NDP not to look-up link-local addresses in the FIB anymore).

cgundogan · 2015-08-13T19:43:09Z

why is @cgundogan adding them then?

The (former) implementation of nd was adding them when pinging, not me specifically. Will test all fixes tomorrow and report my results.

gnrc: Fix #3597 (multi-hop ping)

cgundogan added Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors) NSTF labels Aug 10, 2015

cgundogan assigned miri64 Aug 10, 2015

OlegHahm mentioned this issue Aug 12, 2015

ndp: missing initialization #3617

Merged

OlegHahm closed this as completed in #3617 Aug 12, 2015

OlegHahm reopened this Aug 12, 2015

miri64 mentioned this issue Aug 13, 2015

gnrc: Fix #3597 (multi-hop ping) #3624

Merged

OlegHahm closed this as completed in #3624 Aug 14, 2015

OlegHahm added a commit that referenced this issue Aug 14, 2015

Merge pull request #3624 from authmillenon/gnrc_ndp/fix/i3597

a967fab

gnrc: Fix #3597 (multi-hop ping)

OlegHahm modified the milestone: Release 2015.09 Sep 2, 2015

miri64 added the Area: network Area: Networking label Sep 30, 2018

Multi-Hop ping6 broken #3597

Multi-Hop ping6 broken #3597

Comments

cgundogan commented Aug 10, 2015

OlegHahm commented Aug 12, 2015

OlegHahm commented Aug 12, 2015

cgundogan commented Aug 12, 2015

OlegHahm commented Aug 12, 2015

OlegHahm commented Aug 12, 2015

miri64 commented Aug 12, 2015

OlegHahm commented Aug 12, 2015

OlegHahm commented Aug 12, 2015

miri64 commented Aug 12, 2015

OlegHahm commented Aug 12, 2015

OlegHahm commented Aug 12, 2015

BytesGalore commented Aug 12, 2015

OlegHahm commented Aug 12, 2015

BytesGalore commented Aug 12, 2015

OlegHahm commented Aug 12, 2015

BytesGalore commented Aug 12, 2015

OlegHahm commented Aug 12, 2015

BytesGalore commented Aug 12, 2015

OlegHahm commented Aug 12, 2015

BytesGalore commented Aug 12, 2015

BytesGalore commented Aug 12, 2015

OlegHahm commented Aug 12, 2015

BytesGalore commented Aug 12, 2015

cgundogan commented Aug 12, 2015

cgundogan commented Aug 12, 2015

OlegHahm commented Aug 12, 2015

OlegHahm commented Aug 13, 2015

miri64 commented Aug 13, 2015

miri64 commented Aug 13, 2015

OlegHahm commented Aug 13, 2015

miri64 commented Aug 13, 2015

miri64 commented Aug 13, 2015

OlegHahm commented Aug 13, 2015

BytesGalore commented Aug 13, 2015

miri64 commented Aug 13, 2015

cgundogan commented Aug 13, 2015