Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Forwarding a packet back to its link layer source should not be allowed #5051

Open
jnohlgard opened this issue Mar 12, 2016 · 20 comments
Open
Assignees
Labels
Area: network Area: Networking Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors)

Comments

@jnohlgard
Copy link
Member

Network scenario:
BR -> router -> leaf

BR knows that leaf is behind router, however for some reason the leaf node goes missing and router drops it from its fib (e.g. by router reboot, or clock drift between BR and router so that router drops the route before BR does).

Now BR will forward packets to leaf via router, and since router does not have leaf in its FIB it will pass the received packet along its default route, which is BR. Now the packet starts bouncing until TTL is reached. This will lead to excessive power drain and occupying wireless bandwidth for no good reason.

IMO there should be a check somewhere in the routing decisions that drops a packet instead of forwarding if the routing decision is to forward it to the same link layer address from where it was received.

@jnohlgard jnohlgard added Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors) Area: network Area: Networking GNRC labels Mar 12, 2016
@jnohlgard jnohlgard added this to the Release 2016.04 milestone Mar 12, 2016
@jnohlgard
Copy link
Member Author

I don't know where the proper place for such a check would be in the source, any pointers?

@miri64
Copy link
Member

miri64 commented Mar 12, 2016

Mhhh… I'm not sure that is correct behavior. Isn't that what the hop limit is supposed to be for (though yes, it is power draining)? @emmanuelsearch @OlegHahm @tcschmidt what do you say?

@BytesGalore
Copy link
Member

Now BR will forward packets to leaf via router, and since router does not have leaf in its FIB it will pass the received packet along its default route, which is BR.

What you describe is a classic routing-loop which a routing-protocoll takes care of and prevents/resolves it. If this situation happens, the routing-protocol will eventually resolve it, if not then the protocol would be somehow broken.
I'm with @authmillenon, we should not manually tweak something to prevent routing-loops, this should happen by design.

@tcschmidt
Copy link
Member

Some preliminary remarks:

@gebart : you want to prevent routing loops by excluding the previous last hop (link-local) from forwarding: this is a bit dangerous (e.g., during routing convergence) and won't really help, as routing loops can occur via several hops and addresses may be global unicast.

For the purpose of a lost leaf, we have ICMP "destination unreachable/no route to host". This would fire in case of a lost route.

If there is a valid default route, though, topology needs maintenance. A routing protocol (that operates with default routes) needs to update its information state about connectivity. In case a link is lost, this should trigger an update. In the case of RPL, there is the additional information about upward/downward, which should not mix.

However, there are routing protocols (like AODV) that do not procure for default routes. If default routes are implemented by other means, there may be conflicts ... some need for additional thinking here.

@tcschmidt tcschmidt reopened this Mar 12, 2016
@jnohlgard
Copy link
Member Author

@tcschmidt I observed this happening between two RIOT nodes on the wireless 6lowpan network (using a sniffer) and there were never any ICMPv6 no route to host or any other errors reported, only that the packet bounced 64 times before the hoplimit killed it. Is this a bug in the RPL implementation then?

@miri64
Copy link
Member

miri64 commented Mar 12, 2016

Error reporting isn't realized in GNRC yet. However, with #3544 this should be a breeze by just replacing every gnrc_pktbuf_release() [edit: in IPv6] with the appropriate error send function.

@tcschmidt
Copy link
Member

It sure is an inconsistent routing state ... and RPL knows about default
routes. I would put RPL under suspect, but details under further
investigations. There is of course always a convergence time required
and inconsistencies may be "correct" during convergence.

On 12.03.2016 14:17, Joakim Nohlgård wrote:

@tcschmidt https://github.com/tcschmidt I observed this happening
between two RIOT nodes on the wireless 6lowpan network (using a sniffer)
and there were never any ICMPv6 no route to host or any other errors
reported, only that the packet bounced 64 times before the hoplimit
killed it. Is this a bug in the RPL implementation then?


Reply to this email directly or view it on GitHub
#5051 (comment).

@tcschmidt
Copy link
Member

@gebart I checked back with the RPL spec. In general, RPL is not loop-free and does not provide rapid convergence. However, there are simple loop avoidance mechanisms (accounting for tree traversal directions), see https://tools.ietf.org/html/rfc6550#section-11.2
I'm not sure whether loop avoidance is implemented - @cgundogan ?

@kaspar030
Copy link
Contributor

The problem here is that the 6lowpan RFC is very clear on what is supposed to happen if there's no neighbor cache entry: forward the packet to any router available.
For a 6lowpan node, that makes sense.
For a multihomed node (e.g., the 6lbr) it doesn't.
The RFC says "send the packet to any available router". Sending it back to router the packet came from is clearly wrong. IMHO this is a case where the RFC is unclear. For multihomed nodes, a 6lo node should send it to any router if it has only the 6lowpan network, a 6lbr should either try to send it directly to the target's l2 address or drop the packet with "no route to host".

I've implemented that in [1].

[1] https://github.com/kaspar030/RIOT/tree/introduce_new_ipv6_forwarding

@jnohlgard
Copy link
Member Author

There is actually a mention of the exact behaviour in RFC6550 (RPL) page 100

Note that the chosen successor MUST NOT be the neighbor that was the
predecessor of the packet (split horizon), except in the case where
it is intended for the packet to change from an Upward to a Downward
direction, as determined by the routing table of the node making the
change, such as switching from DIO routes to DAO routes as the
destination is neared in order to continue traveling toward the
destination.

Split horizon: https://en.wikipedia.org/wiki/Split_horizon_route_advertisement

@tcschmidt
Copy link
Member

That is actually a bit strange, because split horizon is typically a
method for route propagation (and the paragraph refers to DIO/DAO
changes). Normally, this behaviour is part of the routing daemon and not
part of the regular packet forwarding (in particular, as regular packets
don't see the last hop on the IP-level). Also, the Section 11.1 is
called "Suggestions for Packet Forwarding" - I guess, this requires more
digging into the details.

On 12.03.2016 16:01, Joakim Nohlgård wrote:

There is actually a mention of the exact behaviour in RFC6550 (RPL) page
100 <tools.ietf.org/html/rfc6550#page-100>

Note that the chosen successor MUST NOT be the neighbor that was the
predecessor of the packet (split horizon), except in the case where
it is intended for the packet to change from an Upward to a Downward
direction, as determined by the routing table of the node making the
change, such as switching from DIO routes to DAO routes as the
destination is neared in order to continue traveling toward the
destination.

Split horizon:
https://en.wikipedia.org/wiki/Split_horizon_route_advertisement


Reply to this email directly or view it on GitHub
#5051 (comment).

@tcschmidt
Copy link
Member

I don't see the relation here: the problem of @gebart seems neither
related to 6lbr (only to multi-hop routing), nor to neighbor caches. It
is a L3 forwarding decision (use default in the absence of a more
specific) - at inconsistent routing tables.

On 12.03.2016 15:41, Kaspar Schleiser wrote:

The problem here is that the 6lowpan RFC is very clear on what is
supposed to happen if there's no neighbor cache entry: forward the
packet to /any/ router available.
For a 6lowpan node, that makes sense.
For a multihomed node (e.g., the 6lbr) it doesn't.
The RFC says "send the packet to any available router. Sending it back
to router the packet came from is clearly wrong. IMHO this is a case
where the RFC is unclear. For multihomed nodes, a 6lo node should send
it to /any/ router if it has only the 6lowpan network, a 6lbr should
either try to send it directly to the target's l2 address or drop the
packet with "no route to host".

I've implemented that in [1].

[1] https://github.com/kaspar030/RIOT/tree/introduce_new_ipv6_forwarding


Reply to this email directly or view it on GitHub
#5051 (comment).

@haukepetersen
Copy link
Contributor

won't be fixed for this release -> postponed

@haukepetersen haukepetersen modified the milestones: Release 2016.07, Release 2016.04 Apr 20, 2016
@kaspar030
Copy link
Contributor

won't be fixed for this release -> postponed

@kaspar030 kaspar030 modified the milestones: Release 2016.10, Release 2016.07 Jul 11, 2016
@miri64 miri64 modified the milestones: Release 2017.01, Release 2016.10 Oct 31, 2016
@PeterKietzmann PeterKietzmann modified the milestones: Release 2017.01, Release 2017.04 Jan 26, 2017
@kYc0o
Copy link
Contributor

kYc0o commented Jul 17, 2018

This is not part of a milestone anymore but I'd ask @miri64 if this issue was somehow addressed by recent networking PRs.

@miri64
Copy link
Member

miri64 commented Jul 17, 2018

This might have been fixed with the recent changes to RPL. @cgundogan is it ?

@jia200x
Copy link
Member

jia200x commented Nov 6, 2018

does someone know if this is still an issue?

@miri64
Copy link
Member

miri64 commented Nov 6, 2018

@cgundogan ?

@stale
Copy link

stale bot commented Aug 10, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If you want me to ignore this issue, please mark it with the "State: don't stale" label. Thank you for your contributions.

@stale stale bot added the State: stale State: The issue / PR has no activity for >185 days label Aug 10, 2019
@stale stale bot closed this as completed Sep 10, 2019
@aabadie aabadie reopened this Sep 21, 2019
@stale stale bot removed the State: stale State: The issue / PR has no activity for >185 days label Sep 21, 2019
@MrKevinWeiss MrKevinWeiss added this to the Release 2021.07 milestone Jun 22, 2021
@MrKevinWeiss MrKevinWeiss removed this from the Release 2021.07 milestone Jul 15, 2021
@maribu
Copy link
Member

maribu commented Sep 16, 2022

Ping?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: network Area: Networking Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors)
Projects
None yet
Development

No branches or pull requests