Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gnrc_ipv6: don't recurse into receive for encapsulated IPv6 #10246

Merged
merged 2 commits into from
Nov 8, 2018

Conversation

miri64
Copy link
Member

@miri64 miri64 commented Oct 24, 2018

Contribution description

_decapsulate() is called by callees of _receive() so the call to
the latter function within the first creates a recursion we don't want.
Using gnrc_netapi instead removes that and provides the added benefit
that other subscribers to IPv6 are also informed.

Testing procedure

Send an encapsulated IPv6 packet to a node. It should still be parsed properly.

Issues/PRs references

None, but related to the IPv6 refactoring project

PR dependencies

@miri64 miri64 added Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors) Area: network Area: Networking CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR labels Oct 24, 2018
@miri64 miri64 requested a review from kaspar030 October 24, 2018 16:47
@miri64 miri64 requested a review from bergzand October 24, 2018 16:47
@miri64 miri64 added the Impact: minor The PR is small in size and might only require a quick look of a knowledgeable reviewer label Oct 24, 2018
@miri64
Copy link
Member Author

miri64 commented Nov 7, 2018

@kaspar030 @bergzand this is a really low hanging fruit.

@bergzand
Copy link
Member

bergzand commented Nov 7, 2018

And here I was thinking that I Acked all the low hanging fruit in this project already :)

Give me a minute to wrap up something else, I'll give it some attention next

pkt);
/* something went horribly wrong if that's not the case because we
* are currently in a thread that is subscribing to the parameters above */
assert(res > 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To ask the one million dollar question, is there any way that this assert could cause a ping-of-death-like scenario? :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Say, something like a message box that is full.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Iff everything is in order no. If for some weird reason (e.g. a stack overflow), the registry entry for the gnrc_ipv6 thread isn't found in the dispatch above, it will. But than we are in a critical state already, so it is safer to abort the program ;-).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to be on the very safe side, I can revert 66ea05b.

Copy link
Member Author

@miri64 miri64 Nov 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Say, something like a message box that is full.

Nope, that has no effect on gnrc_netapi_dispatch_receive(). Only the number of (potentially able to receive) subscribers is returned. The case if they are able to receive is handled separately:

if (release) {
gnrc_pktbuf_release(pkt);
}
#else
if (_snd_rcv(sendto->target.pid, cmd, pkt) < 1) {
/* unable to dispatch packet */
gnrc_pktbuf_release(pkt);
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was already diving into that code. I agree that the assert should only trigger if the whole network stack is already broken. Never mind this then, thanks for clarifying.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided to revert 66ea05b after all. In case we want to make the stack more dynamic in the future (e.g. switch out modules) this might lead to problems.

@bergzand
Copy link
Member

bergzand commented Nov 7, 2018

I can cause a native instance to cease network operations with a bit of scapy and a relative large number of IPv6 in IPv6 in IPv6... headers. You get the idea.

Attached it a pcap as demonstration.
fe80::9435:fdff:fe5a:e9cb is my desktop, fe80::9435:fdff:fe5a:e9cc is a RIOT native instance running the examples/gnrc_networking on this branch. The initial traffic is ping request/replies between my desktop and the interface. Frame 22 is the first interesting bit where I build a frame with 7 IPv6 headers. No response since the source of those frames doesn't match the bridge link local address, but otherwise no harm so far, ping echo/replies continue as if nothing happened.

At frame 34, I transmit a frame with 8 IPv6 headers. After this arrived at the node, it stops responding to the periodic stream of ping requests.

zipped tcpdump capture: pod.zip

Scapy:

lladdr = "fe80::9435:fdff:fe5a:e9cc"  # probably different  :)
ip6p = IPv6(dst=lladdr)
sendp(Ether() / ip6p / ip6p / ip6p / ip6p / ip6p / ip6p / ip6p / ICMPv6EchoRequest(), iface="tapbr0") # node keeps working.
sendp(Ether() / ip6p / ip6p / ip6p / ip6p / ip6p / ip6p / ip6p / ip6p / ICMPv6EchoRequest(), iface="tapbr0") # node stops working.

@miri64
Copy link
Member Author

miri64 commented Nov 8, 2018

Is this only happening with this PR or also in master?

@miri64
Copy link
Member Author

miri64 commented Nov 8, 2018

(I'll have a look regardless, but I think this is already the case (or due to the recursion even worse) in master.

@miri64
Copy link
Member Author

miri64 commented Nov 8, 2018

I can reproduce the same thing in master

@miri64
Copy link
Member Author

miri64 commented Nov 8, 2018

For some reason gnrc_icmpv6_echo_req_handle() is called multiple times per packet. Will investigate.

@miri64
Copy link
Member Author

miri64 commented Nov 8, 2018

Ok, I now found out what happens. The encapsulated IPv6 header actually gets to be handled twice. Once (actually its the second handling) in the method I rework in this PR, and once here since current->type was set here. The easiest way to fix this was to just remove the whole _decapsulate stuff after #10233 was merged (if we remove it before the packet parser gets confused and takes the outermost IPv6 when picking an IPv6 header from a preparsed packet and then drops the packet here, since the payload length in the outermost IPv6 header is of course larger than the one in the innermost.

Now how to proceed? Crashing a node by sending a ping of death is obviously a serious flaw that should be fixed ASAP. But requiring #10233 to be merged into the release is unrealistic. So I would propose as a "fix" to remove the whole _decapsulate thing for now, breaking encapsulated IPv6 support, but removing the ping of death scenario. Not nice, but safer. Or just mark it as a known bug. @jia200x what do you think?

@miri64
Copy link
Member Author

miri64 commented Nov 8, 2018

Ah I think I found an alterntative "bugfix". That would fix both cases!

@miri64
Copy link
Member Author

miri64 commented Nov 8, 2018

It's an ugly one, but for now this should be enough.

@miri64 miri64 added this to the Release 2018.10 milestone Nov 8, 2018
@miri64
Copy link
Member Author

miri64 commented Nov 8, 2018

It's an ugly one, but for now this should be enough.

(the uglyness can be later reverted in #10233)

@miri64
Copy link
Member Author

miri64 commented Nov 8, 2018

@bergzand for testing: use your scapy commands, but also set the tap bridge's link-local address as source (this way you will the that without the latest fix you will get n (= number of IPv6 headers) replies; with the fix you only get one).

@jia200x
Copy link
Member

jia200x commented Nov 8, 2018

Ah I think I found an alterntative "bugfix". That would fix both cases!

@miri64 you mean, it fixes the PoD and we would still have IPv6 encapsulation?

I think such a bugfix (regardless on what happen to IPv6 encapsulation) is worth to be included in the Release. It's a serious bug IMO.

So, I'm +1 in backporting this.

@miri64
Copy link
Member Author

miri64 commented Nov 8, 2018

@miri64 you mean, it fixes the PoD and we would still have IPv6 encapsulation?

Yes. There ist still some issue, if extension headers appear between the encapsulated headers, but a) they don't cause a ping of death (they are just dropped) and b) this is also a problem in the current master. With the simplifications in #10233 this is fixed, but that change is too complicated to be used as a simple release fix.

@bergzand
Copy link
Member

bergzand commented Nov 8, 2018

Tested, issues seem to be resolved.

Ack! please squash!

miri64 and others added 2 commits November 8, 2018 11:54
`_decapsulate()` is called by callees of `_receive()` so the call to
the latter function within the first creates a recursion we don't want.
Using `gnrc_netapi` instead removes that and provides the added benefit
that other subscribers to IPv6 are also informed.
Otherwise, an encapsulated IPv6 packet is handled twice. Once in the
central function, once in the specialized decapsulation.
@miri64
Copy link
Member Author

miri64 commented Nov 8, 2018

Squashed.

@miri64 miri64 force-pushed the gnrc_ipv6/fix/encaps-ipv6-recursion branch from 0d450d6 to d54ac38 Compare November 8, 2018 10:54
@miri64
Copy link
Member Author

miri64 commented Nov 8, 2018

Murdock error is that unrelated false positive.

@miri64 miri64 added CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR and removed CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR labels Nov 8, 2018
Copy link
Member

@bergzand bergzand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All green!

@bergzand
Copy link
Member

bergzand commented Nov 8, 2018

Do we want to wait for Codacy before merging?

@miri64
Copy link
Member Author

miri64 commented Nov 8, 2018

Last time I checked it was green and it shouldn't change. But I leave it to you to hit the merge button ;-)

@bergzand bergzand merged commit 60a6e66 into RIOT-OS:master Nov 8, 2018
@bergzand
Copy link
Member

bergzand commented Nov 8, 2018

Merged!

@miri64 miri64 deleted the gnrc_ipv6/fix/encaps-ipv6-recursion branch November 8, 2018 11:35
@miri64
Copy link
Member Author

miri64 commented Nov 8, 2018

Thanks! :-)

@miri64
Copy link
Member Author

miri64 commented Nov 8, 2018

Backport provided in #10348

@miri64 miri64 added the Process: needs backport Integration Process: The PR is required to be backported to a release or feature branch label Nov 8, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: network Area: Networking CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR Impact: minor The PR is small in size and might only require a quick look of a knowledgeable reviewer Process: needs backport Integration Process: The PR is required to be backported to a release or feature branch Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants