Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TCP socket emitted 'end' event after 'error' event #6083

Closed
davepacheco opened this issue Apr 6, 2016 · 23 comments
Closed

TCP socket emitted 'end' event after 'error' event #6083

davepacheco opened this issue Apr 6, 2016 · 23 comments
Labels
confirmed-bug Issues with confirmed bugs. stream Issues and PRs related to the stream subsystem.

Comments

@davepacheco
Copy link
Contributor

  • Version: observed on v0.12, v4, and v5. Did not observe on v0.10. (details below)
  • Platform: observed on both illumos and OS X Mountain Lion, both 32-bit and 64-bit (details below)
  • Subsystem: net

Although it's not technically documented, a lot of code assumes that once a stream emits an 'error' event, it will not subsequently emit 'data' or 'end' events. This is pretty much necessary, because otherwise it's impossible for a stream consumer to know when a stream has come to rest and will emit no more events. I've reproduced a case where Node reliably emits an 'end' event after an 'error' event, which violates the expectations of code that assumes a stream has come to rest after 'error'.

To be really precise about where I tested it:

Works as expected on v0.10 (perhaps only because the 'end' event is emitted first):

  • illumos: v0.10.43 (both 32-bit and 64-bit)
  • OS X Mountain Lion: v0.10.28 (64-bit)

Does not work as expected on v0.12 and later:

  • illumos: v0.12.12, v4.4.0, v5.9.0 (both 32-bit and 64-bit)
  • OS X Mountain Lion: v0.12.2, v4.4.2, v5.10.0 (64-bit)

Here's a test case that's commented with what it's doing. I've tried to simplify it as much as possible, but since it's a race condition, it's tricky to get the timing just right.

/*
 * test-stream.js: demonstrates a case where a Node stream can see an 'end'
 * event after an 'error' event.
 *
 * This test case works as follows:
 *
 *      (1) Set up a TCP server socket and connect to it using a TCP client.
 *          Server: set up listeners for 'end' and 'error'.
 *          Client: set up listener for 'error'.
 *
 *      (2) Client: write 65536 bytes of data.
 *
 *      (3) Pause one second.  Behind the scenes, Node will detect that the
 *          server's socket has become readable and read all 65536 bytes.  These
 *          will be buffered in JavaScript.
 *
 *      (4) Server: read 65535 bytes of data from the Socket.  There will be
 *          one byte left buffered on the Socket in JavaScript.
 *          Client: destroy the socket.  This will send a FIN to the server.
 *
 *      (5) Asynchronously (via setImmediate):
 *          Server: read 1 byte of data from the Socket.  This will trigger Node
 *          to read from the underlying socket again, where it will read 0
 *          bytes, signifying the end of the stream.
 *
 *          Server: write data to the socket.  Since the socket is now
 *          disconnected, eventually these writes will report EPIPE/SIGPIPE.
 *          This generally happens synchronously with respect to the write()
 *          call, but the error will be emitted asynchronously.
 *
 *      (6) Asynchronously (via setImmediate):
 *          Server: read another byte from the socket.  At this point, we're
 *          reading past end-of-stream, and Node will schedule an 'end' event to
 *          be emitted, but an 'error' event has already been scheduled as well,
 *          so we'll see 'error' and then 'end', which should be invalid.
 */

var mod_net = require('net');
var mod_os = require('os');

/* IP address and port used for this test case. */
var ip = '127.0.0.1';
var port = 16404;

/* We'll use this buffer as a chunk of data. */
var bufsz = 64 * 1024;
var buf;

/* State for this test */
var server;     /* server's listening socket */
var ssock;      /* server's connection socket */
var csock;      /* client socket */
var end = false;    /* server has seen "end" on its connection socket */
var error = false;  /* server has seen "error" on its connection socket */

function main()
{
    console.log('versions:',
        process.version, process.arch, mod_os.platform());

    buf = new Buffer(bufsz);
    buf.fill(0);

    /*
     * (1) Set up client and server.
     */
    server = mod_net.createServer({ 'allowHalfOpen': true });
    server.on('connection', function (s) {
        console.log('server: client connected');
        ssock = s;

        ssock.on('end', function () {
            console.log('server: saw "end" on client socket');
            if (error) {
                console.log('reproduced issue!');
                process.abort();
            }

            end = true;
        });

        ssock.on('error', function (err) {
            console.log('server: saw "error" on client socket', err);
            if (error || end) {
                console.log('bailing out after server error');
                process.exit(0);
            }

            // ssock.read(1);
            error = true;
        });

        /*
         * (2) Client writes data.
         */
        csock.write(buf);

        /*
         * (3) Pause until the server sees that data.
         */
        ssock.once('readable', triggerIssue);
    });

    server.listen(port, function () {
        console.log('server: listening');

        csock = mod_net.createConnection(port, ip);
        csock.on('connect', function () {
            console.log('client: connected');
        });

        csock.on('end', function () {
            console.log('client: saw "end" on server socket');
        });
    });
}

function triggerIssue()
{
    console.log('triggering issue by destroying client socket');

    /*
     * (4) Read _most_ of the data from the socket and have the client
     * destroy the socket.
     */
    ssock.read(bufsz - 1);
    csock.destroy();
    setImmediate(function () {
        /*
         * (5) Read 1 byte of data from the socket and write data to it.
         */
        ssock.read(1);
        ssock.write(buf);
        ssock.write(buf);
        setImmediate(function () {
            /*
             * (6) Read one more byte.
             */
            ssock.read(1);
        });
    });
}

main();

The detailed output for each test I ran is here:
https://gist.github.com/davepacheco/84d450d2c25f6212a99a984a8f089b4c

@mscdex mscdex added the stream Issues and PRs related to the stream subsystem. label Apr 6, 2016
@Trott
Copy link
Member

Trott commented Jul 7, 2017

@nodejs/streams Is this a bug? Documentation issue? Neither? Both?

@mcollina
Copy link
Member

mcollina commented Jul 7, 2017

I would need to look into it.

@michaelnisi
Copy link

Isn’t this unspecified and depends on the individual stream?

@mcollina
Copy link
Member

As @michaelnisi noted, this is unspecified and depends on the individual stream. To the best of my knowledge, there is no mechanism/enforcement in net to prevent this. For Writable  and Duplex  streams (including sockets), there is the guarantee that 'error' will only be emitted once.

Also, 'error' will be emitted before 'end', 'close' and 'finish'. The correct approach is to remove all the listeners when 'error' happens.

That being said, I cannot reproduce the issue on node v4.4.2 (the one which output was linked in the issue), 4.8, 6.11, 8, on Mac OS X 10.12. @davepacheco is this still an issue? If that's the case, can you update the script to reproduce it, or maybe send a PR with a failing test?

The script currently hang with:

versions: v4.0.0 x64 darwin
server: listening
server: client connected
client: connected
triggering issue by destroying client socket
server: saw "end" on client socket

@davepacheco
Copy link
Contributor Author

Is there documentation that stream consumers should remove their 'close', 'end', and 'finish' listeners when 'error' happens? It seems like that makes using streams robustly a lot more complex.

I'm not seeing this on OS X El Capitan (v10.11.6), but I think that just means something about the timing has changed, not that the underlying issue was OS-specific or went away.

@mcollina
Copy link
Member

@davepacheco No, I do not think there is. However, most streams respect this generic pattern. Given the code we have in 8, I do not see how that can happen, as we destroy the handle synchronously in case of an internal error. I would be happy to take a look if you can reproduce anyhow.

The implementations in core should not have the problem you are describing, so it is a bug in that case. It might even be OSS specific, and maybe in libuv.

@ghost
Copy link

ghost commented Sep 26, 2017

Hello
We know anyone can connect to this server
While we need connect special user
How do we can prevent unwanted connection with known user?

@apapirovski
Copy link
Member

There has been no movement on this in nearly a year and no reports of it happening again. I'm going to close it out but if any new info has surfaced or you believe this should remain open, please feel free to do so.

@davepacheco
Copy link
Contributor Author

Has anybody re-run the test program to see if it's still happening?

@mcollina mcollina reopened this Apr 17, 2018
@mcollina
Copy link
Member

@davepacheco The situation has changed. The script hangs:

$ node s
versions: v4.8.0 x64 darwin
server: listening
server: client connected
client: connected
triggering issue by destroying client socket
server: saw "end" on client socket

Or I'm getting a nice abort:

$ node s
versions: v8.11.1 x64 darwin
server: listening
server: client connected
client: connected
triggering issue by destroying client socket
server: saw "error" on client socket { Error: write EPIPE
    at _errnoException (util.js:1022:11)
    at WriteWrap.afterWrite [as oncomplete] (net.js:880:14) code: 'EPIPE', errno: 'EPIPE', syscall: 'write' }
server: saw "end" on client socket
reproduced issue!
 1: node::Abort() [/Users/matteo/.nvm/versions/node/v8.11.1/bin/node]
 2: node::Chdir(v8::FunctionCallbackInfo<v8::Value> const&) [/Users/matteo/.nvm/versions/node/v8.11.1/bin/node]
 3: v8::internal::FunctionCallbackArguments::Call(void (*)(v8::FunctionCallbackInfo<v8::Value> const&)) [/Users/matteo/.nvm/versions/node/v8.11.1/bin/node]
 4: v8::internal::MaybeHandle<v8::internal::Object> v8::internal::(anonymous namespace)::HandleApiCallHelper<false>(v8::internal::Isolate*, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::FunctionTemplateInfo>, v8::internal::Handle<v8::internal::Object>, v8::internal::BuiltinArguments) [/Users/matteo/.nvm/versions/node/v8.11.1/bin/node]
 5: v8::internal::Builtin_Impl_HandleApiCall(v8::internal::BuiltinArguments, v8::internal::Isolate*) [/Users/matteo/.nvm/versions/node/v8.11.1/bin/node]
 6: 0x7a6215042fd
Abort trap: 6

This is also confirmed on master.

@mcollina
Copy link
Member

Here is some output from llnode:

* thread #1, stop reason = signal SIGSTOP
  * frame #0: 0x00007fff73a70b6e libsystem_kernel.dylib`__pthread_kill + 10
    frame #1: 0x00007fff73c3b080 libsystem_pthread.dylib`pthread_kill + 333
    frame #2: 0x00007fff739cc1ae libsystem_c.dylib`abort + 127
    frame #3: 0x0000000100028b77 node`node::Abort() at node.cc:1550 [opt]
    frame #4: 0x000000010002c7e2 node`node::Abort(args=<unavailable>) at node.cc:1572 [opt]
    frame #5: 0x00000001001a3033 node`v8::internal::FunctionCallbackArguments::Call(this=0x00007ffeefbfe210, handler=<unavailable>) at api-arguments.cc:29 [opt]
    frame #6: 0x00000001001fdc0f node`v8::internal::MaybeHandle<v8::internal::Object> v8::internal::(anonymous namespace)::HandleApiCallHelper<false>(isolate=0x0000000102801e00, function=<unavailable>, new_target=<unavailable>, fun_data=<unavailable>, receiver=<unavailable>, args=BuiltinArguments @ 0x00007ffeefbfe2a0) at builtins-api.cc:107 [opt]
    frame #7: 0x00000001001fd2e8 node`v8::internal::Builtin_Impl_HandleApiCall(args=<unavailable>, isolate=0x0000000102801e00) at builtins-api.cc:137 [opt]
    frame #8: 0x00003a0a4fc0427d
    frame #9: 0x00003a0a4fc144f7
    frame #10: 0x00003a0a4fc144f7
    frame #11: 0x00003a0a4fc144f7
    frame #12: 0x00003a0a4fc144f7
    frame #13: 0x00003a0a4fc118d5
    frame #14: 0x00003a0a4fc096a1
    frame #15: 0x00000001004faa86 node`v8::internal::(anonymous namespace)::Invoke(v8::internal::Isolate*, bool, v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Object>, int, v8::internal::Handle<v8::internal::Object>*, v8::internal::Handle<v8::internal::Object>, v8::internal::Execution::MessageHandling, v8::internal::Execution::Target) [inlined] v8::internal::GeneratedCode<v8::internal::Object*, v8::internal::Object*, v8::internal::Object*, v8::internal::Object*, int, v8::internal::Object***>::Call(args=<unavailable>, args=<unavailable>, args=<unavailable>, args=0) at simulator.h:110 [opt]
    frame #16: 0x00000001004faa74 node`v8::internal::(anonymous namespace)::Invoke(isolate=0x0000000102801ec8, is_construct=<unavailable>, target=<unavailable>, receiver=<unavailable>, argc=0, args=0x0000000000000000, new_target=<unavailable>, message_handling=kReport, execution_target=kCallable) at execution.cc:153 [opt]
    frame #17: 0x00000001004fa768 node`v8::internal::Execution::Call(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Object>, int, v8::internal::Handle<v8::internal::Object>*) [inlined] v8::internal::(anonymous namespace)::CallInternal(isolate=0x0000000102801e00, callable=Handle<v8::internal::Object> @ r12, argc=0, argv=0x0000000000000000, message_handling=kReport, target=kCallable) at execution.cc:189 [opt]
    frame #18: 0x00000001004fa6ed node`v8::internal::Execution::Call(isolate=0x0000000102801e00, callable=Handle<v8::internal::Object> @ r12, receiver=<unavailable>, argc=0, argv=0x0000000000000000) at execution.cc:200 [opt]
    frame #19: 0x0000000100187c16 node`v8::Function::Call(this=0x00000001028564a0, context=<unavailable>, recv=<unavailable>, argc=0, argv=0x0000000000000000) at api.cc:5089 [opt]
    frame #20: 0x00000001000278f1 node`node::InternalCallbackScope::Close(this=0x00007ffeefbfe850) at node.cc:989 [opt]
    frame #21: 0x0000000100027a84 node`node::InternalMakeCallback(env=0x00007ffeefbfea38, recv=<unavailable>, callback=<unavailable>, argc=<unavailable>, argv=0x0000000000000000, asyncContext=<unavailable>) at node.cc:1025 [opt]
    frame #22: 0x0000000100027c12 node`node::MakeCallback(isolate=<unavailable>, recv=<unavailable>, callback=<unavailable>, argc=<unavailable>, argv=<unavailable>, asyncContext=<unavailable>) at node.cc:1079 [opt]
    frame #23: 0x000000010001a8dd node`node::Environment::CheckImmediate(handle=<unavailable>) at env.cc:419 [opt]
    frame #24: 0x000000010094de0a node`uv__run_check(loop=0x0000000101646750) at loop-watcher.c:67 [opt]
    frame #25: 0x00000001009494c1 node`uv_run(loop=0x0000000101646750, mode=UV_RUN_DEFAULT) at core.c:369 [opt]
    frame #26: 0x0000000100032061 node`node::Start(isolate=0x0000000102801e00, isolate_data=<unavailable>, argc=2, argv=<unavailable>, exec_argc=<unavailable>, exec_argv=<unavailable>) at node.cc:4504 [opt]
    frame #27: 0x0000000100031860 node`node::Start(event_loop=0x0000000101646750, argc=2, argv=0x00000001024067d0, exec_argc=1, exec_argv=0x00000001024068a0) at node.cc:4583 [opt]
    frame #28: 0x00000001000314cb node`node::Start(argc=<unavailable>, argv=0x00000001024067d0) at node.cc:4640 [opt]
    frame #29: 0x0000000100001034 node`start + 52

Do you know why it's missing the JS stacks? It's taken from master.

@bnoordhuis
Copy link
Member

New V8. llnode hasn't been updated yet.

Also, the Ignition interpreter complicates frame->function mapping. In the naive solution you only see the interpreter frames, like you do with perf(1) now.

@mmarchini has been working on that however, and with respectable stamina too, I may add.

@mcollina
Copy link
Member

Fix in #20104.

mcollina added a commit to mcollina/node that referenced this issue Apr 17, 2018
This PR adds _readableState.errorEmitted and add the tracking of it.

Fixes: nodejs#6083
@mmarchini
Copy link
Contributor

@bnoordhuis I thought llnode was handling interpreted functions 🤔 (at least on Node.js 8). Unfortunately what I've been working on will only fix perf(1), but I'll take a look into llnode to see if it's handling interpreted frames correctly.

Looking at the output from @mcollina though, looks like bt was used instead of v8 bt, or the postmortem metadata is missing from the binary. At least some of those missing frames should appear as "<built-in>" or "<stub>".

@mcollina
Copy link
Member

My bad, I was using the wrong command. Anyway, I tracked down the problem and a PR is on the way.

jasnell pushed a commit that referenced this issue Apr 20, 2018
This PR adds _readableState.errorEmitted and add the tracking of it.

Fixes: #6083

PR-URL: #20104
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: Trivikram Kamat <trivikr.dev@gmail.com>
@addaleax addaleax reopened this May 7, 2018
mcollina added a commit to mcollina/node that referenced this issue May 7, 2018
This reverts commit 8f6ab9f.

This PR adds _readableState.errorEmitted and add the tracking of it.

Fixes: nodejs#6083
See: nodejs#20334
See: nodejs#20449
@ORESoftware
Copy link
Contributor

ORESoftware commented Mar 30, 2019

Sorry to bomb this thread but I am looking for info regarding whether a tcp connection can recover after an 'error' event is emitted. I am guessing not, but I am not really sure. Currently, if 'error' is emitted I call .end() on the current connection and try to open a entirely new one.

@mcollina
Copy link
Member

Unless you are emitting error yourself, you can safely assume that the connection is gone. You don’t need to call .end().

If you want to be safe, call .destroy()

@ORESoftware
Copy link
Contributor

@mcollina thanks appreciated

@BridgeAR
Copy link
Member

BridgeAR commented Jan 2, 2020

@ronag do you know if this is still an issue or if this is already fixed?

@ronag
Copy link
Member

ronag commented Jan 2, 2020

@BridgeAR: It still seems to be an issue. As far as I can tell, the fix was reverted and the follow up was closed... I haven't followed the entire history and rationale of this.

@mcollina mentioned the following #20571 (comment)

This can be closed for the moment, it will be fixed with the destroy work @mafintosh has been preparing for some time.

But as far as I can see from master, this is still a problem, i.e. Readable can emit 'end' after 'error'.

I wouldn't mind to pick up on this but would be useful if those involved so far could shime in? @mafintosh @mcollina

ronag added a commit to nxtedition/node that referenced this issue Jan 3, 2020
Trott pushed a commit that referenced this issue Jan 6, 2020
Refs: #6083

PR-URL: #31182
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Rich Trott <rtrott@gmail.com>
@ronag
Copy link
Member

ronag commented Jan 6, 2020

@BridgeAR: Now I think this is fixed :)

@celesteking

This comment has been minimized.

@benjamingr
Copy link
Member

@celesteking this sort of comment is entirely unacceptable in the Node.js org. Please self-moderate and avoid these sort of comments in the future. Please consider what exactly you're trying to accomplish here and how these comments reflect on you and others in the org.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
confirmed-bug Issues with confirmed bugs. stream Issues and PRs related to the stream subsystem.
Projects
None yet
Development

No branches or pull requests