Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node aborts when logging to stdout in JS called from C++ #3422

Closed
mscdex opened this issue Oct 18, 2015 · 28 comments
Closed

Node aborts when logging to stdout in JS called from C++ #3422

mscdex opened this issue Oct 18, 2015 · 28 comments
Labels
c++ Issues and PRs that require attention from people who are familiar with C++.

Comments

@mscdex
Copy link
Contributor

mscdex commented Oct 18, 2015

This is a weird one and I haven't yet found a simpler way to reproduce it, but it does reproduce every time. This does require a mysql/mariadb server though.

This is happening on Linux.

Steps:

  1. npm install https://github.com/mscdex/node-mariasql/tarball/0cff1a10fc
  2. Place the following code in a file (e.g. bug.js) and execute it (node bug.js):
var Client = require('mariasql');
(new Client()).on('error',function(e) {
  // Change `console.log(e);` here to `console.error(e);` or emit it entirely and no abort happens ...
  console.log(e);
}).connect({
  // Values here do not matter much, as long as a connection is made but auth is bad/wrong
  host:'127.0.0.1',
  user:'root',
  password:'ffffffffff'
});

Output:

{ [Error: Access denied for user 'root'@'localhost' (using password: YES)] code: 1045 }
node: ../deps/uv/src/unix/core.c:856: uv__io_stop: Assertion `loop->watchers[w->fd] == w' failed.
Aborted (core dumped)

Backtrace:

#0  0x00007ffff6bf6e37 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x00007ffff6bf8528 in __GI_abort () at abort.c:89
#2  0x00007ffff6befce6 in __assert_fail_base (fmt=0x7ffff6d3fc08 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n",
    assertion=assertion@entry=0x19adbbe "loop->watchers[w->fd] == w", file=file@entry=0x19ad960 "../deps/uv/src/unix/core.c", line=line@entry=856,
    function=function@entry=0x19adc5f <__PRETTY_FUNCTION__.8772> "uv__io_stop") at assert.c:92
#3  0x00007ffff6befd92 in __GI___assert_fail (assertion=0x19adbbe "loop->watchers[w->fd] == w", file=0x19ad960 "../deps/uv/src/unix/core.c", line=856,
    function=0x19adc5f <__PRETTY_FUNCTION__.8772> "uv__io_stop") at assert.c:101
#4  0x0000000001654c26 in uv__io_stop (loop=0x1edee80 <default_loop_struct>, w=0x1f32f58, events=4) at ../deps/uv/src/unix/core.c:856
#5  0x000000000166222c in uv_try_write (stream=0x1f32ed0, bufs=0x7fffffff5bf0, nbufs=1) at ../deps/uv/src/unix/stream.c:1470
#6  0x00000000015f366f in node::StreamWrap::DoTryWrite (this=0x1f32e40, bufs=0x7fffffff5b90, count=0x7fffffff5ba0) at ../src/stream_wrap.cc:307
#7  0x00000000015f09ef in node::StreamBase::WriteString<(node::encoding)1> (this=0x1f32e80, args=...) at ../src/stream_base.cc:289
#8  0x00000000015f4972 in node::StreamBase::JSMethod<node::StreamWrap, &node::StreamBase::WriteString<(node::encoding)1> > (args=...)
    at ../src/stream_base-inl.h:100
#9  0x0000000000db29b2 in v8::internal::FunctionCallbackArguments::Call (this=0x7fffffff9de0,
    f=0x15f48da <node::StreamBase::JSMethod<node::StreamWrap, &(int node::StreamBase::WriteString<(node::encoding)1>(v8::FunctionCallbackInfo<v8::Value> const&))>(v8::FunctionCallbackInfo<v8::Value> const&)>) at ../deps/v8/src/arguments.cc:33
#10 0x0000000000e06408 in v8::internal::HandleApiCallHelper<false> (isolate=0x1ee26b0, args=...) at ../deps/v8/src/builtins.cc:1092
#11 0x0000000000dffdd5 in v8::internal::Builtin_Impl_HandleApiCall (args=..., isolate=0x1ee26b0) at ../deps/v8/src/builtins.cc:1115
#12 0x0000000000dffd26 in v8::internal::Builtin_HandleApiCall (args_length=4, args_object=0x7fffffff9f80, isolate=0x1ee26b0) at ../deps/v8/src/builtins.cc:1111

I should also note that the abort doesn't happen when the problematic code is ran either in the REPL or via node -pe "...".

@mscdex mscdex added the c++ Issues and PRs that require attention from people who are familiar with C++. label Oct 18, 2015
@bnoordhuis
Copy link
Member

/cc @indutny

@indutny
Copy link
Member

indutny commented Oct 18, 2015

Can't reproduce it on OS X. Going to build it on linux.

@indutny
Copy link
Member

indutny commented Oct 18, 2015

@mscdex I get different assertion in binding when running it in Debug mode:

#0  0x00007ffff6bf6e37 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x00007ffff6bf8528 in __GI_abort () at abort.c:89
#2  0x00007ffff6befce6 in __assert_fail_base (fmt=0x7ffff6d3fc08 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x7ffff46afe02 "status == 0",
    file=file@entry=0x7ffff46afdf0 "../src/binding.cc", line=line@entry=924,
    function=function@entry=0x7ffff46b0600 <Client::cb_poll(uv_poll_s*, int, int)::__PRETTY_FUNCTION__> "static void Client::cb_poll(uv_poll_t*, int, int)") at assert.c:92
#3  0x00007ffff6befd92 in __GI___assert_fail (assertion=0x7ffff46afe02 "status == 0", file=0x7ffff46afdf0 "../src/binding.cc", line=924,
    function=0x7ffff46b0600 <Client::cb_poll(uv_poll_s*, int, int)::__PRETTY_FUNCTION__> "static void Client::cb_poll(uv_poll_t*, int, int)") at assert.c:101
#4  0x00007ffff461d201 in Client::cb_poll (handle=<optimized out>, status=<optimized out>, events=<optimized out>) at ../src/binding.cc:924
#5  0x000000000164972c in uv__poll_io (loop=0x1ec9d40 <default_loop_struct>, w=0x1f1e158, events=28) at ../deps/uv/src/unix/poll.c:39
#6  0x00000000016552af in uv__io_poll (loop=0x1ec9d40 <default_loop_struct>, timeout=-1) at ../deps/uv/src/unix/linux-core.c:345
#7  0x0000000001641763 in uv_run (loop=0x1ec9d40 <default_loop_struct>, mode=UV_RUN_ONCE) at ../deps/uv/src/unix/core.c:341
#8  0x000000000159b7ad in node::StartNodeInstance (arg=0x7fffffffe430) at ../src/node.cc:3987
#9  0x000000000159bae8 in node::Start (argc=2, argv=0x1ecb9a0) at ../src/node.cc:4060
#10 0x00000000015cb3dc in main (argc=2, argv=0x7fffffffe598) at ../src/node_main.cc:44

Could it be that the binding is faulty by itself?

@indutny
Copy link
Member

indutny commented Oct 18, 2015

Yeah, it seems to be passing bad fd for me:

handle->poll_cb(handle, -EBADF, 0);

This is where cb_poll is invoked from.

@mscdex
Copy link
Contributor Author

mscdex commented Oct 18, 2015

@indutny Right, that may be the original bug I was trying to find, but I am still getting the other backtrace. I cannot for the life of me get the backtrace you are seeing.. :-(

I tried both v4.2.1 and master. FWIW it's on Ubuntu 14.10.

@indutny
Copy link
Member

indutny commented Oct 18, 2015

@mscdex may I ask you to try bisecting node to find the commit that may have caused it?

@mscdex
Copy link
Contributor Author

mscdex commented Oct 18, 2015

Ok, git bisect is pointing to 1d0e5210a, which is when libuv was updated from 1.6.1 to 1.7.3.

@indutny
Copy link
Member

indutny commented Oct 18, 2015

cc @saghul

@mscdex
Copy link
Contributor Author

mscdex commented Oct 18, 2015

Bisected the libuv upgrade and it looks like libuv/libuv@60fca01705 is the breaking commit there.

@bnoordhuis
Copy link
Member

I'd start with the assumption that this is a node.js bug that gets flushed out by a libuv change. src/tty_wrap.cc doesn't check the return value of uv_tty_init().

@indutny
Copy link
Member

indutny commented Oct 19, 2015

Most likely this is the case, @bnoordhuis . @mscdex what happens if you will run it like this valgrind node bug.js? I expect there to be an access to uninitialized variables.

@mscdex
Copy link
Contributor Author

mscdex commented Oct 19, 2015

@indutny I'm not seeing that in the valgrind output.

@bnoordhuis
Copy link
Member

Am I right this patch hits the CHECK?

diff --git a/src/tty_wrap.cc b/src/tty_wrap.cc
index a78d231..84d3608 100644
--- a/src/tty_wrap.cc
+++ b/src/tty_wrap.cc
@@ -134,7 +134,7 @@ TTYWrap::TTYWrap(Environment* env, Local<Object> object, int fd, bool readable)
                  object,
                  reinterpret_cast<uv_stream_t*>(&handle_),
                  AsyncWrap::PROVIDER_TTYWRAP) {
-  uv_tty_init(env->event_loop(), &handle_, fd, readable);
+  CHECK_EQ(0, uv_tty_init(env->event_loop(), &handle_, fd, readable));
 }

 }  // namespace node

@mscdex
Copy link
Contributor Author

mscdex commented Oct 19, 2015

@bnoordhuis It's passing the check. I added a debug statement and uv_tty_init() is indeed returning 0 (twice).

@indutny
Copy link
Member

indutny commented Oct 19, 2015

@mscdex are you sure about libuv/libuv@60fca01705 ?

@mscdex
Copy link
Contributor Author

mscdex commented Oct 19, 2015

@indutny Pretty sure, yes. If I check out at the parent commit no abort happens.

@indutny
Copy link
Member

indutny commented Oct 19, 2015

@mscdex just to make sure I understand it correctly, you are reverting just that change in deps/uv and it is starting to work.

@mscdex
Copy link
Contributor Author

mscdex commented Oct 19, 2015

@indutny Yes. With libuv/libuv@60fca01 it aborts. With the parent commit libuv/libuv@c1fe25f654 it doesn't abort.

@indutny
Copy link
Member

indutny commented Oct 19, 2015

@mscdex may I ask you to reproduce it on debug build of node? If it will still crash, may I ask you to print:

p *w in the frame of uv__io_stop (you can switch frames by typing f <FRAME-NUMBER> in gdb)

@mscdex
Copy link
Contributor Author

mscdex commented Oct 19, 2015

@indutny

(gdb) f 4
#4  0x0000000000ded6ce in uv__io_stop (loop=<optimized out>, w=w@entry=0x1433fc8, events=events@entry=4) at ../deps/uv/src/unix/core.c:843
843           assert(loop->watchers[w->fd] == w);
(gdb) p *w
$1 = {cb = 0xdf5a70 <uv__stream_io>, pending_queue = {0x13e37c8 <default_loop_struct+72>, 0x13e37c8 <default_loop_struct+72>}, watcher_queue = {0x1433fe0,
    0x1433fe0}, pevents = 0, events = 0, fd = 9}

I believe fd 9 is the socket fd for the database connection.

@indutny
Copy link
Member

indutny commented Oct 19, 2015

@mscdex ok, now it is time for a serious action :) Let's add:

if (this !== process.stdout && this !== process.stderr) console.trace();

Before https://github.com/nodejs/node/blob/master/lib/net.js#L644

@mscdex
Copy link
Contributor Author

mscdex commented Oct 19, 2015

Nothing new is displayed with that change.

@indutny
Copy link
Member

indutny commented Oct 19, 2015

Ok, this is bad :(

@indutny
Copy link
Member

indutny commented Oct 19, 2015

Let's add v8::Message::PrintCurrentStackTrace(v8::Isolate::GetCurrent(), stderr) before https://github.com/nodejs/node/blob/master/src/stream_wrap.cc#L307

@mscdex
Copy link
Contributor Author

mscdex commented Oct 19, 2015

With that added I get:

createWriteReq (net.js:709:21)
WriteStream.Socket._writeGeneric (net.js:673:11)
WriteStream.Socket._write (net.js:696:8)
doWrite (_stream_writable.js:292:12)
writeOrBuffer (_stream_writable.js:278:5)
WriteStream.Writable.write (_stream_writable.js:207:11)
WriteStream.Socket.write (net.js:618:40)
Console.log (console.js:36:16)
Client.<anonymous> (/home/mscdex/bf/test.js:17:11)
emitOne (events.js:77:13)
Client.emit (events.js:169:7)
Client._onerror (/home/mscdex/bf/node_modules/mariasql/lib/Client.js:404:12)
{ [Error: Access denied for user 'root'@'localhost' (using password: YES)] code: 1045 }
iojs: ../deps/uv/src/unix/core.c:843: uv__io_stop: Assertion `loop->watchers[w->fd] == w' failed.
Aborted (core dumped)

Line 404 in Client.js is just where I emit('error', err).

@indutny
Copy link
Member

indutny commented Oct 19, 2015

Hm... this is very very strange. It looks like the uv_tty_t is overwritten by someone.

@indutny
Copy link
Member

indutny commented Oct 19, 2015

I see no other choice, but to you memory breakpoints in gdb. Hope you don't mind helping me (sorry for lots of instructions):

  1. b uv_tty_init
  2. r
  3. Each time it will hit the breakpoint watch &tty->fd, and also please record somewhere the pointer value of tty and the number fd
  4. If the write breakpoint will be triggered - please log the stack trace
  5. If the crash will happen first - please log the pointer value of stream in uv_try_write.

Thanks!

@indutny
Copy link
Member

indutny commented Oct 19, 2015

As discussed on IRC, it does seem to be a mariadb binding bug, not a node.js/libuv bug . @mscdex feel free to reopen the issue if it will turn out to be something else.

@indutny indutny closed this as completed Oct 19, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c++ Issues and PRs that require attention from people who are familiar with C++.
Projects
None yet
Development

No branches or pull requests

3 participants