Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] node REPL crashes creating ObjectWrap in getter #1000

Closed
trxcllnt opened this issue May 14, 2021 · 28 comments
Closed

[BUG] node REPL crashes creating ObjectWrap in getter #1000

trxcllnt opened this issue May 14, 2021 · 28 comments

Comments

@trxcllnt
Copy link

trxcllnt commented May 14, 2021

The node REPL seems to crash if a C++ getter creates an instance of a class that extends ObjectWrap.

This works in a script, but it crashes as soon as the REPL's auto-complete attempts to "preview" bar.get_new_foo:

#include <napi.h>

struct Foo : public Napi::ObjectWrap<Foo> {
  static Napi::Function Init(Napi::Env env, Napi::Object exports) {
    return DefineClass(env, "Foo", {});
  }
  Foo(Napi::CallbackInfo const& info) : Napi::ObjectWrap<Foo>(info){};
};

struct Bar : public Napi::ObjectWrap<Bar> {
  static Napi::Function Init(Napi::Env env, Napi::Object exports) {
    FooConstructor = Napi::Persistent(Foo::Init(env, exports));
    FooConstructor.SuppressDestruct();
    return DefineClass(env, "Bar", {InstanceAccessor<&Bar::get_new_foo>("get_new_foo")});
  }

  Bar(Napi::CallbackInfo const& info) : Napi::ObjectWrap<Bar>(info){};

 private:
  static Napi::FunctionReference FooConstructor;
  Napi::Value get_new_foo(Napi::CallbackInfo const& info) {
    return FooConstructor.New({});
  }
};

Napi::FunctionReference Bar::FooConstructor;

struct NodeREPLCrashRepro : public Napi::Addon<NodeREPLCrashRepro> {
  NodeREPLCrashRepro(Napi::Env env, Napi::Object exports) {
    DefineAddon(exports, {InstanceValue("Bar", Bar::Init(env, exports))});
  }
};

NODE_API_ADDON(NodeREPLCrashRepro);
$ node
Welcome to Node.js v15.14.0.
Type ".help" for more information.
> var { Bar } = require(`./build/Debug/crash_repro.node`)
undefined
> var bar = new Bar()
undefined
> bar.gFATAL ERROR: Error::Error napi_create_reference
 1: 0xa89e60 node::Abort() [node]
 2: 0x9ade29 node::FatalError(char const*, char const*) [node]
 3: 0x9ade32  [node]
 4: 0xa55b3b napi_fatal_error [node]
 5: 0x7f3bf031ea4e Napi::ObjectReference::~ObjectReference() [build/Debug/crash_repro.node]
 6: 0x7f3bf031eb14 Napi::Error::Error(napi_env__*, napi_value__*) [build/Debug/crash_repro.node]
 7: 0x7f3bf031e97f Napi::Error::New(napi_env__*) [build/Debug/crash_repro.node]
 8: 0x7f3bf031e758 Napi::Function::New(unsigned long, napi_value__* const*) const [build/Debug/crash_repro.node]
 9: 0x7f3bf031e6d9 Napi::Function::New(std::initializer_list<napi_value__*> const&) const [build/Debug/crash_repro.node]
10: 0x7f3bf031f3a8 Napi::FunctionReference::New(std::initializer_list<napi_value__*> const&) const [build/Debug/crash_repro.node]
11: 0x7f3bf032050d Bar::get_new_foo(Napi::CallbackInfo const&) [build/Debug/crash_repro.node]
12: 0x7f3bf031db65  [build/Debug/crash_repro.node]
13: 0x7f3bf031dc39  [build/Debug/crash_repro.node]
14: 0x7f3bf031dbfd  [build/Debug/crash_repro.node]
15: 0xa390bf  [node]
16: 0x149e62d  [node]
Aborted
@trxcllnt trxcllnt changed the title node REPL crashes creating ObjectWrap in getter [BUG] node REPL crashes creating ObjectWrap in getter May 14, 2021
@trxcllnt
Copy link
Author

There seem to be two issues here:

  1. The napi_new_instance() call here returns a napi_generic_failure status.
  2. NAPI_THROW_IF_FAILED calls this Error constructor.
    a. That constructor passes the env/error pair to this Error constructor.
    b. That constructor's napi_create_reference call returns an error code.

This is reminiscent of Python's "while handling the original error, another error occurred" message.

@legendecas
Copy link
Member

legendecas commented May 24, 2021

There are several problems in the case:

  1. napi_new_instance returned napi_generic_failure instead of napi_pending_exception on JavaScript exceptions (v8 throws on calling functions with side effects when previewing).
  2. The value v8 thrown is a plain string, which would cause Napi::Env::GetAndClearPendingException assumes exceptions are Error objects #912 to abort.

Please let me know if there are any other conditions that may apply. I'll submit patches shortly.

@bl-ue
Copy link

bl-ue commented Jun 4, 2021

Should this be closed now that nodejs/node#38798 is landed?

@mhdawson
Copy link
Member

@trxcllnt could you verify if the fix landed has addressed your issue?

@trxcllnt
Copy link
Author

@mhdawson is there a build of v16.4.0 I can try? I'm not set up to build node at the moment.

@mhdawson
Copy link
Member

@trxcllnt these are the nightly builds for master: https://nodejs.org/download/nightly/v17.0.0-nightly20210616767996047c/

I think that would like be the best place to test as 16.4 is not out yet. I do see it in the release proposal though: nodejs/node#39031 and the target for v16.4.0 shows as 15 June (nodejs/Release#658) so it may be coming soon.

@danielleadams
Copy link
Member

danielleadams commented Jun 17, 2021

@trxcllnt @mhdawson I am hoping to get 16.4 out tomorrow after the next npm release goes out.

@mhdawson
Copy link
Member

@danielleadams thanks for the update :)

@trxcllnt
Copy link
Author

Not resolved in node v16.4.0:

$ node
Welcome to Node.js v16.4.0.
Type ".help" for more information.
> var b = new (require('./build/Debug/crash_repro.node')).Bar()
undefined
> b.gFATAL ERROR: Error::Error napi_create_reference
 1: 0xb200e0 node::Abort() [node]
 2: 0xa3c157 node::FatalError(char const*, char const*) [node]
 3: 0xa3c160  [node]
 4: 0xaeb7ab napi_fatal_error [node]
 5: 0x7f19189d0a4e Napi::ObjectReference::~ObjectReference() [/home/ptaylor/dev/node-addon-cmake-js-sandbox/build/Debug/crash_repro.node]
 6: 0x7f19189d0b14 Napi::Error::Error(napi_env__*, napi_value__*) [/home/ptaylor/dev/node-addon-cmake-js-sandbox/build/Debug/crash_repro.node]
 7: 0x7f19189d097f Napi::Error::New(napi_env__*) [/home/ptaylor/dev/node-addon-cmake-js-sandbox/build/Debug/crash_repro.node]
 8: 0x7f19189d0758 Napi::Function::New(unsigned long, napi_value__* const*) const [/home/ptaylor/dev/node-addon-cmake-js-sandbox/build/Debug/crash_repro.node]
 9: 0x7f19189d06d9 Napi::Function::New(std::initializer_list<napi_value__*> const&) const [/home/ptaylor/dev/node-addon-cmake-js-sandbox/build/Debug/crash_repro.node]
10: 0x7f19189d13a8 Napi::FunctionReference::New(std::initializer_list<napi_value__*> const&) const [/home/ptaylor/dev/node-addon-cmake-js-sandbox/build/Debug/crash_repro.node]
11: 0x7f19189d250d Bar::get_new_foo(Napi::CallbackInfo const&) [/home/ptaylor/dev/node-addon-cmake-js-sandbox/build/Debug/crash_repro.node]
12: 0x7f19189cfb65  [/home/ptaylor/dev/node-addon-cmake-js-sandbox/build/Debug/crash_repro.node]
13: 0x7f19189cfc39  [/home/ptaylor/dev/node-addon-cmake-js-sandbox/build/Debug/crash_repro.node]
14: 0x7f19189cfbfd  [/home/ptaylor/dev/node-addon-cmake-js-sandbox/build/Debug/crash_repro.node]
15: 0xaceebf  [node]
16: 0x15dbdcc  [node]
Aborted

@mhdawson
Copy link
Member

@trxcllnt thanks for the update. Seems like we've only fixed one of the several issues that @legendecas mentioned. I think we discussed options that are related to this issue in the Node-api team meeting today.

@github-actions
Copy link
Contributor

This issue is stale because it has been open many days with no activity. It will be closed soon unless the stale label is removed or a comment is made.

@mhdawson
Copy link
Member

This PR should help us get a better error: #1075

@JckXia is going to run with this applied and comment on what the updated output is.

@JckXia
Copy link
Member

JckXia commented Sep 24, 2021

I tried running the addon with changes in PR #1075. I now have a new fatal error inside napi_set_property. Though when I tried to run the same script inside a .js file I got an error saying get_new_foo does not exist.
checkStatus
newSc

Also napi_generic_error is only returned by napi_new_instance in REPL mode for some reason.

Edit:
I was running with a older version of node. But using 16.4.0, I got this error message instead:
pendingException

And the error is napi_pending_exception

@JckXia
Copy link
Member

JckXia commented Oct 1, 2021

After doing some testing, it looks like the value that we are trying to set on the error object is a type napi_null. And napi_set_property promptly fails with a napi_generic_failure. But the issue is still present when napi_new_instance returns a pending exception in REPL mode.

@JckXia
Copy link
Member

JckXia commented Oct 1, 2021

Also something I found: If we paste the entire script as a one liner into the REPL, the code would work properly:
success

Looking at napi_new_instance , the only thing that pops up to me is that we get a context object from the env and pass it to ctor->NewInstance(). Since it also looks like that the REPL will spawn threads to execute javascript commands, is it possible that we have sort of threading issue with not properly entering/exiting the v8 engine?

@JckXia
Copy link
Member

JckXia commented Oct 8, 2021

Sorry for spamming this issue, but I did more experiments with this particular addon and found that if we execute const bar = new addon.Bar() and console.log(bar.get_new_foo) as one line inside REPL, it will also work(with v16.4.0).

duoLine

I am not very familiar with how REPL or V8 work under the hood, but this is what I've found so far.

@legendecas
Copy link
Member

legendecas commented Oct 8, 2021

@JckXia hey, since you say that copy-pasting the single line code to the REPL didn't cause the crash, I have to mention that the error v8 thrown internally is that v8 is detecting if the native method is side-effect-free and if Node.js REPL is safe to invoke the method for previews.

So let's say when you typed bar.g in the REPL and don't type key enter, REPL is trying to evaluate the already typed inputs and show a preview of it. Like say:

> const foo = {bar: 'bar'}
undefined
> foo.bar // <= here you don't have to type ENTER to get the result shown below
'bar'

The mechanism v8 use to detect if a native method is side-effect-free is to disallowing javascript executions when invoking the native method. So back to the issue, when you type bar.g, Node.js REPL and v8 is trying to tell if bar.get_new_foo is side-effect-free, so they set up with disallowing javascript executions and invoked the method Bar.get_new_foo, in which the program invoked Foo's constructor, which eventually gets rejected by v8 with a raw string exception.

So the problem only occurs when you type in the REPL. Script mode and copy-paste a full script source text into the REPL won't trigger the crash.

@legendecas legendecas removed their assignment Oct 8, 2021
@KevinEady
Copy link
Contributor

We will relook at this issue when #1075 lands

@mhdawson
Copy link
Member

mhdawson commented Dec 3, 2021

#1075 has landed, but we still see an error in the define properties as JavaScript execution is not allowed.

@legendecas clarified that it is because the Repl tries to preview the result of a function but sets the environment so JavaScript Execution is not allowed. He already checked and we seem to set the flag telling V8 that the method has side effects which should prevent V8 from trying to call it in preview mode.

Next step is to dig a bit deeper into why the method is still being called. @legendecas will try to do that.

@mhdawson
Copy link
Member

@JckXia mentioned that it seems not to happen with the main branch from ~ 1 month ago. It might make sense to bisect to see where it potentially got "fixed" to see what changed between the good/back versions.

@JckXia
Copy link
Member

JckXia commented Dec 10, 2021

Did a quick version bisection. It looks like the node version upgrade from v16.5.0 to v16.6.0 "fixes" this bug. https://github.com/nodejs/node/releases/tag/v16.6.0

@mhdawson
Copy link
Member

mhdawson commented Jan 7, 2022

Discussed in the Node-api team meeting today.

Next step would seem to be to figure out which of the commits in https://github.com/nodejs/node/releases/tag/v16.6.0 fixes the issue and if it makes any sense to backport that to the 14.x release line (12.x goes EOL in April and so it may not be worth backporting to there)

@mhdawson
Copy link
Member

@legendecas mentioned he might find some time to look at the commits between those two and figure out which one seems to fix the issue.

@JckXia
Copy link
Member

JckXia commented Jan 14, 2022

Hey guys. It looks like the commit nodejs/node@6114198717 fixes the issue. It looks like a V8 ver upgrade as expected.

@legendecas
Copy link
Member

I can confirm https://chromium-review.googlesource.com/c/v8/v8/+/2857634 fixed calling side-effect accessors on side-effect-free debug-evaluate call. (manually picked the patch on v14.x locally to test out)

I'm not sure what the policy is of backporting v8 patches. Since this patch does not break ABI, I'd believe we can safely land the patch on v12.x and v14.x lines (with conflicts resolved, apparently there are lots of conflicts).

@JckXia
Copy link
Member

JckXia commented Feb 3, 2022

Hey everyone. I did some work for backporting this commit(https://chromium-review.googlesource.com/c/v8/v8/+/2857634) and have some questions. After patching it onto v14.x staging and trying to build node with make, I got a few compilation errors. But one of the more notable ones is "LoadPropertyFromDictionary is not defined within scope".

v8MakeIssue

After doing more digging it looks like this change was introduced by v8/v8@8d62d4b#diff-b2526928b4a5ecbf4274ccb91d3a2114e9ae3638f4ca0b66a57f1a0adcda6fb7R21-R27, and the SwissNameDictionary class seems to have a series of implementations (https://github.com/v8/v8/commits/main/src/objects/swiss-name-dictionary.h) that isn't present in V14.staging.

If we were to backport the changes in https://chromium-review.googlesource.com/c/v8/v8/+/2857634, would we have to backport the aforementioned changes with swiss-name-dictionary as well? Thanks.

@mhdawson
Copy link
Member

mhdawson commented Feb 4, 2022

Discussed in the Node-api team meeting today. Gabriel confirmed based on past experience that we need to pull in dependent changes. Likelihood being accepted for 14.x will depend on how bid the change ends up being.

@mhdawson
Copy link
Member

Based on investigation/discussion it looks like the change will be too risky for 14x. It's fixed in 16.x and current plan is to leave it at that.

Going to close this issue, please let us know if that was not the right thing to do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants