Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What if a thread-safe function execution throws an exception,FATAL ERROR: Error::Error napi_define_properties, I catch exceptions and exception handling calls Env interface GetAndClearPendingException, then submitted to the following error #1244

Closed
Candylong opened this issue Dec 3, 2022 · 26 comments
Labels

Comments

@Candylong
Copy link

image

@Candylong Candylong changed the title What if a thread-safe function execution throws an exception What if a thread-safe function execution throws an exception,FATAL ERROR: Error::Error napi_define_properties Dec 3, 2022
@Candylong Candylong changed the title What if a thread-safe function execution throws an exception,FATAL ERROR: Error::Error napi_define_properties What if a thread-safe function execution throws an exception,FATAL ERROR: Error::Error napi_define_properties, I catch exceptions and exception handling calls Env interface GetAndClearPendingException, then submitted to the following error Dec 3, 2022
@JckXia
Copy link
Member

JckXia commented Dec 3, 2022

Hi there. Any chance you can provide a minimal reproducible example? (including the node version) I think this might be related to how we wrap primitive type errors thrown related to the changes in PR #1075

@Candylong
Copy link
Author

Hi there. Any chance you can provide a minimal reproducible example? (including the node version) I think this might be related to how we wrap primitive type errors thrown related to the changes in PR #1075

@Candylong Candylong reopened this Dec 4, 2022
@Candylong
Copy link
Author

Candylong commented Dec 4, 2022

@JckXia ,Thank you for your reply, i use the node-addon-api version 5.0.0 write the addon plugin, and run in electron 18(Indicates the corresponding nodejs version 16.13.2), this isn't a problem that always comes up, but I've encountered it several times using the reload feature on my electron DevTools, what should i do? Here are the crash stack and thread-safe functions when an exception occurs
image
image
Execute in a thread-safe function
auto jscallback = [=](Napi::Env env, Napi::Function jsCallback)
{
Napi::Object videoFrame = Napi::Object::New(env);
videoFrame .Set("width", Napi::Number::New(env, 1920)); //crash happens
}

@JckXia
Copy link
Member

JckXia commented Dec 4, 2022

Could you provide the source code of an addon that would best replicate this issue? Or would that not possible in this case?

But just a shot in the dark...it looks like the real issue here is napi_set_named_property failed to define the property width on the videoFrame object. The macro in the following line simply wraps the error in an Napi::Object, where we do the error wrapping logic but napi_define_properties shouldn't fail.

As you have mentioned it is an intermittent failure. Another option is to compile node 16.33.2 from the source and put a debugger starting at this line: https://github.com/nodejs/node/blob/main/src/js_native_api_v8.cc#L1104 should the error persists.

@Candylong
Copy link
Author

Candylong commented Dec 5, 2022 via email

@JckXia
Copy link
Member

JckXia commented Dec 5, 2022

Sorry, the format looked a bit funky to me. Is this a new example are you talking about, or is it the one in your original post?

@Candylong
Copy link
Author

Candylong commented Dec 6, 2022

Sorry, that message was sent in the wrong format, @JckXia , Here is the addon implementation code I can reproduce for crash:

#include <napi.h>
#include
#include

class MyObjWrap : public Napi::ObjectWrap {

public:
static Napi::FunctionReference Initialize(Napi::Env env) {
printf("init MyObjWrap\n");
Napi::Function fn = DefineClass(env, "MyObjWrap", {
InstanceMethod("print", &MyObjWrap::Print),
});
return Napi::Persistent(fn);
}
MyObjWrap(const Napi::CallbackInfo& info) : Napi::ObjectWrap(info) {
// ...
}
~MyObjWrap() {
// ...
}

private:
void Print(const Napi::CallbackInfo& info){
std::cout << "Print ..." << std::endl;
}

};

class MyObjWrap1 : public Napi::ObjectWrap {

public:
static Napi::FunctionReference Initialize(Napi::Env env) {
Napi::Function fn = DefineClass(env, "MyObjWrap1", {
InstanceMethod("print1", &MyObjWrap1::Print1),
});
return Napi::Persistent(fn);
}
MyObjWrap1(const Napi::CallbackInfo& info) : Napi::ObjectWrap(info) {
// ...
}
~MyObjWrap1() {
// ...
}

private:
void Print1(const Napi::CallbackInfo& info){
std::cout << "Print1 ..." << std::endl;
}

};

Napi::ThreadSafeFunction myThreadFunction;
using callJsfuntion = std::function<void(Napi::Env, Napi::Function)>;
bool jsCallBack(callJsfuntion fucntion)
{
if (napi_status::napi_ok == myThreadFunction.Acquire())
{
bool ok = napi_ok == myThreadFunction.BlockingCall(fucntion);
myThreadFunction.Release();
return ok;
}
else
{
return true;
}
}

bool isStop(true);
void threadLoop()
{
while(!isStop)
{
auto callback = [=](Napi::Env env, Napi::Function function)
{
Napi::Object videoFrame = Napi::Object::New(env);
videoFrame.Set("width", Napi::Number::New(env, 1920)); // exceptions occur
videoFrame.Set("mode2", Napi::Number::New(env, 1920));
videoFrame.Set("mode3", Napi::Number::New(env, 1080));
videoFrame.Set("mode4", Napi::Number::New(env, 1920));
videoFrame.Set("mode5", Napi::Number::New(env, 1080));
for(int i = 0; i<5000; i++) // Compare to internal frequent tasks
{
videoFrame.Set(std::to_string(i).c_str(), Napi::String::New(env, "123456789"));
}

  for(int i = 0; i<7000; i++)
  {
    videoFrame.Set(std::to_string(i).c_str(), Napi::Number::New(env, 1080));
  }
  function.Call({Napi::String::New(env, "hello world!"), videoFrame});
};
jsCallBack(callback);
// sleep

std::this_thread::sleep_for(std::chrono::microseconds(1));
}
}

void threadLoop2()
{
while(!isStop)
{
auto callback = [=](Napi::Env env, Napi::Function function)
{
Napi::Object videoFrame = Napi::Object::New(env);
videoFrame.Set("width", Napi::Number::New(env, 1920)); // exceptions occur
videoFrame.Set("mode2", Napi::Number::New(env, 1920));
videoFrame.Set("mode3", Napi::Number::New(env, 1080));
videoFrame.Set("mode4", Napi::Number::New(env, 1920));
videoFrame.Set("mode5", Napi::Number::New(env, 1080));

  for(int i = 0; i<5000; i++)
  {
    videoFrame.Set(std::to_string(i).c_str(), Napi::Number::New(env, 1080));
  }

  for(int i = 0; i<5000; i++)   // Compare to internal frequent tasks
  {
    videoFrame.Set(std::to_string(i).c_str(), Napi::String::New(env, "123456789"));
  }
  function.Call({Napi::String::New(env, "hello world!"), videoFrame});
};
jsCallBack(callback);
// sleep

std::this_thread::sleep_for(std::chrono::microseconds(1));
}
}

class MyObjWrap2 : public Napi::ObjectWrap {

public:
static Napi::FunctionReference Initialize(Napi::Env env) {
printf("init MyObjWrap2\n");
Napi::Function fn = DefineClass(env, "MyObjWrap2", {
InstanceMethod("print2", &MyObjWrap2::Print2),
});
return Napi::Persistent(fn);
}
MyObjWrap2(const Napi::CallbackInfo& info) : Napi::ObjectWrap(info) {
// ...
isStop = false;
// thread 1
mythread1 = std::thread(std::bind(threadLoop));
// thread 2
mythread2 = std::thread(std::bind(threadLoop2));
}
~MyObjWrap2()
{
isStop = true;
mythread1.join();
mythread2.join();
}

private:
void Print2(const Napi::CallbackInfo& info){
std::cout << "Print2 ..." << std::endl;
}
std::thread mythread1;
std::thread mythread2;

};

class Addon : public Napi::Addon {
public:

Napi::Value CreateFunction(const Napi::CallbackInfo& info)
{
printf("Addon Function \n");
Napi::Env env = info.Env();
Napi::Object param = info[0].AsNapi::Object();
if (!param.Has("tsfn"))
{
Napi::TypeError::New(env, "Wrong arguments").ThrowAsJavaScriptException();
return env.Undefined();
}

Napi::Function tsfn = param.Get("tsfn").As<Napi::Function>();
myThreadFunction = Napi::ThreadSafeFunction::New(
                env,
                tsfn,            // JavaScript function called asynchronously
                "Resource Name", // Name
                0,               // Unlimited queue
                1,               // Only one thread will use this initially
                [](Napi::Env) {  // Finalizer used to clean threads up
                    printf("Addon tsfn finalizer \n");
                });

env.AddCleanupHook([=](){
  printf("execute clean hook\n");
  myThreadFunction.Release();
});
return Napi::Number::New(env, 0);

}

Napi::Value CreateObject(const Napi::CallbackInfo& info)
{
printf("Addon CreateObject \n");
Napi::Env env = info.Env();

Napi::Object jsObject = myObjWrapCtr2.New({});
return jsObject;

}

Addon(Napi::Env env, Napi::Object exports) {
myObjWrapCtr = MyObjWrap::Initialize(env);
myObjWrapCtr1 = MyObjWrap1::Initialize(env);
myObjWrapCtr2 = MyObjWrap2::Initialize(env);
DefineAddon(exports, {
InstanceMethod("CreateFunction", &Addon::CreateFunction),
InstanceMethod("CreateObject", &Addon::CreateObject)}
);
}
private:
Napi::FunctionReference myObjWrapCtr;
Napi::FunctionReference myObjWrapCtr1;
Napi::FunctionReference myObjWrapCtr2;
};

// The macro announces that instances of the class CallasAddon will be
// created for each instance of the add-on that must be loaded into Node.js.
NODE_API_NAMED_ADDON(NODE_GYP_MODULE_NAME, Addon)

Here is the code I can reproduce for electron using addon:(render.js)

//'use strict'
const addon = require('./addon')
let myObjWrap2 = null
const testAPI = document.getElementById('TEST')
function callEmit()
{
console.log(arguments[0]);
}
testAPI.onclick = () =>
{
addon.CreateFunction({tsfn: callEmit})
myObjWrap2 = addon.CreateObject()
}

To repeat the step, click the button on the electron interface and then click reload on Devtools and wait for a while to crash
image
image

Judging by the results of my tests:I think when I click on the button in the electron (start threads and run thread-safe function), because the thread-safe function need to perform more tasks (may queue accumulation is too many tasks, not to be able to perform), and then led to a thread-safe function performs the exception, but what I don't really understand the underlying implementation of it @JckXia

@JckXia
Copy link
Member

JckXia commented Dec 6, 2022

Ok thank you @Candylong. Will try to reproduce this on my end and get back to you asap.

@JckXia
Copy link
Member

JckXia commented Dec 8, 2022

Hey @Candylong, there were some issues with the code you provided. I made some fixes and stored it under https://github.com/JckXia/napi-bug-repro. Just want to double check we are indeed able to reproduce the bug with this addon. I haven't used electron before, so will do more read-up on that tommorow but just running it with node shows no problems.

Out of curiosity, were you able to replicate this issue outside of the electron environment? Or does the error only show up when run in electron? (For example, just run node driver.js after building the addon from my repo)

@Candylong
Copy link
Author

Candylong commented Dec 8, 2022

@JckXia ,like you said, normal operation will be fine, the main issue is to highlight the context-aware addon feature, while electron's reload will reload addon(.node) repeatedly,which is what context-aware addon supports(https://nodejs.org/dist/latest-v18.x/docs/api/addons.html#context-aware-addons), so it shouldn't be a problem
image,

I have arranged my demo and put it here(https://github.com/Candylong/napi-electron-demo). I mainly run it on windows, and it can be reproduced after about 5 operations. Can you have a try

@JckXia
Copy link
Member

JckXia commented Dec 9, 2022

Hey @Candylong. I tried your code and can indeed replicate this bug (clicking the button in 5 rapid successions and then attempting to reload). Will take a deeper dive tomorrow.

@Candylong
Copy link
Author

@JckXia ,Thank you very much and look forward to your reply.

@JckXia
Copy link
Member

JckXia commented Dec 13, 2022

I did more testing with this addon, and here are some things I've noticed:

  1. This crash can be replicated in two ways. We could click the button twice, then attempt to reload the chromium-browser. Or, we could click the button once, let it run for a bit then attempt to reload.

  2. The crashes seem to reliably occur on either line https://github.com/Candylong/napi-electron-demo/blob/main/addon.cc#L77 or line https://github.com/Candylong/napi-electron-demo/blob/main/addon.cc#L106. In addition, all other object accessor methods ('Has', 'Get','Delete') will fail at this step. Oddly, we have no problem creating new objects, it's just that we couldn't perform any CRUD actions on them. Looking at the node core implementation it appears that object accessor methods utilize the v8 context attached to the env variable. Not too sure if that is correlated with this crash, however.

Hmm, any ideas? @nodejs/node-api

@JckXia
Copy link
Member

JckXia commented Dec 13, 2022

On a side note, it appears the electron version listed in repo https://github.com/Candylong/napi-electron-demo binds to node.js version v14.7.0. And what I've found is the locally installed node.js must correspond to this version, or electron will not be able to load the locally compiled addon properly. I opted to bump the electron up to latest and this bug persists still.

@Candylong
Copy link
Author

Hey@JckXia ,is there any progress on this problem? It seems that this problem is blocked at present. What should I do?

@JckXia
Copy link
Member

JckXia commented Dec 21, 2022

Hey @Candylong, sorry haven't got much time to look at this. Still not too sure about the underlying mechanism that causes this but I suspect it has to do with how we are spawning the threads.

But just some observations regarding the code you've provided. From what I understand, you want to tie the life cycle of MyObjWrap2 with threads mythread1 and mythread2, where the condition is if a global variable isStop is set to true in the dtor of MyObjWrap. If that were to be the base, wouldn't it be a better idea to have the isStop variable as a class variable instead?

Another thing that I've noticed is that perhaps the two threads are competing for resources. When I simply run one of the two threads, the crash still happens but doesn't get triggered as easily. Is there any way you could make use of condition variables to coordinate the execution of code in threadLoop and threadLoop2? For example, set a condition variable in the while condition such that at any given time there is only one thread running. Of course, it might not fit your use case but just something to think about.

@JckXia
Copy link
Member

JckXia commented Dec 21, 2022

I did a quick proof of concept here: https://github.com/JckXia/napi-bug-repro/blob/main/example.cc for reference.

@Candylong
Copy link
Author

@JckXia ,Thank you very much. But he reason why I used two threads to run was to simulate reproducing the crash problem(frequent callbacks using thread-safe functions). In the actual scenario, these threads were third-party library threads, I could not do this as you said. For example, reload should be executed into Hook (AddCleanupHook) first, and then I call the third-party library interface in the hook function to synchronously stop the threads.

@Candylong
Copy link
Author

The problem I'm having is blocked. Do you have any good ideas? Or can you look at this problem together? Thank you very much. Hey, @JckXia @NickNaso @mhdawson or other Nodejs(or napi-addon-api) members.

@github-actions
Copy link
Contributor

This issue is stale because it has been open many days with no activity. It will be closed soon unless the stale label is removed or a comment is made.

@github-actions github-actions bot added the stale label Mar 24, 2023
@Candylong
Copy link
Author

Does anyone know about this problem?

@github-actions github-actions bot removed the stale label Mar 25, 2023
@gabrielschulhof
Copy link
Contributor

@Candylong I just tried your demo. It says "We are using Node.js 18.12.1, Chromium 110.0.5481.208, and Electron 23.2.2." I pressed the button 5 times in quick succession then reloaded. It did not crash. Are you still seeing this problem>

@Candylong
Copy link
Author

@gabrielschulhof ,Hi, Instead of clicking the button in quick succession, I clicked the button each time and then clicked reload on electron DevTools. My test probably crashed after clicking reload on electron DevTools 5 times(Click the button on the interface and click reload on the electron DevTools)

@gabrielschulhof
Copy link
Contributor

gabrielschulhof commented Apr 10, 2023

@Candylong I'm not seeing a reload button on my devtools, except on the performance tab where I can reload and start recording. I tried to click the button and then press Ctrl+R, and that worked lots of times in succession. I may not be able to reproduce the problem because there's an error in the console during reload:

node:internal/modules/cjs/loader:1063 Uncaught TypeError: The argument 'id' must be a non-empty string. Received ''
    at __node_internal_captureLargerStackTrace (node:internal/errors:484:5)
    at new NodeError (node:internal/errors:393:5)
    at Module.require (node:internal/modules/cjs/loader:1063:11)
    at require (node:internal/modules/cjs/helpers:103:18)
    at renderer.js:13:15
__node_internal_captureLargerStackTrace @ node:internal/errors:484
NodeError @ node:internal/errors:393
Module.require @ node:internal/modules/cjs/loader:1063
require @ node:internal/modules/cjs/helpers:103
(anonymous) @ renderer.js:13
node:electron/js2c/renderer_init:2 Electron Security Warning (Insecure Content-Security-Policy) This renderer process has either no Content Security
  Policy set or a policy with "unsafe-eval" enabled. This exposes users of
  this app to unnecessary security risks.

For more information and help, consult
https://electronjs.org/docs/tutorial/security.
This warning will not show up
once the app is packaged.```

@Candylong
Copy link
Author

@gabrielschulhof
This reload in the image below,every time you click 'testAPI' on the interface and then click reload in the image below, about the fifth time you repeat this operation, a crash will occur
image
crash-1

@github-actions
Copy link
Contributor

This issue is stale because it has been open many days with no activity. It will be closed soon unless the stale label is removed or a comment is made.

@github-actions github-actions bot added the stale label Jul 11, 2023
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Aug 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants