-
Notifications
You must be signed in to change notification settings - Fork 30k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
node does not abort at the right time when using --abort-on-uncaught-exception #3035
Comments
This PR fixes 0af4c9e so that node aborts at the right time when throwing an error and using --abort-on-uncaught-exception. Basically, it wraps most node internal callbacks with: if (!domain || domain.emittingTopLevelError) runCallback(); else { try { runCallback(); } catch (err) { process._fatalException(err); } } so that V8 can abort properly in Isolate::Throw if --abort-on-uncaught-exception was passed on the command line, and domain can handle the error if one is active and not already in the top level domain's error handler. It also reverts 921f2de partially: node::FatalException does not abort anymore because at that time, it's already too late. It adds process._forceTickDone, which is really a hack to allow test-next-tick-error-spin.js to pass. It's here to basically avoid an infinite recursion when throwing in a domain from a nextTick callback, and queuing the same callback on the next tick from the domain's error handler. This change is an alternative approach to nodejs#3036 for fixing nodejs#3035. Fixes nodejs#3035.
I implemented two different approaches with #3036 and #3038 to solve this problem. They're both drafts of PRs, and I didn't pay too much attention to details. I'm mainly looking for feedback on what would be the preferred approach. I personally have a preference for #3036, because I find it less intrusive and it's been tested with node v0.10. However I would welcome any other opinion/feedback, and there may be other valid approaches. |
Just to weigh in here -- this is critical for us at Netflix and our production stack -- and will potentially prevent us from moving to 4.x until this is fixed. Thanks for all the hardwork @misterdjules and we'd love to see this integrated soon! |
Thanks @misterdjules! I feel like #3036 is a much cleaner way of doing it and it doesn't require messing with timers and repl. |
👍 It's important that we have the right stack. We need the stack in the core file to have the entire call stack that we blew up. |
If the two approaches described in #3036 and #3038 are not viable, a third option is to revert 0af4c9e and 921f2de, and to display a warning when domains and This would bring us back to the point where domains and |
If a domain catches an exception and then determines there is no active domain and rethrows it into the global uncaught you now have a new stack trace that is not the exception you want to debug with a corefile. Is it possible to use From what I understand whenever you have domains enabled the stacktrace will always be a rethrow instead of the stack trace you actually want. That being said; if all you care about is heap analysis instead of stack analysis then |
Just to make sure we're on the same page, and because wording can be tricky when discussing these topics, if you mean that given the following code:
running it with
Not with the above -mentioned use case as far as I know.
Yes, if an error is thrown from the top-level domain's error handler, the stack trace won't contain the frame where the original error was thrown.
Exactly. In general, using This issue is about two things:
|
Sounds great. We are on the same page. |
#3036 was updated now that its V8-related changes landed upstream (see https://codereview.chromium.org/1375933003/). |
Revert 0af4c9e, parts of 921f2de and port nodejs/node-v0.x-archive#25835 from v0.12 to master so that node aborts at the right time when an error is thrown and --abort-on-uncaught-exception is used. Fixes nodejs#3035.
nodejs/master's head does not abort at the right time when using
--abort-on-uncaught-exception
. Here's a reproduction of the problem on SmartOS:Note that the call stack doesn't contain any JavaScript frame, and indicates that node aborted in
node::FatalException
, not when the error was thrown.This is a problem because users of post-mortem debuggers and
--abort-on-uncaught-exception
need to have core dumps that are generated when the exception is thrown, and the core dumps need to have the frame that throws the error in the call stack. Otherwise, it becomes much more difficult, if not impossible to determine the root cause of the problem.In the example above, there's no way to know that the error was thrown by the function named
foo
, we just know that one timer's callback threw.This regression was introduced by #922, which made V8 ignore
--abort-on-uncaught-exception
. An attempt at fixing that regression was made with #2776, but instead of letting V8 abort inIsolate::Throw
(when it actually throws the error and when all the relevant frames are active on the stack), it throws innode::FatalException
as shown above.I will submit two different PRs that fix this issue in two different ways so that we can discuss the pros and cons of the two different approaches I came up with.
/cc @nodejs/post-mortem
The text was updated successfully, but these errors were encountered: