Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

src: document --abort-on-uncaught-exception #13931

Merged

Conversation

sam-github
Copy link
Contributor

It used to be implemented solely by V8, but has been a node option since
#12892. Its important for post-mortem
diagnostics and should be more prominently documented.

Checklist
Affected core subsystem(s)

src,doc

@nodejs-github-bot nodejs-github-bot added the c++ Issues and PRs that require attention from people who are familiar with C++. label Jun 26, 2017
@sam-github sam-github requested a review from misterdjules June 26, 2017 18:16
@sam-github
Copy link
Contributor Author

@misterdjules Is my doc text OK? If you can think of any improvements, I'm happy to make them.

@sam-github
Copy link
Contributor Author

I think this should be backported down to and including 6.x @nodejs/lts (once it merges, of course)

-->

Aborting instead of exiting causes a core file to be generated for post-mortem
analysis using a debugger (such as `lldb`, `gdb`, and `mdb`).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this dependent on system settings, e.g. ulimit -c on Linux?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By 'this' I mean the generation of a core file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, should I give some advice here on ulimit, or just say that it may generate a core file?

added: v0.10
-->

Aborting instead of exiting causes a core file to be generated for post-mortem
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion (feel free to take it or ignore it):

Abort the process and generate a core file if an uncaught exception occurs. The core file can be used for post-mortem analysis using a debugger such as lldb, gdb, and mdb.

@misterdjules
Copy link

misterdjules commented Jun 26, 2017

Just thinking out loud: I wasn't aware of #12892 I wasn't aware of the changes to how the --abort-on-uncaught-exception command line option is handled and used in #12892 and I haven't had the time to take a close look at the changes, but at first sight I don't understand why the --abort-on-uncaught-exception command line option was chosen to determine when to exit or abort when an internal error occurs in the Async Hooks implementation.

My understanding of --abort-on-uncaught-exception is that it makes node abort when a JS error is thrown and is not caught, which seems to not cover the use case of the Async Hooks implementation.

If it is still determined that it's worth it to handle and use the --abort-on-uncaught-exception option in node and not just to forward it to V8, I would recommend to use a similar (if not identical) description in the documentation than the one that is already in V8's source:

$ node --v8-options | grep abort
  --hard_abort (abort by crashing)
  --abort_on_uncaught_exception (abort program (dump core) when an uncaught exception is thrown)
$

@sam-github
Copy link
Contributor Author

sam-github commented Jun 26, 2017

@misterdjules You bring up good an interesting point, I had not noticed that, I brought it up in the async hooks PR.

Personally, I think --abort-on-uncaught-exception should be documented in cli.md and node --help no matter where in the code the option is handled. I realized it wasn't documented when I was searching the node docs for a link to the option's documentation so I could include it in a publication on the importance of core dumps to post-mortem debugging, and was quite puzzled not to find it. At first I assumed it had just been a historical oversight. Its absence from the node usage message was also why I forgot to add it in the NODE_OPTIONS whitelist, even though it is a perfect fit for NODE_OPTIONS's use-cases.

@AndreasMadsen
Copy link
Member

Some of the error (actually most) that happens in async_hooks we can't delegate to the uncaughtException handler. Thus, we remove the uncaughtException event handlers and let the event kill the process. I'm not sure why we don't do that in Environment::AsyncHooks::pop_ids but the effect should be the same. /cc @trevnorris

@sam-github please just cc @nodejs/async_hooks if you have async_hooks questions. Commenting in the old thread is a little awkward.

@misterdjules
Copy link

misterdjules commented Jun 26, 2017

@sam-github

You bring up good an interesting point, I had not noticed that, I brought it up in the async hooks PR.

Thanks!

Personally, I think --abort-on-uncaught-exception should be documented in cli.md and node --help no matter where in the code the option is handled

From a user experience perspective, I agree.

However, if --abort-on-uncaught-exception was removed from the set of command line options that node handles itself, but still documented in cli.md, would that mean that other options that are currently not documented (like the options in the set of V8 options) should also be documented? Or would the decision between documenting and not documenting a command line option be based on the perceived impact on user experience?

@sam-github
Copy link
Contributor Author

I think it would have to be based on perceived user-experience. Clearly putting 100s (literally) of ever-changing V8 options (they vary v8 release to release, and even vary by platform within a V8 release) would not be helpful. --abort-on-uncaught-exception and the somewhat obscurely named --max-old-space-size are the only options I hear regularly recommended for production use.

doc/api/cli.md Outdated
added: v0.10
-->

Abort (dump core) when an uncaught exception is thrown. On most systems this will

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it seems the rest of this file uses just one space after a dot at the start of a sentence.

@Trott
Copy link
Member

Trott commented Jun 27, 2017

Probably a question for LTS WG:

Would documenting --abort-on-uncaught-exception mean that if V8 significantly alters its behavior, that's now a breaking change in Node.js and we can't ship the V8 update until the next major release?

If not--if we might ship the V8 update anyway--then should a one-sentence disclaimer be added letting users know that its behavior could change and that won't be considered a breaking change the way it would for other flags?

@sam-github
Copy link
Contributor Author

@Trott note that we document at least one other V8 option already, and that if V8 broke this behaviour in a way that Node couldn't fix itself, I would oppose shipping such a V8 update as a minor. The ability to core dump on uncaught exception is critical to post-mortem diagnostics, we can't rip that tool out of user's hands in a minor.

And I don't think @ofrobots and team have any intention of changing this long-standing CLI option.

@sam-github
Copy link
Contributor Author

@nodejs/post-mortem I am attempting to document a critical post-mortem debug tool, can I have some thumbs up, please?

@refack
Copy link
Contributor

refack commented Jul 5, 2017

If not--if we might ship the V8 update anyway--then should a one-sentence disclaimer be added letting users know that its behavior could change and that won't be considered a breaking change the way it would for other flags?

👍

@Trott note that we document at least one other V8 option already, and that if V8 broke this behaviour in a way that Node couldn't fix itself, I would oppose shipping such a V8 update as a minor. The ability to core dump on uncaught exception is critical to post-mortem diagnostics, we can't rip that tool out of user's hands in a minor.

👍

I think both these points should be documented "--abort-on-uncaught-exception is a V8 option, but node considers it an important tool and will..."

@misterdjules
Copy link

@Trott

FWIW, I agree with what @sam-github said with the following:

if V8 broke this behaviour in a way that Node couldn't fix itself, I would oppose shipping such a V8 update as a minor. The ability to core dump on uncaught exception is critical to post-mortem diagnostics, we can't rip that tool out of user's hands in a minor.

@refack
Copy link
Contributor

refack commented Jul 5, 2017

Do we want to start implementing a contingency our own implementation? (someone said debugger protocol) From a very quick overview it seems to me like something we can implement with relative ease...

@gibfahn
Copy link
Member

gibfahn commented Jul 6, 2017

Would documenting --abort-on-uncaught-exception mean that if V8 significantly alters its behavior, that's now a breaking change in Node.js and we can't ship the V8 update until the next major release?

Yes, but that's true of any V8 API, and that's why we don't update V8 unless it's compatible.

I would oppose shipping such a V8 update as a minor.

Also agreed.

should a one-sentence disclaimer be added letting users know that its behavior could change and that won't be considered a breaking change the way it would for other flags?

Given the above, I think we should specifically not say this, it should be considered a breaking change.

doc/api/cli.md Outdated
-->

Abort (dump core) when an uncaught exception is thrown. On most systems this will
generate a core file, though `ulimit -c unlimited` may be necessary if resource
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[just asking] Are the clauses about ulimit and on how to use core-dumps needed? Can't we assume (or link to) prior knowledge?

Copy link
Member

@gibfahn gibfahn Jul 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is short enough that it's fine as-is here, linking to something that said these 10 words would be a bit overkill IMO.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this from an earlier review comment. Many node users aren't familiar with ulimit and core dumps, and I'm pretty sure we've seen issues raised claiming the feature is broken because there is no core dump, this attempts to preempt that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its also hard to find good links, particularly to info that is generic across unixen. This has been a problem with man page links, too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack.

@refack
Copy link
Contributor

refack commented Jul 6, 2017

should a one-sentence disclaimer be added letting users know that its behavior could change and that won't be considered a breaking change the way it would for other flags?

Given the above, I think we should specifically not say this, it should be considered a breaking change.

It was suggested to add the opposite claim — even though it's V8 we will consider it a breaking change

@gibfahn
Copy link
Member

gibfahn commented Jul 6, 2017

It was suggested to add the opposite claim — even though it's V8 we will consider it a breaking change

Not sure that's needed, that's the semver default for any documented option.

Copy link
Member

@mhdawson mhdawson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mhdawson
Copy link
Member

mhdawson commented Jul 6, 2017

@refack
Copy link
Contributor

refack commented Jul 6, 2017

Not sure that's needed, that's the semver default for any documented option.

Generally when in doubt, I'd rather be explicit than implicit but I'm +0 on that.

What's more important IMHO is making this an explicit feature support decision and documenting this internally, so it doesn't rot or regress.
We have tests that check the exit code and signal from --abort-on-uncaught-exception (I believe that's what triggered this PR), but we do not assert anything about ulimit and about the usability of the core-dumps. If it's claimed in the public docs we need to assert it's true.
Ref: #14013

@sam-github
Copy link
Contributor Author

What triggered this PR is me trying to link to docs on --abort-on-uncaught-exception, and realizing that literally the only docs are to run node --v8-options from the shell, which isn't linkable or particularly discoverable, buried as it is in 440 other options.

but we do not assert anything about ulimit and about the usability of the core-dumps. If it's claimed in the public docs we need to assert it's true.

These are base O/S mechanisms, and we don't test them. It doesn't make sense to test that a core dump (written by the operating system!) can actually be read by a debugger, or test that an O/S really does implement POSIX process limits.

This was referenced Sep 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c++ Issues and PRs that require attention from people who are familiar with C++.
Projects
None yet
Development

Successfully merging this pull request may close these issues.