Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cluster: remove deprecated problematic API #13684

Closed
wants to merge 1 commit into from

Conversation

isaacs
Copy link
Contributor

@isaacs isaacs commented Jun 14, 2017

Content Warning: mental health and self-harm triggers.


The word "suicide" was used in the cluster API 4 years ago. This was a
serious mistake for several reasons. It was deprecated in version 6,
and long past time for full removal now.

First, and most important, casual use of words referring to self-harm
serve to stigmatize mental illness and make it harder for people in our
community to literally stay alive. It is impossible to claim that we
intend to be inclusive and welcoming, and also casually show such
disrespect for the lives of those in our community. Some of us have
lost friends or struggled with mental illness. Leaving this lying
around tells us that we are not welcome.

Second, in addition to being just wildly inappropriate on ethical and
community grounds, the term "suicide" was never even appropriate
technically! In context, it refers to a case where a child process
exits without receiving a fatal signal from the parent process. The
"cute" unix joke here is that the process "killed itself". However, the
"suicide" flag is set regardless of how the process exited, so in
addition to being offensive and harmful, it is also misleading and
incorrect, and has always been so. (It could've thrown an error, called
process.exit() or received a fatal signal from some other process.)

An open source project is not the place for emotional landmines. It's
distracting and misleading at best, and actively harmful at worst.
There is no justification for it. The API has already been renamed, and
it's an obscure surface that is unlikely to ever be used outside of
core.

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • documentation is changed or added
  • commit message follows commit guidelines
Affected core subsystem(s)
  • cluster

Content Warning: mental health and self-harm triggers.

---

The word "suicide" was used in the cluster API 4 years ago.  This was a
serious mistake for several reasons.  It was deprecated in version 6,
and long past time for full removal now.

First, and most important, casual use of words referring to self-harm
serve to stigmatize mental illness and make it harder for people in our
community to literally stay alive.  It is impossible to claim that we
intend to be inclusive and welcoming, and also casually show such
disrespect for the lives of those in our community.  Some of us have
lost friends or struggled with mental illness.  Leaving this lying
around tells us that we are not welcome.

Second, in addition to being just wildly inappropriate on ethical and
community grounds, the term "suicide" was never even appropriate
technically!  In context, it refers to a case where a child process
exits without receiving a fatal signal from the parent process.  The
"cute" unix joke here is that the process "killed itself".  However, the
"suicide" flag is set regardless of how the process exited, so in
addition to being offensive and harmful, it is also misleading and
incorrect, and has always been so.  (It could've thrown an error, called
`process.exit()` or received a fatal signal from some other process.)

An open source project is not the place for emotional landmines.  It's
distracting and misleading at best, and actively harmful at worst.
There is no justification for it.  The API has already been renamed, and
it's an obscure surface that is unlikely to ever be used outside of
core.
@vsemozhetbyt vsemozhetbyt added the cluster Issues and PRs related to the cluster subsystem. label Jun 14, 2017
@vsemozhetbyt
Copy link
Contributor

@tniessen tniessen added the semver-major PRs that contain breaking changes and should be released in the next major version. label Jun 14, 2017
@tniessen
Copy link
Member

This is semver-major, right?

@cjihrig
Copy link
Contributor

cjihrig commented Jun 14, 2017

@tniessen yes

@michaelmcmillan
Copy link

I applaud the concern for others, but I doubt that this is really helping anyone. We only have so many words to play with and blacklisting one because it could potentially trigger a negative emotional response seems counterproductive.

This is a striking example of the Voldemort Effect in the Harry Potter books. Suicide is equivalent with "he who must not be named". Instead of talking about the problem openly you shut down conversation in hopes of it going away.

@cjihrig
Copy link
Contributor

cjihrig commented Jun 14, 2017

@michaelmcmillan the argument about getting rid of suicide happened a long time ago. Now it's nothing but legacy dead weight in the codebase (see exitedAfterDisconnect instead). We should probably do some sort of usage analysis though.

@michaelmcmillan
Copy link

@cjihrig I was not aware of that, sorry for resurrecting that whole debate.

@gibfahn
Copy link
Member

gibfahn commented Jun 14, 2017

So this was doc deprecated in Node 6 and emitted a warning in Node 7? Finally dropping it in Node 9 seems reasonable.

CitGM: https://ci.nodejs.org/view/Node.js-citgm/job/citgm-smoker/877/

@sam-github
Copy link
Contributor

Perhaps this (and all other deprecated APIs) could be routinely removed once an alternative is available to them across all LTS branches. This particular property has a replacement in 6.x and later, but not in 4.x. We can backport the replacement property to 4.x, which should be quite low risk, or we can wait until 4.x goes out of maintenance and do a house-cleaning then.

@sam-github
Copy link
Contributor

To be clear, I'm not opposed to doing this now, but I hope its deprecation cycle is the same as every other deprecated property. We should find all the candidates for deprecation, and make sure they all go away on the same schedule.

@Trott
Copy link
Member

Trott commented Jun 14, 2017

Perhaps this (and all other deprecated APIs) could be routinely removed

[UPDATED: The -1 below is a comment on the quoted half-sentence above, not on this PR. The change in this PR is fine by me. In fact, my comments are so tangential that I'm going to wrap them in a Details block.]

Definitely -1 on removing stuff without a reason.
  • deprecated does not mean "going away". It means "discouraged" or "not recommended". It absolutely makes sense for us to deprecate things that might never be removed.

  • A very typical consideration is a soft estimate of ecosystem breakage. Consider Buffer constructor. There's a viable alternative so maybe Buffer constructor will be removed...or maybe the breakage to the ecosystem would be too great such that removing it any time in the next several years is unthinkable. Even if you disagree with that assessment on Buffer constructor in particular, you get the idea in general.

  • Not a fan of removing stuff just to remove stuff. If it costs little or nothing to keep it, then keep it. We've had sys forever and will probably never remove it for exactly this reason.

@nodejs nodejs locked and limited conversation to collaborators Jun 14, 2017
@nodejs nodejs deleted a comment from wrq Jun 14, 2017
@gibfahn
Copy link
Member

gibfahn commented Jun 14, 2017

Definitely -1 on removing stuff without a reason.

@Trott If I understand you correctly you're replying to @sam-github's general point about removing APIs, not expressing a particular issue with this PR. If so can we move that to a new Issue (I think/hope @sam-github was going to raise one)?

@@ -52,7 +52,6 @@ if (cluster.isWorker) {
worker_emitDisconnect: [1, "the worker did not emit 'disconnect'"],
worker_emitExit: [1, "the worker did not emit 'exit'"],
worker_state: ['disconnected', 'the worker state is incorrect'],
worker_suicideMode: [false, 'the worker.suicide flag is incorrect'],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to confirm, this mode was only entered when using the deprecated API correct?

Copy link
Contributor

@MylesBorins MylesBorins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with small question

Definitely should run this through citgm to check breakages.

Also, it might be worth doing some introspection into GitHub project to see how often this api is used.... I could do some BigQuery analysis

@@ -7,12 +7,12 @@ const cluster = require('cluster');
const worker = new cluster.Worker();

assert.strictEqual(worker.exitedAfterDisconnect, undefined);
assert.strictEqual(worker.suicide, undefined);
assert.strictEqual(worker.exitedAfterDisconnect, undefined);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this test just be removed? AIUI from 3f61521 this is just to test that the deprecated properties work properly, and we're removing them.

cc/ @Trott

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gibfahn Yes, I agree, it seems like this test file can be removed rather than modified.

@Trott
Copy link
Member

Trott commented Jun 14, 2017

@Trott If I understand you correctly you're replying to @sam-github's general point about removing APIs, not expressing a particular issue with this PR.

@gibfahn Correct, sorry about the confusing wording. I'm A-OK with this change pending a CITGM run and maybe a few other checks to estimate ecosystem impact.

@sam-github
Copy link
Contributor

I think/hope @sam-github was going to raise one

No, I thought it was the intention that deprecation was a step on the road to deletion, but apparently that's not the case.

@MylesBorins
Copy link
Contributor

@sam-github @Trott can you please move conversation about deprecation to a meta issue.

Thanks!

@gibfahn
Copy link
Member

gibfahn commented Jun 14, 2017

Just noticed that exitedAfterDisconnect doesn't seem to have been backported to v4.x. Is there a reason for that? It'd be nice to be able to say "just do s/worker.suicide/worker.exitedAfterDisconnect/ and it'll work everywhere".

@nodejs/lts is there a chance that we could backport that? We've got a couple of fixes that we're planning to backport to 4.x anyway, and it's in maintenance till March 2018 .

@MylesBorins
Copy link
Contributor

we could backport to v4.x if it was done as a patch

@sam-github
Copy link
Contributor

I'm having trouble following the history.

@evanlucas said he would backport, #3743 (comment), and he did, #7998, and it got closed, but I can't find any evidence of it in v4.x-staging or v4.x. Maybe its the remnants of my cold? What's up?

@gibfahn
Copy link
Member

gibfahn commented Jun 14, 2017

we could backport to v4.x if it was done as a patch

#7998 was a semver-minor, so I don't think that's going to work.

@MylesBorins
Copy link
Contributor

@gibfahn want to open an issue over on Lts and we can discuss?

@gibfahn
Copy link
Member

gibfahn commented Jun 14, 2017

@jasnell
Copy link
Member

jasnell commented Jun 15, 2017

FYI... I am currently in the process of preparing an alternate semver-major PR targeted for 9.x, and a separate PR that should be less disruptive for 8.x that we ought to be able to backport to 6.x. There is a separate conversation on going about whether to backport the exitedAfterDisconnect API to 4.x LTS.

@mikeal
Copy link
Contributor

mikeal commented Jun 15, 2017

Maybe I'm missing something, so let me know if this was already covered or suggested.

  • This API was soft deprecated in version 6.
  • At some point the goal should be to remove it entirely, being that it is deprecated, if we can do so without breaking much. If we don't at least try to remove deprecated APIs we will continue to grow cruft.
  • Tools like CITGM won't give us a great picture on how much this breaks because this API is used far more by applications than by modules. We won't have a very good idea of the scope of the break until we put it in people's hands.
  • We have an odd release coming up in October, that's a good time to test out a harder deprecation, either by printing a large warning for a cycle or just remove it entirely and see what kind of feedback we get.

@jasnell
Copy link
Member

jasnell commented Jun 15, 2017

@mikeal .. A runtime deprecation was added in 6.0.0, not a docs-only deprecation

@mikeal
Copy link
Contributor

mikeal commented Jun 15, 2017

@jasnell ah, thanks for the clarification.

@refack
Copy link
Contributor

refack commented Jun 15, 2017

Since this thread is locked GitHub doesn't cross reference PRs
An alternative removal implementation #13702
A complementary tweak to the deprecation #13703 #13704

@isaacs
Copy link
Contributor Author

isaacs commented Jun 15, 2017

Any investigation into the scope of breakage that would be caused by this should limit to programs that currently pass their tests on node 8 without any code changes. If they're broken already, it doesn't matter if they also have to change this, as long as the change is made as soon as possible. If we delay, then there's a chance that they'll upgrade, and not know to upgrade this (though that is unlikely, given the preexisting deprecation.)

@ChALkeR
Copy link
Member

ChALkeR commented Jun 15, 2017

@isaacs Just a note (as I am feeling that you might have misunderstood me): I understand your concern and I am not blindly opposed to making changes there or even bypassing the deprecation procedure if that would be the least bad thing.

I am opposed to not having a discussion here and not doing the estimations, and to rushing this.
I believe that that done previosly is exactly what caused the logs situation which you were talking about.
I will support this PR if the estimations will show that that is the best possible choice of action.

If you truly wish the best for everyone out there, you shouldn't be opposed to factorizing the concern and investigating which way would be the least harmful.

@isaacs
Copy link
Contributor Author

isaacs commented Jun 15, 2017

I am opposed to not having a discussion here and not doing the estimations, and to rushing this.

I agree in principle, but consider what you're suggesting. What is the chance that you will find a scope of program breakage anywhere near the harm of triggering surprise PTSD flashbacks in our users?

I believe that that done previosly is exactly what caused the logs situation which you were talking about.

I don't understand how that's possible. They won't get "is not a function" errors, because it never was. Unless a userland module is doing something very peculiar, it seems unlikely that this will result in a triggering log spam. Can you demonstrate how that would work? "Making the existing problem worse" is the only reasonable argument against this imo, but I am very skeptical.

@ChALkeR
Copy link
Member

ChALkeR commented Jun 15, 2017

What is the chance that you will find a scope of program breakage anywhere near the harm of triggering surprise PTSD flashbacks in our users?

Again — that's not what I am suggesting. We should take all the consequences into the account that we could, of course, but that's not even the main point.

You are implying that this change will reduce the chance of triggering those flashbacks, but I am afraid that the opposite could be true. I am not even sure that landing this PR would have prevented those logs that you are talking about. See below, though.

They won't get "is not a function" errors, because it never was. Unless a userland module is doing something very peculiar, it seems unlikely that this will result in a triggering log spam.

Would it be possible to get those logs to better understand what exactly caused them?

@bmeck
Copy link
Member

bmeck commented Jun 15, 2017

Even though I am normally +1 for backwards compat/safety. This has encountered a deprecation cycle/warnings and has harm by staying. I would want a serious argument for why keeping deprecated things with better replacement APIs should stay supported if support could cause harm to users. With any usage data I would like to see ones that would continue to function silently in this scenario.

@jasnell
Copy link
Member

jasnell commented Jun 15, 2017

Just as a heads up, #13702 accomplishes the same goal as this, but across separate commits and with the documentation nits addressed. Please take a look at that as an alternative. #13704 proposes a backwards compat backport for 8.x.

@ChALkeR
Copy link
Member

ChALkeR commented Jun 15, 2017

@isaacs To elaborate: the runtime deprecation caused this: Unitech/pm2#2780, they have that as an open issue at this moment — users are seeing that deprecation warning in their logs and have no idea why (the issue is not resolved yet).

And the fact that that is being actively hit means that by removing that we will not only break things, but force some percentage of people to dig into the code that calls that. Which would expose them more.

@isaacs
Copy link
Contributor Author

isaacs commented Jun 15, 2017

@ChALkeR If anything, it shows that noisy deprecation here is almost certainly worse than outright removal. Since PM2 doesn't use that field anywhere, it's really weird that it's showing up at all. I've asked some users in that thread to turn on trace-deprecation. Probably we can fix this at the root.

@ChALkeR
Copy link
Member

ChALkeR commented Jun 15, 2017

@isaacs PM2 1.x, 2.0.x, 2.1.0 and 2.1.1 use it at this line: https://github.com/Unitech/pm2/blob/v2.0.18/lib/God/ActionMethods.js#L303

@jasnell
Copy link
Member

jasnell commented Jun 15, 2017

@ChALkeR ... do you know if a PR has been opened in PM2 to remove/replace usage of the deprecated API?

@ChALkeR
Copy link
Member

ChALkeR commented Jun 15, 2017

@jasnell It is fixed in the current version for several releases (2.1.2 and later), but it is strange, as people are still getting the warnings from somewhere. I can't locate that in the code. Could be some PM2-related module or local configuration, or incomplete update.

@MylesBorins
Copy link
Contributor

@ChALkeR I believe it is ending up in logs when doing introspection on the worker object. Making the property unenumerable would stop that from happening.

@mcollina
Copy link
Member

cc @Unitech - Alexandre can you have a look?

@jasnell
Copy link
Member

jasnell commented Jun 15, 2017

#13704 makes the property non-enumerable and can be backported to the 6.x, and 7.x release lines.

@ChALkeR
Copy link
Member

ChALkeR commented Jun 15, 2017

@MylesBorins introspection could surely make it appear in the logs, but don't think that could have caused any of the specific mentioned log issues (is not a function and the deprecation messages with pm2 ones). Making it inenumerable should be fine, I can't imagine how that could break things.

@ChALkeR
Copy link
Member

ChALkeR commented Jun 15, 2017

@isaacs Thanks for taking a look at that pm2 issue.

Re: removal — the problem that I see here is that deprecation messages being fired (and people are still reporting it being fired) means that the code path is actually being used, meaning that once that is removed completely, that code path is going to be silently broken.

And one of the implications of silently breaking it would be that users/developer would probably have to spend some visible amount of people-hours in total, trying to nail it down, getting more emotionally tied to that than to regular logs, and during that they will be closer dealing with (or suddenly discovering) the exact same api/wording the intention was to make less visible. Because of that, I am not convinced that removing will make the flashbacks less harmful ( = frequent * severe).

@jasnell
Copy link
Member

jasnell commented Jun 15, 2017

At this point, removal in 9.x is the right thing to do. For users who may be impacted by that, a trivial mitigation is possible by monkey-patching the cluster API. A small module could easily be published to npm to accomplish that goal. The API was runtime deprecated in 6.x and, per the deprecation policy, is ready to reach the end-of-life stage.

My concern is only that this specific PR is not the way to accomplish it and I would recommend closing this PR in favor of #13702 and #13704.

@Trott
Copy link
Member

Trott commented Jun 15, 2017

how many people do see this string in the code per day, and how much likely are they to harm themselves because of that.

@ChALkeR As far as you can tell, is there anyone in this conversation with insights that would make their estimates of these numbers significantly better than your own estimates?

If you believe these should be measured/estimated to the extent possible and there's no reason to think that your estimates are going to be significantly inferior to anyone else's, might you be able to create your own estimates and explanations? If others do have insights into ways premises can be improved, etc., they might more readily respond to a concrete example.

@ChALkeR
Copy link
Member

ChALkeR commented Jun 20, 2017

@Trott Based on preliminary estimations, I believe that this is an order of magnitude of 10-100 users getting those flashbacks per month (where flashback is being defined as «noticing and reminded of their previous attempt», not something more severe). I have not completed the estimations, but those are the numbers that I expect to get based on what I have now. I don't have the knowledge to estimate how bad those are, what portion of those are severe, and how likely are they to harm people.

That said, I do not know how would that be changed once we remove that property.

@ChALkeR
Copy link
Member

ChALkeR commented Jun 20, 2017

Preliminary data and assumptions used, if anyone wants to check my result:

  • ~6% attempted rate, ~1% fatal. Let's use 5%.
  • pm2 currently has about 620 000 downloads per month
  • number of total downloads per month and the number of users per month relate as about 1:2900
  • assuming that pm2 is an average package (which it looks like to be), one pm2 download resulting in 146 actual package downloads, gives us about 31200 humans downloading pm2 per month (npm staff should have better numbers, but I expect something near that).
  • assuming that about 50% of those warnings come from pm2 using that flag in earlier versions, as that's the only very popular library that was doing that. The remaining 50% I reserve to be scattered against other libraries and proprietary solutions.
  • about 20-30% of the users using Node.js versions where that warning is enabled (note: the warning is not the only possible source, as @isaacs pointed out). Btw, I think that @isaacs could have better information on this one, too. Note: this is correlated with the number of people using new versions that don't print warnings.
  • not everyone inspecting the logs, noticing that, or being reminded of their previous attempt once noticing.

@isaacs
Copy link
Contributor Author

isaacs commented Jun 20, 2017

If immediate removal is not on the table, then I'd recommend removing the deprecation warning to stderr and making the property non-enumerable, removing from documentation (or hiding by default with a warning), and backporting these changes as far back as possible.

The deprecation in 6.x made the problem worse, as @ChALkeR has described above.

@seishun
Copy link
Contributor

seishun commented Jun 20, 2017

Sure, removing the runtime deprecation will prevent people from seeing the word in their console outputs or logs. But it won't help people seeing it in their codebase, or when debugging dependencies, or when looking at Node.js's code. I think eventual removal is a better solution in the long term.

@jasnell
Copy link
Member

jasnell commented Jun 20, 2017

#13702 and #13704 have landed. Closing this.

@jasnell jasnell closed this Jun 20, 2017
@isaacs
Copy link
Contributor Author

isaacs commented Jun 21, 2017

A bit late, since this is already landed, but just to close the loop on the unanswered question.

The log spam was a result of a module used by PM2 that walks an object's members mutating functions. Since it has to check for each member's functionness, it was calling typeof obj[key] and triggering this deprecation message.

So, removing this flag will not break PM2. They don't use it for anything.

I still maintain that the best course of action is full immediate removal, but #13704 and #13702 do make the situation a lot better. Thank you for taking this seriously.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
cluster Issues and PRs related to the cluster subsystem. semver-major PRs that contain breaking changes and should be released in the next major version.
Projects
None yet
Development

Successfully merging this pull request may close these issues.