Handle worker crashes #1073

IanVS · 2018-01-12T14:48:42Z

This is an attempt to mitigate some of the problems being experienced in the referenced issues, whereby linter-eslint crashes/hangs with a high memory usage.

I was unable to determine the exact cause of workers dying, but this PR does a few things to try to minimize the crashes and memory usage, as well as allowing linter-eslint to recover from a crashed worker. In one of my projects, I was able to reliably reproduce the crashes, usually within a few seconds of opening a file. Using this branch, linter-eslint has now become usable for me again, albeit during a crash it still takes a few seconds to recover and start working again, so I don't consider this a complete solution.

I think the main challenge we are facing is that the Task api from Atom is not intended to be used as a long-running task, which is what we need because starting up a lint job can take a second or two. Maybe we can find another, more reliable approach in the future, but hopefully this PR will at least make linter-eslint usable again for those folks currently stuck on the old version.

Arcanemagus

Overall I like this change, just a few cleanups I noticed.

Arcanemagus · 2018-01-16T00:16:31Z

src/main.js

+        resolve()
+      }
+      // Initialize the worker during an idle time
+      window.requestIdleCallback(initializeESLintWorker)


This should be added to the idelCallbacks Set (The call up on L185 also should be added). Otherwise we risk initializing a worker when the package is deactivated.

Arcanemagus · 2018-01-16T00:17:32Z

src/main.js

+        this.worker.terminate()
+        this.worker = null
+      }
+      const initializeESLintWorker = () => {


Since this code is in multiple places it should be moved to a separate function of its own.

Arcanemagus · 2018-01-16T00:23:44Z

src/main.js

@@ -323,6 +344,13 @@ module.exports = {
      await waitOnIdle()
    }

+    // Sometimes the worker dies and becomes disconnected


Since this is duplicated maybe this comment as well as the check should be moved into the called function, and it renamed to something like checkESLintWorker?

I went back and forth on it, and in the end decided against passing yet another argument to sendJob, but can do so if you'd prefer that.

I actually rather like the idea of just sticking this logic in sendJob, that way the logic for maintaining the worker stays there leaving the callers to just worry about how the job is used.

I'm not sure why moving the logic there would require an additional argument though?

Is there a way to kill and restart a worker simply from a reference to that worker? You can see that my restartESLintWorker() method sets this.worker, which is then passed to sendJob. So I assume we would need to send a reference to this.restartESLintWorker to sendJob as well. Or am I missing another approach that you have in mind?

I was thinking this bit:

if (this.worker && !this.worker.childProcess.connected) { await this.initializeWorker() }

As well as the code in initializeWorker could be moved into sendJob (or called from there).

Hm, I guess I'm not seeing a clear vision of what you have in mind. Do you want to take a crack at a commit on this PR?

I'm not sure if this is at all an improvement. A lot of side effects and misdirection, but I'll share the idea anyways. Maybe it will spark something.

const advancedWorker = () => ({ initialize: () => this.initializeEslintWorker().then(() => this.worker), restart: () => this.restartEslintWorker.then(() => this.worker), task: this.worker }) async function sendJob( worker, config) { if (!worker.task || !this.worker.childProcess.connected) { worker.task = await worker.initialize() } // .... } ```

Arcanemagus · 2018-01-16T00:26:26Z

src/worker.js

+    // We catch all worker errors so that we can create a separate error emitter
+    // for each emitKey, rather than adding multiple listeners for `task:error`
+    try {
+      const {


Minor nit, but I'd leave this out of the try/catch and just use emitKey in the catch block.

skylize · 2018-01-16T01:04:02Z

src/helpers.js

@@ -43,12 +43,17 @@ export async function sendJob(worker, config) {
  config.emitKey = cryptoRandomString(10)

  return new Promise((resolve, reject) => {


Why do we have a promise that can never resolve?

Nevermind, resolve was hidden in folded code. 🙈

That's what the reject on L57, and the resolve on L62 are for?

Arcanemagus · 2018-01-16T07:41:22Z

src/main.js

    idleCallbacks.forEach(callbackID => window.cancelIdleCallback(callbackID))
    idleCallbacks.clear()
+    helpers.killWorker()


Note that this is after the cancellation of the idle callbacks to prevent the idle callback that requests a worker start from firing off before it would be able to kill it.

skylize · 2018-01-16T11:34:22Z

src/helpers.js

@@ -83,7 +103,7 @@ function validatePoint(textBuffer, line, col) {
  }
 }

-export async function getDebugInfo(worker) {
+export async function getDebugInfo() {


I like this a lot. Gets a ton of the worker logic and state management of the worker out the the main object, greatly simplifying things. Definitely going the right direction IMHO.

Arcanemagus

Approving this as is, the refactor work I was thinking of turned out to be large enough that it should be a separate PR.

This helps to avoid an explosion in the number of listeners once errors start to happen.

There are some times when the worker dies completely and becomes disconnected. When that happens, we cannot send another job until we terminate the failed worker and start up a new one.

Shitty commit title, but this does a few things: - Breaks out the worker initialization into a separate method - Adds the idle callback for the worker initialization to `idleCallbacks` - Does away with `restartEslintWorker` and just adds the safety check of terminating an existing `this.worker` within `initializeWorker` - Destructures `jobConfig` outside of the try/catch, to use within the error handler.

IanVS requested a review from Arcanemagus January 12, 2018 14:48

Arcanemagus suggested changes Jan 16, 2018

View reviewed changes

Arcanemagus added the bug label Jan 16, 2018

skylize reviewed Jan 16, 2018

View reviewed changes

Arcanemagus reviewed Jan 16, 2018

View reviewed changes

skylize reviewed Jan 16, 2018

View reviewed changes

Arcanemagus force-pushed the handle-worker-crashes branch from 9491546 to 6850987 Compare January 16, 2018 19:29

Arcanemagus approved these changes Jan 16, 2018

View reviewed changes

IanVS added 4 commits January 16, 2018 14:45

Dispose subscriptions when errors occur

f53ebe3

This helps to avoid an explosion in the number of listeners once errors start to happen.

Avoid multiple task:error listeners

c52c7a6

Restart ESLint listener if necessary

8ecf95a

There are some times when the worker dies completely and becomes disconnected. When that happens, we cannot send another job until we terminate the failed worker and start up a new one.

IanVS force-pushed the handle-worker-crashes branch from 4941417 to b4b26a1 Compare January 16, 2018 19:45

IanVS merged commit 64f4c80 into master Jan 17, 2018

IanVS deleted the handle-worker-crashes branch January 17, 2018 11:12

OscarBarrett mentioned this pull request Jan 24, 2018

Linter crashes after adding too many event listeners, channel closed. (8.2.1) #927

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle worker crashes #1073

Handle worker crashes #1073

IanVS commented Jan 12, 2018

Arcanemagus left a comment

Arcanemagus Jan 16, 2018

Arcanemagus Jan 16, 2018

Arcanemagus Jan 16, 2018

IanVS Jan 16, 2018

Arcanemagus Jan 16, 2018

IanVS Jan 16, 2018

Arcanemagus Jan 16, 2018

IanVS Jan 16, 2018

skylize Jan 16, 2018 •

edited

Loading

Arcanemagus Jan 16, 2018

skylize Jan 16, 2018

skylize Jan 16, 2018

Arcanemagus Jan 16, 2018

Arcanemagus Jan 16, 2018

skylize Jan 16, 2018

Arcanemagus left a comment

		@@ -43,12 +43,17 @@ export async function sendJob(worker, config) {
		config.emitKey = cryptoRandomString(10)

		return new Promise((resolve, reject) => {

Handle worker crashes #1073

Handle worker crashes #1073

Conversation

IanVS commented Jan 12, 2018

Arcanemagus left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

skylize Jan 16, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Arcanemagus left a comment

Choose a reason for hiding this comment

skylize Jan 16, 2018 •

edited

Loading