core: refactor gather-runner #8964

brendankenny · 2019-05-15T18:13:52Z

This is a refactor of gather-runner.js to

more clearly say what it's doing. The big outline at the top shouldn't be necessary; you should be able to just read run() (and now runPass()) to get an overview. GatherRunner.beforePass, pass, and afterPass now only call those methods on gatherers and it's a lot easier to see where things like recording a trace starts and stops.
make it easier to handle error cases. The code should be able to express what we're trying to do: at certain points (as the page is loading, after the page is done, between passes, etc) we need to check for errors, and then depending on the type of error take one of several actions (continue (allowing what gatherers can run to run), return error artifacts, don't run any more passes, cancel the entire LH run, etc). I've tried to add natural spots for these checks and more explicit error handling rather than trying to cram more things into the array-of-resolved-or-rejected-beforepass-pass-afterpass-promises model :) I believe core(gather-runner): treat NO_FCP as a pageLoadError #8340 and core: bail on gathering if we have a failure in the first pass #8866 (and future error handling beyond those) should integrate well with these changes.

Driver commands and input/output should be exactly the same except:

we go back to loading about:blank twice on the first pass
some log.time groups have different Driver functions in them so the timing might change slightly
The first is not a big deal since after core(driver): wait for Page.frameNavigated for about:blank #6446 the second about:blank load is like 10ms, and the timing changes seem unavoidable since it's an explicit externalizing of internal implementation details.

This is a big PR touching a lot of stuff, so if reviewers prefer this PR can just show the motivating goal and be closed and I can split into a couple of PRs.

brendankenny · 2019-05-15T18:24:32Z

lighthouse-core/gather/gather-runner.js

+    if (loadData.trace) baseArtifacts.traces[passConfig.passName] = loadData.trace;
+
+    // If there were any load errors, treat all gatherers as if they errored.
+    const pageLoadError = GatherRunner.getPageLoadError(passContext, loadData);


there are a few ways #8340 could be incorporated here. One would be have GatherRunner.loadPage pass back non-fatal errors as it does in that PR and pass that error into GatherRunner.getPageLoadError here, and have the code in getPageLoadError decide if the error is severe enough, which error wins if there are multiple ones, etc

(where exactly this call should go is another question that we need to figure out in terms of what's reasonable to pass back in an errored run...e.g. it doesn't make much sense to save a trace of a security interstitial)

seems reasonable to me

brendankenny · 2019-05-15T18:26:13Z

lighthouse-core/gather/gather-runner.js

-        await GatherRunner.beforePass(passContext, gathererResults);
-        await GatherRunner.pass(passContext, gathererResults);
+        const passResults = await GatherRunner.runPass(passContext);
+        Object.assign(artifacts, passResults.artifacts);


#8866 could be done here. GatherRunner.runPass returns the artifacts from the pass but also any errors encountered loading the page in that pass. If we want run() to break at this point (or after the isFirstPass check below), it's easy to do

patrickhulce

oh whoops I thought I flushed these already my b

patrickhulce · 2019-05-15T18:55:58Z

lighthouse-core/gather/gather-runner.js

 /**
 * Class that drives browser to load the page and runs gatherer lifecycle hooks.
- * Execution sequence when GatherRunner.run() is called:


patrickhulce · 2019-05-15T18:56:40Z

lighthouse-core/gather/gather-runner.js

   * @param {Driver} driver
+   * @param {{requestedUrl: string, settings: LH.Config.Settings}} options


do we actually need requestedUrl?

do we actually need requestedUrl?

I was going to say no but apparently yes, for the driver.clearDataForOrigin call :)

patrickhulce · 2019-05-15T18:56:56Z

lighthouse-core/gather/gather-runner.js

+   * @param {LH.Gatherer.PassContext} passContext
+   * @param {LH.Gatherer.LoadData} loadData
+   */
+  static getPageLoadError(passContext, loadData) {


👍 I like this distinction

👍 I like this distinction

Yeah, same.

pageLoadError is also a natural name for a future error passed back from GatherRunner.loadPage and this pageLoadError could be something like passError, but I feel like we've all gotten kind of used to calling the pass-level error pageLoadError, so I'm not sure what we should do there (if anything)

patrickhulce · 2019-05-15T18:58:59Z

lighthouse-core/gather/gather-runner.js

-    // TODO(bckenny): correct Partial<LH.GathererArtifacts> at this point to drop cast.
-    return /** @type {LH.Artifacts} */ ({...baseArtifacts, ...gathererArtifacts});
+    return {
+      artifacts: gathererArtifacts,


feels a bit weird that this isnt just returing the partial artifacts then...

oh I see it's so later on it matches the signature of the other method :)

feels a bit weird that this isnt just returing the partial artifacts then...
oh I see it's so later on it matches the signature of the other method :)

(I know you're already well aware of all this, @patrickhulce, but for anyone else reading :)

it's not my favorite thing ever, but we really have three finished-pass states

regular artifacts returned

fatal exception thrown (e.g. javascript syntax error, LH logic error)

pageLoadError, when the page load failed but LH is doing fine (bad URL, bad cert, etc)

The first one is handled by the existing returned artifacts interface, the second is handled by normal exceptions, but the third we don't want to handle with exceptions because it's not really an exceptional state for Lighthouse.

That's been fine before because we can just handle the pageLoadError within a pass by erroring out all the artifacts from that pass, but now we want the error information higher up, so we need to pass back artifacts and the pageLoadError. There are other ways of doing this, but really we're just talking about a multi-value return, so let's just do a multi-value return :)

patrickhulce · 2019-05-15T19:00:19Z

lighthouse-core/gather/gather-runner.js

          url: options.requestedUrl,
          settings: options.settings,
          passConfig,
          baseArtifacts,
-          // *pass() functions and gatherers can push to this warnings array.


can they not anymore?

can they not anymore?

no change, they can, I just put the comments on the PassContext type rather than inline here since it's a little distracting from the run() logic and it's more helpful to gatherer authors than folks in this file (it's kind of helpful in this file, so I also added the /** @type {LH.Gatherer.PassContext} */ to the object literal so anyone using intellisense will get the comment on hover as well)

patrickhulce · 2019-05-15T19:01:33Z

lighthouse-core/gather/gather-runner.js

          LighthouseRunWarnings: baseArtifacts.LighthouseRunWarnings,
        };
-
-        await driver.setThrottling(options.settings, passConfig);
-        if (!isFirstPass) {


can we just keep this on pass context so we don't lose this? it's such a small concession to keep our 300ms savings :)

can we just keep this on pass context so we don't lose this? it's such a small concession to keep our 300ms savings :)

so we caaaaan, and I admit I deleted it only because it was ugly :) but because we only wait for Page.frameNavigated after #6446, I consistently get ~10ms for this second load of about:blank in the Timing artifact.

Oh I forgot I fixed that already lol, yeah that's fine then carryon :)

patrickhulce · 2019-05-15T19:03:52Z

lighthouse-core/gather/gather-runner.js

+    if (loadData.trace) baseArtifacts.traces[passConfig.passName] = loadData.trace;
+
+    // If there were any load errors, treat all gatherers as if they errored.
+    const pageLoadError = GatherRunner.getPageLoadError(passContext, loadData);


seems reasonable to me

patrickhulce

seems like an improvement to me!

I've got 2 gather-runner PRs open, so let's make like a lane closure and merge this fool 🚥 🚗

LGTM :)

paulirish · 2019-05-20T23:43:45Z

lighthouse-core/gather/gather-runner.js

      baseArtifacts.BenchmarkIndex = await options.driver.getBenchmarkIndex();
+
      await GatherRunner.setupDriver(driver, options);


I'm comparing the timing trace of before and after this PR and there's now a good amount of time that isn't instrumented:

it's after loadBlank and before gather꞉loadPage-defaultPass (though gather꞉beforePass is in between)

the rest of the timings look good tho

there's now a good amount of time that isn't instrumented

Yes, though to be fair, if a timing is dropped and no one is observing it anywhere, does it really matter? :P

I'll try to get it covered and avoid that in this PR, though

paulirish · 2019-05-20T23:51:02Z

lighthouse-core/gather/gather-runner.js

      // In the devtools/extension case, we can't still be on the site while trying to clear state
      // So we first navigate to about:blank, then apply our emulation & setup
      await GatherRunner.loadBlank(driver);
+
+      const baseArtifacts = await GatherRunner.getBaseArtifacts(options);


rename method now that we have populateBaseArtifacts?

initializeBaseArtifacts?

brendankenny requested review from patrickhulce and paulirish as code owners May 15, 2019 18:13

brendankenny commented May 15, 2019

View reviewed changes

patrickhulce reviewed May 15, 2019

View reviewed changes

patrickhulce approved these changes May 20, 2019

View reviewed changes

patrickhulce mentioned this pull request May 20, 2019

core(gather-runner): treat NO_FCP as a pageLoadError #8340

Merged

paulirish reviewed May 21, 2019

View reviewed changes

vercel bot deployed to staging May 21, 2019 00:58 View deployment

brendankenny added 9 commits May 20, 2019 17:58

split run() and runPass()

eefff12

split beforePass and pass

91eb996

split afterPass

e661dfc

more

a518aa5

revert some stuff

0f6aa29

more

3a4a686

fix tests

255ea57

more

2a588d1

back by popular demand

99f1e15

brendankenny force-pushed the gr branch from e96eda6 to 99f1e15 Compare May 21, 2019 00:59

vercel bot deployed to staging May 21, 2019 00:59 View deployment

paulirish approved these changes May 21, 2019

View reviewed changes

brendankenny merged commit 709d2b5 into master May 21, 2019

brendankenny deleted the gr branch May 21, 2019 17:35

brendankenny mentioned this pull request May 25, 2019

misc(fix master): tmp revert of scroll position in #8625 #9059

Merged

paulirish pushed a commit that referenced this pull request May 28, 2019

core: refactor gather-runner control flow (#8964)

4ec7ea9

snyk-bot mentioned this pull request Mar 21, 2020

[Snyk] Upgrade lighthouse from 5.1.0 to 5.6.0 godaddy/lighthouse4u#13

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

core: refactor gather-runner #8964

core: refactor gather-runner #8964

brendankenny commented May 15, 2019

brendankenny May 15, 2019

patrickhulce May 15, 2019

brendankenny May 15, 2019 •

edited

Loading

patrickhulce left a comment

patrickhulce May 15, 2019

patrickhulce May 15, 2019

brendankenny May 16, 2019

patrickhulce May 15, 2019

brendankenny May 16, 2019

patrickhulce May 15, 2019

patrickhulce May 15, 2019

brendankenny May 16, 2019

patrickhulce May 15, 2019

brendankenny May 16, 2019

patrickhulce May 15, 2019

brendankenny May 16, 2019

patrickhulce May 16, 2019

patrickhulce May 15, 2019

patrickhulce left a comment

paulirish May 20, 2019

brendankenny May 21, 2019

paulirish May 20, 2019

		* @param {Driver} driver
		* @param {{requestedUrl: string, settings: LH.Config.Settings}} options

		baseArtifacts.BenchmarkIndex = await options.driver.getBenchmarkIndex();

		await GatherRunner.setupDriver(driver, options);

core: refactor gather-runner #8964

core: refactor gather-runner #8964

Conversation

brendankenny commented May 15, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

brendankenny May 15, 2019 • edited Loading

Choose a reason for hiding this comment

patrickhulce left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

patrickhulce left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

brendankenny May 15, 2019 •

edited

Loading