Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core(lifecycle): allow gathering & auditing to run separately #3743

Merged
merged 20 commits into from
Jan 5, 2018
Merged

Conversation

paulirish
Copy link
Member

@paulirish paulirish commented Nov 3, 2017

fixes #1806

The mystical -GAR feature. Actually its just -GA for now.

Here's how this works:

lighthouse -G http://example.com
# launches browser, collects data, saves to disk (in `./latest-run/`) and quits

lighthouse -A http://example.com
# skips browser interaction, loads artifacts from disk (in `./latest-run/`), runs audits on them, generates report

lighthouse -GA http://example.com
# Normal gather + audit run, but also saves artifacts to disk
  • --gather-mode and --audit-mode are the long versions of -G and -A.
  • config.auditResults is removed. it was weird.
  • assetSaver got a new impl of saveArtifacts, and added loadArtifacts
  • --save-artifacts CLI flag was removed, as it was never really used and makes less sense in -G world.

Todo:
  • Decide if we want to rename these, as "gather" hasn't been really user-visible before.
Future work:
  • Add -G=./artifacts-save-path/
  • Add -A=./artifacts-load-path/

Copy link
Collaborator

@patrickhulce patrickhulce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice this is awesome!! been soooooo excited for this 🎉 🕺 🕺 🕺 🎉

@@ -318,10 +316,6 @@ class Config {
this._configDir = configPath ? path.dirname(configPath) : undefined;

this._passes = configJSON.passes || null;
this._auditResults = configJSON.auditResults || null;
if (this._auditResults && !Array.isArray(this._auditResults)) {
throw new Error('config.auditResults must be an array');
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is anyone using this should we wait for a major version? I'm fine nuking, but seems like it could still be supported

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally it would stick around, at least until 3.0. auditResults can skip everything and go right to the scoring (basically -R)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(it's also helpful for tests where you just want to test the -R part)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to just kill auditResults. Turns out we dont really have tests using it (except one).

auditResults is this weird inbetween value that's only different from the LHR because of these few lines: https://github.com/GoogleChrome/lighthouse/blob/gar/lighthouse-core/runner.js#L128-L155

if we REALLY want to support -R then we'd definitely not put auditResults on the config instance because it's bizarre. (obv that would break backcompat (for those mystery users)). if we want to do this i'd prefer to do it in a followup

artifacts.traces = {};

const filenames = fs.readdirSync(basePath);
const promises = filenames.filter(filename => filename.endsWith('-trace.json')).map(filename => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we do both traces and devtoolsLogs this way? It'd be nice to be consistent with how the assets are saved to disk in non-GAR mode but also less useful since there's nothing you can do with just the devtoolslog

nit: we also liked .trace.json/.devtoolslog.json if we want to stick with it here too :)

log.log('artifacts file saved to disk', fullPath);
function saveArtifacts(artifacts, basePath) {
assert.notEqual(basePath, '/');
rimraf.sync(basePath);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we actually know about all the files we create, can we just delete those instead of the whole directory?

const shouldGatherAndQuit = opts.flags.gatherMode && !opts.flags.auditMode;
const shouldOnlyAudit = opts.flags.auditMode && !opts.flags.gatherMode;
const shouldDefaultRunButSaveArtifacts = opts.flags.auditMode && opts.flags.gatherMode;
const shouldDoTypicalRun = !opts.flags.gatherMode && !opts.flags.auditMode;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we pick either Typical or Default :)


return artifacts;
if (shouldLoadArtifactsFromDisk) {
config.removePasses();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like this should be unnecessary given we've exploded the various should* branches into variables above?

const fs = require('fs');
const path = require('path');
const URL = require('./lib/url-shim');
const Sentry = require('./lib/sentry');

const basePath = path.join(process.cwd(), 'latest-run');

class Runner {
static run(connection, opts) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this method is a bit big for its 👖 these days, it seems like it can roughly be broken down into

  • loadOrGatherArtifacts
  • potentiallySaveArtifacts
  • potentiallyComputeAudits
  • combine results

do you think it'd be easier to handle the config.gatherMode type flags individually in each of these 4 hypothetical methods rather than exploding the matrix of choices? the plain english variable names of "this is what is happening" is kinda nice right now, but its quite a long function

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed on no real need to make a variable for each possible state. Rather than just four, can't we just have

  • loadArtifacts
  • gatherArtifacts
  • saveArtifacts
  • runAudits
  • score
  • createLHR
    and just have run() run the ones that are needed for each flag?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sg. say hello to

_loadArtifactsFromDisk
_gatherArtifactsFromBrowser
_saveArtifacts
_runAudits
_scoreAndCategorize

@@ -87,7 +87,7 @@ describe('Module Tests', function() {
});
});

it('should return formatted audit results when given no categories', function() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh wow you don't even get any category results? yeah let's nuke this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the original test was added to ensure that you'd still get json output even if there were no categories in the config (e.g. you don't care about categories). Is that still possible?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added a new test to assert things finish even if no categories in the config

});
// Entering: Gather phase
if (!config.passes && !config.artifacts && !shouldLoadArtifactsFromDisk) {
const err = new Error('You must require either gather passes or provide saved artifacts.');
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"You must require" seems a bit odd, "You must provide either ..." or "The config requires ... " perhaps?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about

No browser artifacts are either provided or requested.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sg, even further?

No browser artifacts provided or requested.

assert.equal(artifacts.networkRecords['firstPass'], undefined);
assert.equal(artifacts.networkRecords['secondPass'], undefined);
assert.equal(artifacts.networkRecords, undefined);
assert.equal(artifacts.networkRecords, undefined);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: remove one of these

const computedArtifacts = Runner.instantiateComputedArtifacts();

/* eslint-env mocha */

describe('Runner', () => {
const saveArtifactsSpy = sinon.spy(assetSaver, 'saveArtifacts');
const loadArtifactsSpy = sinon.spy(assetSaver, 'loadArtifacts');
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh these are straight spies, I thought you were stubbing

I'd have a slight preference for stubbing loadArtifacts and asserting the LHR looks like we'd expect it to than 100% relying on 'was called' assertions

Copy link
Member Author

@paulirish paulirish Nov 18, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we'll have to discuss this in person, as all runner tests do actually run gatherrunner.run, runAudits, generateReport, etc. So only going with stubs for these seems inconsistent.
Also, we'll have to mimic what these methods do on the test side if we want to see that everything works (for example: how do we test that lighthouse -G completes successfully unless it can read the files off disk?)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm all for having at least some integration testing of this 👍

Maybe we'll chat in person but seems like there's room for both

Copy link
Member

@brendankenny brendankenny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GAAAAAAAAA(r)!

@@ -87,7 +87,7 @@ describe('Module Tests', function() {
});
});

it('should return formatted audit results when given no categories', function() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the original test was added to ensure that you'd still get json output even if there were no categories in the config (e.g. you don't care about categories). Is that still possible?

@@ -66,6 +69,9 @@ export function getFlags(manualArgv?: string) {
'disable-device-emulation': 'Disable Nexus 5X emulation',
'disable-cpu-throttling': 'Disable CPU throttling',
'disable-network-throttling': 'Disable network throttling',
'gather-mode':
'Collect artifacts from a connected browser, save, & quit. However, if audit-mode is also enabled, then the run will complete after saving artifacts to disk.',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure how to parse the second part of this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rewrote it


await saveResults(results, artifacts!, flags);
await launchedChrome.kill();
if (shouldSaveResults){
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this be combined with the logic of whether or not to save already in saveResults?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


return results;
} catch (err) {
return handleError(err);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this work? handleError calls process.exit()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fair. changed it.

@@ -318,10 +316,6 @@ class Config {
this._configDir = configPath ? path.dirname(configPath) : undefined;

this._passes = configJSON.passes || null;
this._auditResults = configJSON.auditResults || null;
if (this._auditResults && !Array.isArray(this._auditResults)) {
throw new Error('config.auditResults must be an array');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally it would stick around, at least until 3.0. auditResults can skip everything and go right to the scoring (basically -R)

// do everything else
delete artifacts.traces;
// The networkRecords artifacts have circular references
fs.writeFileSync(`${basePath}/artifacts.json`, stringifySafe(artifacts), 'utf8');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stringifySafe will be broken on re-import if we actually have circular references, so should just use JSON.stringify (and hopefully we've done things correctly :)


const p = Promise.all(savePromies).then(_ => {
// do everything else
delete artifacts.traces;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't delete traces on artifacts, maybe make a copy of the object?

const fs = require('fs');
const path = require('path');
const URL = require('./lib/url-shim');
const Sentry = require('./lib/sentry');

const basePath = path.join(process.cwd(), 'latest-run');

class Runner {
static run(connection, opts) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed on no real need to make a variable for each possible state. Rather than just four, can't we just have

  • loadArtifacts
  • gatherArtifacts
  • saveArtifacts
  • runAudits
  • score
  • createLHR
    and just have run() run the ones that are needed for each flag?

}],
audits: [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how does this test audit output without running a gatherer or audit?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it does run a gatherer and an audit. :o

@@ -318,10 +316,6 @@ class Config {
this._configDir = configPath ? path.dirname(configPath) : undefined;

this._passes = configJSON.passes || null;
this._auditResults = configJSON.auditResults || null;
if (this._auditResults && !Array.isArray(this._auditResults)) {
throw new Error('config.auditResults must be an array');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(it's also helpful for tests where you just want to test the -R part)

@wardpeet
Copy link
Collaborator

wardpeet commented Nov 4, 2017

will there be docs about this? This feature is really great if you run it in a cloud environment!

log.log('artifacts file saved to disk', fullPath);
function saveArtifacts(artifacts, basePath) {
if (!fs.existsSync(basePath)) {
fs.mkdirSync(basePath);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should use mkdirp.sync here to make sure we can create dirs recursively (we should already have it as a dependency)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sg. done

if (!fs.existsSync(basePath)) {
fs.mkdirSync(basePath);
}
rimraf.sync(`${basePath}/*${traceSuffix}`);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe chrome launcher had a lot of issues on windows when using sync rimraf. Should we move to async?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rimraf's bumps since then were around this issue so i am hoping we're good now. https://github.com/isaacs/rimraf/commits/master

@paulirish
Copy link
Member Author

paulirish commented Nov 18, 2017

Okay folks.. A fresh update here.

The big changes to runner are in this commit: a3da663
There's no way to make that diff super straightforward, but I hope it's a bit easier to read now.

Otherwise, the remaining feedback has been addressed as well.

@@ -66,6 +69,9 @@ export function getFlags(manualArgv?: string) {
'disable-device-emulation': 'Disable Nexus 5X emulation',
'disable-cpu-throttling': 'Disable CPU throttling',
'disable-network-throttling': 'Disable network throttling',
'gather-mode':
'Collect artifacts from a connected browser and save to disk. If audit-mode is not also enabled, the run quit early.',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the run will quit early?

@@ -92,8 +92,12 @@ function handleError(err: LighthouseError) {
}

export function saveResults(results: Results, artifacts: Object, flags: Flags) {
const shouldSaveResults = flags.auditMode || (flags.gatherMode == flags.auditMode);
if (shouldSaveResults) return;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait this seems backwards, shouldn't it be if (!shouldSaveResults)?

if not, maybe just method of variable name needs some tweaking :)

}

return handleError(err);
await potentiallyKillChrome(launchedChrome) return handleError(err);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing a ;?

@patrickhulce
Copy link
Collaborator

hey @devtools-bot why'd you change this :P

@paulirish
Copy link
Member Author

Ready for another look.

Copy link
Collaborator

@patrickhulce patrickhulce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚢 it!


// save everything else
promise = promise.then(_ => {
fs.writeFileSync(`${basePath}/${artifactsFilename}`, JSON.stringify(artifacts), 'utf8');
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

care about prettifying?

a consequence of this I just ran into is that some artifacts are objects like Map/Set that aren't reconstituted. we'll have to move away from that. it might just be me with unminified-javascript right now :)

const fs = require('fs');
const path = require('path');
const URL = require('./lib/url-shim');
const Sentry = require('./lib/sentry');

const basePath = path.join(process.cwd(), 'latest-run');
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's add this to .gitignore :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Split Gather phase from Audit phase
5 participants