-
Notifications
You must be signed in to change notification settings - Fork 9.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
core: bail if encounter insecure ssl cert, to avoid hanging forever #6300
Conversation
lighthouse-core/gather/driver.js
Outdated
getSecurityState(timeout = 1000) { | ||
return new Promise((resolve, reject) => { | ||
const err = new LHError(LHError.errors.SECURITY_STATE_TIMEOUT); | ||
const asyncTimeout = setTimeout((_ => reject(err)), timeout); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this will eventually be replaced by #6296
b8eff57
to
104e399
Compare
@@ -165,6 +166,20 @@ class GatherRunner { | |||
} else if (mainRecord.hasErrorStatusCode()) { | |||
errorDef = {...LHError.errors.ERRORED_DOCUMENT_REQUEST}; | |||
errorDef.message += ` Status code: ${mainRecord.statusCode}.`; | |||
} else if (!mainRecord.finished) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK, an insecure security state will always have .finished
set to false on the network records. I wanted to avoid calling these Security commands if not needed.
@@ -313,7 +333,7 @@ class GatherRunner { | |||
} | |||
|
|||
// Resolve on tracing data using passName from config. | |||
return passData; | |||
return [passData, pageLoadError]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WDYT about attaching it to passData
instead? It's a relatively grab-bag object of what a happened during a pass, so it makes some sense :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. I'll add it as an optional property. I avoided it in the first place b/c I didn't want to modify all usages of this type, but I completely forgot about optional properties :P
lighthouse-core/gather/driver.js
Outdated
this.sendCommand('Security.enable'); | ||
this.once('Security.securityStateChanged', (e) => { | ||
clearTimeout(asyncTimeout); | ||
resolve(e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
e => state ?
lighthouse-core/gather/driver.js
Outdated
this.once('Security.securityStateChanged', (e) => { | ||
clearTimeout(asyncTimeout); | ||
resolve(e); | ||
this.sendCommand('Security.disable').catch(reject); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you need this catch(). The timeout one is fine.
@@ -165,6 +166,20 @@ class GatherRunner { | |||
} else if (mainRecord.hasErrorStatusCode()) { | |||
errorDef = {...LHError.errors.ERRORED_DOCUMENT_REQUEST}; | |||
errorDef.message += ` Status code: ${mainRecord.statusCode}.`; | |||
} else if (!mainRecord.finished) { | |||
// could be security error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this may be just certificate errors. Seems like most of the connection errors are mainRecord.failed === true
.
@@ -165,6 +166,20 @@ class GatherRunner { | |||
} else if (mainRecord.hasErrorStatusCode()) { | |||
errorDef = {...LHError.errors.ERRORED_DOCUMENT_REQUEST}; | |||
errorDef.message += ` Status code: ${mainRecord.statusCode}.`; | |||
} else if (!mainRecord.finished) { | |||
// could be security error | |||
const securityState = await driver.getSecurityState(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small nit but I think it'd be slightly better to collect this on L288 and pass that into getPageLoadError() rather than all of driver.
.filter(exp => exp.securityState === 'insecure') | ||
.map(exp => exp.description) | ||
.join(' '); | ||
errorDef.message += ` Insecure: ${insecureDescriptions}`; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree the explanation gathering is the best we can do.
For this line i think it could just be
errorDef.message += ` Insecure: ${insecureDescriptions}`; | |
errorDef.message += ` ${insecureDescriptions.join(' ')}`; |
@@ -410,7 +430,8 @@ class GatherRunner { | |||
await driver.setThrottling(options.settings, passConfig); | |||
await GatherRunner.beforePass(passContext, gathererResults); | |||
await GatherRunner.pass(passContext, gathererResults); | |||
const passData = await GatherRunner.afterPass(passContext, gathererResults); | |||
const [passData, pageLoadError] = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@brendankenny @patrickhulce either of you have an idea on how else to deliver this error?
see also the L462 here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pageLoadError.fatal = true; throw pageLoadError;
:P
lighthouse-core/lib/strings.js
Outdated
@@ -11,7 +11,9 @@ module.exports = { | |||
badTraceRecording: `Something went wrong with recording the trace over your page load. Please run Lighthouse again.`, | |||
pageLoadTookTooLong: `Your page took too long to load. Please follow the opportunities in the report to reduce your page load time, and then try re-running Lighthouse.`, | |||
pageLoadFailed: `Lighthouse was unable to reliably load the page you requested. Make sure you are testing the correct URL and that the server is properly responding to all requests.`, | |||
pageLoadFailedInsecure: `The URL you have provided does not have a valid security certificate.`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm technically not just certificate, but admittedly most other security errors fall through the ERRORED_DOC_REQ path right now. how about
... does not have valid security credentials.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice meaty change, awesome!
lighthouse-core/gather/driver.js
Outdated
const asyncTimeout = setTimeout((_ => reject(err)), timeout); | ||
|
||
this.sendCommand('Security.enable'); | ||
this.once('Security.securityStateChanged', (e) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
somewhat tangential...
@paulirish are we just paranoid for doing all our listeners before enabling or do we actually need to? I seem to remember dgozman scolding me for not trusting in JS microtasks when we had listeners before .enable
:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm good call.
This code here would be a problem if there was an await
on the .enable
call above. which seems fairly reasonable.
So yeah +1 to defining the once handler before we flip enable() on
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point, fixed.
I don't quite understand the execution model here. It's clear to me why await
ing the enable would schedule registering the listener much too late, but I don't understand how this original code worked (enable without await
ing, then register a listener via .once
). Why does enabling the Security domain still occur after the listener is registered?
(note, this is literally all I know about micro/macrotasks: https://stackoverflow.com/a/25933985 )
lighthouse-core/gather/driver.js
Outdated
const err = new LHError(LHError.errors.SECURITY_STATE_TIMEOUT); | ||
const asyncTimeout = setTimeout((_ => reject(err)), timeout); | ||
|
||
this.sendCommand('Security.enable'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, @paulirish is Security domain still broken on Android, is this going to break us on real devices again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well someone's behind the times 😆
@@ -313,7 +333,7 @@ class GatherRunner { | |||
} | |||
|
|||
// Resolve on tracing data using passName from config. | |||
return passData; | |||
return [passData, pageLoadError]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WDYT about attaching it to passData
instead? It's a relatively grab-bag object of what a happened during a pass, so it makes some sense :)
@@ -165,6 +166,20 @@ class GatherRunner { | |||
} else if (mainRecord.hasErrorStatusCode()) { | |||
errorDef = {...LHError.errors.ERRORED_DOCUMENT_REQUEST}; | |||
errorDef.message += ` Status code: ${mainRecord.statusCode}.`; | |||
} else if (!mainRecord.finished) { | |||
// could be security error | |||
const securityState = await driver.getSecurityState(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
super nit: WDYT about sprinkling some nice destructuring here, const {securityState, explanations} =
? somethin' about securityState.securityState
just spoke to me :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't even notice that :O
lighthouse-core/gather/driver.js
Outdated
const err = new LHError(LHError.errors.SECURITY_STATE_TIMEOUT); | ||
const asyncTimeout = setTimeout((_ => reject(err)), timeout); | ||
|
||
this.once('Security.securityStateChanged', state => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aside: does this check break if someone else has been listening to the Security
domain? It would be nice to get a non-event based version of this in the protocol (AFAIK not much we can do to guard against that case for now).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I enabled just before calling this function, and yes, it does break. It times out + rejects with SECURITY_STATE_TIMEOUT
(which actually kills Lighthouse. oops)
Besides other users enabling the Security domain via code, could this fail if a developer opens the Security tab (this enables the domain) before running an audit?
lighthouse-core/gather/driver.js
Outdated
resolve(state); | ||
this.sendCommand('Security.disable'); | ||
}); | ||
this.sendCommand('Security.enable'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want this to be a general utility function, it will need to disable the domain and remove the event listener in the rejection case. Otherwise we'll need to think how to make rejections always lead to a program exit, as the state of things might get weird (we'll have to deal with the same thing in #6296)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want this to be a general utility function, it will need to disable the domain and remove the event listener in the rejection case.
Maybe this is overkill. If it's timing out (unexpectedly), something is pretty wrong and LH probably won't recover. Disabling the domain also might not even work. OTOH, not cleaning up feels wrong :)
@@ -271,7 +278,9 @@ class GatherRunner { | |||
const networkRecords = NetworkRecorder.recordsFromLogs(devtoolsLog); | |||
log.verbose('statusEnd', status); | |||
|
|||
let pageLoadError = GatherRunner.getPageLoadError(passContext.url, networkRecords); | |||
const securityState = await driver.getSecurityState(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we're making this harder than it needs to be by trying to unify with with the other page load errors.
getPageLoadError
is trying really hard to not interrupt normal control flow, inserting the promise rejection into the gather results instead of into the executing chain. Meanwhile, this security check is trying to abandon the gathering altogether.
Maybe this should just be a separate function, checkPageSecurityState()
or whatever, called here-ish. If the page is insecure, throw in there. It'll bail back to here, which will bail up to the catch
in run()
. If we really want to still return something in that case, the check against LHError.errors.INSECURE_DOCUMENT_REQUEST.code
can happen there (or rethrow if it's something different).
This also gets around extending passData
, which makes sense, because really there isn't any other useful pass data when there's this kind of error.
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be great to utilize error throwing to simplify this. But, I can't quite figure out how to still display a LHR in run
's catch
. I marked the security LHError fatal (so it would throw), moved baseArtifacts
up a bit (so it can be accessed in catch
), but get this silly error:
Error: CSSUsage failed to provide an artifact.
at Function.collectArtifacts (/Users/cjamcl/src/lighthouse/lighthouse-core/gather/gather-runner.js:366:15)
at <anonymous>
at process._tickCallback (internal/process/next_tick.js:189:7)
OK, I believe I resolved all comments. Opened #6330 to discuss how to handle errors being thrown during gather-runner + move this issue forward. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice, I like this approach
lighthouse-core/gather/driver.js
Outdated
*/ | ||
getSecurityState() { | ||
// @ts-ignore | ||
return this.lastSecurityState; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should set this in the Driver
constructor (I guess to null since SecurityStateChangedEvent
is a relatively large object so it's probably not worth constructing an unknown
starting value). Won't need to ts-ignore then
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's ts-ignored to coerce a non-nullable value, since I believe this should always be set. I could remove it and allow this getter to be nullable, but would also need to handle that in the calling code (a case that I think will never happen). wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we believe it's always set but don't know for sure, WDYT about throwing an explicit error if it's not?
This way it'll be clear to us when something unexpected happens and ts is happy by default :)
@@ -112,6 +112,7 @@ class GatherRunner { | |||
await driver.cacheNatives(); | |||
await driver.registerPerformanceObserver(); | |||
await driver.dismissJavaScriptDialogs(); | |||
await driver.listenForSecurityStateChanges(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doing it this way, I guess we'll find out if there are issues with listening through the whole page load :)
But it does nicely pave the way for bailing earlier if we want to do that.
.filter(exp => exp.securityState === 'insecure') | ||
.map(exp => exp.description); | ||
errorDef.message += ` ${insecureDescriptions.join(' ')}`; | ||
return new LHError(errorDef, {fatal: true}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rather than returning the error and throwing on the return value, I'd say just throw in here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{fatal: true}
shouldn't be necessary anymore
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I can do that. I was thinking it's nicer to make tests that check return values rather than throwing conditions, but just my preference, I could have it either way.
I'll remove fatal. yay, that means the resolve or throw thing really can still be removed. It wasn't being used any way, doh.
@@ -172,6 +173,22 @@ class GatherRunner { | |||
} | |||
} | |||
|
|||
/** | |||
* Returns an error if the original network request failed or wasn't found. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to update string
* @param {LH.Crdp.Security.SecurityStateChangedEvent} securityState | ||
* @return {LHError|undefined} | ||
*/ | ||
static checkForSecurityIssue({securityState, explanations}) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bikeshed: if it does throw in here, maybe rename assertCertificateError
or something? Or does this catch a larger class of issues
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It catches anytime the latest security state is marked insecure
. a certificate error is one example. I don't know if it's just one of, most of them, or all possible scenarios.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just triple checking because the protocol name is getting me concerned a bit, but securityState === 'insecure'
only when the page is HTTPS but actually isn't secure, right? Not just every time the page is served over HTTP without encryption? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to the idea of throwing in here with a assert*
rename too, btw. Seems much easier and removes the cognitive load from the caller
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea to check HTTP. Tested with node lighthouse-cli http://www.httpvshttps.com/ --verbose --view
- it still generated a full LHR. I presume "insecure" is only used for HTTPS connections that don't comply with security standards (and I guess they always go hand-in-hand with interstitial security warnings).
@@ -288,7 +310,6 @@ class GatherRunner { | |||
trace, | |||
}; | |||
|
|||
// Disable throttling so the afterPass analysis isn't throttled |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
restore?
lighthouse-core/lib/lh-error.js
Outdated
@@ -169,6 +175,12 @@ const ERRORS = { | |||
message: strings.requestContentTimeout, | |||
}, | |||
|
|||
// Protocol timeout failures | |||
SECURITY_STATE_TIMEOUT: { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the usual case for the security state check, it should just reject on an insecure security state, right? If so, we probably want to make this a more general protocol communication timeout error (anticipating #6296)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My plan was to modify that bit in the referenced issue. but now that we have this concrete proto definition it makes sense to make it good sooner rather than later
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I actually removed the timeout stuff for security checking. I'll remove this too.
@@ -543,9 +545,9 @@ describe('GatherRunner', function() { | |||
], | |||
}; | |||
|
|||
return GatherRunner.afterPass({url, driver, passConfig}, {TestGatherer: []}).then(vals => { | |||
return GatherRunner.afterPass({url, driver, passConfig}, {TestGatherer: []}).then(passData => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok to revert these lines now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think passData
is more clear than vals
.
Looking through the chromium source, the values to search for are There are two ways it's set. If someone wants to track down which one is actually in our execution path, feel free :)
if (net::IsCertStatusError(cert_status) &&
!net::IsCertStatusMinorError(cert_status)) {
return blink::kWebSecurityStyleInsecure;
} This is per-resource, not per-page, so probably not it.
switch (security_level) {
// ...
case security_state::DANGEROUS:
return blink::kWebSecurityStyleInsecure;
// ..
} where DANGEROUS is
Either way, this assumption seems to be ok. The main extra thing seems to be malware/phishing. Presumably that information will be in the explanation put in the friendly messsage. |
oh, mixed content might be an issue as well? We should check. I'm not sure we want to throw in that case |
|
yeah, it's flagged as "neutral". Mixed content pages with passwords might trigger this (cf. https://crbug.com/647754), but it seems it will only require a error message tweak in the future if folks run into it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is looking good. Just a few last things
@@ -679,6 +681,43 @@ describe('GatherRunner', function() { | |||
}); | |||
}); | |||
|
|||
describe('#checkForSecurityIssue', () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#assertNoSecurityIssues
lighthouse-core/gather/driver.js
Outdated
@@ -849,6 +849,26 @@ class Driver { | |||
}); | |||
} | |||
|
|||
async listenForSecurityStateChanges() { | |||
this.on('Security.securityStateChanged', state => { | |||
this.lastSecurityState = state; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the type checker now allows this (in JS files as of 3.0 or 3.1), but should still declare it in the constructor so there's a clear place to check for what's on Driver, we don't dynamically mutate the object's shape, etc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah this really surprised me, how it was able to infer the type. it was also able to infer it could be undefined, since the ctor didn't set it.
Makes sense. This is not good:
I guess TSC just infers a permissive union type for this.property
, based on all setters? What new feature of tsc 3.0/3.1 is this?
@@ -172,6 +173,22 @@ class GatherRunner { | |||
} | |||
} | |||
|
|||
/** | |||
* Returns an error if the security state is insecure. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Throws
GatherRunner.assertNoSecurityIssues(insecureSecurityState); | ||
assert.fail('expected INSECURE_DOCUMENT_REQUEST LHError'); | ||
} catch (err) { | ||
assert.equal(err.message, 'INSECURE_DOCUMENT_REQUEST'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe check err.code
as well/instead, since that's what we use for the top level runtimeError
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Some other tests were just testing .message
, so I added checks there too. lmk if I should just remove the .message
assertions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
After encountering an SSL error, some commands to the Network/Page domains either took awhile to timeout, or hung forever (only seemed to happen for
ERR_CERT_SYMANTEC_LEGACY
). You can see this at https://ilkayuyarkaba.av.tr/tag/istifaya-zorlanan-iscinin-kidem-tazminati (use canary).If one of the passes generates an SSL error, just bail. A report is generated with a useful error message at the top.
Fixes #6287.