Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keep track of the document URL post-redirects #582

Merged
merged 14 commits into from
Aug 20, 2016
Merged

Conversation

paulirish
Copy link
Member

Often from a non-extension the input URL may then be redirected. For example, there are 3 redirects if you attempt http://aliexpress.com:

image

There's a few places where we work with the current URL, including @jeffposnick's new #578 and the unregsw branch. And in these cases, we want the post-redirect URL.

This PR will update options.url as redirects change the document location.

Additionally, we used to have a driver.url, but only for the chrome extension. We now drop that completely, which simplifies our URL gatherer.

redirects.
`options.url` will be kept up-to-date, and `options.initialUrl` will
retain the initial URL provided to Lighthouse
The drivers no longer keep track of the current URL.
obj.
This is currently untested. Will need an end to end test to try
redirects.
.then(_ => {
// Update the outer options object of any post-redirect url
// Subsequent passes will use the new URL, and it'll be reported outwards
options.url = runOptions.url;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like this could bite us, though. Someone adding a new first pass, moving the performance gatherers to the second pass, would miss the redirects in the load time since the second pass would skip to the new URL. Is there a reason each pass shouldn't start with the same URL each time?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did it mostly to save time in our later passes, but I agree it changes behavior. I'm fine with reusing the same initialUrl for each pass

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting challenge here. Thee http-redirect gather changes the options.url in beforePass and resets in afterPass.

They are separated into separate passes, so we are safe for now, but we need to make sure they don't share a pass. (well, until we fix the http redirect gather)

@paulirish
Copy link
Member Author

Rebased to master.

To test, run lighthouse http://aliexpress.com and verify output includes:
Lighthouse results: https://m.aliexpress.com/?tracelog=wwwhome2mobilesitehome

* This gatherer changes the options.url so that its pass loads the http page.
* After load it detects if its on a crypographic scheme.
* TODO: Instead of abusing a loadPage pass for this test, it could likely just do an XHR instead
*/
class HTTPRedirect extends Gatherer {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually wondering if we should be surfacing this. For example, redirecting to an m. incurs a perf penalty, which should be captured in TTI etc, but as a meta piece of data highlighting the number of redirects might be helpful.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

redirects do show up (but aren't especially highlighted) in the critical request chains though timing info isn't included

@paullewis
Copy link
Contributor

LGTM. Think we might want to surface the number of redirects as a piece of data somewhere, but that can be a separate PR.

@paullewis paullewis added the lgtm label Aug 17, 2016
* As such, options.url will always represent the post-redirected URL.
* options.initialUrl is the pre-redirect URL that things started with
*
* Caveat: only works when network recording enabled for a pass
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we file a bug for this? I'm thinking either network recording should always happen, getting rid of config.network, or network recording should always happen but the results are discarded if config.network is false

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems important for gatherers/audits to be able to assume they don't have to do any work to get the actual url of the current page

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

filed #604

@brendankenny
Copy link
Member

It still feels a bit weird to be updating options.url as we go. One other option: it could be collected as part of the tracing data (along with the trace and network records), passed into afterPass() on the loadData object and merged into the artifacts along with them as well

@brendankenny
Copy link
Member

(we could explore that in a follow up with a general options cleanup)

@paulirish
Copy link
Member Author

Yup I like the idea of doing it alongside the trace and network data.

@paulirish
Copy link
Member Author

A few items came up from this. Filed in #604 and #605.
Otherwise we should be good to go.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants