Skip to content
This repository has been archived by the owner on May 5, 2022. It is now read-only.

Preload headers on sub resources #92

Closed
Krinkle opened this issue May 2, 2017 · 6 comments · Fixed by #108
Closed

Preload headers on sub resources #92

Krinkle opened this issue May 2, 2017 · 6 comments · Fixed by #108
Assignees
Milestone

Comments

@Krinkle
Copy link
Member

Krinkle commented May 2, 2017

Preload is useful as it allows one to help the browser discover a resource before it would/could naturally discover it. Especially with regards to indirection (e.g. the resource is not already specified directly somewhere in the HTML document, or is sufficiently far down in the source that it makes sense to specify it higher up or in a header as well).

However, sometimes this indirection is inevitable or even by design. Three use cases laid out below:

  1. Background image. Wikipedia's sidebar is toward the bottom of the HTML payload - after the main content. In addition, the main logo is actually specified by a CSS background-image, not an IMG element. As such, browser don't fetch the logo until the CSS is downloaded/parsed and the entire HTML payload is download, parsed and has its styles applied.

  2. Immutable scripts from a "startup" module. ResourceLoader's design document explains how all JS modules are served from immutable urls. That is to say, the module urls are versioned using a hash of their content. However, to ensure a consistent user experience from one page to another, and to allow quick deployments, the urls are not embedded in the (CDN-cached) HTML. Instead, we may use of an (async) "startup" script - which is the only mutable script ("not versioned"). As such the urls to the main script payloads is in this script response and not fetched until the browser has downloaded, parsed and executed this first JavaScript resource.

  3. Third-party sub resources. This use case doesn't apply to me personally, but I've seen it elsewhere. You include a third-party script (e.g. Google JS libraries, or stylesheet (e.g. something like jQuery UI or Bootstrap CSS). This third-party resource has sub resources (images, or other scripts) that are not discovered until run time.

These cases have three things in common:

  • They fetch sub resources not present in the HTML and logically also cannot be or should not be.
  • The sub resources (leaf nodes) are triggered indirectly (e.g. there is at least one sub resource in between the HTML and the leaf node).
  • Hardcoding them in preload on the HTML response would inevitably lead to a situation where cached content will result in preloading resources that are ultimately not used because the resources have changed url.

In these cases I think a significant improvement can be made without violating the requirements. Namely if we could specify preload of resources on sub resources.

  1. The HTML cannot specify the logo url, but the stylesheet response could have a Link header that says to preload the logo. This way, regardless of download/parsing/applying HTML/CSS, it can discover the fetch instruction as soon as it receives the headers of the stylesheet. Not as good as directly on the HTML, but certainly a significant improvement.
  2. The JavaScript contains a url, and the response can also set this url in a preload header. Again, this means that the fetch will not be delayed until the browser has downloaded and parsed the JavaScript and has arrived at its async or deferred execution slot, but instead it will start fetching as soon as it has the start of the JS response.
  3. The third-party library can send preload headers on their scripts or stylesheets. This could greatly benefit the way people use libraries from CDNs. (Personally I prefer self-hosting over use of CDNs, but as shows by case 1 and 2, this is not limited to CDNs.)
@Krinkle
Copy link
Member Author

Krinkle commented May 2, 2017

This use case comes from Wikipedia (as indicated by the examples and #31). Despite the logo changing once or twice per year on some wikis (e.g. for events or because of minor design adjustments or updated translations for one of the many language editions), we've chosen to go ahead with the preload header regardless using the url that is known to the application at the time the page is saved by the CDN on a cache-miss.

This has the caveat that if the logo were to change (temporarily or not), users will needlessly preload the old logo for up to 7 days (our current TTL for most page views) when viewing pages that have not been purged/modified since the logo change.

@igrigorik
Copy link
Member

@Krinkle all of the above makes sense... and should already work! You can specify preload headers on a subresource and those should be processed. Are you seeing otherwise? /cc @yoavweiss

@Krinkle
Copy link
Member Author

Krinkle commented May 4, 2017

@igrigorik I couldn't find anything about it in the spec (also no platform tests, but I guess that's tricky with static files and non-HTML sub resources).

Also, there is Link rel=stylesheet as a header (Firefox-only if I recall correctly) which enables applying stylesheets to non-HTML responses (e.g. when viewing an image or plain text file).

Given that, I would assume the default would be for specs like these to only apply when processing the response for something that can be the subject of a browsing context / HTML Document – assuming of course that the Link rel=stylesheet is not needlessly downloaded in Firefox when the resource in question is a sub resource.

When creating a little test of my own, I do indeed find that preload works fine in Chrome for sub resources. See https://gist.github.com/Krinkle/d39cd3d4bd30a10fad5e1ae7f60b4c11 (PHP)

Thanks for that! Could you clarify where this is specified in the preload spec? (Or another spec.)

That way we can be somewhat confident that when other browsers implement preload, they will do so in the same manner.

@igrigorik
Copy link
Member

Section 2.1: https://w3c.github.io/preload/#processing

The appropriate times to obtain the preload resource are:

  • When the user agent that supports [RFC5988] processes Link header that contains a preload link.

The above doesn't distinguish between navigation or subresource responses, and intentionally so. Personally, I don't think we need to explicitly spell this out? /cc @yoavweiss

@yoavweiss
Copy link
Contributor

This should Just Work™, and I agree with @igrigorik that we don't need to spell it out in normative text. An example as well as a WPT might be in order though.

@yoavweiss
Copy link
Contributor

Test is at web-platform-tests/wpt#6933

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants