Last updated: Apr 2022
We propose a new approach to load a large number of resources efficiently using a format that allows multiple resources to be bundled, e.g. Web Bundles.
- Backgrounds
- Requirements
<script>
-based API- Example
- Request's mode and credentials mode
- Request's destination
- CORS and CORP for subresource requests
- Content Security Policy (CSP)
- Defining the scopes
- Serving constraints
- Extensions
- Subsequent loading and Caching
- Compressed list of resources
- Alternate designs
-
Loading many unbundled resources is still slower in 2020. We concluded that bundling was necessary in 2018, and our latest local measurement still suggests that.
-
The output of JS bundlers (e.g. webpack) doesn't interact well with the HTTP cache. They are pretty good tools but configuring them to work in an optimal way is tough, and sometimes they'are also incompatible with new requirements like dynamic bundling (e.g. small edit with tree shaking could invalidate everything).
-
With JS bundlers, execution needs to wait for the full bytes to come. Ideally loading multiple subresources should be able to utilize full streaming and parallelization, but that's not possible if all resources are bundled as one javascript. (For JS modules execution still needs to be waited for the entire tree due to the current deterministic execution model)
Web pages will declare that some of their subresources are provided by the Web Bundle at a particular URL.
It's likely that the HTML parser will encounter some of the bundle's subresources before it receives the bundle's index. The declaration needs to somehow prevent the parser from double-fetching those bytes, which it can accomplish in a couple ways.
We don't see an initial need for an associated Javascript API to pull information out of the bundle.
We also don't address a way for Service Workers to use bundles to fill a Cache.
Service Workers can technically unpack a bundle into
cache.put()
calls themselves, and, while the result may take an inefficient amount of
browser-internal communication, letting some sites experiment with this will
give us a better chance of designing the right API.
This feature is a powerful feature that can replace any subresources in the page. So we limit the use of this feature only in secure contexts.
This feature is NOT related to Signed Exchanges, that is a common misunderstanding. The bundle doesn't have to be signed.
Developers will write
<script type="webbundle">
{
"source": "https://example.com/dir/subresources.wbn",
"resources": ["https://example.com/dir/a.js", "https://example.com/dir/b.js", "https://example.com/dir/c.png"]
}
</script>
to tell the browser that subresources specified in resources
can
be found within the https://example.com/dir/subresources.wbn
bundle.
When the browser parses such a script
element, it:
-
Fetches the specified Web Bundle,
https://example.com/dir/subresources.wbn
. -
Records the
resources
and delays fetching a subresource specified there if a subresource's origin is the same origin as the bundle's origin and its path contains the bundle's shortened path as a prefix. -
As the bundle arrives, the browser fulfills those pending subresource fetches from the bundle's contents.
-
If a fetch isn't actually contained inside the bundle, it's probably better to fail that fetch than to go to the network, since it's easier for developers to fix a deterministic network error than a performance problem.
The primary requirement to avoid fetching the same bytes twice is that "If a specified subresource is needed later in the document, that later fetch should block until at least the index of the bundle has downloaded to see if it's there."
It seems secondary to then say that if a specified subresource isn't in the bundle, its fetch should fail or otherwise notify the developer: that just prevents delays in starting the subresource fetch.
Suppose that the bundle, subresources.wbn
, includes the following resources:
- https://example.com/dir/a.js (which depends on ./b.js)
- https://example.com/dir/b.js
- https://example.com/dir/c.png
- … (omitted)
A URL of the resource in the bundle can be a relative URL to the bundle. A browser must parse a URL using bundle's URL.
<script type="webbundle">
{
"source": "https://example.com/dir/subresources.wbn",
"resources": ["https://example.com/dir/a.js", "https://example.com/dir/b.js", "https://example.com/dir/c.png"]
}
</script>
<script type=”module” src=”https://example.com/dir/a.js”></script>
<img src=https://example.com/dir/c.png>
Then, a browser must fetch the bundle, subresources.wbn
, and load
subresources, a.js
, b.js
, and c.png
, from the bundle.
A URL in source
can be a relative
URL and must be resolved on
document's base URL.
A URL in resources
and scopes
can be a relative
URL and must be resolved on
the bundle's URL.
<script type="webbundle">
doesn't support src=
attribute. The rule must be inline.
A request for a bundle
will have its mode set to "cors
" and its
credentials mode set to "same-origin
" unless a
credentials
is specified in its JSON as follows:
<script type="webbundle">
{
"source": "https://example.com/dir/subresources.wbn",
"credentials": "omit",
"resources": ["https://example.com/dir/a.js", "https://example.com/dir/b.js", "https://example.com/dir/c.png"]
}
</script>
A possible value is "omit
", "same-origin
", or "include"
. See the fetch spec for details.
If other values are specified, a credentials mode is set to "same-origin
" .
Note: <script>
element's crossorigin attribute is not used.
With the <script>
-based API, a
request for a bundle
will have its
destination
set to "webbundle
"
(whatwg/fetch#1120).
CORS and CORP checks on subresources in bundles are based on the URL and response headers of requested subresource.
For example, if a cors request is made to a cross-origin subresource in a
bundle, and the subresource does not have an Access-Control-Allow-Origin:
header, the request will fail.
Similarly, if a no-cors request is made to a cross-origin subresource in a
bundle, and the subresource has Cross-Origin-Resource-Policy: same-origin
header, the request will fail.
For resources loaded from bundles, URL matching of CSP is done based on the URL of the resource, not the URL of the bundle. For example, given this CSP header:
Content-Security-Policy: script-src https://example.com/script/
In the following, a.js
will be loaded, but b.js
will be blocked:
<script type="webbundle">
{
"source": "https://example.com/subresources.wbn",
"resources": ["https://example.com/script/a.js",
"https://example.com/b.js"]
}
</script>
<script src=”https://example.com/script/a.js”></script>
<script src=”https://example.com/b.js”></script>
Instead of including a list of resources, the <script>
defines a scopes
.
<script type="webbundle">
{
"source": "https://example.com/dir/subresources.wbn",
"scopes": ["https://example.com/dir/js/",
"https://example.com/dir/img/",
"https://example.com/dir/css/"]
}
</script>
Any subresource under the scopes
will be fetched from the bundle.
See the Serving constraints for response headers which MUST be included when serving Web Bundles over HTTP.
There are several extensions to this explainer, aiming to support various use cases which this explainer doesn't support:
See issue #641 for the motivation of splitting the explainer into the core part, this explainer, and the extension parts.
Dynamic bundle serving with WebBundles is a detailed exploration of how to efficiently retrieve only updated resources on the second load. The key property is that the client's request for a bundle embeds something like a cache digest of the resources it already has, and the server sends down the subset of the bundle that the client doesn't already have.
As discussed in Dynamic bundle serving with WebBundles, simply including a list of resources in the HTML may cost as little as 5 bytes per URL on average after the HTML is compressed.
This explainer had used <link>
-based API before adopting <script>
-based API:
<link
rel="webbundle"
href="https://example.com/dir/subresources.wbn"
resources="https://example.com/dir/a.js https://example.com/dir/b.js https://example.com/dir/c.png"
/>
However, we abandoned <link>
-based API, in favor of <script>
-based
API. See issue #580
for the motivation. Note that some of the following alternate designs
were proposed at the era of <link>
-based API. This explainer doesn't
rewrite them with <script>
-based API yet.
A resource bundle is the same effort, with a particular scope. A resource bundle has a good FAQ which explains how this proposal and a resource bundle are related.
We have been collaborating closely to gather more feedback to draw a shared conclusion.
Several other mechanisms are available to give the bundler more flexibility or to compress the resource list.
A page still executes correctly, albeit slower than optimal, if a resource that's in a bundle is fetched an extra time, or a resource that's not in a bundle waits for the bundle to arrive before its fetch starts. That raises the possibility of putting a Bloom filter or other approximate membership query datastructure, like a cuckoo filter or quotient filter, in the scoping attribute.
In this case, it must not be an error if a resource matches the filter but turns out not to be in the bundle, since that's an expected property of this datastructure.
<link
rel="webbundle"
href="https://example.com/dir/subresources.wbn"
digest="cuckoo-CwAAAAOztbwAAAM2AAAAAFeafVZwIPgAAAAA"
/>
In some cases, the page might be able to control when it issues fetches for all
of the resources contained in a bundle. In that case, it doesn't need to
describe the bundle's scope in the <link>
element but can instead listen for
its load
event:
<link
rel="webbundle"
href="https://example.com/dir/subresources.wbn"
onload="startUsingTheSubresources()"
/>
Since the web bundles format includes an index before the content, we can optimize this by firing an event after the index is received (which expresses the bundle's exact scope) but before the content arrives:
<link
rel="webbundle"
href="https://example.com/dir/subresources.wbn"
onscopereceived="startUsingTheSubresources()"
/>
We might be able to use a link type as general as "bundle"
, especially if it
also uses the MIME type of the bundle resource to determine how to process it.
We'll need to disambiguate between a bundle meant for preloading subresources
and a bundle meant as an alternative form of the current page. The second can
use <link rel="alternate" type="application/web-bundle">
.
Thanks to https://github.com/yoavweiss/cache-digests-cuckoo and https://github.com/google/brotli for the software used to generate sample attribute values.