Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update loading spec to support subresource substitution #591

Merged
merged 21 commits into from
Sep 4, 2020

Conversation

horo-t
Copy link
Collaborator

@horo-t horo-t commented Aug 12, 2020

I'd like to update the "Loading Signed Exchanges" spec to support Signed Exchange subresource substitution.
The logic is copied from "Algorithm sketch" section of the explainer of Signed Exchange subresource substitution.


Preview | Diff

@horo-t horo-t requested a review from jyasskin August 12, 2020 05:47
Copy link
Member

@jyasskin jyasskin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to be more complicated than just copying and pasting the algorithm sketch from the explainer. You need to go through the adjacent specifications and describe what's happening in their terms. Where the adjacent specs don't exist yet, like prenavigate, you can be a bit more handwavey, but try to be as precise as you can.

loading.bs Outdated Show resolved Hide resolved
loading.bs Outdated Show resolved Hide resolved
loading.bs Outdated Show resolved Hide resolved
loading.bs Outdated
The user agent SHOULD prefetch the alternate subresources signed exchanges if
declared in link headers in the outer and the inner response header of the
signed exchanges by running the following steps:
1. When the UA detects a "preload" link HTTP header in a signed exchange
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only defined resource hint that tells the UA to process the inner response at all is prerender, which is fairly deprecated. We need to specify that this happens inside of prenavigate, which doesn't have a spec yet... @yoavweiss, what's the status there?

I think this block needs to be a monkeypatch to either the prenavigate definition, or the fetch and process the linked resource algorithm that preload uses. Probably both: prenavigate to set some flag about which origins are ok to connect to, and "fetch and process the linked resource" to notice that and use the rel=alternate link instead.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prenavigate is indeed not currently defined. All we have is a few past discussions :/

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(As we also chatted elsewhere) I don't think this part specifically relies on prenavigate (while this one itself needs to have spec too), but rather depending on the unclarified behavior processing link headers won't be considered preprocessing the content (w3c/resource-hints#77 (comment)). But yes this part also needs to be reflected in the spec text.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this monkeypatch does describes the behavior of processing the preload header in the inner response of the prefetched signed exchange.

loading.bs Outdated Show resolved Hide resolved
loading.bs Outdated
So in [processing](https://www.w3.org/TR/preload/#processing) of [Link type
"preload"](https://www.w3.org/TR/preload/#link-type-preload) add the following
step:
1. For each preload, use the imagesrcset and imagesizes attributes to pick a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@domfarolino, I'm having trouble finding where <link rel=preload> applies the imagesrcset and imagesizes attributes to control what it fetches. https://www.w3.org/TR/preload/#processing appears to use the default https://html.spec.whatwg.org/multipage/semantics.html#fetch-and-process-the-linked-resource algorithm, which doesn't consult those attributes. Since you wrote whatwg/html#4048, can you point us in the right direction?

We should find or define a way for this algorithm to just ask the new document what it preloaded. We'll also have to pick the moment at which we check this set, since documents can add and remove <link rel="allowed-alt-sxg"> and <link rel="preload"> elements at any time.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While runnning default fetch and process the linked resource, the linked resource fetch setup steps for preload is called from step 10.

  1. Run the linked resource fetch setup steps, given el and request. If the result is false, then return.

The linked resource fetch setup steps for preload calls select an image source which calls update the source set to update source set using the imagesrcset and imagesizes attributes and returns selected source from the source set. And

  1. Set request's url to url.

So I think we need the monkypatch after 4. If as is "image", then: of the linked resource fetch setup steps for preload.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

loading.bs Outdated Show resolved Hide resolved
loading.bs Outdated Show resolved Hide resolved
@horo-t
Copy link
Collaborator Author

horo-t commented Aug 17, 2020

@jyasskin @kinu
Cold you please review the latest PR?
99de0b3

Copy link
Member

@jyasskin jyasskin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I didn't get through the whole thing. I'll do more tomorrow.

loading.bs Outdated Show resolved Hide resolved
loading.bs Outdated Show resolved Hide resolved
loading.bs Outdated Show resolved Hide resolved
loading.bs Outdated Show resolved Hide resolved
loading.bs Outdated
parameter contains `allowed-alt-sxg` and a `header-integrity` parameter set
to the value of the [=header integrity value=] of the [=alternate signed
exchange=]
* A <dfn dfn-type=dfn>alternate signed exchange link</dfn> is a relationship
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not actually a link relationship since it's not a keyword. Imitating https://www.w3.org/TR/resource-hints/#dfn-resource-hint-link, how about:

Suggested change
* A <dfn dfn-type=dfn>alternate signed exchange link</dfn> is a relationship
* An <dfn>alternate signed exchange link</dfn> is a `` `Link` `` header sent with a signed exchange |S|, with the [=alternate=] link type, the `type` parameter set to `application/signed-exchange`, and a [=Link Context=]. The [=Link Context=] MUST be the [=Link Target=] of a [=preload=] `` `Link` `` header inside the signed content of |S|, and the [=alternate signed exchange link=] means that this resource can also be found inside the signed exchange at the [=Link Target=] of the [=alternate signed exchange link=].
<div class="example" id="example-alternate-signed-exchange-link">
```http
Link: <...>; ...
```
</div>

Add the Link Target (https://tools.ietf.org/html/rfc8288#section-3.1) and Link Context (https://tools.ietf.org/html/rfc8288#section-3.2) definitions to the <pre class='anchors'> so they appear correctly in the References section.

You may have to fiddle with the bikeshed markdown to get it to link to https://html.spec.whatwg.org/multipage/links.html#rel-alternate and look right. See https://tabatkins.github.io/bikeshed/#autolinking.

This would probably make more sense with a diagram showing where in the inner and outer signed exchanges the headers appear. Try to make one with https://plantuml.com/component-diagram#58bb821cf05c48a2?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may have to fiddle with the bikeshed markdown to get it to link to https://html.spec.whatwg.org/multipage/links.html#rel-alternate and look right. See https://tabatkins.github.io/bikeshed/#autolinking.

Sorry I may not understand well how bikeshed works.
I created a PR to the HTML spec to export rel and alternate. whatwg/html#5843
Do I need to wait for the PR to be merged?

loading.bs Outdated Show resolved Hide resolved
loading.bs Outdated Show resolved Hide resolved
loading.bs Outdated Show resolved Hide resolved
loading.bs Outdated Show resolved Hide resolved
loading.bs Outdated Show resolved Hide resolved
loading.bs Outdated Show resolved Hide resolved
loading.bs Outdated Show resolved Hide resolved
loading.bs Outdated Show resolved Hide resolved
loading.bs Outdated Show resolved Hide resolved
@horo-t
Copy link
Collaborator Author

horo-t commented Aug 24, 2020

I think 1304768 is ready for review.
@jyasskin Could you please review this again?
Thank you.

loading.bs Outdated Show resolved Hide resolved
loading.bs Outdated Show resolved Hide resolved
loading.bs Outdated

* The <{Link}> element.

The <dfn element-attr for="link">variants</dfn> parameter may be present.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to define its format. I think you mean that it's an instance of https://httpwg.org/http-extensions/draft-ietf-httpbis-variants.html#variants ... which is tricky because that's a full sh-dict, which doesn't really fit as a field inside another header. What's the implementation doing?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the explanation.
Sorry, I don't understand why sh-dict doesn't fit as a field inside another header.

Currently Chrome is supporting the format defined in draft-ietf-httpbis-variants-04.

loading.bs Outdated Show resolved Hide resolved
loading.bs Outdated Show resolved Hide resolved
loading.bs Outdated
1. Let |preload links| be the set of <{link/rel/allowed-alt-sxg}> <a
http-header>Link</a> HTTP headers on |R|.
1. For each |preload link| in |preload links|:
1. Let |el| be a semantically equivalent link element to |preload link|.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to define "semantically-equivalent link element" in order to use it here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

loading.bs Outdated Show resolved Hide resolved
loading.bs Outdated Show resolved Hide resolved
loading.bs Outdated Show resolved Hide resolved
loading.bs Outdated

add the following steps:

5. If |el| is for the main resource's HTTP <a http-header>Link</a> header,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've run out of time to review today, so this comment is just a general direction rather than actually what needs to be fixed, but:

We're going to have to reconcile that the HTML spec pretends that it's dealing with <link> elements, but we're doing Link headers, and that HTML is trying to handle things incrementally, but we're putting a restriction on the overall set of Link headers. It might work to define a predicate on the response that runs over the full set of response headers to say whether to do the resource substitution at all, and then maybe you can avoid checking whether each element came from a header...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added note about the Issue(whatwg/html#4224).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also "is for the main resource" is not really precise, and it's be better to have some boolean that indicates what you mean here.

loading.bs Outdated Show resolved Hide resolved
loading.bs Outdated
This parameter is using the same format as <a
http-header>Variant-Key</a> Header Field.

The [=appropriate time=] to [=fetch and process the linked resource=] for an
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not an "appropriate times" algorithm, (e.g. like this one), but more of a "should we preload this resource?" one.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As such, it should be called from somewhere.

loading.bs Outdated

</div>

<h3 algorithm id="mp-document">Monkeypatch Document</h3>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This simply extends the Document object. I don't think I'd categorize it as "mokeypatch".

loading.bs Outdated

</blockquote>

<h3 algorithm id="mp-navigation-params">Monkeypatch navigation params struct</h3>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly to above, extending an object is not necessarily "monkeypatching"

loading.bs Outdated
link](https://w3c.github.io/resource-hints/#fetching-the-resource-hint-link)
section in [[resource-hints]], add the following lines:

The user agent SHOULD run the following steps if the response of a
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should mokeypatch Fetch or HTML (if this is only meant for prenavigates) directly.
So when either processes a response, they need to call an algorithm defined here that would:

  • Check if the response is a prefetch one (e.g. by checking its request's initiator) and otherwise bail
  • Check if the response came from an SXG. Otherwise bail
  • Create an exchange...

loading.bs Outdated
9. Let |allowed alternate subresource signed exchange links| be null.

10. If |sourceBrowsingContext|'s [=active document=]'s
[=Document/prefetched signed exchanges for navigation=] has a matching
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"has a matching exchange" is a bit handwavy. Might be better to define "prefetched signed exchanged for navigation" as a map

loading.bs Outdated

add the following steps:

5. If |el| is for the main resource's HTTP <a http-header>Link</a> header,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also "is for the main resource" is not really precise, and it's be better to have some boolean that indicates what you mean here.

@horo-t
Copy link
Collaborator Author

horo-t commented Aug 26, 2020

I changed this PR to monkeypatch Link type "prefetch" for recursive prefetching of alternate signed exchange subresources, and monkeypatch Page load processing model for HTML files for Link rel=preload HTTP headers processing.

@jyasskin @yoavweiss
Could you please review the latest patch?

@horo-t horo-t requested review from jyasskin and yoavweiss August 26, 2020 22:11
Copy link
Member

@jyasskin jyasskin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't read every line here, but it's looking good. Go through the comments here (and try to generalize them), and then I expect to be able to fix up anything else tomorrow and merge the PR.

loading.bs Outdated Show resolved Hide resolved
loading.bs Outdated Show resolved Hide resolved
loading.bs Outdated Show resolved Hide resolved
loading.bs Outdated Show resolved Hide resolved
loading.bs Outdated
Comment on lines 598 to 580
1. Let |storedExchange| be the result of creating an [=exchange=] with
the URL of |linkTarget| and a new [=response=].
1. If |innerLinkVariants| is not the empty string (""), [=header
list/set=] `` `Variants` ``/|innerLinkVariants| in
|storedExchange|'s [=exchange/response=]'s [=response/header list=].
1. If |innerLinkVariantKey| is not the empty string (""), [=header
list/set=] `` `Variant-Key` ``/|innerLinkVariantKey| in
|storedExchange|'s [=exchange/response=]'s [=response/header list=].
1. Let |requestForMatch| be the result of [=creating a potential-CORS
request=] given |linkTarget|, the empty string, and [=No CORS=].
1. If |requestForMatch| doesn't [=matches the stored exchange=]
|storedExchange|, then continue.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can probably refactor [=matches the stored exchange=] so this algorithm doesn't need to create a placeholder exchange, but no need to do that in this change.

loading.bs Outdated Show resolved Hide resolved
loading.bs Outdated
[=exchange/response=]'s [=response/header integrity value=]
encoded as a [[CSP]] <a grammar>hash-source</a>, then set
|alternateSXG| to |sxg|.
1. If |headerIntegrity| is not an empty string and |alternateSXG| is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wasn't obvious why it was ok for |headerIntegrity| to be an empty string here. I think it's that when it's empty, there was no allowed signed exchange, and so we don't need it to have been prefetched ... in which case maybe it'll be clearer to put the "If |headerIntegrity| is not an empty string" condition around both the for loop and what's left of this step?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I introduced getting allowed signed exchange link info algorithm which doesn't allow allowed signed exchange link info's header integrity to be null. And I added a Note about the intension. Does this answer your question?

@horo-t horo-t requested a review from jyasskin August 28, 2020 06:47
loading.bs Outdated Show resolved Hide resolved
1. If |selected source| is not null, then set |linkTarget| to the result
of [=URL parser|parsing=] |selected source|, with a base URL of
|response|'s [=response/URL=].
1. For each |allowedSxgLink| of |allowedSxgLinks|:
Copy link
Collaborator

@yoavweiss yoavweiss Aug 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an O(n^2) algorithm. Is this how we're implementing this?
Not a blocker, but might be better to redefine this using a map, as a followup.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blink is using a map AlternateSignedExchangeResourceInfo.EntryMap.

I will do so as a followup.

loading.bs Outdated
a base URL of |navigationParams|'s [=navigation
params/response=]'s [=response/URL=].
1. Let |headerIntegrity| be null.
1. For each |allowedSxgLink| of |allowedSxgLinks|:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same O(n^2) comment as above

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will use a map as a followup.

loading.bs Outdated Show resolved Hide resolved
loading.bs Outdated Show resolved Hide resolved
loading.bs Show resolved Hide resolved
@horo-t horo-t requested a review from yoavweiss August 28, 2020 09:43
Copy link
Collaborator

@yoavweiss yoavweiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM % not-really-blocking comments

I still want @jyasskin to take a look before landing

loading.bs Outdated Show resolved Hide resolved
loading.bs Outdated Show resolved Hide resolved
loading.bs Outdated Show resolved Hide resolved
loading.bs Outdated
[=request/stashed exchange=] to |preloadLinkItem|'s [=alternate
signed exchange peload info/prefetched alternate exchange=].

Note: When |canLoadAlternateSxg| if false or there is no
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! That makes things clearer! :)

loading.bs Outdated Show resolved Hide resolved
@horo-t horo-t force-pushed the subresource-sxg-loading branch from 5fd8b75 to 107f39f Compare August 31, 2020 04:01
Copy link
Collaborator Author

@horo-t horo-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

loading.bs Outdated Show resolved Hide resolved
loading.bs Outdated Show resolved Hide resolved
loading.bs Outdated Show resolved Hide resolved
loading.bs Outdated Show resolved Hide resolved
loading.bs Outdated
1. Let |alternateSxgUrl| be [=alternate signed exchange link
info/target=] of the first [=alternate signed exchange link info=]
of |alternateLinks| whose [=alternate signed exchange link
info/variants=] matches |allowedSxgLink|'s [=allowed signed exchange
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "matches" here means "is identical to"? Or did you implement something more flexible?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The implementation is checking the equality of strings in alternate_signed_exchange_resource_info.cc

loading.bs Outdated
link info/variants=] and whose [=alternate signed exchange link
info/variant key=] matches |allowedSxgLink|'s [=allowed signed
exchange link info/variant key=] and whose [=alternate signed
exchange link info/context=] matches |linkTarget| or continue if it
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this string equality or URL equivalence? Because the Link Context is a URL, I'm going to assume it's parsed and then checked for URL equivalence, but we need to be sure to be precise about this to avoid security problems when one of the links is unnecessarily more percent-encoded than the other.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The implementation is checking the equality of URL in alternate_signed_exchange_resource_info.cc.
Sorry I'm not sure what kind of security problems exist here.

Copy link
Member

@jyasskin jyasskin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I finally got through the whole thing. Thanks for bearing with me. I'll merge it now, and we can fix up any mistakes I made in another PR.

loading.bs Outdated

or continue if no such link is present.
1. Let |sxgRequest| be the result of [=creating a potential-CORS
request=] given |alternateSxgUrl|, the empty string, and [=No
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question about this destination.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Correct.

loading.bs Outdated
key=] in |storedExchange|'s [=exchange/response=]'s
[=response/header list=].
1. Let |requestForMatch| be the result of [=creating a potential-CORS
request=] given |linkTarget|, the empty string, and [=No CORS=].
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sets the destination to the empty string, but it looks like we have an as value above. We should use that, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Correct, thank you for fixing.

loading.bs Outdated
[=No CORS=].
1. If |requestForMatch| [=doesn't match the stored exchange=]
|storedExchange|, then continue.
1. Set |headerIntegrity| to |allowedSxgLink|'s [=allowed signed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like it'll use the header-integrity of the last matching allowed-alt-sxg link if multiple such links match. Does that match your implementation?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. Chrome is using the first allowed-alt-sxg link entry.h.

I will update the spec to make it closer to the Chrome's implementation.
(But it may take a few months because of my paternity leave.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants