When to fallback redirect? #397

irori · 2019-02-13T04:57:55Z

In the current loading spec, redirect to the fallback URL happens only when the signed exchange version is not supported.

However, Chromium implementation performs fallback redirect in the following cases (corresponding to the error codes here):

SXG was served from non-secure origin.
Unsupported version of SXG (could extract fallback URL).
SXG parse error (could extract fallback URL).
Network error occurred while loading SXG header.
Failed to fetch certificate chain.
Failed to parse certificate chain.
Signature verification failed.
Cert verification failed.
CT verification failed.
OCSP check failed.
Certificate Requirements aren't met.
SXG was served without X-Content-Type-Options: nosniff header.

Chromium doesn't fallbeck on the following errors:

SXG parse error (couldn't extract fallback URL).
Merkle integrity error.

We should clarify that which error should cause a fallback redirect, and spec them.

The text was updated successfully, but these errors were encountered:

kinu · 2019-02-13T05:01:16Z

/cc @twifkak for consumer side feedback

twifkak · 2019-02-20T01:01:27Z

Thanks for opening the issue. Sorry for the delay; I've been sick and this slipped my radar.

As a consumer & distributor, my product preference is to make as many things as possible a fallback. That said, I understand there may be reasons not to prefer that, and I would like to know more about them:

Debugging -- as a developer, I won't even notice that my SXG is invalid unless I have DevTools Network tab open and I scroll all the way to the top. That said, it seems like the UA should prioritize the typical user's experience over developer experience.
What others?

From your list above, I think there are some that I would feel safe removing fallback for today:

SXG was served from non-secure origin.
Network error occurred while loading SXG header. (I assume this is a network error on the outer response?)
SXG was served without X-Content-Type-Options: nosniff header.

Essentially these are all the "outer exchange" things that are well-trodden and I feel confident we'll do correctly; also, they require that we generate the correct response rather than that we validate (but don't modify) a given response.

I'm also okay with not falling back for Merkle integrity (this one was simple enough to check) or parse error (fallback seems dangerous).

For a distributor with the privacy-preserving-prefetch use case, the downsides of having a network error page are:

The user sees an error page instead of simply reduced page speed & extra network bytes.
The error page is unbranded. (In the early days of AMP, there used to be more branded error pages, but my understanding is that these have significantly reduced in frequency due to improvements in search infra.).

The potential reasons that our cache may serve an invalid SXG include:

A known limitation in our implementation (i.e. we haven't implemented validation X yet).
An unknown difference (i.e. a bug in either our cache's or the UA's implementation of SXG validation).
An intrinsically unsolvable difference (e.g. it's impossible to know the UA's trusted roots, or UA clock jitter [skew changing significantly from one request to another]).

At this point, I'm mostly worried about the first two. But I'm mostly guessing. After #374 (and after we implement server-side monitoring of those reports), I'll be able to respond more usefully.

twifkak · 2019-02-21T02:00:07Z

(I'm presuming that the fallback for SXG version mismatch happens before the network error for MICE decode failure. Otherwise, future Chromium versions will need to continue supporting old MICE versions.)

jyasskin · 2019-05-15T00:25:03Z

Thank you for collating all the error cases so we can just go through them. My suggestion is:

Situation	Behavior	Notes
SXG was served from non-secure origin.	🛑Error
Unsupported version of SXG (could extract fallback URL).	↪️Fallback
SXG parse error (could extract fallback URL).	🛑Error
Network error while streaming SXG header or body.	🛑Error
Failed to fetch certificate chain.	🛑Error
Failed to parse certificate chain.	🛑Error
Signature parsing failed (This includes fetching and parsing the certificate)	🛑Error
Signature isn't valid	↪️Fallback	This could error instead on the last step, which actually runs the validation algorithm?
Certificate doesn't have the CanSignHttpExchanges OID	🛑Error	Maybe we shouldn't make this exception from the next step.
Certificate chain does not have a trusted leaf for the origin for another reason	↪️Fallback	This includes cert path building, OCSP checking, and CT checking.
Certificate chain does not establish cross-origin trust for another reason	🛑Error	The non-certificate reasons can be checked on the distributor.
SXG was served without X-Content-Type-Options: nosniff header.	🛑Error
SXG parse error (couldn't extract fallback URL).	🛑Error
Merkle integrity error.	🛑Error

I basically want anything the distributor could catch to be an Error, while anything that clients might vary on (mostly time & crypto validity) should fall back. @sleevi, how do you feel about this list?

sleevi · 2019-05-15T00:56:21Z

I basically want anything the distributor could catch to be an Error, while anything that clients might vary on (mostly time & crypto validity) should fall back.

At first glance, that sounds like a great principle and I think your analysis is reasonable and spot-on. Signature isn’t valid seems a reasonable error?

jyasskin · 2019-05-15T03:34:04Z

@sleevi https://wicg.github.io/webpackage/loading.html#validating-signature includes the check against the client's clock (which I think should definitely fall back) and the check that it's a secp256r1 key (where I want to have clients fall back when they remove broken crypto, but maybe we should do that with a version increment instead).

We should also make sure that the reporting API distinguishes between the cases that error vs fall back. Right now there are distinctions in the table above that aren't captured in the reporting API results.

sleevi · 2019-05-15T15:42:19Z

Yeah, the approach for changing crypto primitives was definitely seen as a version-bumpable thing. Perhaps it's worth splitting out "validate a signature" and "validate the timestamp" as two separate things, the former being an error, the latter being a fallback?

twifkak · 2019-05-21T02:36:34Z

Initial NEL analysis from AMP Cache is available to Googlers in http://b/126787670. Overall, error rate is relatively low, but I'd say not low enough to serve network errors in all cases.

The two most prevalent errors are cert_fetch_error & mi_error. My understanding is that these can be due to a bad connection, which is beyond the control of the distributor (e.g. may be due to ISP issues or spotty cell coverage). I'm thinking that these cases should continue to fall back. Otherwise, spotty connnections would disproportionately affect SXG, because browsers are willing to render unsigned HTML that was truncated or had failed subresources. Are there downsides to falling back in these cases?

sleevi · 2019-05-21T02:41:58Z

Cert Fetch Errors are more similar to failures to fetch AIA (which causes hardfail) or OCSP (for which behaviour varies between browsers), right? As to whether it’s safe, the analysis would be: - Identifying all the information an adversary could learn by observing the fallback occurred - Identifying whether and how that information might already be available - Identifying how that information could be abused - Identifying what the process would be if we “got it wrong” in the analysis (in this case, what the implications would be if a browser had to suddenly switch to block)

irori · 2019-05-21T03:30:49Z

Reg. mi_error:

From the implementation's POV, browser sends SXG's inner response header to the renderer as soon as the SXG header is validated. Merkle integrity error is detected after that, so it's too late to chagne the response to 303 redirect at that pont.

jyasskin · 2019-05-23T19:17:00Z

cert_fetch_error: My instinct is that a failure to fetch the certificate from the same origin that's already transferring the signed exchange is similarly likely to a failure to fetch the signed exchange itself or the content from the publisher. Since we don't auto-reload on a network error, we shouldn't auto-redirect on one either.

mi_error:

It seems confusing to have a network error that truncates the SXG be expressed as mi_error. I think I'd want to see one of https://w3c.github.io/network-error-logging/#transmission-of-request-and-response-errors instead? @irori/@horo-t Is that plausible?
I hope the browser can render the parts of the content that have already been verified, even if the connection is broken which truncates and invalidates the last chunk.

irori · 2019-05-24T01:42:31Z

mi_error:

It seems confusing to have a network error that truncates the SXG be expressed as mi_error. I think I'd want to see one of https://w3c.github.io/network-error-logging/#transmission-of-request-and-response-errors instead? @irori/@horo-t Is that plausible?

IIUC, currently both a mi_error and a transmission error are reported. I think it's possible to stop sending mi_error for network error cases, but I'll defer to @horo-t.

I hope the browser can render the parts of the content that have already been verified, even if the connection is broken which truncates and invalidates the last chunk.

Chrome works in this way, although it's not guaranteed that response is loaded to the last validated chunk (internal buffer is discarded upon network error).

horo-t · 2019-05-24T10:32:46Z

mi_error:

It seems confusing to have a network error that truncates the SXG be expressed as mi_error. I think I'd want to see one of https://w3c.github.io/network-error-logging/#transmission-of-request-and-response-errors instead? @irori/@horo-t Is that plausible?

IIUC, currently both a mi_error and a transmission error are reported. I think it's possible to stop sending mi_error for network error cases, but I'll defer to @horo-t.

I agree.
We should not send mi_error when caused by a transmission error.
We need to update both the loading spec and Chromium implementation.

twifkak · 2019-05-24T18:30:01Z

Thanks for the clarifications; looks like I misunderstood what mi_error meant for the user. Re: cert_fetch_error; I'll defer to you all, unless I learn that these errors are disproportionately large and can't be mitigated at implementation side, and I can come up with a good reason they're different from AIA fetch. (The only difference I can think of is that a secure fallback seems possible in this case.)

Add: * An invariant fallback URL, like signed exchanges have. * A version number, so we can easily know to fall back to a redirect. * Some infrastructure to identify what kind of error broke the parse, which can feed into both Network Error Logging and #397’s discussion of when to fall back. * The index maps URLs to a Variants value + a list of the responses for each possible Variant-Key, instead of using a set of request headers. * A new signatures section allows authorities to vouch for particular subsets of the bundle. This document doesn't describe how a user agent would decide to trust cross-origin resources based on which authority vouches for them.

irori mentioned this issue Apr 5, 2019

Define -impl-03 #406

Merged

jyasskin mentioned this issue May 8, 2019

Prevent fallback loop when publisher publishes invalid SXG #432

Open

igrigorik mentioned this issue May 14, 2019

Navigation Timing behavior for HTTP Exchange (SxG) loading w3c/navigation-timing#107

Open

sleevi mentioned this issue May 15, 2019

Spec makes it difficult to distinguish signed and unsigned parameters #434

Open

jyasskin mentioned this issue Jul 5, 2019

Update the bundle format #450

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When to fallback redirect? #397

When to fallback redirect? #397

irori commented Feb 13, 2019

kinu commented Feb 13, 2019

twifkak commented Feb 20, 2019 •

edited

Loading

twifkak commented Feb 21, 2019

jyasskin commented May 15, 2019

sleevi commented May 15, 2019

jyasskin commented May 15, 2019

sleevi commented May 15, 2019

twifkak commented May 21, 2019

sleevi commented May 21, 2019 via email

irori commented May 21, 2019

jyasskin commented May 23, 2019

irori commented May 24, 2019

horo-t commented May 24, 2019

twifkak commented May 24, 2019

When to fallback redirect? #397

When to fallback redirect? #397

Comments

irori commented Feb 13, 2019

kinu commented Feb 13, 2019

twifkak commented Feb 20, 2019 • edited Loading

twifkak commented Feb 21, 2019

jyasskin commented May 15, 2019

sleevi commented May 15, 2019

jyasskin commented May 15, 2019

sleevi commented May 15, 2019

twifkak commented May 21, 2019

sleevi commented May 21, 2019 via email

irori commented May 21, 2019

jyasskin commented May 23, 2019

irori commented May 24, 2019

horo-t commented May 24, 2019

twifkak commented May 24, 2019

twifkak commented Feb 20, 2019 •

edited

Loading