Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Find the best terminology to restrict the usage of data urls #635

Closed
iherman opened this issue May 13, 2021 · 7 comments
Closed

Find the best terminology to restrict the usage of data urls #635

iherman opened this issue May 13, 2021 · 7 comments

Comments

@iherman
Copy link

iherman commented May 13, 2021

Ya ya yawm TAG!

The category ("dispute escalation") is a misnomer; this is more a help/clarification request.

I'm requesting the TAG express an opinion on a problem related to:

We recommend the explainer to be in Markdown.

Explanation of the issue that we'd like the TAG's opinion on:

"There is no final agreement in the WG on how to precisely formulate the restrictions on the usage of data-url-s. The current formulation relies on the top-level browsing contexts term but that may not be adequate (e.g., if the top level document is an SVG file)."

Cc @ylafon

@iherman iherman added Progress: untriaged Review type: conflict escalation The TAG is being asked to settle a debate labels May 13, 2021
@annevk
Copy link
Member

annevk commented May 14, 2021

(See also whatwg/html#5279.)

@hadleybeeman
Copy link
Member

Hi @iherman. We're looking at this in our W3CTAG breakout, and we'd love a little more context.

What are you trying to accomplish with data URLs? Why is it helpful to restrict them? It seems like you've got a use case in mind, but it's hard to work it out from the spec you've linked to.

We think it may be to do with security, but we don't see it documented. Can you tell us a bit more about your thinking?

We can hopefully help more with that information. Thanks!

@hadleybeeman hadleybeeman self-assigned this May 25, 2021
@iherman
Copy link
Author

iherman commented May 25, 2021

Hey @hadleybeeman!

I try to summarize, but I also cc @mattgarrish @dauwhe and @bduga, who have a deeper knowledge of what is happening. The relevant part in the specification is https://w3c.github.io/epub-specs/epub33/rs/#confreq-rs-data-urls.

In the EPUB jargon, a Reading System is, from the point of view of what we are discussing, like a browser, insofar as one of its main task is to render either HTML or (standalone) SVG documents; these documents provide the reader with the pages of the books. These documents, referred to as "Top Level Content Documents", can be thought of being, say, the chapters of a large book (and the metadata provided in the EPUB instance tells the Reading System in which order these files should be displayed). Of course, these pages, which are HTML pages, can link to other resources, some in the EPUB instance and some somewhere on the Web.

The security related issue is how to handle DATA URL-s. One approach is to universally disallow them; however, this might make some genuine use cases impossible (e.g., an SVG content is embedded in the HTML or CSS file as a DATA URI). Hence the approach taken in the spec to disallow them as, say, a href value in an <a> element, but allow them in, e.g., a CSS file. The question was how to turn this into spec-text.

We realized that browsers have similar restrictions, and the EPUB spec is keen not to reinvent not only a wheel, but not even a terminology, when possible. However, we did not find any normative statement in other specs that applies to this situation. We did put the text into the current draft, but we are not sure whether that is the proper reference/terminology. Hence the request for TAG help...

Some further references:

I hope this helps...

@mattgarrish
Copy link

In addition to what Ivan has already mentioned, using data URLs to embed resources doesn't appear problematic, but allowing data URLs to be referenced from a elements has the same security risks in EPUB that have been raised for browsers (i.e., phishing).

In other words, we want to disallow data URLs from opening a "top-level browsing context", except when explicitly requested by a user (e.g., to open an image in a new window), but aren't completely sure how best to say this since it seems to only be handled in bug trackers right now. For reference, see:

So, in the absence of more formal guidance (which we'd prefer to reference), does the following make sense:

Reading Systems MUST prevent data URLs [RFC2397] from opening in top-level browsing contexts [HTML], except when initiated through a Reading System affordance such as a context menu. If a Reading System does not use a top-level browsing context for Top-level Content Documents, it MUST also prevent data URLs from opening as though they are Top-level Content Documents.

Or do you have any suggestions on how we can improve this wording?

@rhiaro
Copy link
Contributor

rhiaro commented Aug 4, 2021

Hi @iherman and @mattgarrish. Sorry about the delay in responding to this.

A question that came up in our TAG meeting last week was: does the epub spec require secure contexts? I couldn't tell from a quick ctrl+F of the spec. If it does, then it was resolved that data URLs at the top level do not create a secure context, in which case your wording could include something like:

Reading Systems MUST prevent data URLs [RFC2397] from opening in insecure contexts [https://html.spec.whatwg.org/multipage/webappapis.html#secure-context]

Otherwise I think the wording you have is sufficient, and it is consistent with widely implemented browser behaviour. If the main concern is that an SVG at the top level doesn't count as a "browsing context", and SVG is the only exception, you could be explicit about this, eg:

Reading Systems MUST prevent data URLs [RFC2397] from opening in top-level browsing contexts [HTML], except when initiated through a Reading System affordance such as a context menu. If a Reading System does not use a top-level browsing context for Top-level Content Documents, for example if the Top-level Content Document is an SVG, it MUST also prevent data URLs from opening as though they are Top-level Content Documents.

My only other suggestion would be to consider the phrase "via a user-initiated navigation" instead of (or as well as) "through a Reading System affordance such as a context menu" if that adds any clarity to what you mean.

@rhiaro
Copy link
Contributor

rhiaro commented Aug 10, 2021

I see the PR to update the wording was merged, so closing this. Thanks!

@rhiaro rhiaro closed this as completed Aug 10, 2021
@iherman
Copy link
Author

iherman commented Aug 10, 2021

I see the PR to update the wording was merged, so closing this. Thanks!

Indeed, @rhiaro. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants