Putting guards at primitives instead of sinks #176

annevk · 2019-05-28T07:19:06Z

I discussed this briefly with @koto and thought I'd file it to not lose track of the idea. What if instead of sinks we add the guards at the various primitives. E.g., not <a>, <form>, etc. but "navigate". Not fetch(), <img>, etc. but "fetch". Not appendChild() et al but "prepare a script" (or some such). Not innerHTML and friends but "HTML fragment parsing algorithm".

This would put the actual protections right at the dangerous points. We'd still have to change sinks to allow for typed objects to reach the dangerous points, but there's no longer the issue of overlooking a sink or overlooking trusted types when adding a new sink. Or the issue of it not being clear how to update all the various sinks as with Location as we could opt not to add trusted types for all of them.

The text was updated successfully, but these errors were encountered:

koto · 2019-05-28T14:11:12Z

That's a really good idea! It's mostly just an exercise in finding the right primitives matching the current sinks. For example, for now fetch() is not guarded. I assume the default policy (or a callback function) would run not at a sink setter level, but at the lower-lever primitive (like a spec algorithm).

Two potential downsides I see is that a) it might make the integration with other spec more verbose, so more challenging and b) maybe not everything is polyfillable.

I'll take a look at where we could hook the current TT sink setters, and whether that causes some change of behavior (it will for individual Location properties, so for <a href> as well, but that's actually what we want - #64).

koto · 2019-06-01T12:07:03Z

One problem with the Fetch integration is that this would happen too late. To avoid having to taint track nodes, TT are guarded at node attribute setters, even if the node is not connected to the document.

For example, for scripts fetch would only be triggered when the script is prepared, but we'd rather require the type earlier, in this case when setting the src.

We could definitely benefit from Fetch integration e.g. by referring to (script-like Request destination)[https://fetch.spec.whatwg.org/#request-destination-script-like] and requiring the URL attributes being TrustedURL or TrustedScriptURL based on whether the Fetch destination later on would be script-like or not. However this seems hard to specify normatively ("require this type if it's possible for a different algorithm to set this value to Request" is quite vague). Perhaps it's better off adding a non-normative note instead? This already seems like what is happening elsewhere, e.g. "audioworklet" Request destination is not specified in Web Audio, Worklets nor HTML, and its value is only linked to audioWorklet.addModule() in a table in non-normative node in Fetch.

annevk · 2019-06-04T13:39:41Z

You can certainly put requirements on entry points. The main thing about having the actual guard at the primitive is to avoid missing entry points.

Not sure about the note thing, "audioworklet" not being defined in a normative matter is a bug.

koto · 2019-06-04T14:49:39Z

That's fine, I was just trying to figure out where the Request destinations are defined and audioworklet was the first thing I tried. The same bug exists for paintworklet that I can't find a reference of outside of Fetch, but I assume other values are better in that regard.

I still see some difficulty for speccing this, but it's likely I just misunderstood what you mean. Do you mean to add additional guards in e.g. fetch (the most likely algorithm that would relate to URL TTs), such that it rejects non-compliant values?

E.g.

If request initiator is "non-parser-inserted" , and request window's documents' TrustedTypeConfiguration's string at sink disposition is "reject" and a request URL is not a getTypeForDestination(request destination), reject with some error.

That would require preserving the type of the URL in Fetch as well. So far we're not doing this yet in the JS fetch() call, because we looked into non-script URLs mostly to address a problem with javascript: or other powerful schemes in URLs, and not to prevent requests from happening.

If we'd want to keep JS fetch(string) to work regardless of TT enforcement, we need a way to track the requests initiated by DOM / other sinks from ones using XHR / Fetch - should we use initiator for that?

annevk · 2019-06-12T07:22:46Z

Yeah, worklets are not defined well and have quite a few open issues against them... Perhaps you can ping some colleagues?

Initiator might be useful if there's nothing else that gives it away. Note also that at a specification level URL is not a string, e.g., to make blob URLs work. So there's infrastructure already to attach additional data to URLs.

koto · 2019-08-07T06:07:34Z

I was able to tackle the javascript: URLs in a more developer-friendly matter.

Previously, all navigational sinks required a TrustedURL. The only issue with those sinks was that they could potentially navigate to javascript:, but most usages of them were benign. We found that it's unneccessarily difficult for the authors to cover these sinks with TrustedURLs (this is by far the most common violation source), so instead I propose a simple hook that will make sure that a navigation to a javascript: URL will go through a default policy:

https://github.com/WICG/trusted-types/pull/204/files

This places the burden on the authors who still use javascript: URLs, and makes sure that, in "TT mode" the javascript: URLs don't work by default.

existing algorithms.

koto · 2019-09-03T15:11:11Z

@annevk I've found a couple of algorithms that form the set of primitives that are a good target for guards. The issue is that as these primitives accept strings, or byte streams and are called a few levels deeper than the TT logic, and I don't know how to express in the spec that these should take TT into account, without clumsily passing an additional argumentall the way through the stack.

I see something similar when HTML fragment parsing algorithms is called out as a special case e.g. here

Would something like the following work for TrustedHTML for example? If this is acceptable, I can extend that approach to other types and primitive algos.

[Pull request] [Spec preview]

…ascript: URLs. (#204) This removes the burden from all authors to create types when interacting with common sinks that usually don't cause DOM XSS (unless for javascript: URLs). This PR prevents javascript: URLs from working by default, and allows programmatic opt-in to enable them one-by-one for the few applications that need them. Related to #176. Partially addresses #169. Fixes #64.

koto · 2019-09-30T15:31:01Z

There's three generic approaches we could take to implement checks at the primitive algorithms:

Preserve the typed object in a given sink (e.g. attribute node, text node), and perform checks (+ stringify) in a primitive algorithm. We would still prefer that the default policy and the rejection / notification happened early, as this is the developer experience that we want (e.g. throw on script.setAttribute('src', ...), not on appendChild(script)). The primitive algorithms would perform the checks defined in TT spec via relevant global object.

This requires modifying some of the core IDL interfaces, e.g. Attr or Text.
Add a flag (e.g. was_trusted_html) to interfaces used in a given sink and check that it's present in the lower-level algorithms.

This also requires modifying some core IDL interfaces, to store a flag.
Pass the flag through the algorithms, without modifying the IDL interfaces. This can likely only be done formally iff a given primitive algorithm is called by the sink setter algo, and not if the former is caller at a later point in time (e.g. for prepare script). That said, maybe this can be explained in the algorithm prose.

I've identified the following algorithms that would be affected. It seems that, after fixing #47, the TT behavior for the currently exposed APIs would be the same with and without the hardening fixes outlined below.

HTML parsing

Create an element for the token is called after parsing; since the input data was trusted implicitly, this algorithm needs to add trust info to created elements.
HTML fragment parsing. It would have to callout to a relevant global object to make a check.
HTML parser (used by document.write). Seems like this needs to be guarded at document.write level only, as it accepts a stream of code points, and not a string.

Attributes

Attributes / Text nodes should be modified to also store trust information (typed values or a flag). https://dom.spec.whatwg.org/#interface-attr
Change signature of https://dom.spec.whatwg.org/#dom-element-setattribute and related.
In case approach 1 is taken, change Attr value getter to stringify the output; also in https://dom.spec.whatwg.org/#concept-element-attributes-get-value (or perhaps introduce a don't stringify argument?)
Do TT checks via relevant global object in https://dom.spec.whatwg.org/#set-an-existing-attribute-value
Add checks to https://dom.spec.whatwg.org/#concept-element-attributes-append, https://dom.spec.whatwg.org/#concept-element-attributes-change and
https://dom.spec.whatwg.org/#concept-element-attributes-replace (called by NamedNodeMap setNamedItem(NS)).

Scripts

Add checks at https://html.spec.whatwg.org/#execute-the-script-block or https://html.spec.whatwg.org/#prepare-a-script. In the latter step 13 has CSP integration for inline scripts and is passed an element; this will later call https://w3c.github.io/webappsec-csp/#match-element-to-source-list
For non-inline scripts CSP only integrates at fetch level; this might be too late for TT, though if this is only an additional guard, perhaps it's fine to hook into https://fetch.spec.whatwg.org/#main-fetch 2.4. However, the checks are based on a request destination - a string. We need to either:
- add TT metadata to Fetch request, and verify those in CSP checks, or:
- Add TT checks in https://html.spec.whatwg.org/#prepare-a-script, similar to step 5.
Perhaps change https://w3c.github.io/webappsec-trusted-types/dist/spec/index.html#dom-text-node-validation. The checks might be problematic in step 7 of https://dom.spec.whatwg.org/#concept-node-insert as they can partially fail, and then step 8 doesn't run. Perhaps hook after insertion but pre execution.

koto · 2019-10-11T16:50:47Z

Moved some of this upstream to whatwg/html#3052 and whatwg/dom#789.

domenic · 2019-10-15T09:45:11Z

I just want to write up a concrete analysis of what I think the spec text would look like for the approaches @koto mentions.

There's three generic approaches we could take to implement checks at the primitive algorithms:

This requires updating all sinks' IDL, since they need to accept new types anyway, and then updating all the intermediate algorithms that process the accepted values, all the way down to the primitives, where eventually the primitive does conversion to a string, which causes enforcement to trigger.
If I understand it, this seems like (3) but with extra public API, which is unnecessary. Let's skip it.
This requires updating all sinks' IDL, as well as updating the sink algorithms to turn a string in to a (string, trusted = false) tuple, while turning a trusted type into a (string, trusted = true) tuple. Then we thread those tuples through all intermediate algorithms, all the way down to primitives, where the primitive does enforcement. So, it is basically the same as (1), but feels a bit cleaner since we are not passing IDL types as far down into the stack (e.g. potentially across threads).

Other shared subtleties:

Where to throw the enforcement error, once the value reaches the primitives and fails to pass, is not obvious. You'd probably need to store the global object as part of the tuple. What about in async cases---is it too late to throw? I'm not sure if there are currently any situations where the sink is separated from the primitive by an async boundary, but it's easy to imagine some, especially with URLs.
If intermediate algorithms need to treat the value as a string (or URL, or...) then they will need to pierce the trusted type or tuple. This can be a bit tricky, and will likely mean many of them need to be updated too.

(DOM|USV)String. This is to hook up the Trusted Types validation during the ES->IDL type conversion to avoid funky issues with its default policy. See w3c/trusted-types#248, w3c/trusted-types#176

koto · 2020-03-02T22:39:29Z

The 'putting guards at primitives' approach has surfaced issues with the default policy (#248). It looks like it's preferable to address the TT checks, with the default policy application at IDL conversion time (in one of the few variants outlined in whatwg/webidl#841) - so even "before" sinks. The advantages we get for avoiding the default policy problems seem to outweigh the complexity of guarding at all affected primitives.

There are exceptions to that, more in line with the "at primitives" approach:

javascript: navigation (https://w3c.github.io/webappsec-trusted-types/dist/spec/#require-trusted-types-for-pre-navigation-check)
script processing (https://w3c.github.io/webappsec-trusted-types/dist/spec/#slot-value-verification). This one will likely remain for script text only, as script URL could be guarded like other attributes, via DOM's attribute validate steps.

Please let us know if that sounds like a bad call, @annevk.

(DOM|USV)String. This is to hook up the Trusted Types validation during the ES->IDL type conversion to avoid funky issues with its default policy. See w3c/trusted-types#248, w3c/trusted-types#176

koto mentioned this issue Jun 3, 2019

Allow for limiting the sinks that a type can be used for #152

Closed

koto added this to the v1 milestone Jun 24, 2019

koto added the spec label Jun 24, 2019

Andrew-Cottrell mentioned this issue Jul 12, 2019

Perhaps typedef RequestInfo should include TrustedURL google/closure-compiler#3429

Closed

koto mentioned this issue Aug 6, 2019

Replace TrustedURL with calling a default policy on navigation to javascript: URLs. #204

Merged

jonathanKingston mentioned this issue Aug 20, 2019

Trusted Types mozilla/standards-positions#20

Closed

koto added a commit to koto/trusted-types that referenced this issue Sep 3, 2019

First attempt at fixing w3c#176. Add an argument to pass through to

54a5218

existing algorithms.

This was referenced Oct 11, 2019

Consider something like a "Trusted Type" system for DOM manipulation. whatwg/html#3052

Closed

Trusted Types integration whatwg/dom#789

Open

koto mentioned this issue Feb 10, 2020

Added a definition of {{StringContext}} extended attribute. whatwg/webidl#841

Open

koto closed this as completed Mar 6, 2020

lukewarlow mentioned this issue Mar 12, 2024

Add {{StringContext}} extended attribute whatwg/webidl#1392

Closed

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Putting guards at primitives instead of sinks #176

Putting guards at primitives instead of sinks #176

annevk commented May 28, 2019

koto commented May 28, 2019

koto commented Jun 1, 2019

annevk commented Jun 4, 2019

koto commented Jun 4, 2019

annevk commented Jun 12, 2019

koto commented Aug 7, 2019

koto commented Sep 3, 2019

koto commented Sep 30, 2019

koto commented Oct 11, 2019

domenic commented Oct 15, 2019

koto commented Mar 2, 2020

Putting guards at primitives instead of sinks #176

Putting guards at primitives instead of sinks #176

Comments

annevk commented May 28, 2019

koto commented May 28, 2019

koto commented Jun 1, 2019

annevk commented Jun 4, 2019

koto commented Jun 4, 2019

annevk commented Jun 12, 2019

koto commented Aug 7, 2019

koto commented Sep 3, 2019

koto commented Sep 30, 2019

HTML parsing

Attributes

Scripts

koto commented Oct 11, 2019

domenic commented Oct 15, 2019

koto commented Mar 2, 2020