First draft: explainer-new. #193

otherdaniel · 2023-05-04T16:41:03Z

Explainer for new Sanitizer API design, based on recent sync meeting (#192).

Notes:

Not ready yet; would like to add examples.
I added several open questions not discussed in the meeting.

otherdaniel · 2023-05-04T16:50:06Z

This tries to summarize the result from the recent sync meeting. I hope it's a starting point to replace the current explainer for this project, and to then base future work on it. Please let me know if this doesn't reflect the meeting results accurately.

I've added several open questions (i.e., what about defaults?) that were not explicitly discussed at the meeting.

I'd like to add examples, too, to make this more useful for people not involved in the meetings.

annevk

Thanks for writing this up @otherdaniel! Left my thoughts and nits inline.

explainer-new.md

annevk · 2023-05-05T06:05:33Z

explainer-new.md

+## Open questions:
+
+- Defaults: If no filter is supplied, do the safe methods have any filtering
+  other than the baseline? Do the unsafe methods have default filtering?


It seems good to conclude Clarify threat model, "XSS first and foremost" #188 so "baseline" is clear.

I think that by default "unsafe" shouldn't do any filtering. And if you want you get to essentially supply your own block and safelists and the browser won't perform any normalization on them.

This is more than "what the baseline is", but also whether the default should be something "more usable" than the baseline (e.g., also issue baseline should be default and default should be baseline #183). I think it should suffice to enforce only the baseline. (Unless we actually do the full research on the defaults of other sanitizers and e.g., popular XSS gadget attributes to also strip from the "default".)

annevk · 2023-05-05T06:07:14Z

explainer-new.md

+- Defaults: All of these are new methods, without legacy usage. Would DSD
+  parsing default to `true`? Do we even decide that, or would we instead ask
+  the HTML WG and adopt whatever their choice?


I think ultimately that decision is largely up to the WHATWG. And we decided there previously that new parsing APIs won't need any kind of opt-in for declarative shadow roots. So we shouldn't add any here. I don't think we need opt-out either (that's handled through custom elements not being safelisted).

As long as a declarative shadow root always requires a custom element as a parent and as long as we require custom elements to be named in the allowElements list, I am very OK with this not being a toggle at all.

Reworded this. Agree on WHATWG getting the last word.

I'm not aware of any requirements that shadow roots - declarative or not - must be hosted by custom elements. I thought a boring old <div> could also host a shadow DOM. Am I overlooking something?

Trying to find evidence for restrictions on shadow root hosts. What I find is:

https://dom.spec.whatwg.org/#concept-element-shadow-root and

https://dom.spec.whatwg.org/#dom-element-attachshadow

The second one lists quite a few restrictions, but it does allow a number of specific HTML elements, e.g. div.

Good point. I raised this in whatwg/dom#831 (comment).

Having said that, I'd appreciate it if @mozfreddyb could elaborate on this concern, as I'm not yet seeing the problem with shadow roots potentially being present in the output.

If there is a problem though I don't think we should address it at the parser level, but instead the sanitizer should block elements with an associated shadow root.

Well, glad we got some clarity here. I was totally not aware that this may live outside of custom elements. That being said, I think this is all fine as long as the sanitizer traverses the shadow root.

We discussed this on the call and agree that this is fine as long as the sanitizer is traversing

annevk · 2023-05-05T06:08:17Z

explainer-new.md

+- Should the filter config be a separate object, or should it be a plain dictionary?
+  - Reasons pro dictionary:
+     - Simpler.
+  - Reasons pro object:
+     - Allows to pre-process of the config and to amortize the cost over many calls.
+     - Allows adding other useful config operations, like introspection.


If actual normalization is performed on the input and that is exposed (rather than the input being reflected back) I could see folks supporting a class. The current Sanitizer class is not it though.

My main hope with this was that a framework would supply a Sanitizer instance for the web developer to use in framework-specific code (e.g., to prevent script gadgets attacks). That being said, a dictionary is likely as good.

We also discussed this on the call. We agree that there's little value in having an object that just contains a dictionary. Nobody is against an object, as long as there is a useful feature behind it (e.g, normalization, performance).

annevk · 2023-05-05T06:09:52Z

explainer-new.md

+- Baseline for safe methods: There are some "special" behaviours (currently in
+  "handle funky elements"), mainly around dropping javascript:-URLs in contexts
+  that navigate. Should this be available as an option (which would then be
+  force-true for safe usage, and default-false but available for non-safe
+  usage), or would that be custom behaviour only for the safe methods?


I'd be okay leaving it out until we see demand.

Done. (Removed this from open questions, and added a sentence after the API proposal.)

explainer-new.md

mozfreddyb

This is great. Thanks so much, Daniel!
(And my deepest apology for the the review delay here!)

I think this is almost ready to merge once we have resolved the minor sidebars

explainer-new.md

mozfreddyb · 2023-05-12T10:33:18Z

explainer-new.md

+- Defaults: All of these are new methods, without legacy usage. Would DSD
+  parsing default to `true`? Do we even decide that, or would we instead ask
+  the HTML WG and adopt whatever their choice?


As long as a declarative shadow root always requires a custom element as a parent and as long as we require custom elements to be named in the allowElements list, I am very OK with this not being a toggle at all.

explainer-new.md

mozfreddyb · 2023-05-12T10:42:45Z

explainer-new.md

+## Open questions:
+
+- Defaults: If no filter is supplied, do the safe methods have any filtering
+  other than the baseline? Do the unsafe methods have default filtering?


This is more than "what the baseline is", but also whether the default should be something "more usable" than the baseline (e.g., also issue baseline should be default and default should be baseline #183). I think it should suffice to enforce only the baseline. (Unless we actually do the full research on the defaults of other sanitizers and e.g., popular XSS gadget attributes to also strip from the "default".)

mozfreddyb · 2023-05-12T10:47:49Z

explainer-new.md

+- Should the filter config be a separate object, or should it be a plain dictionary?
+  - Reasons pro dictionary:
+     - Simpler.
+  - Reasons pro object:
+     - Allows to pre-process of the config and to amortize the cost over many calls.
+     - Allows adding other useful config operations, like introspection.


My main hope with this was that a framework would supply a Sanitizer instance for the web developer to use in framework-specific code (e.g., to prevent script gadgets attacks). That being said, a dictionary is likely as good.

explainer-new.md

annevk

Thanks, this looks great!

annevk · 2023-05-16T15:25:11Z

explainer.md

+- `Element.setHTML(string, {options})` - Parses `string` using `this` as
+  context element, like assigning to `innerHTML` would; applies a filter,
+  while enforcing an XSS-focused baseline; and finally replaces the children
+  of `this` with the results.


I'm not sure we want to directly compare parsing to innerHTML as there will at least be two major differences:

Support for declarative shadow roots.

No support for XML.

While technically correct, I think it helps to draw a comparison. Even if it's not exactly 1:1, I am expecting many developers to switch from innerHTML= to setHTML()

This is summarizing the design for implementers, no? I don't mind it saying innerHTML, but it should be more accurate.

We agree on a softer wording. "almost like", "similar to innerHTML". etc. :). Obviously, there should be a section that explains the differences more closely.

Done. Also added a note at the bottom to be more explicit.

One question on, "No support for XML": I think we agree that Document.parseHTML should not create XML documents (unlike DOMParser). Is this also meant to apply to setHTML when called on an element in an existing XML document?

(I don't mind either way; but would like to be clear.)

I meant it for both.

annevk · 2023-05-16T15:28:08Z

explainer.md

+- Defaults: If no filter is supplied, do the safe methods have any filtering
+  other than the baseline?


Can we reference #188 here?

annevk · 2023-05-16T15:28:44Z

explainer.md

+- Defaults: All of these are new methods without legacy usage. Would DSD
+  parsing default to `true`? (Probably. Decision lies with WHATWG.)


I think we can consider this decided.

Removed. (I added this as one example to the note about differences between the new APIs and their existing counterparts.)

annevk · 2023-05-16T15:31:11Z

explainer.md

+     - Simpler.
+  - Reasons pro object:
+     - Allows to pre-process the config and to amortize the cost over many calls.
+     - Allows adding other useful config operations, like introspection.


This seems correct. An object here would require one of:

Compelling performance numbers.

A compelling operation that would only work with a pre-processed dictionary.

Done. Adopted this wording.

README.md

mozfreddyb · 2023-05-17T07:46:23Z

explainer.md

+- `Element.setHTML(string, {options})` - Parses `string` using `this` as
+  context element, like assigning to `innerHTML` would; applies a filter,
+  while enforcing an XSS-focused baseline; and finally replaces the children
+  of `this` with the results.


While technically correct, I think it helps to draw a comparison. Even if it's not exactly 1:1, I am expecting many developers to switch from innerHTML= to setHTML()

annevk

\o/

Thanks!

First draft: explainer-new.

25d9c3a

otherdaniel requested a review from mozfreddyb May 4, 2023 16:43

annevk reviewed May 5, 2023

View reviewed changes

annevk mentioned this pull request May 5, 2023

Sanitizer API WebKit/standards-positions#86

Closed

mozfreddyb reviewed May 12, 2023

View reviewed changes

Review feedback.

12adee8

annevk reviewed May 16, 2023

View reviewed changes

annevk mentioned this pull request May 16, 2023

Declarative Shadow DOM whatwg/dom#831

Closed

mozfreddyb approved these changes May 17, 2023

View reviewed changes

Review feedback, round WICG#2.

1726415

mozfreddyb mentioned this pull request May 24, 2023

Rationale for requring a secure context #122

Closed

mozfreddyb approved these changes May 24, 2023

View reviewed changes

annevk approved these changes May 24, 2023

View reviewed changes

otherdaniel merged commit 6dd0623 into WICG:main May 31, 2023

otherdaniel deleted the explainer-new branch April 22, 2024 12:23

		- Defaults: If no filter is supplied, do the safe methods have any filtering
		other than the baseline?

		- Defaults: All of these are new methods without legacy usage. Would DSD
		parsing default to `true`? (Probably. Decision lies with WHATWG.)

First draft: explainer-new. #193

First draft: explainer-new. #193

Conversation

otherdaniel commented May 4, 2023 • edited Loading

otherdaniel commented May 4, 2023

annevk left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mozfreddyb left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

annevk left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

annevk left a comment

Choose a reason for hiding this comment

otherdaniel commented May 4, 2023 •

edited

Loading