-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
First draft: explainer-new. #193
Conversation
This tries to summarize the result from the recent sync meeting. I hope it's a starting point to replace the current explainer for this project, and to then base future work on it. Please let me know if this doesn't reflect the meeting results accurately. I've added several open questions (i.e., what about defaults?) that were not explicitly discussed at the meeting. I'd like to add examples, too, to make this more useful for people not involved in the meetings. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for writing this up @otherdaniel! Left my thoughts and nits inline.
explainer-new.md
Outdated
## Open questions: | ||
|
||
- Defaults: If no filter is supplied, do the safe methods have any filtering | ||
other than the baseline? Do the unsafe methods have default filtering? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- It seems good to conclude Clarify threat model, "XSS first and foremost" #188 so "baseline" is clear.
- I think that by default "unsafe" shouldn't do any filtering. And if you want you get to essentially supply your own block and safelists and the browser won't perform any normalization on them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- This is more than "what the baseline is", but also whether the default should be something "more usable" than the baseline (e.g., also issue
baseline
should bedefault
anddefault
should bebaseline
#183). I think it should suffice to enforce only the baseline. (Unless we actually do the full research on the defaults of other sanitizers and e.g., popular XSS gadget attributes to also strip from the "default".)
explainer-new.md
Outdated
- Defaults: All of these are new methods, without legacy usage. Would DSD | ||
parsing default to `true`? Do we even decide that, or would we instead ask | ||
the HTML WG and adopt whatever their choice? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think ultimately that decision is largely up to the WHATWG. And we decided there previously that new parsing APIs won't need any kind of opt-in for declarative shadow roots. So we shouldn't add any here. I don't think we need opt-out either (that's handled through custom elements not being safelisted).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As long as a declarative shadow root always requires a custom element as a parent and as long as we require custom elements to be named in the allowElements
list, I am very OK with this not being a toggle at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reworded this. Agree on WHATWG getting the last word.
I'm not aware of any requirements that shadow roots - declarative or not - must be hosted by custom elements. I thought a boring old <div>
could also host a shadow DOM. Am I overlooking something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trying to find evidence for restrictions on shadow root hosts. What I find is:
- https://dom.spec.whatwg.org/#concept-element-shadow-root and
- https://dom.spec.whatwg.org/#dom-element-attachshadow
The second one lists quite a few restrictions, but it does allow a number of specific HTML elements, e.g. div
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I raised this in whatwg/dom#831 (comment).
Having said that, I'd appreciate it if @mozfreddyb could elaborate on this concern, as I'm not yet seeing the problem with shadow roots potentially being present in the output.
If there is a problem though I don't think we should address it at the parser level, but instead the sanitizer should block elements with an associated shadow root.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, glad we got some clarity here. I was totally not aware that this may live outside of custom elements. That being said, I think this is all fine as long as the sanitizer traverses the shadow root.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We discussed this on the call and agree that this is fine as long as the sanitizer is traversing
explainer-new.md
Outdated
- Should the filter config be a separate object, or should it be a plain dictionary? | ||
- Reasons pro dictionary: | ||
- Simpler. | ||
- Reasons pro object: | ||
- Allows to pre-process of the config and to amortize the cost over many calls. | ||
- Allows adding other useful config operations, like introspection. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If actual normalization is performed on the input and that is exposed (rather than the input being reflected back) I could see folks supporting a class. The current Sanitizer
class is not it though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My main hope with this was that a framework would supply a Sanitizer instance for the web developer to use in framework-specific code (e.g., to prevent script gadgets attacks). That being said, a dictionary is likely as good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also discussed this on the call. We agree that there's little value in having an object that just contains a dictionary. Nobody is against an object, as long as there is a useful feature behind it (e.g, normalization, performance).
explainer-new.md
Outdated
- Baseline for safe methods: There are some "special" behaviours (currently in | ||
"handle funky elements"), mainly around dropping javascript:-URLs in contexts | ||
that navigate. Should this be available as an option (which would then be | ||
force-true for safe usage, and default-false but available for non-safe | ||
usage), or would that be custom behaviour only for the safe methods? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be okay leaving it out until we see demand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. (Removed this from open questions, and added a sentence after the API proposal.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great. Thanks so much, Daniel!
(And my deepest apology for the the review delay here!)
I think this is almost ready to merge once we have resolved the minor sidebars
explainer-new.md
Outdated
- Defaults: All of these are new methods, without legacy usage. Would DSD | ||
parsing default to `true`? Do we even decide that, or would we instead ask | ||
the HTML WG and adopt whatever their choice? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As long as a declarative shadow root always requires a custom element as a parent and as long as we require custom elements to be named in the allowElements
list, I am very OK with this not being a toggle at all.
explainer-new.md
Outdated
## Open questions: | ||
|
||
- Defaults: If no filter is supplied, do the safe methods have any filtering | ||
other than the baseline? Do the unsafe methods have default filtering? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- This is more than "what the baseline is", but also whether the default should be something "more usable" than the baseline (e.g., also issue
baseline
should bedefault
anddefault
should bebaseline
#183). I think it should suffice to enforce only the baseline. (Unless we actually do the full research on the defaults of other sanitizers and e.g., popular XSS gadget attributes to also strip from the "default".)
explainer-new.md
Outdated
- Should the filter config be a separate object, or should it be a plain dictionary? | ||
- Reasons pro dictionary: | ||
- Simpler. | ||
- Reasons pro object: | ||
- Allows to pre-process of the config and to amortize the cost over many calls. | ||
- Allows adding other useful config operations, like introspection. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My main hope with this was that a framework would supply a Sanitizer instance for the web developer to use in framework-specific code (e.g., to prevent script gadgets attacks). That being said, a dictionary is likely as good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this looks great!
explainer.md
Outdated
- `Element.setHTML(string, {options})` - Parses `string` using `this` as | ||
context element, like assigning to `innerHTML` would; applies a filter, | ||
while enforcing an XSS-focused baseline; and finally replaces the children | ||
of `this` with the results. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we want to directly compare parsing to innerHTML
as there will at least be two major differences:
- Support for declarative shadow roots.
- No support for XML.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While technically correct, I think it helps to draw a comparison. Even if it's not exactly 1:1, I am expecting many developers to switch from innerHTML=
to setHTML()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is summarizing the design for implementers, no? I don't mind it saying innerHTML
, but it should be more accurate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We agree on a softer wording. "almost like", "similar to innerHTML". etc. :). Obviously, there should be a section that explains the differences more closely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Also added a note at the bottom to be more explicit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One question on, "No support for XML": I think we agree that Document.parseHTML should not create XML documents (unlike DOMParser). Is this also meant to apply to setHTML when called on an element in an existing XML document?
(I don't mind either way; but would like to be clear.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant it for both.
explainer.md
Outdated
- Defaults: If no filter is supplied, do the safe methods have any filtering | ||
other than the baseline? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we reference #188 here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
explainer.md
Outdated
- Defaults: All of these are new methods without legacy usage. Would DSD | ||
parsing default to `true`? (Probably. Decision lies with WHATWG.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can consider this decided.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed. (I added this as one example to the note about differences between the new APIs and their existing counterparts.)
explainer.md
Outdated
- Simpler. | ||
- Reasons pro object: | ||
- Allows to pre-process the config and to amortize the cost over many calls. | ||
- Allows adding other useful config operations, like introspection. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems correct. An object here would require one of:
- Compelling performance numbers.
- A compelling operation that would only work with a pre-processed dictionary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Adopted this wording.
explainer.md
Outdated
- `Element.setHTML(string, {options})` - Parses `string` using `this` as | ||
context element, like assigning to `innerHTML` would; applies a filter, | ||
while enforcing an XSS-focused baseline; and finally replaces the children | ||
of `this` with the results. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While technically correct, I think it helps to draw a comparison. Even if it's not exactly 1:1, I am expecting many developers to switch from innerHTML=
to setHTML()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
\o/
Thanks!
Explainer for new Sanitizer API design, based on recent sync meeting (#192).
Notes: