-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Set dfn-type on definitions #5694
Conversation
If this is needed, can be done automatically, and can't be dealt with on the Shepherd side, perhaps it would make more sense to add this to https://github.com/whatwg/wattsi? That would also address the issue of it depending on how lines are delimited as you can operate on objects. This would also force us to keep using data-x in a consistent fashion. (If we do that, we should only output it for the main standard, not for RD, dev, snapshots, etc.) |
For clarity sake, @tidoust and I are working on providing a replacement to Shepherd for definition extractions via reffy, and it is already set up to deal with the HTML spec as an exception. But given the numerous exceptions and given that they're likely to need a lot of updates overtime, it would seem a lot more robust to make the HTML spec align with other specs.
I believe I could address that issue separately from using wattsi - this pull request was mostly focused on showing a preliminary approach to this. (I've already made a small improvement to the underlying script which creates an additional ~700 lines changes)
From whatwg/wattsi#97 I infer that wattsi doesn't have an IDL parser, which means integrating it there would be a challenge (I also have no experience coding in Pascal, but this could be an opportunity to learn :) |
Ah yeah. Hmm. @domenic will be back next week and likely has some thoughts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way I see it, we have a few options:
-
Continue to rely on external tools figuring out our definitions.
-
Something like this PR, using
data-dfn-type
/data-dfn-for
, e.g.<dfn data-dfn-type="attribute" data-dfn-for="HTMLLinkElement"><code data-x="dom-link-disabled">disabled</code></dfn>
-
Update Wattsi to allow use of shorter syntax like Bikeshed does, e.g.
<dfn attribute for="HTMLLinkElement"><code data-x="dom-link-disabled">disabled</code></dfn>
- Perhaps we could create some sort of bridge between the Bikeshed-syntax and the
data-x=""
syntax? E.g.<dfn attribute for="HTMLLinkElement" id="dom-link-disabled"><code>disabled</code></dfn>
and then the rest of the HTML spec could use<code data-x="HTMLLinkElement-disabled">disabled</code>
. We could do that later though.
- Perhaps we could create some sort of bridge between the Bikeshed-syntax and the
I think (1) is not very friendly to the wider ecosystem. Everyone else has to mark up their specs; we should too.
I'd encourage us to try (3). I suspect it would be pretty easy; you'd find Wattsi's main tree-walking phase and just replace attribute names in a straightforward manner. And as @annevk mentions, (3) would allow us to just remove the attributes for dev edition / review drafts / commit snapshots. @sideshowbarker also has a good amount of expertise in the Wattsi codebase and might be able to chip in or give pointers.
If (3) turns out to be quite hard for whatever reason, then we could fall back to (2).
@@ -2325,7 +2325,7 @@ a.setAttribute('href', 'https://example.com/'); // change the content attribute | |||
|
|||
<hr> | |||
|
|||
<p>The attribute with the tag name <dfn><code data-x="attr-xml-space">xml:space</code></dfn> in | |||
<p>The attribute with the tag name <dfn data-dfn-type="element-attr"><code data-x="attr-xml-space">xml:space</code></dfn> in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing in the dependencies section should be marked up in this way, I think, since other specs should be referring to the original definitions, not the ones in HTML.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my algorithm already skips the <dfn>
that are links, but it'd be easy to skip all the definitions of that section
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, so I guess it found cases where HTML should be doing a better job linking. https://www.w3.org/TR/xml/#sec-white-space seems reasonable for xml:space
, but for ARIA role, it seems ARIA would prefer that the HTML spec define it, per https://w3c.github.io/aria/#host_general_role. Well, this is all off-topic; perhaps I will open a separate issue...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ping @sideshowbarker; this comment is still relevant. We should just remove all this markup from the dependencies section (i.e. through line 3782), with the possible exception of role and xml:space.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ping @sideshowbarker; this comment is still relevant. We should just remove all this markup from the dependencies section (i.e. through line 3782), with the possible exception of role and xml:space.
OK, will get that done here too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing in the dependencies section should be marked up in this way, I think, since other specs should be referring to the original definitions, not the ones in HTML.
We should just remove all this markup from the dependencies section (i.e. through line 3782), with the possible exception of role and xml:space.
OK, as far as I can tell, other than the markup added to the role
and xml:space
defnitions, the patch in this PR isn’t actually adding any new markup to the any other definitions anywhere in the dependencies section.
That said, there are three definition in the dependencies section that already had additional markup (in the source on the main branch) prior to this change:
<li><dfn data-dfn-for="url" data-x="concept-url-origin" data-x-href="https://url.spec.whatwg.org/#concept-url-origin">Origin</dfn> of URLs</li>
...
<li>`<dfn data-dfn-type="http-header" data-x="http-origin" data-x-href="https://fetch.spec.whatwg.org/#http-origin"><code>Origin</code></dfn>` header</li>
...
<li><dfn data-dfn-for="request" data-x="concept-request-origin" data-x-href="https://fetch.spec.whatwg.org/#concept-request-origin">origin</dfn></li>
So for those, this patch is just changing the existing data-dfn-for
instances to for
, and the one existing data-dfn-type="http-header"
there to just http-header
.
@@ -9089,7 +9089,7 @@ partial interface <dfn id="document" data-lt="">Document</dfn> { | |||
|
|||
<div w-nodev> | |||
|
|||
<p>The <dfn><code data-x="dom-document-referrer">referrer</code></dfn> attribute must return | |||
<p>The <dfn data-dfn-type="attribute" data-dfn-for="Document"><code data-x="dom-document-referrer">referrer</code></dfn> attribute must return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be ideal to rewrap all paragraphs touched in this way, using the algorithm in https://github.com/domenic/rewrapper (or perhaps a less buggy version of that).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right - once there is agreement in the way forward, that's the kind of polishing I'm happy to look into
@@ -9089,7 +9089,7 @@ partial interface <dfn id="document" data-lt="">Document</dfn> { | |||
|
|||
<div w-nodev> | |||
|
|||
<p>The <dfn><code data-x="dom-document-referrer">referrer</code></dfn> attribute must return | |||
<p>The <dfn data-dfn-type="attribute" data-dfn-for="Document"><code data-x="dom-document-referrer">referrer</code></dfn> attribute must return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should these be exported (data-export=""
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
anything that isn't dfn-type=dfn is assumed exported per the bikeshed documentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sideshowbarker, maybe we should make Wattsi complain about redundant export
on things which have one of the definition types. And then remove such cases from this PR (e.g. I see one such instance for nonce
, which is element-attr
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we should make Wattsi complain about redundant
export
on things which have one of the definition types.
OK, that sounds doable
And then remove such cases from this PR (e.g. I see one such instance for
nonce
, which iselement-attr
).
OK, will make that change. (After I make the Wattsi change and test it.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we should make Wattsi complain about redundant
export
on things which have one of the definition types.
OK, added that to Wattsi as a fatal error — which caused it to catch about 130 cases of that redundancy
And then remove such cases from this PR (e.g. I see one such instance for
nonce
, which iselement-attr
).
Removed all 130 of them
Yeah it sounds pretty straightforward and I’d be willing to help |
Yeah, but as mentioned in the whatwg/wattsi#97 issue description, we could do it “similar to how we did syntax highlighting, by shelling out to an external program”. In fact, as part of doing the syntax highlighting, the IDL is all getting run through https://github.com/plinss/widlparser already anyway — so it’s imaginable that to get the things we’d want to achieve with access to the parsed IDL, we could hack the syntax-highlighter code (though admittedly that’d be an unusual way of going about it). |
I don't think IDL parsing is necessary at all for the feature as described here. We just want to transform attributes from If IDL parsing is necessary for some reason, then we should reconsider. My enthusiasm for modifying Wattsi was based on it being easy. |
yeah, no IDL parsing necessary for the attribute shorthands. |
3fd96b8
to
079c592
Compare
This change adds support for doing the following with attributes of dfn elements: * recognize attributes with names that are known dfn types (currently, 'element', 'element-attr', 'event', 'interface', 'extended-attribute', 'method', 'attribute', 'enum-value', and 'http-header'), and replace them with data-dfn-type=element, etc., attributes in output * for any dfn element that has an attribute named "for", replace it in output with an attribute named "data-dfn-for" * for any dfn element with an attribute named "export" or "noexport", replace it with one named "data-export" or "data-noexport" Relates to whatwg/html#5694
079c592
to
e8d14c5
Compare
OK, I rebased this and then on top of the existing change in this branch, I pushed the commit at e8d14c5 to convert everything to the Bikeshed short syntax — including the following changes:
Because there were merge conflicts due to intervening changes on the main branch since the time when this pull request was first raised, some manual editing was needed to resolve the merge conflicts. I think I got it right, but still that makes it even more important to review the resulting diff carefully. |
whatwg/wattsi#134 has the corresponding Wattsi change for handling the short syntax instances introduced in this PR. |
source
Outdated
@@ -7920,7 +7920,7 @@ interface <dfn>DOMStringList</dfn> { | |||
However, to better specify the behavior of certain more complex situations, the model was updated | |||
to make the serialization and deserialization explicit.</p> | |||
|
|||
<h4 data-export="" data-lt="transferable object"><dfn>Transferable objects</dfn></h4> | |||
<h4 export data-lt="transferable object"><dfn>Transferable objects</dfn></h4> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we translate data-lt into lt=""? Not a blocker for this PR since it's related, but not dependent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, lt
is just a Bikeshed-Flavored Markdown nicety. It gets translated into data-lt
in the output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I wrote that backward. I meant to say that maybe Wattsi should also grow that nicety.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, then yes, I recommend doing so. ^_^ Probably useful to grow all the same niceties for omitting the data-
prefix generally, and when used directly on a dfn
or heading, the dfn-
prefix as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we translate data-lt into lt=""? Not a blocker for this PR since it's related, but not dependent.
Yeah, I think we should — and we might as well add the markup change to this PR. And the additional code needed to support in Wattsi is trivial, on top of the existing Wattsi patch. So I’ll get it added there by tomorrow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About data-lt
and lt
in the HTML markup source, I’m not clear now on what’s being proposed.
Is the proposal that as part of this PR, we replace all data-lt
with lt
in the HTML markup source?
If so, then that’s going to run into an existing check we have in Wattsi:
if (Element.HasAttribute(kLTAttribute)) then
Fail('<dfn> with lt="" found, use data-x="" instead; dfn is ' + Describe(Element));
I don’t know why we’re doing that check — but if we were to allow lt
for dfn
in the HTML markup source (and translate it to data-lt
in the output) then we’d need to also remove that check. And that would seem to defeat whatever purpose we had for putting that check there to begin with…
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following the blame, the commit message makes it clear:
This helps catch mistakes now and makes it obvious you need to update wattsi when we switch to
<a lt>
for cross-references.
So (5 years ago) the plan was apparently to more closely match Bikeshed's syntax at some point; this check was to make sure mistakes didn't creep in beforehand and ensured that you'd know you had to fix Wattsi when you finally did make the switchover.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. So we have two choices:
-
Do not convert
lt=""
todata-lt=""
. This will potentially make people more aware thatdata-lt=""
is just a Bikeshed bridge, and does not work for intra-HTML-spec linkage. -
Convert
lt=""
todata-lt=""
. Since Wattsi still usesdata-x
, people might get confused and wonder why<dfn lt="foo">bar</dfn>
+<a>foo</a>
doesn't work (like it does in Bikeshed).
I think the risk of confusion is minimal, so I am mildly pro-conversion. But I can see both sides of the discussion. And also this is all quite separable from the current PR, so I am fine leaving this as-is.
Previously we mentioned rewrapping the lines that were modified. I'd only want to do that if it's easily automated, but it's worth keeping in mind. Maybe I'll try to whip something up. |
I’m actually willing to do it manually. Stepping through each change in the vimdiff view in vim makes it relatively painless. (It’ll only take me a few seconds to write a one-off macro for it, and bind a keystroke to it — and with that in hand, it’ll basically just amount to pressing one key some several-hundred times to jump to each change, and select it and re-wrap it. So in all, I reckon it’ll take a handful of minutes in total, at the very most.) |
Based on a specialized definitions parser for the HTML spec built to improve on the existing scraping done in shepherd. This particular patch doesn't cover all the identified dfn-type that could be set, only the ones that were easy to treat (i.e. where the data-x and the dfn occur on the same line). The setting of dfn-type and dfn-for is based on the WebIDL defined in the spec
This change: * replaces any data-dfn-type=element, etc., attributes with ones literally named "element", etc. * replaces any data-dfn-for attribute with one literally named "for" * replaces data-export="" and data-noexport="" attribues with value-less attributes literally named "export" and "noexport" * drops "export" from any dfn that has a dfn type attribute
e8d14c5
to
7501ceb
Compare
OK, did it that way and pushed a change with all the changed lines re-wrapped (and can re-do it again if necessary) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM. Thanks so much for all the rebasing and rewrapping and tweaks. Let's try to get this merged soon before it gets too far behind master :). That'll require the Wattsi change merging first.
Since every line has been re-written by @sideshowbarker I'm comfortable with the IPR issues here. The easiest way to satisfy the bots would be to open a new PR with @sideshowbarker as the author, probably?
This change adds support for doing the following with attributes of dfn elements: * recognize attributes with names that are known dfn types (currently, 'element', 'element-attr', 'event', 'interface', 'extended-attribute', 'method', 'attribute', 'enum-value', and 'http-header'), and replace them with data-dfn-type=element, etc., attributes in output * for any dfn element that has an attribute named "for", replace it in output with an attribute named "data-dfn-for" * for any dfn element with an attribute named "export" or "noexport", replace it with one named "data-export" or "data-noexport" Relates to whatwg/html#5694
Superseded by #5916 |
OK, I’ve gone ahead and merged the Wattsi change
OK, see #5916 |
Follow up to whatwg#5694 whatwg#5916 setting dfn types on more items identified by reffy
Follow up to whatwg#5694 whatwg#5916 whatwg#5956 setting dfn types on items defined in IDL fragments
Follow up to whatwg#5694 whatwg#5916 setting dfn types on more items identified by reffy
* Editorial: remove redundant "the" * Meta: default branch rename Also correct a broken link. Not even w3.org URLs are that cool. Helps with whatwg/meta#174. * Editorial: clean up calls to "parse a URL" It actually takes a string, so calls should be clear about that. * Review Draft Publication: January 2021 * Simplify <link>s In particular, remove their activation behavior, stop them from matching :link and :visited, and stop suggesting that they be focusable areas. This also includes a slight expansion and rearrangement of the link element's section to make it clearer what hyperlinks created by <link> are meant for, contrasting them to <a> and <area> hyperlinks. Closes whatwg#4831. Closes whatwg#2617. Helps with whatwg#5490. * Meta: remove demos/offline/* (whatwg#6307) These are no longer needed as of e4330d5. * Meta: minor references cleanup Use more HTTPS and drop obsolete HTML Differences reference. * Editorial: anticlockwise → counterclockwise We use en-US these days. Spotted in https://twitter.com/iso2022jp/status/1352601086519955456. * Use :focus-visible in the UA stylesheet See w3c/csswg-drafts#4278. * Editorial: align with WebIDL and Infra * Fix "update a style block" early return The new version matches implementation reality and CSSWG resolution. The algorithm was also inconsistent, as it looked at whether the element was in a shadow tree or in the document tree, but it was only specified to be re-run if the element becomes connected or disconnected. The CSSWG discussed this in w3c/csswg-drafts#3096 (comment) and http://wpt.live/shadow-dom/ShadowRoot-interface.html tests this. This also matches closer the definition of <link rel="stylesheet">, which does use connectedness (though it uses "browsing-context connected", which is a bit different): https://html.spec.whatwg.org/#link-type-stylesheet * Modernize and refactor simple dialogs This contains a small bug fix, in that confirm() and prompt() said "return" in some cases instead of "return false" or "return null" as appropriate. Other notable changes, all editorial, are: * Factoring out repeated "cannot show simple dialogs" steps, which will likely expand over time (see e.g. whatwg#6297). * Separating out and explaining the no-argument overload of alert(). * Passing the document through to the "printing steps", instead of just having them talk about "this Window object". * Meta: add definition markup for MessageEvent * Remove <marquee> events They are only supported by one engine (Gecko). Closes whatwg#2957. * Clarify when microtasks happen * Ignore COEP on non-secure contexts Fixes whatwg#6328. * Editorial: update URL Standard integration * Editorial: only invoke response's location URL once Complements whatwg/fetch#1149. * Track the incumbent settings and active script in Promise callbacks Closes whatwg#5213. * createImageBitmap(): stop clipping sourceRect to source's dimensions It has been found in whatwg#6306 that this was an oversight at the time of its introduction. Current behavior goes against author expectations and no implementer has opposed the change to "no-clip". Tests: web-platform-tests/wpt#27040. Closes whatwg#6306. * Remove CSP plugin-types blocking With Flash not being supported anymore, the CSP directive plugin-types has lost its main reason for being and is being removed from the Content Security Policy specification: w3c/webappsec-csp#456. This change removes references to the relevant algorithm from the Content Security Policy spec. * Meta: set more dfn types A follow-up to: * whatwg#5694 * whatwg#5916 * Editorial: occuring → occurring * Make all plugin-related APIs no-ops Part of whatwg#6003. * Disallow simple dialogs from different-origin domain iframes Closes whatwg#5407. * Revive @@iterator for PluginArray/MimeTypeArray/Plugin @@iterator is implicitly installed by defining an indexed property getter. Since there is no other way to define it exclusively, this restores some methods back to being indexed getters. This fixes an inadvertent observable behavior change in d4f07b8. * Adjust web+ scheme security considerations to account for FTP removal Also, network scheme is now reduced to HTTP(S) scheme. Helps with whatwg#5375, but form submission issue remains. See whatwg/fetch#1166 for context. * Meta: export pause Nobody but XMLHttpRequest take a dependency on this please. You have been warned. Context: whatwg/xhr#311. * Fix typo: ancestor → accessor Fixes whatwg#6374. Co-authored-by: Dominic Farolino <domfarolino@gmail.com> Co-authored-by: Anne van Kesteren <annevk@annevk.nl> Co-authored-by: Domenic Denicola <d@domenic.me> Co-authored-by: Emilio Cobos Álvarez <emilio@crisal.io> Co-authored-by: Momdo Nakamura <xmomdo@gmail.com> Co-authored-by: Jake Archibald <jaffathecake@gmail.com> Co-authored-by: Yutaka Hirano <yhirano@chromium.org> Co-authored-by: Shu-yu Guo <syg@chromium.org> Co-authored-by: Kaiido <tristan.fraipont@gmail.com> Co-authored-by: Antonio Sartori <anton.sartori@gmail.com> Co-authored-by: Michael[tm] Smith <mike@w3.org> Co-authored-by: Ikko Ashimine <eltociear@gmail.com> Co-authored-by: Carlos IL <carlosjoan91@gmail.com> Co-authored-by: Kagami Sascha Rosylight <saschanaz@outlook.com> Co-authored-by: Simon Pieters <zcorpan@gmail.com>
Follow up to whatwg#5694 whatwg#5916 whatwg#5956 setting dfn types on items defined in IDL fragments
Based on a specialized definitions parser for the HTML spec built to improve on the existing scraping done in shepherd (which produces a lot of inaccurate data since the naming conventions on their own aren't enough to map definitions to their type or their scope). My specialized parser sets dfn-type and dfn-for based on the WebIDL defined in the spec.
This particular patch doesn't cover all the identified dfn-type that my parser identifies, only the ones that were easy to treat in the source file (i.e. where the data-x and the dfn occur on the same line).
I realize this 700+ lines diff may be painful to review, assuming there is even interest in accepting this kind of change. I'm submitting this as a basis for discussion, but since this is all automatically generated, I'm happy to resubmit it as smaller batch of changes if this is preferred, or even abandon it completely if there is no interest in this kind of change (for now or for good).
/browsers.html ( diff )
/browsing-the-web.html ( diff )
/canvas.html ( diff )
/common-microsyntaxes.html ( diff )
/custom-elements.html ( diff )
/dnd.html ( diff )
/dom.html ( diff )
/dynamic-markup-insertion.html ( diff )
/edits.html ( diff )
/embedded-content-other.html ( diff )
/embedded-content.html ( diff )
/form-control-infrastructure.html ( diff )
/form-elements.html ( diff )
/forms.html ( diff )
/grouping-content.html ( diff )
/history.html ( diff )
/iana.html ( diff )
/iframe-embed-object.html ( diff )
/image-maps.html ( diff )
/index.html ( diff )
/indices.html ( diff )
/infrastructure.html ( diff )
/input.html ( diff )
/interaction.html ( diff )
/interactive-elements.html ( diff )
/links.html ( diff )
/media.html ( diff )
/microdata.html ( diff )
/obsolete.html ( diff )
/offline.html ( diff )
/origin.html ( diff )
/parsing.html ( diff )
/rendering.html ( diff )
/scripting.html ( diff )
/semantics-other.html ( diff )
/semantics.html ( diff )
/server-sent-events.html ( diff )
/structured-data.html ( diff )
/system-state.html ( diff )
/tables.html ( diff )
/text-level-semantics.html ( diff )
/timers-and-user-prompts.html ( diff )
/urls-and-fetching.html ( diff )
/webappapis.html ( diff )
/window-object.html ( diff )
/workers.html ( diff )
/xhtml.html ( diff )