Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider specifying document.evaluate and document.createNSResolver #67

Open
domenic opened this issue Sep 3, 2015 · 12 comments
Open

Comments

@domenic
Copy link
Member

domenic commented Sep 3, 2015

These are currently left to DOM 3 XPath. However, that spec is (a) very old, and thus wrong in a lot of ways; (b) not very large. It could maybe be subsumed and thus give implementations an actual non-crazy reference.

XPathEvaluator.prototype.evaluate has ~0.7% usage and isn't going anywhere. XPathEvaluator.prototype.createNSResolver has ~0.04% and so is also likely here to stay. However XPathEvaluator.prototype.createExpression is at 0.001% and could probably be left out. Which is great because that means we can very likely kill XPathExpression.

Other features of the spec that don't seem to be implemented are XPathException and XPathNamespace.

According to a comment in Blink's source code, XPathEvaluator has a constructor in reality, even if not in the spec.

Credit to @sideshowbarker for bringing this up.

@annevk
Copy link
Member

annevk commented Sep 3, 2015

@zcorpan
Copy link
Member

zcorpan commented Feb 24, 2016

@gsnedders
Copy link
Member

XPathEvaluator.prototype.evaluate has ~0.7% usage and isn't going anywhere. XPathEvaluator.prototype.createNSResolver has ~0.04% and so is also likely here to stay. However XPathEvaluator.prototype.createExpression is at 0.001% and could probably be left out. Which is great because that means we can very likely kill XPathExpression.

More modern numbers seem to be ~2.3%, ~0.3%, ~0.1%, all vastly higher than four years ago. (Interestingly, createExpression seems to have gone up from ~0.01% to ~0.1% over the past few months very suddenly; some major site now using it?)

That said, to try and write up some to-do list:

  • Try and get consensus on what the WebIDL should look like (WebKit has contextNode optional, defaulting to document; Blink changed this to match Gecko a while back because the contextNode default was totally non-obvious and undocumented)
  • Define the DOM -> XPath data model (note this includes several intentional violations of the XPath data model, as the wiki page notes)
  • Define what each WebIDL operation/attribute does
  • Integrate the HTML DOM special case into the DOM spec from HTML

@foolip
Copy link
Member

foolip commented May 29, 2019

I made #763 today. Didn't know about this issue, but I'll link it.

@foolip
Copy link
Member

foolip commented Aug 28, 2019

@gsnedders in #763 (comment):

  • XPathException is gone (just use DOMException)
  • The query is matched against the DOM, and therefore contrary to the XPath 1.0 data model the root element has a parent (the Document) and text nodes can be adjacent to one another. [wilful violation]

annevk pushed a commit that referenced this issue Aug 30, 2019
See also https://wiki.whatwg.org/wiki/DOM_XPath. Prose still needs to be added, tracked by #67. This is a one-time exception from the normal WHATWG Working Mode as there's a lot of benefit in hosting these interfaces in a standard as they're implemented in all engines, despite them not being fully defined and tested.

Co-Authored-By: Sam Sneddon <me@gsnedders.com>
@foolip
Copy link
Member

foolip commented Aug 30, 2019

I've updated https://wiki.whatwg.org/wiki/DOM_XPath to point at https://dom.spec.whatwg.org/#xpath for the Web IDL definitions.

@WebReflection
Copy link

WebReflection commented Oct 13, 2020

FWIW, I've just refactored some code from this:

function notifyIfMatchesXPath(query, notify) {
  const flag = XPathResult.ORDERED_NODE_SNAPSHOT_TYPE;
  const callback = () => {
    const result = document.evaluate(query, document, null, flag, null);
    for (let i = 0, {snapshotLength} = result; i < snapshotLength; i++)
      notify(result.snapshotItem(i));
  };
  new MutationObserver(callback).observe(
    document,
    {characterData: true, childList: true, subtree: true}
  );
  callback();
}

to this:

function notifyIfMatchesXPath(query, notify) {
  const evaluator = new XPathEvaluator();
  const expression = evaluator.createExpression(query, null);
  const flag = XPathResult.ORDERED_NODE_SNAPSHOT_TYPE;
  const callback = () => {
    const result = expression.evaluate(document, flag, null);
    for (let i = 0, {snapshotLength} = result; i < snapshotLength; i++)
      notify(result.snapshotItem(i));
  };
  new MutationObserver(callback).observe(
    document,
    {characterData: true, childList: true, subtree: true}
  );
  callback();
}

assuming the cost of parsing, and validating, the XPath query would've been removed from the mutation dance/equation, but I've discovered only recently XPathEvaluator and createExpression, and I'm sure if other developers knew about it, its usage would be closer to the document.evaluate one, which is also now at 2.x%.

If the direction is to nuke XPathEvaluator though, I rather would like to know it before landing such refactoring, thanks.

@SamB
Copy link

SamB commented Feb 27, 2022

Hmm, createExpression looks to have hit 0.3% recently: V8XPathEvaluator_CreateExpression_Method.

DocumentXPathCreateExpression is also up to about 0.025%.

I'm not quite sure what the deal is with these two feature values; it's best I post this before I look too far down that rabbit hole or I might forget entirely ...

@rhdunn
Copy link

rhdunn commented Jul 17, 2023

An attempt at a DOM to XPath XDM has been made at https://qt4cg.org/specifications/xpath-functions-40/Overview.html#html. Feedback is welcome.

Note that this needs to add a conversion mode for not processing namespaces so that when namespaces are enabled it will correctly map XHTML documents and when disabled it will correctly map HTML documents. -- The exact mechanism for this (likely an additional processing option) has not been agreed yet.

It attempts to deal with the various willful violations of the XPath/XML data model in a way that it can handle the different flavours of (X)HTML in the corresponding fn:parse-html function. This includes things like the non-conforming handling of template elements.

@gsnedders
Copy link
Member

gsnedders commented Oct 24, 2023

An attempt at a DOM to XPath XDM has been made at https://qt4cg.org/specifications/xpath-functions-40/Overview.html#html. Feedback is welcome.

Note existing behaviour in browsers cannot be described by the DOM -> XDM conversion alone.

For example, matching //FooBar will match a (http://www.w3.org/1999/xhtml, foobar) element, but it won't match a (http://www.w3.org/1999/xhtml, FooBar) element.

@rhdunn
Copy link

rhdunn commented Oct 24, 2023

See qt4cg/qtspecs#296 for making the XPath matching side of this work, given a HTML document mapped to the XDM.

@gsnedders
Copy link
Member

It might be worth moving at least the XPath section of https://html.spec.whatwg.org/multipage/infrastructure.html#interactions-with-xpath-and-xslt into DOM, because it's really about matching XPath against HTML Documents, which we nowadays define in DOM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

8 participants