-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
xpath: support for indexed-repeat
function
#195
Conversation
…e usage As was done in the first pass on this functionality in #150
… and include notes on the error condition test which still fails.
… and eliminate glob imports for the bulk migrated ORXE tests (these were convenient during rapid dev on the xpath functionality, but now they just make ongoing maintenance a little extra annoying)
This is copied verbatim from 56df0ea#diff-ca14e9532b1b772a6f0453cf9ff885d0fdd7c088df02bd17bfc911f03f4e9cd4 where they were originally introduced. This demonstrates the previous, wrong, understanding of how arguments are expected to be contextualized. The tests will be updated in subsequent commit(s) to show the difference in behavior from #150.
🦋 Changeset detectedLatest commit: bbbd2a2 The changes in this PR will be included in the next version bump. This PR includes changesets to release 2 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
I want to call out the series of commits around unit-ish tests. These were carried over from #150 and adapted to match the corrected behavior. I don't think the tests themselves have as much value in this PR as they did in #150, but I included them anyway because:
|
I'm not sure who to tag to review this one! I think @lognaturel has a good idea what to expect from a behavior perspective. I expect @sadiqkhoja would benefit from getting a bit more insight into |
* be** the error thrown by `@getodk/xpath` for missing function support, | ||
* so we start with an expected failure throughout the suite. | ||
* | ||
* - Bind for `calc` is updated to reference `/data/calc`, as is clearly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Weird, I just get in a relative mindset, I guess. 😬 Will fix in JR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some opportunities to clean up some comments made stale by getodk/javarosa@9748bea and a couple questions. Generally this looks ready to go to me. 🎉
/** | ||
* **PORTING NOTES** | ||
* | ||
* - Updated body reference to `/data/repeat2/inside2`, as seems the likely |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going to fix.
for (const pair of pairs) { | ||
const { position } = pair; | ||
const repeats = pair.repeats.filter((repeat) => { | ||
return repeatContextNode?.contains(repeat) !== false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This means invalid parameter pairs will be silently ignored, right? It feels like it would be helpful for the issue to be surfaced in case it's a typo but that could come later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, but I can see why it would seem that way! The reasoning for this was also less clear to me reading it today than when I wrote it.
It's a special case for the first iteration of the loop. I hope 7742cb9 is clarifying.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Helpful for the first iteration, yes! What happens if I do indexed-repeat(/some/repeat, /some/other_repeat, 3)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm about to wrap up for the day so I can test tomorrow to say for sure, but I believe it will produce an empty node-set/blank string.
Ah, and now I understand the question better!
The contains
filters aren't meant (only) as a guard against invalid structural references. They are also a result filter to roughly achieve the semantics of the the non-shorthand's LocationPath Step equivalent—i.e. /data/subtree/leaf
produces a context of the leaf
nodes contained by (well, in this case children of) /data/subtree
.
You can think of it as conceptually similar to how some custom CSS selector engines work: they'll often work right to left (well, hierarchically bottom up), matching the deepest part of the selector first and then working hierarchically up the selector to refine the result set from there. (As described here, it's usually a performance optimization technique; and it's one we might also want to think about someday)
But yes, it will also filter out invalid hierarchies as in your example, that's just not all it's doing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, thanks! How about tracking the unrelated reference case as part of future error work? It would be nice to give explicit feedback as some sort of error if possible.
return 0; | ||
}; | ||
|
||
export const indexedRepeat = new NodeSetFunction( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this needs to be StringFunction
, in the spec return type of indexedRepeat
is string
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, thanks for flagging this, I wondered about it briefly when reviewing but didn't come back to it. I've never seen attempted usage that results in a nodeset but it certainly could be useful. I'm not sure what Collect will do in that case. I think it would be fine as long as you use it in a context where a nodeset is allowed like count(indexed-repeat(/data/repeat1/repeat2/foo, /data/repeat1, 3))
so it might be the spec that needs updating. We should try something like that in Collect (I can later if no one beats me to it)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe I made a case for this in #150. I am open to changing it, but I do want to make the case for it again here.
My interpretation of the spec is that the function...
- is called in a node-set context (implicit, true of all XPath expressions)
- produces a new node-set context (intermediate result) for each
repeatN
/indexN
pair [in containment depth order] - resolves to a node-set for the
arg
target (this is the result produced by the spec example's non-shortcut equivalent) - ultimately casts that resolved node-set to a string (suggested by spec language: "single node value")
Importantly: XPath semantics only has a concept of node-set value/result type, not a single node. Any single node result—whether returned by a function, or produced by a LocationPath Step—is a node-set with a context size of 1.
In this case, the production of a "single node value" occurs in step 4, where XPath casting semantics produces the string value of the first node in a node-set (regardless of its context size).
The tests clearly demonstrate that behavior: anywhere a string value is consumed, the expected string value is produced from the single (first) node value in the resulting node-set.
In @getodk/xpath
this casting from node-set to string occurs further up the call stack, applied by intentionally isolated casting logic, and applied consistently:
- to any XPath result
- of any type
- whether an intermediate result of a sub-expression, or the ultimate result produced by an
evaluate
call - whether the result is produced by a
FunctionImplementation
, any part of a LocationPath, or any other result from any other XPath syntax
Our isolation of casting logic is important. XPath casting semantics have a fair bit of nuance which can easily be lost or become inconsistent otherwise (and did earlier in implementation!). That isolation is also crucial to our ability to support some ODK XForms functionality, i.e. date / time / datetime data types.
In terms of spec conformance: the only way to deviate from the spec behavior is to intentionally consume the return value as a node-set, as in @lognaturel's count
example, or by calling evaluate
with any of the XPathResult
types producing one or more nodes **and then explicitly accessing the evaluated result's node APIs, or by calling any of our explicitly node-returning evaluate*
convenience methods; any of which are explicit and intentional. Doing so does go beyond the spec, but it does not conflict with the spec. This is the same reasoning I applied to implementing the function with a variadic signature (rather than restricting calls to a maximum of three repeatN
/indexN
pairs as in the spec).
Since the spec signature returns a string, there's no precedent for different behavior when an indexed-repeat
return value is deliberately used as a node-set: you can't cast a string to a node-set.
The only conflict this interpretation could introduce with the spec is if we think it's more valuable to users to actively prevent the beyond-spec usage than to (quietly, apart from this discussion) permit it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation, it feels fine to leave as-is to me. 👍 @sadiqkhoja?
Our isolation of casting logic is important.
Strong agree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just saw the test below and the edited comment. Ok, seems like there's more to consider!
const targetNodes = evaluateArgumentToFilterableNodes(context, target); | ||
|
||
// Filter only the target nodes contained by the deepest repeat context node. | ||
return targetNodes.filter((targetNode) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
find
wouldn't be a better function here? We need to pick " a single node value "
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll refer to the casting details in my comment the other thread, with emphasis on how node-sets are cast to strings. This is also how XPath semantics addresses node-set values.
Anyway, as I said in that thread: I'm open to making this less capable if that's what we decide we really want.
But to be clear, a NodeSetFunction
implementation can't return a single node, it must return Iterable<Node>
. So if we decide that the less capable thing we're after is guaranteeing single node-ness (rather than the value-ness, which is already handled by casting up the stack), we'd still need to either rewrap the single node result in an array (empty if null) or convert this to a generator and yield
the non-null result.
Following adhoc test fails:
|
While this was more concise, it was less clear when the optional case applies. This is an odd case where TypeScript doesn’t have a great way to model a let binding which **will ultimately** be assigned by a loop, but is undefined for the loop’s first iteration. I would normally prefer to express this with either: - an initial non-null value assigned to the let binding (but there isn’t an appropriate one in this case; we could lie and assign the root node, but that would be a lot less clear IMO) - reduce (but I suspect that would make this implementation a little bit harder for most readers who don’t also have an affinity for reduce)
Unless I'm mistaken, this looks like the same feedback as the other two comments: whether or not to go beyond the spec and permit a node-set result. If I am mistaken and there's a different issue here that I'm not seeing, please feel free to clarify so I can address it separately. Edit: after giving this some more thought, I think this might be a compelling argument for enforcing the “single node” aspect at the function implementation level, rather than just relying on casting up the stack as currently implemented. At least as a cautious first step, leaving it open to spec expansion in the future. But it definitely is an example of a potential case where we might unintentionally introduce unexpected behavior. I think (as @lognaturel suggested) it’ll be interesting to see what JavaRosa produces for this case. I’ll check that in the morning! |
I tried the attached Form in Collect and Enketo, both are producing multiple nodes so this implementation matches the JR's behaviour. |
Not what I expected! That said, it makes sense given that they both essentially construct XPath path expressions and then evaluate them. Let’s update the spec. 👍 |
Thanks for putting a form together @sadiqkhoja! I've added two I'm really glad we took the time to make sure this is the right call, and to compare the JavaRosa behavior. In hindsight, this is a good reminder (to me!) not to take spec language too literally, and to think really carefully about likely intent. It's also really encouraging that it's now so quick and easy to compare Web Forms and JavaRosa behavior, so we can go to JR as a source of truth when there's doubt about spec minutiae. I agree we should update the spec language, and I'll also re-emphasize that it would be a great opportunity to refine the example, and maybe add more examples as well. |
Replaces #150. The previous approach incorrectly contextualized relative arguments to their preceding result node-sets. This approach is corrected based on:
indexed-repeat
function #150 and clarifying discussions based on thatThanks @lognaturel for helping me understand the expected behavior! This one was a doozy, relative to what I thought I understood going in.
A nice side effect of all this clarification is that I think the implementation itself is much simpler.
I did not decide to go the alternative route discussed in #150 comments (providing a mechanism for a function to act as a syntax alias or sugar). That would certainly be interesting, but I decided to go with this approach for a couple of reasons:
@getodk/xpath
concepts and internals for as many use cases as they address. This keeps the base of functionality contained, especially as compelling motivations for refactoring mount1.Footnotes
Such as the oft-cited Decouple @odk/xpath API from XML/browser DOM #39 ↩