xpath: [incorrect] support for `indexed-repeat` function #150

eyelidlessness · 2024-06-27T18:02:00Z

This implementation covers:

All cases directly tested in scenario tests ported from JavaRosa
Unblocking the current known failure in the child-vaccination smoke test

Also adds a slew of other unit-level tests in the xpath package.

There are a handful of judgement calls made in the implementation, described in detail on the JSDoc for the function implementation itself. Most notably, there is mixed support for contextualization of absolute expressions (repeatN expressions > repeat1 are filtered to descendants of result from the previous repeatN/indexN argument pair; indexN are not contextualized, they are evaluated as if they were [position] predicates).

There are a couple of precursor commits to prepare for this work:

c25cae8 sets up a more conventional test inclusion filter in the xpath package (eliminating the weird glob import indirection that existed prior, which was really only valuable during early development of the xpath evaluator)
04225e5 updates some Scenario methods to implicitly create repeat instances where they're missing. I believe this is consistent with the same methods' behavior in JavaRosa. I would prefer to move away from this sort of implicit behavior in the future, but it feels pragmatic for now to keep tests closer to their JavaRosa counterparts. The change was necessary to pass the ported tests directly exercising indexed-repeat. It's worth noting that there are other failing tests which are affected by this change. They don't yet pass, but their PORTING NOTES may be less predictive of their respective failure modes. Some of those will be addressed in another coming PR, also involving in flight improvements that didn't make the cut for Several repeat-related fixes #135.

changeset-bot · 2024-06-27T18:02:04Z

🦋 Changeset detected

Latest commit: a781d92

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 4 packages

Name	Type
@getodk/scenario	Patch
@getodk/xforms-engine	Patch
@getodk/xpath	Patch
@getodk/ui-solid	Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

… and eliminate glob imports for the bulk migrated ORXE tests (these were convenient during rapid dev on the xpath functionality, but now they just make ongoing maintenance a little extra annoying)

… for consistency with JavaRosa. This change will become pertinent when adding support for `indexed-repeat`, as the pertinent `scenario` tests expect this behavior as well.

- Direct test of function implementation in form context has minor modification to omit superfluous answer expression position predicates (and PORTING NOTES updated accordingly) - Child Vaccination smoke test updates to reflect progress based on ability to evaluate `indexed-repeat` function call. It is now blocked on lack of support for repeats with `jr:count`. It seems likely that functionality will allow significantly more progress through this smoke test.

lognaturel

I have a couple questions/thoughts ready to share! My next steps are reviewing the tests more carefully and adding some JR tests to verify my understanding around contextualization of the indexN params.

lognaturel · 2024-07-03T19:37:01Z

packages/scenario/src/jr/Scenario.ts

+
+		for (let i = 0; i < delta; i += 1) {
+			this.createNewRepeat(repeatRangeReference);
+		}


The algorithm here is something like find the last positional predicate, strip it off, use that reference to find the repeat range, add instances up to the position accessed, right? So it won't work for nested repeats that all have missing instances (e.g. /data/repeat1[2]/repeat2[5]/foo)? I think maybe there are tests that use that in JR but I'm not entirely sure. Fine to leave as-is for now but wanted to check my understanding.

I don't think this requires immediate actions either but I wanted to mention that I expected some slightly higher-level building blocks here! Specifically, I'm surprised to see some regex for getting references without predicates and for getting sub references. I imagine at some point it will be worth having some reusable support for those concepts.

The comment about building blocks is partially addressed at 66aa8c9#diff-f3ac6c9869393022cd23375d383ded81f1d35c2c962125846cb288792ee28d52R15

I've been mulling this since I saw these comments yesterday, and my instinct is that we're probably better off not porting this aspect of Scenario.

The tests I've encountered which assume this behavior have frequently surprised and confused me. And not in a trivial way: almost every time, I've misinterpreted the nature of a failure, or misunderstood the intent of the test, sending me off on unproductive tangents.

Tests which expect this level of logical complexity should probably have their own logic tested as well. For instance, I was already deeply uncomfortable with the level of complexity in child-vaccination. There is significant application-like work happening there, and I find it vastly harder to understand the test flow than any of the tests which more explicitly express what they are doing.

I should say that I regret the linked comment. In hindsight, the kind of expression analysis and manipulation that I discuss there, and that you say you were expecting here, is definitely appropriate for engine implementation details. I am quite cautious about using logic of that kind to implement test behavior.

On Tuesday we discussed the fact that such logic is in fact incoming, supporting a variety of other engine improvements. And as we discussed then, that logic is downstream from this change, as other affected tests would currently depend on this specific implicit repeat instance creation. But to be clear, as that work took form, it felt pretty clear that the expression analysis/manipulation logic will be entirely encapsulated in the engine. If we do anticipate exposing arbitrarily/increasingly complex expression analysis and manipulation as a public API, I think we should really think hard about what the appropriate design for that would be. And the appropriate package and/or abstraction layer from which we should expose it.

I already wanted to make repeat instance creation more explicit before this discussion, but I hesitated in this PR on the basis that it would increase divergence from JavaRosa. But that divergence is much more appealing to me now, in light of some of the implications this has raised.

I thought your linked comment meant in the engine. Agreed it’s the engine that eventually should have that functionality so good to see it’s coming. 👍

As for not porting the implicit repeat addition, then I think we should probably change the JR tests. I don’t know off the top of my head how many are affected and will look.

lognaturel · 2024-07-03T19:58:54Z

packages/xpath/src/functions/xforms/node-set.ts

+ * 5. **!!!** None of the `indexN` arguments are contextualized in any special
+ *    way, they're evaluated just as if they were written as a position
+ *    predicate. At present, it's not clear whether there's a reliable way to
+ *    infer intent otherwise:


This is not feeling quite right. A common case is parallel repeats in which position(..) is used as the indexN. The intent is to use the position of the current repeat. Example XLSForm (not the first weird one with the relevance directly on the repeat but the second one with relevance on an inner group)

Haven’t had a chance to write the tests I wanted, I’ll come back to it Friday! I wanted to clarify that my understanding is that the intended context for the index expressions is the single node binding, NOT the repeat reference. Unfortunately all the JR tests have index paths that end up with the same contextualized ref regardless of which is used.

So I played a bit with the linked form. I agree it doesn't seem to be doing what I would intuitively expect from the definition. That said...

I wanted to clarify that my understanding is that the intended context for the index expressions is the single node binding, NOT the repeat reference.

This strongly suggests to me that the spec example could use clarification. Here's what the spec says:

Returns a single node value from a node-set by selecting the 1-based index of a repeat node-set that this node is a child of. It does this up to 3 repeat levels deep. E.g. indexed-repeat(//node, /path/to/repeat, //index1, /path/to/repeat/nested-repeat, //index2) is meant to be a shortcut for //repeat[position()=//index1]/nested-repeat[position()=index2]/node in native XPath syntax.

Note that as I mentioned when we discussed this Tuesday, the way that the example maps //¹ to sub-expression parts is already pretty loosey goosey. But to be really clear...

//repeat[position()=//index1]/nested-repeat[position()=index2]/node ^ (1) ^ (2)

XPath specifies this predicate as contextualized to //repeat.

The position() call is evaluated in that context (which we can hopefully take as read).

The //index1 expression is an abbreviated absolute path, so it isn't really contextualized to anything at all. I really don't know how to interpret the intent of this part of the example. But taken literally, the XPath spec answer would be "the first index1 in the document, in document order".

XPath specifies this predicate as contextualized to //repeat[position()=//index1]/nested-repeat.

Here again, the position() call is evaluated in that context.

The index2 expression is a relative path expression, with a leading abbreviated child axis step. Again taking the example literally, the XPath spec would say that this should resolve to the first index2 child of the current nested-repeat context node.

Footnotes

Shorthand for /descendant-or-self::node()/, which the spec doesn't support, and as I recall the spec was updated in recent memory to remove other similar usage. ↩

The spec was backfilled from the JR implementation so I think we should use the implementation as the intended behavior and update the spec as needed.

I agree it’s loosey goosey! I’ve interpreted it as “evaluate the indexN expression using the single node binding context and use that as the positional predicate.” It’s a bit of a stretch given what’s written but I think that’s the implementation.

I started backfilling a couple tests in JR to show the intended evaluation contexts and describe the general intent behind the function: getodk/javarosa#776 @eyelidlessness and I have looked at it together and he'll take over from there!

packages/xpath/src/functions/xforms/node-set.ts

lognaturel · 2024-07-03T20:08:19Z

.changeset/four-bats-kick.md

+---
+"@getodk/scenario": patch
+"@getodk/xforms-engine": patch
+"@getodk/xpath": patch


Nit: new functionality feels like minor? Not a big deal either way!

Agree. I was partly basing this on precedent in 1ac6b10, which surprised me but seemed to be oriented around targeting some set of functionality for certain 0.x milestones. Not sure if that was the right inference, or if it's still pertinent.

Interesting, I see! I think I was thinking of the API addition as internal but it’s not, really. So that one should have been minor for the engine and dependents?

This one feels unambiguous to me because it’s introducing user-facing functionality.

eyelidlessness · 2024-07-04T17:44:58Z

@lognaturel I think given the ambiguity of spec intent, and the questions about expression logic in Scenario, it might be better to close this for now (and unblock the would-be-downstream PR improving path resolution/etc) and revisit with more clarity. What do you think?

lognaturel · 2024-07-04T19:20:12Z

I’m not opposed to closing and coming back if you think that’s easiest to work with.

I could also see merging and filing a couple issues for follow-up work. I think it’s totally fine to have behavior that we know to be incorrect or incomplete at this stage. Merging would mean we have this context (pun?) easily accessible in history. If you like this option, I think we can just change the title to indicate it’s incomplete and then merge.

Followup from [this discussion](#150 (comment)) The idea here is: 1. Explicit repeat creation in tests will improve test clarity 2. Introduce a clear way to make similar changes in JavaRosa as they come up 3. Detect missing repeats (with a still naive approach[^1], albeit now recursive) and **log with a stack trace** so explicit calls can be introduced (conditionally, with parameterization like many other cases where we make adjustments to the JavaRosa direct port) 4. Add a new proposed `Scenario` method which… - Makes clear where explicit repeat creation calls are added, in a way that can be traced directly in test source, whenever convenient - Assumes the call occurs in such a sub-suite parameterizing whether to explicitly add repeats as detected; adds repeats as explicitly specified in the true condition, suppresses logging in the false condition This approach already detected one test which would have passed if adding repeats had been explicit. The test is updated here to demonstrate that. Notice that the test’s **PORTING NOTES** have also been removed. This is because the notes were wrong! This is an excellent example of how misleading it is that tests fail for lack of this implicit behavior! The actual test logic is not substantially noisier or more complex as a result. This feels like a clear win to me. [^1]: Keeping this naive seems fine for the limited scope of usage. The reference expressions which reach this point are limited to `Scenario.answer` calls with an explicit reference. If we’re using references of arbitrary complexity in those calls, I think we’ve got much bigger problems than this functionality being so narrowly scoped.

eyelidlessness · 2024-07-15T17:57:14Z

I've switched this PR to a draft for now. It'll likely be closed, with another PR replacing it on a separate branch. I want to keep it here for now to keep the feedback context. Changes in the followup PR will include:

Branching off the changes in Improve dependency resolution, XPath static analysis; support current() in computations, repeat-based itemsets, relative body references #166
Expand on approach in that branch/PR to making scenario implicit repeat creation explicit
Bring in tests from Add indexed-repeat tests javarosa#776
Expand on those with some additional checks exercising my better understanding of the intended behavior (I will likely create these tests in JavaRosa first, to continue validating my assumptions)
Finally, revisiting the implementation to address contextualization of the "index" (position) arguments

I expect the implementation will be more limited, to just the FunctionImplementation itself. Given what I now understand about JavaRosa's behavior, I don't think there's as much opportunity to generalize position predicates as I had here.

There's also a slight chance that that approach is challenging because of some of the expectations around argument order (i.e. I also now understand that there is no requirement that the arguments are ordered in terms of depth of hierarchy). In which case, it's possible the implementation might be better modeled after JavaRosa's approach, which involves transforming the FunctionCall expression itself into its LocationPath equivalent. I have reservations about taking that approach. It's perfectly cromulent! But I'm cautious about what it would imply for the responsibility boundaries between our packages (i.e. we currently do XForms-specific XPath syntax analysis in xforms-engine, and currently have function implementations in xpath).

One potential way to bridge that gap would be to have xpath expose something like a SyntaxSugarFunctionImplementation—i.e. a function implemented by converting the source syntax to some other arbitrary expression for further evaluation—that xforms-engine and other potential downstream packages could use to implement specialized functions. I could see this being useful for implementing pulldata as well. It could also theoretically be used to replace the jr:itext implementation with a much cleaner isolation of its implicit state.

lognaturel · 2024-07-15T19:30:25Z

a function implemented by converting the source syntax to some other arbitrary expression for further evaluation

That makes a lot of sense and I agree it likely would be useful beyond this function 👍

Followup from [this discussion](#150 (comment)) The idea here is: 1. Explicit repeat creation in tests will improve test clarity 2. Introduce a clear way to make similar changes in JavaRosa as they come up 3. Detect missing repeats (with a still naive approach[^1], albeit now recursive) and **log with a stack trace** so explicit calls can be introduced (conditionally, with parameterization like many other cases where we make adjustments to the JavaRosa direct port) 4. Add a new proposed `Scenario` method which… - Makes clear where explicit repeat creation calls are added, in a way that can be traced directly in test source, whenever convenient - Assumes the call occurs in such a sub-suite parameterizing whether to explicitly add repeats as detected; adds repeats as explicitly specified in the true condition, suppresses logging in the false condition This approach already detected one test which would have passed if adding repeats had been explicit. The test is updated here to demonstrate that. Notice that the test’s **PORTING NOTES** have also been removed. This is because the notes were wrong! This is an excellent example of how misleading it is that tests fail for lack of this implicit behavior! The actual test logic is not substantially noisier or more complex as a result. This feels like a clear win to me. [^1]: Keeping this naive seems fine for the limited scope of usage. The reference expressions which reach this point are limited to `Scenario.answer` calls with an explicit reference. If we’re using references of arbitrary complexity in those calls, I think we’ve got much bigger problems than this functionality being so narrowly scoped.

…e usage As was done in the first pass on this functionality in #150

This is copied verbatim from 56df0ea#diff-ca14e9532b1b772a6f0453cf9ff885d0fdd7c088df02bd17bfc911f03f4e9cd4 where they were originally introduced. This demonstrates the previous, wrong, understanding of how arguments are expected to be contextualized. The tests will be updated in subsequent commit(s) to show the difference in behavior from #150.

eyelidlessness and others added 5 commits June 27, 2024 13:49

chore: xpath package consistent test suffixes…

9aad48b

… and eliminate glob imports for the bulk migrated ORXE tests (these were convenient during rapid dev on the xpath functionality, but now they just make ongoing maintenance a little extra annoying)

Fix (scenario): create missing repeat instances on answer call…

9fdf3e2

… for consistency with JavaRosa. This change will become pertinent when adding support for `indexed-repeat`, as the pertinent `scenario` tests expect this behavior as well.

xpath: add support for indexed-repeat function

56df0ea

Changeset: indexed-repeat

a781d92

eyelidlessness force-pushed the features/indexed-repeat-xpath-fn branch from ff9fe16 to a781d92 Compare June 27, 2024 20:50

lognaturel reviewed Jul 3, 2024

View reviewed changes

eyelidlessness mentioned this pull request Jul 12, 2024

Improve dependency resolution, XPath static analysis; support current() in computations, repeat-based itemsets, relative body references #166

Merged

eyelidlessness marked this pull request as draft July 15, 2024 17:37

eyelidlessness added a commit that referenced this pull request Aug 22, 2024

Update ported JR indexed-repeat tests for consistent XPath referenc…

aad995b

…e usage As was done in the first pass on this functionality in #150

eyelidlessness changed the title ~~xpath: support for indexed-repeat function~~ xpath: [incorrect] support for indexed-repeat function Aug 22, 2024

eyelidlessness closed this Aug 22, 2024

eyelidlessness mentioned this pull request Aug 22, 2024

xpath: support for indexed-repeat function #195

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

xpath: [incorrect] support for `indexed-repeat` function #150

xpath: [incorrect] support for `indexed-repeat` function #150

eyelidlessness commented Jun 27, 2024 •

edited

Loading

changeset-bot bot commented Jun 27, 2024 •

edited

Loading

lognaturel left a comment

lognaturel Jul 3, 2024

lognaturel Jul 3, 2024

eyelidlessness Jul 4, 2024

lognaturel Jul 4, 2024 •

edited

Loading

lognaturel Jul 3, 2024

lognaturel Jul 4, 2024

eyelidlessness Jul 4, 2024 •

edited

Loading

lognaturel Jul 4, 2024

lognaturel Jul 8, 2024 •

edited

Loading

lognaturel Jul 3, 2024

eyelidlessness Jul 4, 2024

lognaturel Jul 4, 2024

eyelidlessness commented Jul 4, 2024

lognaturel commented Jul 4, 2024

eyelidlessness commented Jul 15, 2024

lognaturel commented Jul 15, 2024

xpath: [incorrect] support for indexed-repeat function #150

xpath: [incorrect] support for indexed-repeat function #150

Conversation

eyelidlessness commented Jun 27, 2024 • edited Loading

changeset-bot bot commented Jun 27, 2024 • edited Loading

🦋 Changeset detected

lognaturel left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lognaturel Jul 4, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eyelidlessness Jul 4, 2024 • edited Loading

Choose a reason for hiding this comment

Footnotes

Choose a reason for hiding this comment

lognaturel Jul 8, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eyelidlessness commented Jul 4, 2024

lognaturel commented Jul 4, 2024

eyelidlessness commented Jul 15, 2024

lognaturel commented Jul 15, 2024

xpath: [incorrect] support for `indexed-repeat` function #150

xpath: [incorrect] support for `indexed-repeat` function #150

eyelidlessness commented Jun 27, 2024 •

edited

Loading

changeset-bot bot commented Jun 27, 2024 •

edited

Loading

lognaturel Jul 4, 2024 •

edited

Loading

eyelidlessness Jul 4, 2024 •

edited

Loading

lognaturel Jul 8, 2024 •

edited

Loading