Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xpath: [incorrect] support for indexed-repeat function #150

Closed
wants to merge 5 commits into from

Conversation

eyelidlessness
Copy link
Member

@eyelidlessness eyelidlessness commented Jun 27, 2024

This implementation covers:

  • All cases directly tested in scenario tests ported from JavaRosa
  • Unblocking the current known failure in the child-vaccination smoke test

Also adds a slew of other unit-level tests in the xpath package.

There are a handful of judgement calls made in the implementation, described in detail on the JSDoc for the function implementation itself. Most notably, there is mixed support for contextualization of absolute expressions (repeatN expressions > repeat1 are filtered to descendants of result from the previous repeatN/indexN argument pair; indexN are not contextualized, they are evaluated as if they were [position] predicates).


There are a couple of precursor commits to prepare for this work:

  • c25cae8 sets up a more conventional test inclusion filter in the xpath package (eliminating the weird glob import indirection that existed prior, which was really only valuable during early development of the xpath evaluator)

  • 04225e5 updates some Scenario methods to implicitly create repeat instances where they're missing. I believe this is consistent with the same methods' behavior in JavaRosa. I would prefer to move away from this sort of implicit behavior in the future, but it feels pragmatic for now to keep tests closer to their JavaRosa counterparts. The change was necessary to pass the ported tests directly exercising indexed-repeat. It's worth noting that there are other failing tests which are affected by this change. They don't yet pass, but their PORTING NOTES may be less predictive of their respective failure modes. Some of those will be addressed in another coming PR, also involving in flight improvements that didn't make the cut for Several repeat-related fixes #135.

Copy link

changeset-bot bot commented Jun 27, 2024

🦋 Changeset detected

Latest commit: a781d92

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 4 packages
Name Type
@getodk/scenario Patch
@getodk/xforms-engine Patch
@getodk/xpath Patch
@getodk/ui-solid Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

eyelidlessness and others added 5 commits June 27, 2024 13:49
… and eliminate glob imports for the bulk migrated ORXE tests (these were convenient during rapid dev on the xpath functionality, but now they just make ongoing maintenance a little extra annoying)
… for consistency with JavaRosa.

This change will become pertinent when adding support for `indexed-repeat`, as the pertinent `scenario` tests expect this behavior as well.
- Direct test of function implementation in form context has minor modification to omit superfluous answer expression position predicates (and PORTING NOTES updated accordingly)

- Child Vaccination smoke test updates to reflect progress based on ability to evaluate `indexed-repeat` function call. It is now blocked on lack of support for repeats with `jr:count`. It seems likely that functionality will allow significantly more progress through this smoke test.
@eyelidlessness eyelidlessness force-pushed the features/indexed-repeat-xpath-fn branch from ff9fe16 to a781d92 Compare June 27, 2024 20:50
Copy link
Member

@lognaturel lognaturel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a couple questions/thoughts ready to share! My next steps are reviewing the tests more carefully and adding some JR tests to verify my understanding around contextualization of the indexN params.


for (let i = 0; i < delta; i += 1) {
this.createNewRepeat(repeatRangeReference);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The algorithm here is something like find the last positional predicate, strip it off, use that reference to find the repeat range, add instances up to the position accessed, right? So it won't work for nested repeats that all have missing instances (e.g. /data/repeat1[2]/repeat2[5]/foo)? I think maybe there are tests that use that in JR but I'm not entirely sure. Fine to leave as-is for now but wanted to check my understanding.

I don't think this requires immediate actions either but I wanted to mention that I expected some slightly higher-level building blocks here! Specifically, I'm surprised to see some regex for getting references without predicates and for getting sub references. I imagine at some point it will be worth having some reusable support for those concepts.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment about building blocks is partially addressed at 66aa8c9#diff-f3ac6c9869393022cd23375d383ded81f1d35c2c962125846cb288792ee28d52R15

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been mulling this since I saw these comments yesterday, and my instinct is that we're probably better off not porting this aspect of Scenario.

  • The tests I've encountered which assume this behavior have frequently surprised and confused me. And not in a trivial way: almost every time, I've misinterpreted the nature of a failure, or misunderstood the intent of the test, sending me off on unproductive tangents.

  • Tests which expect this level of logical complexity should probably have their own logic tested as well. For instance, I was already deeply uncomfortable with the level of complexity in child-vaccination. There is significant application-like work happening there, and I find it vastly harder to understand the test flow than any of the tests which more explicitly express what they are doing.

  • I should say that I regret the linked comment. In hindsight, the kind of expression analysis and manipulation that I discuss there, and that you say you were expecting here, is definitely appropriate for engine implementation details. I am quite cautious about using logic of that kind to implement test behavior.

  • On Tuesday we discussed the fact that such logic is in fact incoming, supporting a variety of other engine improvements. And as we discussed then, that logic is downstream from this change, as other affected tests would currently depend on this specific implicit repeat instance creation. But to be clear, as that work took form, it felt pretty clear that the expression analysis/manipulation logic will be entirely encapsulated in the engine. If we do anticipate exposing arbitrarily/increasingly complex expression analysis and manipulation as a public API, I think we should really think hard about what the appropriate design for that would be. And the appropriate package and/or abstraction layer from which we should expose it.

I already wanted to make repeat instance creation more explicit before this discussion, but I hesitated in this PR on the basis that it would increase divergence from JavaRosa. But that divergence is much more appealing to me now, in light of some of the implications this has raised.

Copy link
Member

@lognaturel lognaturel Jul 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought your linked comment meant in the engine. Agreed it’s the engine that eventually should have that functionality so good to see it’s coming. 👍

As for not porting the implicit repeat addition, then I think we should probably change the JR tests. I don’t know off the top of my head how many are affected and will look.

* 5. **!!!** None of the `indexN` arguments are contextualized in any special
* way, they're evaluated just as if they were written as a position
* predicate. At present, it's not clear whether there's a reliable way to
* infer intent otherwise:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not feeling quite right. A common case is parallel repeats in which position(..) is used as the indexN. The intent is to use the position of the current repeat. Example XLSForm (not the first weird one with the relevance directly on the repeat but the second one with relevance on an inner group)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven’t had a chance to write the tests I wanted, I’ll come back to it Friday! I wanted to clarify that my understanding is that the intended context for the index expressions is the single node binding, NOT the repeat reference. Unfortunately all the JR tests have index paths that end up with the same contextualized ref regardless of which is used.

Copy link
Member Author

@eyelidlessness eyelidlessness Jul 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I played a bit with the linked form. I agree it doesn't seem to be doing what I would intuitively expect from the definition. That said...

I wanted to clarify that my understanding is that the intended context for the index expressions is the single node binding, NOT the repeat reference.

This strongly suggests to me that the spec example could use clarification. Here's what the spec says:

Returns a single node value from a node-set by selecting the 1-based index of a repeat node-set that this node is a child of. It does this up to 3 repeat levels deep. E.g. indexed-repeat(//node, /path/to/repeat, //index1, /path/to/repeat/nested-repeat, //index2) is meant to be a shortcut for //repeat[position()=//index1]/nested-repeat[position()=index2]/node in native XPath syntax.

Note that as I mentioned when we discussed this Tuesday, the way that the example maps //1 to sub-expression parts is already pretty loosey goosey. But to be really clear...

//repeat[position()=//index1]/nested-repeat[position()=index2]/node
         ^ (1)                              ^ (2)
  1. XPath specifies this predicate as contextualized to //repeat.

    • The position() call is evaluated in that context (which we can hopefully take as read).

    • The //index1 expression is an abbreviated absolute path, so it isn't really contextualized to anything at all. I really don't know how to interpret the intent of this part of the example. But taken literally, the XPath spec answer would be "the first index1 in the document, in document order".

  2. XPath specifies this predicate as contextualized to //repeat[position()=//index1]/nested-repeat.

    • Here again, the position() call is evaluated in that context.

    • The index2 expression is a relative path expression, with a leading abbreviated child axis step. Again taking the example literally, the XPath spec would say that this should resolve to the first index2 child of the current nested-repeat context node.

Footnotes

  1. Shorthand for /descendant-or-self::node()/, which the spec doesn't support, and as I recall the spec was updated in recent memory to remove other similar usage.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The spec was backfilled from the JR implementation so I think we should use the implementation as the intended behavior and update the spec as needed.

I agree it’s loosey goosey! I’ve interpreted it as “evaluate the indexN expression using the single node binding context and use that as the positional predicate.” It’s a bit of a stretch given what’s written but I think that’s the implementation.

Copy link
Member

@lognaturel lognaturel Jul 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started backfilling a couple tests in JR to show the intended evaluation contexts and describe the general intent behind the function: getodk/javarosa#776 @eyelidlessness and I have looked at it together and he'll take over from there!

---
"@getodk/scenario": patch
"@getodk/xforms-engine": patch
"@getodk/xpath": patch
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: new functionality feels like minor? Not a big deal either way!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. I was partly basing this on precedent in 1ac6b10, which surprised me but seemed to be oriented around targeting some set of functionality for certain 0.x milestones. Not sure if that was the right inference, or if it's still pertinent.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, I see! I think I was thinking of the API addition as internal but it’s not, really. So that one should have been minor for the engine and dependents?

This one feels unambiguous to me because it’s introducing user-facing functionality.

@eyelidlessness
Copy link
Member Author

@lognaturel I think given the ambiguity of spec intent, and the questions about expression logic in Scenario, it might be better to close this for now (and unblock the would-be-downstream PR improving path resolution/etc) and revisit with more clarity. What do you think?

@lognaturel
Copy link
Member

I’m not opposed to closing and coming back if you think that’s easiest to work with.

I could also see merging and filing a couple issues for follow-up work. I think it’s totally fine to have behavior that we know to be incorrect or incomplete at this stage. Merging would mean we have this context (pun?) easily accessible in history. If you like this option, I think we can just change the title to indicate it’s incomplete and then merge.

eyelidlessness added a commit that referenced this pull request Jul 4, 2024
Followup from [this discussion](#150 (comment))

The idea here is:

1. Explicit repeat creation in tests will improve test clarity

2. Introduce a clear way to make similar changes in JavaRosa as they come up

3. Detect missing repeats (with a still naive approach[^1], albeit now recursive) and **log with a stack trace** so explicit calls can be introduced (conditionally, with parameterization like many other cases where we make adjustments to the JavaRosa direct port)

4. Add a new proposed `Scenario` method which…

  - Makes clear where explicit repeat creation calls are added, in a way that can be traced directly in test source, whenever convenient

  - Assumes the call occurs in such a sub-suite parameterizing whether to explicitly add repeats as detected; adds repeats as explicitly specified in the true condition, suppresses logging in the false condition

This approach already detected one test which would have passed if adding repeats had been explicit. The test is updated here to demonstrate that.

Notice that the test’s **PORTING NOTES** have also been removed. This is because the notes were wrong! This is an excellent example of how misleading it is that tests fail for lack of this implicit behavior! The actual test logic is not substantially noisier or more complex as a result. This feels like a clear win to me.

[^1]: Keeping this naive seems fine for the limited scope of usage. The reference expressions which reach this point are limited to `Scenario.answer` calls with an explicit reference. If we’re using references of arbitrary complexity in those calls, I think we’ve got much bigger problems than this functionality being so narrowly scoped.
eyelidlessness added a commit that referenced this pull request Jul 8, 2024
Followup from [this discussion](#150 (comment))

The idea here is:

1. Explicit repeat creation in tests will improve test clarity

2. Introduce a clear way to make similar changes in JavaRosa as they come up

3. Detect missing repeats (with a still naive approach[^1], albeit now recursive) and **log with a stack trace** so explicit calls can be introduced (conditionally, with parameterization like many other cases where we make adjustments to the JavaRosa direct port)

4. Add a new proposed `Scenario` method which…

  - Makes clear where explicit repeat creation calls are added, in a way that can be traced directly in test source, whenever convenient

  - Assumes the call occurs in such a sub-suite parameterizing whether to explicitly add repeats as detected; adds repeats as explicitly specified in the true condition, suppresses logging in the false condition

This approach already detected one test which would have passed if adding repeats had been explicit. The test is updated here to demonstrate that.

Notice that the test’s **PORTING NOTES** have also been removed. This is because the notes were wrong! This is an excellent example of how misleading it is that tests fail for lack of this implicit behavior! The actual test logic is not substantially noisier or more complex as a result. This feels like a clear win to me.

[^1]: Keeping this naive seems fine for the limited scope of usage. The reference expressions which reach this point are limited to `Scenario.answer` calls with an explicit reference. If we’re using references of arbitrary complexity in those calls, I think we’ve got much bigger problems than this functionality being so narrowly scoped.
eyelidlessness added a commit that referenced this pull request Jul 9, 2024
Followup from [this discussion](#150 (comment))

The idea here is:

1. Explicit repeat creation in tests will improve test clarity

2. Introduce a clear way to make similar changes in JavaRosa as they come up

3. Detect missing repeats (with a still naive approach[^1], albeit now recursive) and **log with a stack trace** so explicit calls can be introduced (conditionally, with parameterization like many other cases where we make adjustments to the JavaRosa direct port)

4. Add a new proposed `Scenario` method which…

  - Makes clear where explicit repeat creation calls are added, in a way that can be traced directly in test source, whenever convenient

  - Assumes the call occurs in such a sub-suite parameterizing whether to explicitly add repeats as detected; adds repeats as explicitly specified in the true condition, suppresses logging in the false condition

This approach already detected one test which would have passed if adding repeats had been explicit. The test is updated here to demonstrate that.

Notice that the test’s **PORTING NOTES** have also been removed. This is because the notes were wrong! This is an excellent example of how misleading it is that tests fail for lack of this implicit behavior! The actual test logic is not substantially noisier or more complex as a result. This feels like a clear win to me.

[^1]: Keeping this naive seems fine for the limited scope of usage. The reference expressions which reach this point are limited to `Scenario.answer` calls with an explicit reference. If we’re using references of arbitrary complexity in those calls, I think we’ve got much bigger problems than this functionality being so narrowly scoped.
eyelidlessness added a commit that referenced this pull request Jul 12, 2024
Followup from [this discussion](#150 (comment))

The idea here is:

1. Explicit repeat creation in tests will improve test clarity

2. Introduce a clear way to make similar changes in JavaRosa as they come up

3. Detect missing repeats (with a still naive approach[^1], albeit now recursive) and **log with a stack trace** so explicit calls can be introduced (conditionally, with parameterization like many other cases where we make adjustments to the JavaRosa direct port)

4. Add a new proposed `Scenario` method which…

  - Makes clear where explicit repeat creation calls are added, in a way that can be traced directly in test source, whenever convenient

  - Assumes the call occurs in such a sub-suite parameterizing whether to explicitly add repeats as detected; adds repeats as explicitly specified in the true condition, suppresses logging in the false condition

This approach already detected one test which would have passed if adding repeats had been explicit. The test is updated here to demonstrate that.

Notice that the test’s **PORTING NOTES** have also been removed. This is because the notes were wrong! This is an excellent example of how misleading it is that tests fail for lack of this implicit behavior! The actual test logic is not substantially noisier or more complex as a result. This feels like a clear win to me.

[^1]: Keeping this naive seems fine for the limited scope of usage. The reference expressions which reach this point are limited to `Scenario.answer` calls with an explicit reference. If we’re using references of arbitrary complexity in those calls, I think we’ve got much bigger problems than this functionality being so narrowly scoped.
@eyelidlessness eyelidlessness marked this pull request as draft July 15, 2024 17:37
@eyelidlessness
Copy link
Member Author

I've switched this PR to a draft for now. It'll likely be closed, with another PR replacing it on a separate branch. I want to keep it here for now to keep the feedback context. Changes in the followup PR will include:

I expect the implementation will be more limited, to just the FunctionImplementation itself. Given what I now understand about JavaRosa's behavior, I don't think there's as much opportunity to generalize position predicates as I had here.

There's also a slight chance that that approach is challenging because of some of the expectations around argument order (i.e. I also now understand that there is no requirement that the arguments are ordered in terms of depth of hierarchy). In which case, it's possible the implementation might be better modeled after JavaRosa's approach, which involves transforming the FunctionCall expression itself into its LocationPath equivalent. I have reservations about taking that approach. It's perfectly cromulent! But I'm cautious about what it would imply for the responsibility boundaries between our packages (i.e. we currently do XForms-specific XPath syntax analysis in xforms-engine, and currently have function implementations in xpath).

One potential way to bridge that gap would be to have xpath expose something like a SyntaxSugarFunctionImplementation—i.e. a function implemented by converting the source syntax to some other arbitrary expression for further evaluation—that xforms-engine and other potential downstream packages could use to implement specialized functions. I could see this being useful for implementing pulldata as well. It could also theoretically be used to replace the jr:itext implementation with a much cleaner isolation of its implicit state.

@lognaturel
Copy link
Member

a function implemented by converting the source syntax to some other arbitrary expression for further evaluation

That makes a lot of sense and I agree it likely would be useful beyond this function 👍

eyelidlessness added a commit that referenced this pull request Jul 16, 2024
Followup from [this discussion](#150 (comment))

The idea here is:

1. Explicit repeat creation in tests will improve test clarity

2. Introduce a clear way to make similar changes in JavaRosa as they come up

3. Detect missing repeats (with a still naive approach[^1], albeit now recursive) and **log with a stack trace** so explicit calls can be introduced (conditionally, with parameterization like many other cases where we make adjustments to the JavaRosa direct port)

4. Add a new proposed `Scenario` method which…

  - Makes clear where explicit repeat creation calls are added, in a way that can be traced directly in test source, whenever convenient

  - Assumes the call occurs in such a sub-suite parameterizing whether to explicitly add repeats as detected; adds repeats as explicitly specified in the true condition, suppresses logging in the false condition

This approach already detected one test which would have passed if adding repeats had been explicit. The test is updated here to demonstrate that.

Notice that the test’s **PORTING NOTES** have also been removed. This is because the notes were wrong! This is an excellent example of how misleading it is that tests fail for lack of this implicit behavior! The actual test logic is not substantially noisier or more complex as a result. This feels like a clear win to me.

[^1]: Keeping this naive seems fine for the limited scope of usage. The reference expressions which reach this point are limited to `Scenario.answer` calls with an explicit reference. If we’re using references of arbitrary complexity in those calls, I think we’ve got much bigger problems than this functionality being so narrowly scoped.
eyelidlessness added a commit that referenced this pull request Jul 22, 2024
Followup from [this discussion](#150 (comment))

The idea here is:

1. Explicit repeat creation in tests will improve test clarity

2. Introduce a clear way to make similar changes in JavaRosa as they come up

3. Detect missing repeats (with a still naive approach[^1], albeit now recursive) and **log with a stack trace** so explicit calls can be introduced (conditionally, with parameterization like many other cases where we make adjustments to the JavaRosa direct port)

4. Add a new proposed `Scenario` method which…

  - Makes clear where explicit repeat creation calls are added, in a way that can be traced directly in test source, whenever convenient

  - Assumes the call occurs in such a sub-suite parameterizing whether to explicitly add repeats as detected; adds repeats as explicitly specified in the true condition, suppresses logging in the false condition

This approach already detected one test which would have passed if adding repeats had been explicit. The test is updated here to demonstrate that.

Notice that the test’s **PORTING NOTES** have also been removed. This is because the notes were wrong! This is an excellent example of how misleading it is that tests fail for lack of this implicit behavior! The actual test logic is not substantially noisier or more complex as a result. This feels like a clear win to me.

[^1]: Keeping this naive seems fine for the limited scope of usage. The reference expressions which reach this point are limited to `Scenario.answer` calls with an explicit reference. If we’re using references of arbitrary complexity in those calls, I think we’ve got much bigger problems than this functionality being so narrowly scoped.
eyelidlessness added a commit that referenced this pull request Aug 22, 2024
…e usage

As was done in the first pass on this functionality in #150
eyelidlessness added a commit that referenced this pull request Aug 22, 2024
This is copied verbatim from 56df0ea#diff-ca14e9532b1b772a6f0453cf9ff885d0fdd7c088df02bd17bfc911f03f4e9cd4 where they were originally introduced. This demonstrates the previous, wrong, understanding of how arguments are expected to be contextualized. The tests will be updated in subsequent commit(s) to show the difference in behavior from #150.
@eyelidlessness eyelidlessness changed the title xpath: support for indexed-repeat function xpath: [incorrect] support for indexed-repeat function Aug 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants