Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix all XPathMatcherTest TODO's + handle nested elements with the same name #4532

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

DidierLoiseau
Copy link
Contributor

@DidierLoiseau DidierLoiseau commented Sep 28, 2024

What's changed?

  • Fixed all existing TODO’s in `XPathMatcherTest
  • Handle XML documents that contain nested elements with the same name for XPath using //
  • Code cleanup

What's your motivation?

Needed to use such XPaths.

Anything in particular you'd like reviewers to focus on?

I will put comments on the bits of code I cleaned up.

There are still limitations with XPath that do not start with / or that start with //, in particular, they don’t support // in the middle and there were not tests for that.

Have you considered any alternatives or workarounds?

I think the implementation would really still benefit from a lot of refactoring, or even a full rewrite:

  • There is no validation of the provided XPath
  • The XPath parsing is mixed with the XPath maching (e.g. extracting conditions, determining whether we are facing an attribute or a function etc.)
  • The implementation of XPaths starting with / is completely dissociated from those starting with // or no /, which causes different limitations for the two.

I think it would make sense to represent the XPath as a structured object instead of a String[]

Any additional context

As a user, it’s not really clear what is supported and what is not without looking at the unit test…

Checklist

  • I've added unit tests to cover both positive and negative cases
  • I've read and applied the recipe conventions and best practices
  • I've used the IntelliJ IDEA auto-formatter on affected files

@@ -92,18 +92,8 @@ public boolean matches(Cursor cursor) {
if (index < 0) {
return false;
}
if (part.startsWith("@")) { // is attribute selector
partWithCondition = part;
tagForCondition = i > 0 ? path.get(i - 1) : path.get(i);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using i straight up in path.get() was completely wrong since path is reverted while parts is not – pathIndex should always be used.

Note that if a part starts with @, it MUST be the last part – I wanted to check it here, but then I realized all ifs actually result in the same code (and conditions matching sub-elements were not handled…).

}
}
partWithCondition = part;
tagForCondition = path.get(pathIndex);
} else if (i < path.size() && i > 0 && parts[i - 1].endsWith("]")) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t understand what’s the purpose of this if. All tests still pass if I remove it but I’m not sure whether I’m missing an edge case or it’s really unneeded.

- paths with multiple occurrences of //
- // preceding @Attribute
} else {
Collections.reverse(path);

// Deal with the two forward slashes in the expression; works, but I'm not proud of it.
if (expression.contains("//") && !expression.contains("://") && Arrays.stream(parts).anyMatch(StringUtils::isBlank)) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing :// does not seem to break anything, but fixes XPaths with both // and a namespace URL condition.

I don’t think the first condition is still relevant either, however the isBlank check is inconsistent with the indexOf("") on the next line. I think it should use isEmpty() like I did to compute tagMatchingParts.

Note that the first condition is insufficient on its own because of URLs again – or actually any value that could contain double slashes in a condition.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

1 participant