Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should XPath selecting down from root node / be allowed in assert test XPath expression? #386

Closed
martin-honnen opened this issue Jan 22, 2024 · 7 comments

Comments

@martin-honnen
Copy link

As far as I understand it, the XSD 1.1 specification restricts the allowed XPath expressions, you can only access the subtree of the element/type you are putting an assert(ion) on.

However, it seems, that XmlSchema doesn't implement such restrictions, for instance a schema like e.g. the one shown below, with an XPath like count(/foods/food[@type='fruit']) eq /foods/recon/@fruits, which selects down from the root node / is not rejected, like it seems to be by other XSD 1.1 validators (Xerces and Saxon EE).

Is that an intentional feature of XmlSchema, that it doesn't restrict limitations on XPath expressions?

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:vc="http://www.w3.org/2007/XMLSchema-versioning" elementFormDefault="qualified" attributeFormDefault="unqualified" vc:minVersion="1.1">
<xs:element name="food" type="foodType"/>
<xs:complexType name="foodType">
    <xs:sequence>
        <xs:element name="name" type="xs:string"/>
    </xs:sequence>
    <xs:attribute name="type">
        <xs:simpleType>
            <xs:restriction base="xs:string">
                <xs:enumeration value="meat"/>
                <xs:enumeration value="vegetable"/>
                <xs:enumeration value="fruit"/>
            </xs:restriction>
        </xs:simpleType>
    </xs:attribute>
</xs:complexType>
<xs:element name="foods">
    <xs:annotation>
        <xs:documentation>Comment describing your root element</xs:documentation>
    </xs:annotation>
    <xs:complexType>
        <xs:sequence>
            <xs:element ref="food" maxOccurs="unbounded"/>
            <xs:element ref="recon"/>
        </xs:sequence>
        <xs:assert test="count(/foods/food[@type='fruit']) eq /foods/recon/@fruits"/>
    </xs:complexType>
</xs:element>
<xs:element name="recon" type="reconType"/>
<xs:complexType name="reconType">
    <xs:attribute name="fruits" type="xs:integer"/>
    <xs:attribute name="vegetables" type="xs:integer"/>
    <xs:attribute name="meats" type="xs:integer"/>
</xs:complexType>
</xs:schema>
@brunato
Copy link
Member

brunato commented Jan 22, 2024

Hi,
in the paragraph "G.1.4 Assertions and XPath" the application of XPath on assertions seems to have no restrictions on syntax.

More specific on this the subsection 3.b.i) tells:

i. When assertions on a complex type are evaluated, only the subtree rooted in an element of that type is mapped into the data model instance. References to ancestor elements or other nodes outside the subtree are not illegal but will not be effective.

So no invalidity, only it doesn't select nothing outside the scope.

@martin-honnen
Copy link
Author

But appears XmlSchema does apply a rule such as <xs:assert test="count(/foods/food[@type='fruit']) eq /foods/recon/@fruits"/> because a sample like

<foods>
<food type="meat">
    <name>Chicken</name>
</food>
<food type="meat">
    <name>Beef</name>
</food>
<food type="meat">
    <name>Pork</name>
</food>
<food type="fruit">
    <name>Banana</name>
</food>
<food type="fruit">
    <name>Apple</name>
</food>
<food type="vegetable">
    <name>Carrot</name>
</food>
<recon vegetables="1" fruits="2" meats="3"/>
</foods>

is assessed as valid while a sample like

<foods>
<food type="meat">
    <name>Chicken</name>
</food>
<food type="meat">
    <name>Beef</name>
</food>
<food type="meat">
    <name>Pork</name>
</food>
<food type="fruit">
    <name>Banana</name>
</food>
<food type="fruit">
    <name>Apple</name>
</food>
<food type="vegetable">
    <name>Carrot</name>
</food>
<recon vegetables="1" fruits="3" meats="3"/>
</foods>

is assessed as invalid.

@brunato
Copy link
Member

brunato commented Jan 23, 2024

Ok, so the subtree is a fragment, not a document, according to their definition in the XDM.

The concept of document is a bit confusing sometimes in ElementTree, e.g.:

>>> import lxml.etree as et
>>> root = et.XML('<root><elem1/><elem2/></root>')
>>> root.xpath('.')
[<Element root at 0x7f9c24dedb80>]
>>> root.xpath('/')
[]
>>> root.xpath('/root')
[<Element root at 0x7f9c24dedb80>]
>>> root.xpath('root')
[]

also if you use an ElementTree instance:

>>> maybe_a_doc = et.ElementTree(root)
>>> maybe_a_doc.getroot()
<Element root at 0x7f9c24dedb80>
>>> maybe_a_doc.xpath('.')
[<Element root at 0x7f9c24dedb80>]
>>> maybe_a_doc.xpath('/')
[]
>>> maybe_a_doc.xpath('/root')
[<Element root at 0x7f9c24dedb80>]

Anyway i could change the behavior to reject absolute expressions or to evaluate them as relative (it might be better to reject, as Xerces and Saxon EE do, to be explicit and so to avoid confusion on that).

@brunato
Copy link
Member

brunato commented Feb 11, 2024

Xerces reports the root '/' as incorrect, but as a warning, not an error. I have to try Saxon HE on this (note added: cannot test, SaxonC-EE is needed for having XSD validation).

Also the two tests "d4_3_15ii31" and "d4_3_15ii32" of W3C XML Schema 1.1 test suite, report the schema as valid.

The annotation of these tests says:

<ts:annotation>
    <ts:documentation>"//" returns empty sequence</ts:documentation>
</ts:annotation>

So my preferred option is to generate a warning when the schema instance is parsed. The difference is the XML data will be set as fragment (using elementpath>=4.2.1) and so '/' and '//' will select nothing.

@martin-honnen
Copy link
Author

@brunato , thanks, yes, a warning and the change to ensure / and // don't select anything seems fine.

brunato added a commit that referenced this issue Feb 18, 2024
  - Group custom XPath parser in the new package
  - Add warning class for assertions
  - Resolution for issue #386
@brunato
Copy link
Member

brunato commented Feb 19, 2024

@martin-honnen, a resolution with a warning message and empty select for rooted '/' and '//' is available with the release v3.0.2.
thanks

@martin-honnen
Copy link
Author

@brunato , thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants