Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validation succeeds despite undefined entity #539

Closed
ghost opened this issue Jan 27, 2020 · 15 comments
Closed

Validation succeeds despite undefined entity #539

ghost opened this issue Jan 27, 2020 · 15 comments

Comments

@ghost
Copy link

ghost commented Jan 27, 2020

When running validation on DC-SLES-all in SUSE/doc-sle@305012007, the output is this:

> daps -d DC-SLES-all validate

[...]/doc-sle/xml/vnc.xml:296: parser error : Entity 'slesa' not defined
     <title>Remmina Viewing &slesa; &productnumber; Remote Session</title>
                                   ^
/home/sknorr-l/data/gits/doc-sle/xml/vnc.xml:296: parser error : Entity 'slesa' not defined
     <title>Remmina Viewing &slesa; &productnumber; Remote Session</title>
                                   ^
All files are valid.

-> There is a validation error but DAPS declares the document valid anyway and exits with code 0.

Frank remarks that in [daps-repo]/make/setfiles.mk:42, the CHECK_WELLFORMED variable appears to be empty in our error case.

--

Notably, this error somehow is specific to this revision of the doc-sle repo, I could not reproduce it by creating a fresh example document with doc-kit and then adding an undefined entity to it. It errored out correctly:

[outdated example removed]

EDIT: removed outdated information.

@ghost
Copy link
Author

ghost commented Jan 27, 2020

Update: The error always appears when the missing entity is used in an XML file that is not the MAIN file.

So, with my example document (doc-kit -> docbook5-book, then apply patch via git am), I can now reproduce:

> daps -d DC-example  validate

[...]/blubsi/xml/cha_example_one.xml:22: parser error : Entity 'bogus' not defined
    What is this chapter about? &bogus;
                                       ^
[...]/blubsi/xml/cha_example_one.xml:22: parser error : Entity 'bogus' not defined
    What is this chapter about? &bogus;
                                       ^
All files are valid.

Notes:

  • It does not matter whether the document is profiled.

  • When the undefined entity is used in the MAIN file, DAPS errors out (= correct).

@ghost
Copy link
Author

ghost commented Feb 12, 2020

@tomschr I can still reproduce the issue with the develop branch of DAPS, unfortunately.

@ghost ghost reopened this Feb 12, 2020
@tomschr
Copy link
Collaborator

tomschr commented Feb 12, 2020

@sknorr Thanks for the feedback. That's a bit strange. It works for Frank and me. Are you sure you used the right environment?

I'm curious, what do you get with the following call:

$YOUR_DAPS_REPO/libexec/daps-xmlwellformed  -W --xinclude YOUR_XML_FILE

@ghost
Copy link
Author

ghost commented Feb 12, 2020

I used [repopath]/bin/daps --dapsroot [repopath] -d DC-example validate

@ghost
Copy link
Author

ghost commented Feb 12, 2020

Also:

> [repopath]/libexec/daps-xmlwellformed -W --xinclude xml/MAIN.book-example.xml 
> echo $?
0

@tomschr
Copy link
Collaborator

tomschr commented Feb 13, 2020

Thanks. I suspect, that is caused by DocBook 4 DTD somehow...

@tomschr
Copy link
Collaborator

tomschr commented Feb 13, 2020

Ok, this is weird. I've used your doc-kit, applied the mentioned patch. So everything looks ok, including the bogus entity.

I used ddaps, a function to my DAPS GitHub repo:

function ddaps () {
 echo "****** DAPS Developer Version ******"
 local D=/local/doc/daps
 $D/bin/daps --dapsroot $D $@
}

With ddaps, it makes sure that it picks the latest daps-xmlwellformed script. When I run ddaps on my test repo, I should get an error which it does:

$ ddaps -v -d DC-example validate
****** DAPS Developer Version ******
/local/doc/daps/make/setfiles.mk:45: *** Fatal error:
/tmp/testy-db-xml/xml/cha_example_one.xml:24: Warning: Entity 'bogus' not defined   &bogus;.  Schluss.
$ echo $?
1

So the error code is 1.

I've tried it to uncomment &bogus; and to move it to the MAIN file. Different error messages, of course, but the result is almost the same (error code 2 instead of 1).

Hmn... 🤔

@ghost
Copy link
Author

ghost commented Feb 13, 2020

Toms and I sat together for a bit and it appears the issue is:

  • LXML 4.0 is broken for our use case (on Leap 15.1)
  • LXML 4.33 is fixed for our use case (on Tw and devel:languages:python)

It seems reasonable working around the limitations though [to me, not speaking for toms].

@tomschr
Copy link
Collaborator

tomschr commented Feb 13, 2020

I've went through the lxml Changelog. One item looks promising (4.2.0, 2018-03-13):

LP#1551797: Some XSLT messages were not captured by the transform error log.

@ghost ghost changed the title Validation succeeds despite non-defined entity Validation succeeds despite undefined entity Mar 5, 2020
@fsundermeyer
Copy link
Member

Closing this. Current distributions should no longer have the faulty lxml lib.

@ghost
Copy link
Author

ghost commented Jul 8, 2021

This is still an issue with our new Leap 15.2 runners on GitHub Actions:
https://github.com/SUSE/doc-sle/runs/2975954334#step:4:78 (SUSE/doc-sle@74b81dd)

> Validating ./DC-SLES-all
  ---------------
  
          DAPS VERSION: 3.2.0
[...]
    ---------------
  /github/workspace/xml/smt_product_overview.xml:24: parser error : Entity 'ismt' not defined
     &ismt; allows you to provision updates for all of your devices
           ^
  /github/workspace/xml/smt_product_overview.xml:24: parser error : Entity 'ismt' not defined
     &ismt; allows you to provision updates for all of your devices
           ^
[...]
  Validating...
  All files are valid.

We're using LXML 4.4.2 there. (And the mention of LXML 4.33 looks to be a typo'd version of 4.3.3, because the newest version is 4.6.3 only.)

@fsundermeyer
Copy link
Member

Did a little debugging (python-lxml 4.6.3):

  • daps-xmlwellformed catches the error but only issues a warning. However we are calling this script with PYTHONWARNINGS="ignore" because of DAPS 3.1 regression: If profiled-away XIncluded document does not exist, DAPS fails  #617 This results in no error and no output.
  • creating the list of setfiles with daps-xslt/common/get-all-used-files.xsl reports the faulty entity (as a non fatal warning, I assume)--this is the first "error" message you see when running daps validate
  • the second (non-fatal) error message (a duplicate of the first one) is very likely issued by the profiling stylesheets

Since the error message looks the same in all three cases (run the first one without
PYTHONWARNINGS="ignore") I assume it comes from the libxml. We need to find a way to configure this specific case as error rather than warning. I'd rather prefer not to do this in make, daps-xmlwellformed should do this.

However, I can see why this is not considered an error upstream--it does not seem to affect parsing and building, so a warning seems appropriate.

@ghost
Copy link
Author

ghost commented Jul 13, 2021

@tomschr Can we disambiguate between entity issues and profiling issues in daps-xmlwellformed?

@tomschr
Copy link
Collaborator

tomschr commented Jul 14, 2021

I did a little debugging too.

When I run daps-xmlwellformed on a XML file with an unknown entity, I get this:

$ daps-xmlwellformed tests/bad/test-undef-entity.xml
ERROR: Entity 'unknown' not defined, line 3, column 16 (test-undef-entity.xml, line 3)
       tests/bad/test-undef-entity.xml:3:16:FATAL:PARSER:ERR_UNDECLARED_ENTITY: Entity 'unknown' not defined
$ echo $?
10

The return code clearly shows there was something wrong. I'm not sure why this is isn't caught by make. This is the snippet where the script is used:

# file make/setfiles.mk
CHECK_WELLFORMED := $(shell PYTHONWARNINGS="ignore" $(LIBEXEC_DIR)/daps-xmlwellformed --xinclude $(MAIN) 2>&1 )

ifdef CHECK_WELLFORMED
  $(error Fatal error:$(\n)$(CHECK_WELLFORMED))
endif

Can we disambiguate between entity issues and profiling issues in daps-xmlwellformed?

The script does resolve XIncludes only (if requested). It doesn't do any profiling step.

ghost pushed a commit that referenced this issue Aug 6, 2021
We don't rely on the XSLT stylesheet anymore. It turned out
that it wasn't reporting entity errors etc. reliably. The script
now uses tree.xinclude() (built-in) and parses log messages to
categorize them as warnings or errors. This also differentiating
between entity and XInclude errors.

Requires lxml=>4.4.2 now.

Co-authored-by: Frank Sundermeyer <fs@suse.de>
Co-authored-by: Stefan Knorr <sknorr@suse.de>
ghost pushed a commit that referenced this issue Aug 6, 2021
This test is better suited to a bash script and doing the check early
does not hurt.
@ghost
Copy link
Author

ghost commented Aug 6, 2021

Fixed! Thanks a lot, Toms & Frank!
Commits: ff1267e & e3d97ff

@ghost ghost closed this as completed Aug 6, 2021
quatran added a commit to quatran/daps that referenced this issue Jan 26, 2022
* Remove file permission change to a no longer existant file in makefile.am
* Remove dbcentx entities in entity-decl.ent.in

Fixup for openSUSE#539

Co-authored-by: Stefan Knorr <sknorr@suse.de>
ghost pushed a commit that referenced this issue Feb 3, 2022
We don't rely on the XSLT stylesheet anymore. It turned out
that it wasn't reporting entity errors etc. reliably. The script
now uses tree.xinclude() (built-in) and parses log messages to
categorize them as warnings or errors. This also differentiating
between entity and XInclude errors.

Requires lxml=>4.4.2 now.

Co-authored-by: Frank Sundermeyer <fs@suse.de>
Co-authored-by: Stefan Knorr <sknorr@suse.de>
ghost pushed a commit that referenced this issue Feb 3, 2022
This test is better suited to a bash script and doing the check early
does not hurt.
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants