Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libxml2 version check #299

Merged
merged 3 commits into from
Mar 27, 2024
Merged

Conversation

jonathangreen
Copy link
Contributor

@jonathangreen jonathangreen commented Mar 22, 2024

I believe this should resolve #283.

I've encountered ongoing issues with xmlsec builds on certain platforms using the latest code. While troubleshooting, I utilized #289, but it didn't consistently provide the necessary information.

Previously, get_libxml_version returned the version of libxml2 that xmlsec was compiled against, which lxml refers to as LIBXML_COMPILED_VERSION. I've introduced a new function called get_libxml_compiled_version to supply this information. Additionally, I've updated get_libxml_version to return the currently loaded version of libxml2, similar to LIBXML_VERSION from lxml.

To address #283, I added a new check that occurs when xmlsec is imported:

  • Verification of matching major and minor versions from get_libxml_compiled_version and get_libxml_version. Removed see libxml2 version check #299 (comment)
  • Verification of matching major and minor versions from lxml.etree.LIBXML_VERSION and get_libxml_version.

If these checks fail, an xmlsec.Error is raised, providing relevant details. This should make it easier to troubleshoot when library versions are mismatched, rather then trying to trace down where a segfault came from.

- Expose both the compiled version and the linked version of libxml2

- Do a check that the versions match when initializing the module.
  Otherwise raise an exception. This seems like a better result then
  a difficult to diagnose segfault.
@nosnilmot
Copy link
Contributor

The check for lxml-loaded-libxml2 vs. xmlsec-loaded-libxml2 may prevent version mismatch problems, but I'm not sure it counts as a solution to #283 - my interpretation of that ticket is to actually make mixed versions work reliably.

I think the PyXmlSec_CheckLibXmlLibraryVersion() check for compile vs. runtime version of libxml2 is overly restrictive and unnecessary, regular ABI compatibility for compile vs. runtime is already handled by libxml2 soname. Are there really any reported segfaults from this mismatch rather than the above?

@jonathangreen
Copy link
Contributor Author

Your criticism is valid; indeed, what we have here is more of a mitigation measure rather than a definitive solution. For #283, I thought having some checks in place was safer then no checks at all, so this PR met the criteria for resolving that issue.

I think the PyXmlSec_CheckLibXmlLibraryVersion() check for compile vs. runtime version of libxml2 is overly restrictive and unnecessary, regular ABI compatibility for compile vs. runtime is already handled by libxml2 soname.

I don't believe this is the case. In CI right now, you can see examples of both of these checks failing. The broken CI workflows are inadvertently providing a test case for this PR 😄 Without these checks in place the tests instead fail with Fatal Python Errors or segfaults. I think this is proof that the check does not duplicate a check already being done by libxml2.

Example of PyXmlSec_CheckLibXmlLibraryVersion failing is in the macosx workflow:

An example of PyXmlSec_CheckLxmlLibraryVersionfailing is the sdist workflow:

In both cases, I think the results with this PR in place are preferable to the Python Errors that occur otherwise. They are more informative, and point more directly to the cause of the issue. I do agree with you though, it would be a better solution if the library mismatch were just seamlessly handled. I, unfortunately, don't have the expertise to implement a solution like that and I thought the checks in this PR provided a helpful mitigation.

I am totally open to feedback on refining these checks or closing this PR in favor of a better solution to #283.

@nosnilmot
Copy link
Contributor

Example of PyXmlSec_CheckLibXmlLibraryVersion failing is in the macosx workflow:

But that test is installing lxml from a wheel:

Collecting lxml!=4.7.0,>=3.8.0 (from -r requirements.txt (line 1))
  Using cached lxml-5.1.0-cp37-cp37m-macosx_10_9_x86_64.whl.metadata (3.5 kB)

So isn't the failure likely caused by a mismatch between libxml2 version used by lxml vs xmlsec rather than the mismatch between compile-time and run-time version for xmlsec? i.e. it would be caught by the PyXmlSec_CheckLxmlLibraryVersion check too.

@jonathangreen
Copy link
Contributor Author

@nosnilmot that is a good point. I pushed a new commit that removes the PyXmlSec_CheckLibXmlLibraryVersion check you mentioned and you are right, the second check catches the CI issue. Thanks for the great feedback on this one.

@jimjag jimjag merged commit 1525fe0 into xmlsec:master Mar 27, 2024
0 of 31 checks passed
Copy link
Contributor

@scoder scoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks. This should make it much less likely that users run into crashing surprises.

if (version == NULL) {
goto FINALIZE;
}
if (!PyTuple_Check(version) || PyTuple_Size(version) != 3) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably enough to check "size < 2" here, if all you look at is the first two entries.

Copy link
Contributor

@scoder scoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hadn't looked at the exception handling part.

Comment on lines +74 to +83
PyObject* major = PyTuple_GetItem(version, 0);
PyObject* minor = PyTuple_GetItem(version, 1);

if (!PyLong_Check(major) || !PyLong_Check(minor)) {
goto FINALIZE;
}

if (PyLong_AsLong(major) != PyXmlSec_GetLibXmlVersionMajor() || PyLong_AsLong(minor) != PyXmlSec_GetLibXmlVersionMinor()) {
goto FINALIZE;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this should usually work, users can easily assign tuples with large integers to etree.LIBXML_VERSION, which would then make PyLong_AsLong() fail here and the code continues with an exception set.

I suggest adding an exception check also to the PyTuple_GetItem() calls above, because why not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that PyLong_Check handled this case, but re-reading the docs now I see that it doesn't. Thanks for the feedback here.

Comment on lines +87 to +92
FINALIZE:
// Cleanup our references, and return the result
Py_XDECREF(lxml);
Py_XDECREF(version);
return result;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should call PyErr_Clear(), in case we got here due to an exception.

@jonathangreen
Copy link
Contributor Author

@scoder Thank you for the review on this. I created a small follow up PR that incorporates your feedback (#300), since this one was already merged. If you have time, can you do a quick review of that PR as well and make sure I correctly captured your feedback?

@jimjag once @scoder has a change to take a look at #300, it might be good to get that one merged as well before a release goes out.

@jonathangreen jonathangreen deleted the feature/version-check branch March 27, 2024 12:50
@scoder
Copy link
Contributor

scoder commented Mar 27, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Find a safer way to integrate with lxml
4 participants