Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VM-Packages: Adding support for detecting musl-based Linux #7624

Merged

Conversation

snoopcheri
Copy link
Contributor

This PR adds support for detecting musl-based Linux operating systems, when the NativeLibLoader tries to determine the path of the XGBoost4j shared library to load.

The solution for detecting a musl-based Linux system is based on how the the sqlite-jdbc project does it here: it lists the memory-mapped files of a Linux system and checks whether any name contains the string "musl". If that's the case, the OS is considered a musl-based Linux system.

The path of the shared library for a regular Linux-based system for x86_64 architecture is still linux/x86_64/libxgboost4j.so. For a musl-based system this would now be linux-musl/x86_64/libxgboost4j.so.

This even works for Arm-based versions, where the path would be linux-musl/aarch64/libxgboost4j.so.

Note: This feature was initially discussed in the public forums here.

@snoopcheri
Copy link
Contributor Author

@Craigacp I've integrated your feedback. Please have another look. Thanks! :)

@Craigacp
Copy link
Contributor

Looks good. Could you update the error messages around here too - https://github.com/dmlc/xgboost/blob/master/jvm-packages/xgboost4j/src/main/java/ml/dmlc/xgboost4j/java/NativeLibLoader.java#L132? I'm not sure what the situation is wrt musl and OpenMP, but if you get an UnsatisfiedLinkError at that line then it's not likely to be an OpenMP issue so the error message is probably wrong.

Also it might be nice to log the detected platform at level fine or lower as now it's computed rather than being a direct function of the JVM properties (as if I've mmapped a library where the filename has "musl" in then I'll get an extremely impenetrable error message out of XGBoost if I'm actually using glibc, and logging the detected platform/OS will make it more obvious whats going on).

The debug log also shows the filename which causes us to consider
the Linux OS as musl-based.
In case of Linux, we now additionally mention that the MUSL-detection
might have gone wrong.
@snoopcheri
Copy link
Contributor Author

@Craigacp

Sorry for the delay! But real life and stuff.. :)

I've added the following things:

  • 6afc19f adds a debug log which prints the detected filename that lead the system to believe it's musl-based
  • 2e13788 is my attempt to give better hints to the end user, in case the shared library cannot be loaded. For Linux-based systems, it now also mentions that the musl-detection might have gone wrong. I'm not really sure what OpenMP AND musl really means. So I didn't go into details here.

Lemme know what you think!

Copy link
Contributor

@Craigacp Craigacp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. @trivialfis @CodingCat could one of you please review & merge this? It'll make it easier to deploy XGBoost4J in docker containers which use musl. We've hit this issue before in Tribuo and have a note in our docs that XGBoost4J doesn't work on top of musl libc, so it'll be nice to have a real solution available (even if users have to compile it themselves).

Copy link
Member

@trivialfis trivialfis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the fix! LGTM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants