Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better XGBoost installation on Mac OSX? #4477

Closed
adithyabsk opened this issue May 17, 2019 · 28 comments · Fixed by #5146
Closed

Better XGBoost installation on Mac OSX? #4477

adithyabsk opened this issue May 17, 2019 · 28 comments · Fixed by #5146
Labels

Comments

@adithyabsk
Copy link

adithyabsk commented May 17, 2019

Issue

Currently on MacOS the process to install the python package is as follows.

$ brew install gcc@5
$ export CC=/path/to/gcc-5; export CXX=/path/to/g++-5; pip install xgboost

Question

What I would like to learn from a more experienced contributor is whether there are any plans to simplify this install process? The above is not tenable for an automated install system for any package that depends on xgboost. What would be required to make xgboost compatible with Apple's clang.

@hcho3
Copy link
Collaborator

hcho3 commented May 17, 2019

Apple’s clang doesn’t support OpenMP out of the box, so hence Homebrew GCC is needed. So, no, XGBoost will not be compatible with Apple’s clang.

I think we can simplify the process by distributing binary wheels for Mac OSX. The binary wheels will contain pre-built libxgboost.dylib so that the user won’t need to have any compiler. (This is how Windows users don’t need to have Visual Studio installed to use XGBoost.)

However, I’m afraid that the maintainers (including myself) currently are not familiar with binary packaging with Mac OSX, i.e. how to make binaries that would be broadly compatible across multiple versions of OSX. Do you have any suggestions here?

For now, you should consider using conda-forge to automate XGBoost installation on Mac OSX.

@adithyabsk
Copy link
Author

@hcho3 thanks for your quick response! Conda certainly is an option but it would be much simpler to use pip. I'll look into what binary packaging on macos would look like. I am also not familiar with binary packaging, so anyone else's input who has experience in that area would be much appreciated.

@Craigacp
Copy link
Contributor

Craigacp commented Jun 13, 2019

I've had some difficulty with this issue as the dylib produced by the standard compilation process has a hard dependency on the homebrew gcc's libraries. If anyone has a way of changing that dependency after compilation (or making it generic across gcc versions) that would be great, but I don't think macOS ships with libgomp (which provides the OpenMP support) so we might need to package that as well, which makes life difficult.

@adithyabsk
Copy link
Author

adithyabsk commented Jun 29, 2019

@Craigacp @hcho3 Is this something we could consider until a cmakelists workaround is found. netket/netket#225 (comment). I'm not super familiar with the internals of xgboost, how critical is OpenMP to the performance of the library.

This also seemed promising but I wasn't able to get it to work: https://stackoverflow.com/questions/46414660/macos-cmake-and-openmp.

@hcho3
Copy link
Collaborator

hcho3 commented Jul 3, 2019

@adithyabsk @Craigacp OpenMP is very much critical for the performance of XGBoost, since we want to use all available cores of multi-core CPUs commonly available on users' systems. Without OpenMP, you'd be able to use only one CPU core.

IMHO, pip is not designed for handling such external dependencies as libomp. On the other hand, conda is able to handle non-Python dependencies just as easily. See this post: https://jakevdp.github.io/blog/2016/08/25/conda-myths-and-misconceptions/

@hcho3
Copy link
Collaborator

hcho3 commented Jul 3, 2019

How Microsoft/LightGBM solves the problem: they ask users to run brew install libomp. I'm not sure if this is any easier than installing GCC or Conda, since you will need to first install Homebrew.

@adithyabsk
Copy link
Author

@hcho3 the brew install libomp solution might be better as that can be provided in pre-install setup scripts whereas, currently, one has to separate out xgboost in CI pipelines to specify the appropriate gcc and g++ versions. Definitely, agree with you as far as conda goes and that might end up being the only solution but I just wanted to explore the other options to see if anything else was possible.

Sorry for the silly question, but is OpenMP required at runtime? For example, could we compile the dmlc-core and xgboost with OpenMP installed and then bundle that file into a wheel so that compilation wouldn't be necessary at install time using a tool like audit_wheel?

https://stackoverflow.com/a/42106034

@hcho3
Copy link
Collaborator

hcho3 commented Jul 3, 2019

@adithyabsk I just tried using brew install libomp and now I'm able to compile XGBoost with the default compiler, Apple Clang:

brew install libomp
mkdir build
cd build
cmake ..
make -j10

What's more, the resulting binary libxgboost.dylib depends only on /usr/local/opt/libomp/lib/libomp.dylib and OSX system libs. (No more dependency on specific version of GCC! Hooray!) So I suppose brew install libomp is the least painful way to install XGBoost on Mac OSX without Conda.

Distributing pre-compiled binaries is still tricky, however. Even if we were to include libomp.dylib inside the wheel, Mac OSX will not use the file, since shared library dependency is specified with full path:

hcho3@localhost: xgboost$ otool -l libxgboost.dylib    # show list of library dependencies

libxgboost.dylib:
Mach header
      magic cputype cpusubtype  caps    filetype ncmds sizeofcmds      flags
 0xfeedfacf 16777223          3  0x00           6    15       2112 0x00918085
....
Load command 10
          cmd LC_LOAD_DYLIB
      cmdsize 64
         name /usr/local/opt/libomp/lib/libomp.dylib (offset 24)
   time stamp 2 Wed Dec 31 16:00:02 1969
      current version 5.0.0
compatibility version 5.0.0
Load command 11
          cmd LC_LOAD_DYLIB
      cmdsize 48
         name /usr/lib/libc++.1.dylib (offset 24)
   time stamp 2 Wed Dec 31 16:00:02 1969
      current version 400.9.0
compatibility version 1.0.0
Load command 12
          cmd LC_LOAD_DYLIB
      cmdsize 56
         name /usr/lib/libSystem.B.dylib (offset 24)
   time stamp 2 Wed Dec 31 16:00:02 1969
      current version 1252.50.4
compatibility version 1.0.0

On the other hand, Windows is more flexible when it comes to locating shared libraries. I found it sufficient to simply include vcomp140.dll (OpenMP runtime) inside the wheel.

@hcho3
Copy link
Collaborator

hcho3 commented Jul 3, 2019

@hetong007 Related note: brew install libomp should also enable multi-threading for CRAN XGBoost on Mac OSX

@hetong007
Copy link
Member

@hcho3 I think so. XGBoost R package calls the same backend API thus should behave the same.

@adithyabsk
Copy link
Author

@hcho3 That's an awesome development! Already moving in the right directions as I can attest that in a lot of R&D labs installing xgboost is a pain point for those not intimately familiar with its internal requirements.

Following up on this note:

Mac OSX will not use the file, since shared library dependency is specified with full path

Maybe we could look more into this particular issue to see if there are any workarounds to get the libomp.dylib into the binary wheel.

@adithyabsk
Copy link
Author

@hcho3 could also be because of the extension itself? Should we be using .so on macOS as well. This issue thread and stackoverflow post seem to indicate so.
https://stackoverflow.com/questions/2488016/how-to-make-python-load-dylib-on-osx
MoDeNa-EUProject/MoDeNa#1

@hcho3 hcho3 pinned this issue Jul 4, 2019
@hcho3 hcho3 changed the title Fixing python install on macos? Better XGBoost installation on Mac OSX? Jul 4, 2019
@hcho3
Copy link
Collaborator

hcho3 commented Jul 4, 2019

@adithyabsk Given the complexity of shipping the runtime library in the wheel (and getting it to load), let's settle with brew install libomp.

  • Homebrew is quite widely used already among power users (I think).
  • With libomp, we can use Apple Clang to compile XGBoost, thus eliminating hard dependency on a specific version of Homebrew GCC.
  • This method has been verified by other projects, such as LightGBM.

Ps. I'm looking at https://iscinumpy.gitlab.io/post/omp-on-high-sierra/ to understand the use of OpenMP in Apple Clang.

@StrikerRUS
Copy link
Contributor

@hcho3

Ps. I'm looking at https://iscinumpy.gitlab.io/post/omp-on-high-sierra/ to understand the use of OpenMP in Apple Clang.

These PRs may help you:
microsoft/LightGBM#1501, microsoft/LightGBM#1923.

@hcho3
Copy link
Collaborator

hcho3 commented Jul 18, 2019

@adithyabsk This is one of priorities of mine. I'd like to make a fix before the 1.0.0 release.

@adithyabsk
Copy link
Author

adithyabsk commented Jul 18, 2019

@hcho3 Glad to hear it! I'll see if I can tinker around with this issue as well.

@hcho3
Copy link
Collaborator

hcho3 commented Jul 18, 2019

@adithyabsk One subtle problem I've run into brew install libomp is that XGBoost would be compiled without OpenMP, because CMakeLists.txt was not configured correctly. (I could tell by running a moderately heavy job on my Macbook; without OpenMP, jobs will take 2-3x as long.) I'm trying to revise CMakeLists.txt to properly enable OpenMP.

@StrikerRUS Thanks for the link. Getting build system working is quite hard, and it helps me a lot to have a point of reference (LightGBM).

@wel51x
Copy link

wel51x commented Jul 28, 2019

@adithyabsk One subtle problem I've run into brew install libomp is that XGBoost would be compiled without OpenMP, because CMakeLists.txt was not configured correctly. (I could tell by running a moderately heavy job on my Macbook; without OpenMP, jobs will take 2-3x as long.) I'm trying to revise CMakeLists.txt to properly enable OpenMP.

Any luck? The reason I ask is that "pip install xgboost -U" fails even after installing libomp via "brew install libomp".

@hcho3
Copy link
Collaborator

hcho3 commented Sep 12, 2019

@wel51x We did not yet modify CMakeLists.txt to get the new solution working. For now, you should follow instructions in https://xgboost.readthedocs.io/en/latest/build.html.

@adithyabsk @Craigacp I found https://github.com/matthew-brett/delocate. This may be a useful solution to remove hard-coded library dependencies.

@alexvorobiev
Copy link

In case somebody finds it helpful... I understand that it is by no means a mainstream approach, but the latest xgboost with OpenMP support can be installed on MacOS using Nix (https://nixos.org/nix/) as trivially as

$ nix-shell -p python3Packages.xgboost

@ankane
Copy link
Contributor

ankane commented Sep 14, 2019

Hey @hcho3, I created a Homebrew formula for XGBoost to help simplify installation on Mac, so users can run brew install xgboost in the future. It works great, but unfortunately won't be accepted using an older version of GCC.

Discussion: Homebrew/homebrew-core#43246

One option is to disable OpenMP, but as you mentioned, it's not great for performance. If you're able to commit the changes to make it work with libomp, I can update the formula and we can push this forward.

@wel51x
Copy link

wel51x commented Sep 17, 2019

Thanks for the updates.

@ankane
Copy link
Contributor

ankane commented Sep 28, 2019

fwiw, I updated the formula so it no longer depends on GCC but lacks support for OpenMP. We can update it once support for libomp is released.

@ankane
Copy link
Contributor

ankane commented Oct 7, 2019

The formula was accepted by Homebrew, so Mac users can now do:

brew install xgboost

@bnicholl
Copy link

I used brew install xgboost but I still cant import XGBoost. There's no init.py file or anything inside the actual newly installed XGBoost directory, so I cant use any of the XGBoost functions. Is there another step after using brew to install XGBoost?

@hcho3
Copy link
Collaborator

hcho3 commented Oct 23, 2019

@bnicholl See #4949 (comment) for a temporary solution.

@StrikerRUS
Copy link
Contributor

@hcho3

Thanks for the link. Getting build system working is quite hard, and it helps me a lot to have a point of reference (LightGBM).

With the incoming CMake 3.16 release (in RC phase now) it should be even easier: there will be no need in passing extra args for >=Mojave users. Refer to https://gitlab.kitware.com/cmake/cmake/merge_requests/3916.

@hcho3
Copy link
Collaborator

hcho3 commented Dec 23, 2019

@adithyabsk @Craigacp #5146 should now let you use OpenMP without installing Homebrew GCC. Now XGBoost will only depend on the libomp Homebrew package.

@ankane Consequently, we should be able to submit the next release of XGBoost (1.0) with OpenMP enabled.

@lock lock bot locked as resolved and limited conversation to collaborators Mar 27, 2020
@trivialfis trivialfis removed the 1.0.0 label Dec 17, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants