Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate to scikit-build-core #4352

Closed
wants to merge 84 commits into from

Conversation

cringeyburger
Copy link
Contributor

@cringeyburger cringeyburger commented Aug 16, 2024

Description

This PR is about migrating to a new build system, scikit-build-core from setuptools and wheel, while incorporating necessary changes in GitHub Workflows, Benchmarks and Documentation.

CHANGES IN DISCUSSION:

  • Task 27: Use SUNDIALS installation script to set the environmental variables for windows --- We currently don't support Windows users to run nox -s pybamm-requires or install_KLU_Sundials.py script. Adding this feature would change a lot of things

CHANGES REQUESTED:

  • Task 22: Remove images from .gitignore
  • Task 23: Include the whole sundials_KLU_libs in .gitignore
  • Task 24: Remove import sys from bottom call in test_idaklu_solver
  • Task 25: Skip the parameter_sets test
  • Task 26: Change the version requirements given the scikit-build-core docs
  • Task 28: Update docs for DAE solver (remove extra text from the tabs)

CHANGES IN PROGRESS:

COMPLETED, need reviews:

  • Task 4: Remove pins on other build-time dependencies like CasADi --- unpinned all instances of build-time dependencies, including scikit-build-core in the direction to remove editable-rebuilds, or at least give a warning to the users before using it.
  • Task 5: Remove pip update commands from nox.
  • Task 13: Remove set CMAKE_GENERATOR="Visual Studio 17 2022" and set CMAKE_GENERATOR_PLATFORM=x64 and let CMake set it by itself.
  • Task 14: Set defaults for Windows for the remaining 4 environment variables in CMakeLists.txt. Use "if set, use that otherwise use default" logic for VCPKG_ROOT (rename VCPKG_ROOT_DIR to VCPKG_ROOT) --- Can not fix using CMake variables, have to set environment variables, hence wrote a batch script to set the env vars permanently for the Windows user. The CI still use pre-defined env vars because this avoids issues related to interactive scripts.
  • Task 19: Clean the find BLAS script
  • Task 3: Run tests, including IDAKLU ones, on all platforms (except doctests). --- random windows IDAKLU tests fail because of workers crashes. IREE tests had to be skipped because of attribute failure. Rest linux and macos tests are working.
  • Task 1: Check if the error in Doctests (pybamm.parameter_sets) can be resolved using an example instead of explicitly setting all the names. --- need to set all names
  • I unpinned scikit-build-core but will keep it as an option in the docs. I will write a warning saying that it is experimental.
    • Task 2: Set the scikit-build-core pin to 0.10.5 from 0.10.3
    • Task 6: Write a clearer message about editable-rebuild command, which is supposed to help developers using partial rebuilds (it was written to use that command to contribute to PyBaMM).
    • Task 7: Remove the verbose flag from the editable-rebuilds command
  • Task 12: Set python/python3 instead of python3.X in TOML kit installation in the guide.
  • Task 15: Remove ambiguity from # in the PyBaMM/ directory
  • Task 16: Migrate all Windows instructions to the existing Install from source doc.
  • Task 20: Add pytest documentation from Contributors guide in Develop branch.
  • Task 10: Include new "build" and "download_KLU_Sundials" directories in .gitignore
  • Task 18: Include *.egg-info pattern in .gitignore
  • Task 9: Update docs asking users to install VCPKG themselves.
    • Task 8: Remove images from the installation guide.
    • Task 11: Install VCPKG in a local directory
    • Task 17: We do not need the MSVC image
  • Task 18: Unpin graphviz from docs and tests --- had to pin on tests because of permission errors, though it would work fine locally.
  • Task 21: Add IREE Support

Copy link

codecov bot commented Aug 16, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 95.48%. Comparing base (3bf3ea8) to head (a51b095).
Report is 7 commits behind head on develop.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #4352      +/-   ##
===========================================
- Coverage    99.42%   95.48%   -3.94%     
===========================================
  Files          299      299              
  Lines        22691    22715      +24     
===========================================
- Hits         22560    21690     -870     
- Misses         131     1025     +894     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@cringeyburger
Copy link
Contributor Author

With this commit, editable + partial rebuilds are available through nox -s dev.
Note that we have to export BUILD_IDAKLU=ON to enable CMake and thus IDAKLU build

@cringeyburger cringeyburger force-pushed the migrate-to-sbc branch 3 times, most recently from a8f773f to e046696 Compare August 17, 2024 10:24
@cringeyburger
Copy link
Contributor Author

cringeyburger commented Aug 17, 2024

Updates till now:

  1. Wheels are built properly, and the package can be installed simply with pip install .[...].
  2. CI Builds pass except for macOS, which can't find Fortran (which is actually installed before the build step) to find SuiteSparse libraries and BLAS. This shouldn't be a problem on local if gfortran is installed through brew.
  3. IDAKLU is compiled properly and is installed during wheel builds and installation, given we have exported BUILD_IDAKLU=ON before running the build/install commands. Otherwise, a pure python package is installed.
  4. Editable installs with rebuilds are working, and can be run using pip install --no-build-isolation --config-settings=editable.rebuild=true -Cbuild-dir=build -ve.[...], provided we have the build-time dependencies, such as cmake, pybind11 etc., are installed beforehand.
  5. This is automated by running nox -s dev which installs the build-time dependencies. Note that this feature requires tomlkit to be installed before running any nox command. Hence it is better to install nox and tomlkit hand in hand.
  6. It is to be noted that with editable installs and rebuilds, the IDAKLU solver does not compile with changes in SuiteSparse and SUNDIALS files. This is because SuiteSparse and SUNDIALS aren't interfaced with pybind11 directly, and thus are not linked with the compiled artifact. Apart from this, the editable installs work properly with working IDAKLU solver.
  7. Docker builds are working, both locally and on CI.
  8. Doctests and other tests should work now since nox and tomlkit are installed together on CI.

@cringeyburger
Copy link
Contributor Author

cringeyburger commented Aug 17, 2024

Problem with Benchmarks:

  1. The installation script installs the libs to PyBaMM/sundials_KLU_libs.
  2. CMake links them using ${CMAKE_CURRENT_SOURCE_DIR}/sundials_KLU_libs.
  3. The thing with ASV on CI is that the script responsible for installing the packages installs them in /home/runner/work/PyBaMM/PyBaMM/sundials_KLU_libs/.
  4. But due to the build command: python -m build --wheel -o {build_cache_dir} {build_dir}, it makes a separate venv+pip environment, and the path to the libs is changed to /home/runner/work/PyBaMM/PyBaMM/env/ec88222d46a18b5e9d12ba7194d09478/project/sundials_KLU_libs

Because of this, CMake isn't finding the libs and is failing to build them.

@cringeyburger cringeyburger force-pushed the migrate-to-sbc branch 4 times, most recently from b25357f to 2f8e252 Compare August 17, 2024 16:25
@agriyakhetarpal
Copy link
Member

Now that we are done with the v24.9 release, let's plan to merge this in the coming week.

@agriyakhetarpal agriyakhetarpal mentioned this pull request Sep 18, 2024
6 tasks
@agriyakhetarpal
Copy link
Member

I've been trying to get to this PR but it has taken me some time to do so. @kratman is attempting the unvendoring of the solver in pybamm-team/pybammsolvers#4, it might be good to migrate this PR there once that is done, @cringeyburger? It would make things a bit easier that were difficult to do here, especially with the troubles that were noticed with the benchmarks and the WSL-specific Dockerfile errors.

@cringeyburger
Copy link
Contributor Author

Yes, it would be great if we unvendor the solver and then work on migration.

I just wanted to point out that the benchmarks and dockerfile are fixed, along with everything else, atleast before I started the streak of merging commits from main.
With this unvendoring, it would make the migration neat and clean with the new changes from main.

@agriyakhetarpal
Copy link
Member

Yes, it would be great if we unvendor the solver and then work on migration.

I just wanted to point out that the benchmarks and dockerfile are fixed, along with everything else, atleast before I started the streak of merging commits from main. With this unvendoring, it would make the migration neat and clean with the new changes from main.

Thanks! Yes, sorry, I, too, meant that the migration would be cleaner with it, not that they were not working right now. Let's wait until that gets merged and we can alter your final report if you'd like to. Do you know why the CIBW_BEFORE_ALL step is failing? I'm unable to find out from the logs, but I feel it's getting stuck with downloading something. The Windows wheel failure looks fixable, though.

@cringeyburger
Copy link
Contributor Author

cringeyburger commented Sep 30, 2024

I tried fixing things, here is what I found:

  1. After we incorporated uv in the workflow, I noticed that the logs for MacOS builds are too huge, and no matter how or what I do (even tried downloading them), I can't get to the bottom part.
  2. Even though wheel.cmake is set to true by default and we are explicitly setting it too, scikit-build-core refuses to run CMake before making the wheel, thus producing a purelib wheel. I tried using the override method - set READTHEDOCS env to true, and set wheel.cmake to true in the overrides, and it builds a proper wheel after running CMake.
    1. I also tried setting wheel.platlib = true in the config (the documentation says, with wheel.cmake = true, wheel.platlib will be set to true). It did change the name of the wheel built, but it still did not run CMake.

EDIT: The second change is recent one, and it used to work fine before..

@agriyakhetarpal
Copy link
Member

Thanks for taking a look – I think we should pin scikit-build-core to the last working version (maybe 0.9.10). Breakages like these will be troublesome for us to fix, and it might be better to always keep it pinned until it becomes a stable enough backend.

As for the macOS wheel builds, I got to notice them in live-action mode just now and I think there is an endless loop somewhere, since the before-all step kicks off another before-all step, and the next one, and the next, and so on until we run out of memory designated by the GHA runner. It's unlikely this is a bug on cibuildwheel's side, though.

@agriyakhetarpal
Copy link
Member

It's because of this line: python -m cibuildwheel --output-dir wheelhouse, which triggers another cibuildwheel instance inside cibuildwheel itself. I guess you brought it back inadvertently at some place when merging changes from develop.

@kratman
Copy link
Contributor

kratman commented Oct 15, 2024

@cringeyburger I think due to the separation of IDAKLU in #4487, we should probably move this work to https://github.com/pybamm-team/pybammsolvers

This will make the changes much simpler and easier to manage

@kratman
Copy link
Contributor

kratman commented Oct 24, 2024

Changes migrated to pybammsolvers, closing for now

@kratman kratman closed this Oct 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants