Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmake harness, 2023 edition #259

Open
wants to merge 64 commits into
base: master
Choose a base branch
from

Conversation

loriab
Copy link
Collaborator

@loriab loriab commented Jan 7, 2023

This is a squash of #233 (2022) based on #205 (2021) that includes #148 (2019) followed by rebase to master (2023).

Ignore for a bit. It's passed some local sanity checks, and this'll start the GHA check, but I haven't checked moved files very carefully or even looked into the new python build or arbitrary order derivative changes that were in master.

@loriab
Copy link
Collaborator Author

loriab commented Feb 11, 2023

I think I'm done again with this for the year. Below are the more important changes (that I remember) beyond #233. There's simply so many ways of building this project that I surely haven't tried them all, but tests and psi4 usage are in good shape.

  • Integrate past year's master changes into CMake build. mainly Py bindings and arb-order derivatives
  • GHA testing: build Py interface a few ways, update actions syntax, test Intel compilers. from previous: test Linux/MacOS/Windows builds from Linux export
  • Switch solid harmonics to runtime selectable. move SHGAUSS_ORDERING to generator option since still matters for sph_emultipole. provide setter and accessor functions. modify tests to run both types.
  • update INSTALL.md
  • set some variables on target for FetchContent compatibility. prefix any non-standard-cmake library options with LIBINT2_
  • Check 2023.0 Intel compilers with L2. icpc and icpx work. confirmed that both can build boys.h header with c++17, which was not true of icpc 2021.4
  • Sorted out why pybind11 was stuck at an old version
  • Master source had been updated to something like 2.8.0-beta1 (which installs to 2.8.0). I backed that off to 2.7.2-post1 so that my builds aren't so deceptive.

@timostrunk
Copy link

@evaleev Are there blockers for a review here? I think this PR is quite important for the future of libint. It would be a shame if this work goes to waste.

@loriab
Copy link
Collaborator Author

loriab commented Sep 9, 2023

fwiw, @timostrunk and @hadim, my current hope is to get the #269, #270, #271 PRs (cummulative) that address the run-time parts of this one merged and released. Doing that can release the psi4 and downstream blocks. If that goes through, I can rebase/refactor this PR to deal with the remaining build-/detection-time changes.

@evaleev
Copy link
Owner

evaleev commented Nov 24, 2023

@loriab let me know what the status of this is, would be great to get this saga finished before 2024

@loriab
Copy link
Collaborator Author

loriab commented Nov 25, 2023

@loriab let me know what the status of this is, would be great to get this saga finished before 2024

Ah, from what I recall, the basic status is:

  • all the actual runtime changes from this PR are already present in master as of the test solid harmonic ordering runtime switchable #271 merge (thanks!)
  • this PR has
    • the big rearrangement where bin tests and lib tests moved around and export/ moved into lib and libtool went away.
    • cmake processing of user args to build the config.h that guides the generator build and seeds the lib build
    • organizational improvements (I think) to the library targets and export. attendantly, the libint2-config.cmake has improvements for selecting particular targets
  • from discussions at test solid harmonic ordering runtime switchable #271 on the 3-center AM patterns, I realize that the cmake logic in this PR is insufficient to enable the mixed-AM (e.g., --with-max-am=3 --with-eri3-max-am=4 to get eri_ffG_d0). By the current logic, the generic WITH_MAX_AM needs to be >= the largest integral class MAX_AM; the variable is only used to default the integrals classes. So the logic in this PR needs reworking.
  • you mentioned in Sept wanting to embed the git hash into the configuration.cc. That isn't in this PR
  • there's an extensive upgrade guide in INSTALL.md that will need revising based on changes from 269, 270, 271.

I'd dearly like to get this finished before 2024, too. But I think it'll take enough revising that I can't work on it until mid-Dec.

This PR isn't a big change for L2 users, but I suspect it'd be convenient for the devs in the new flurry of work. I'd propose a release w/o this PR soon for user continuity, then I'll try to patch this up this year, and perhaps your group can field test it a bit before a 2024 release.

@evaleev
Copy link
Owner

evaleev commented Nov 28, 2023

I'd dearly like to get this finished before 2024, too. But I think it'll take enough revising that I can't work on it until mid-Dec.

No problem at all.

This PR isn't a big change for L2 users, but I suspect it'd be convenient for the devs in the new flurry of work. I'd propose a release w/o this PR soon for user continuity, then I'll try to patch this up this year, and perhaps your group can field test it a bit before a 2024 release.

Yes, @JonathonMisiewicz could use this now. Not a show-stopper, but would be good to finally get this checked off.

@LecrisUT
Copy link

Hi @loriab, I have some comments about the design, but I think they will be better addressed after this one is merged. Basically some of the ideas are in LecrisUT/CMake-Template#1, but we should discuss them there since it is a smaller more flexible template project.

For this PR, I only have one non-blocking comment, can you explain a bit about LIBINT_BUILD_LIBRARY_AS_SUBPROJECT and ExternalProject_Add design? I think it would be better to have a fallback of find_package(CONFIG) -> FetchContent with the interface from cmake 3.24 (with some backports to earlier versions for now)

@loriab
Copy link
Collaborator Author

loriab commented Dec 13, 2023

FYI, @LecrisUT, this won't be merged as-is. The user-facing code parts have already been merged as PRs 269, 270, 271. What's left is the builder-facing CMake parts that I'm reexamining and hopefully PRing in pieces rather than en masse, starting with 299.

Yeah, LIBINT_BUILD_LIBRARY_AS_SUBPROJECT vs. ExternalProject_Add :-). Basically, I'm all for the latter for this role, especially as it keeps the dev library build separate and configured most like the more common independent library build. If a variable doesn't pass through, I know to fix it in the cmake and add it to the INSTALL.md. I think the main developer prefers FetchContent for not having to define every variable passed through, so he added the former option. It sometimes takes two steps, and iirc I couldn't get the GHA to always work for it.

@LecrisUT
Copy link

I think the decision would depend on how much the subproject is designed to be a subproject vs a standalone project. If it's the latter, I would agree that the latter option is a good intermediate. But whenever possible I would suggest going for FetchContent compatible design since that is a major focus of CMake upstream, and it is more compatible when chaining projects.

It sometimes takes two steps, and iirc I couldn't get the GHA to always work for it.

That is odd. The only thing that comes to mind is if the sub-project is not designed to be a subproject. At the same time, this is quite an odd design with the packing and unpacking and would need to look more deeply. But if you need another pair of eyes on such issues, let me know.

@evaleev
Copy link
Owner

evaleev commented Dec 13, 2023

Yeah, LIBINT_BUILD_LIBRARY_AS_SUBPROJECT vs. ExternalProject_Add :-). Basically, I'm all for the latter for this role, especially as it keeps the dev library build separate and configured most like the more common independent library build. If a variable doesn't pass through, I know to fix it in the cmake and add it to the INSTALL.md.

LIBINT_BUILD_LIBRARY_AS_SUBPROJECT is merely a fragile convenience for experts. I would not advertise it because it is extra fragile. When tinkering with the generated library it is convenient though, e.g. for IDE to comprehend the generated library (targets, indexing) from within the compiler project.

I think the main developer prefers FetchContent for not having to define every variable passed through, so he added the former option. It sometimes takes two steps, and iirc I couldn't get the GHA to always work for it.

I can't imagine how to make Libint properly subprojectable without a lot of logic duplication. The key benefit of FetchContent, making subproject's targets available within the main project at configure time, is not possible here since the user code wants to subproject the generated library, not the compiler. One can create targets in the compiler project that will forward to the library targets, but it would require duplicating most of the library harness logic in the compiler harness.

We could give user the option of building/running the compiler at configure time, but that seems fragile as well.

@LecrisUT
Copy link

LecrisUT commented Dec 13, 2023

We could give user the option of building/running the compiler at configure time, but that seems fragile as well.

That seems similar to things that I've done on fypp and something that I want to do for swig as well. In those cases I make it explicitly compatible with FetchContent and find_package. I am not familiar with the project to know what it does or if it is an appropriate solution, but if you have the time to explain the overall structure, I can offer some additional feedback of potential design approaches. Basically if it needs a precompiler or to inject libraries like in Python_add_library I think there is a clean design there.

@evaleev
Copy link
Owner

evaleev commented Dec 13, 2023

We could give user the option of building/running the compiler at configure time, but that seems fragile as well.

That seems similar to things that I've done on fypp and something that I want to do for swig as well. In those cases I make it explicitly compatible with FetchContent and find_package. I am not familiar with the project to know what it does or if it is an appropriate solution, but if you have the time to explain the overall structure, I can offer some additional feedback of potential design approaches. Basically if it needs a precompiler or to inject libraries like in Python_add_library I think there is a clean design there.

The libint project contains a compiler, which generates a libint library. The library code is packaged into a cmake project, and consumed by others as a tarball (I use FetchContent for that). There is an infinite number of configurations, so we can only ship a few we (I) care about.

The main complexity stems from a large number of use cases. Libint devs want to be able to build the whole thing starting from the compiler and don't care about subprojectability. Most libint users want to consume the library, not the compiler. Since library generation is expensive, we don't generate at configure time. Thus the need for a separate library generation step. In superbuild scenario this is fine, but in the common (FetchContent) scenario we must have the library already generated. This is why I never cared about making the compiler build harness to be cmake ... but still there are good reasons to (thanks, @loriab!)

@LecrisUT
Copy link

The library code is packaged into a cmake project, and consumed by others as a tarball (I use FetchContent for that). There is an infinite number of configurations, so we can only ship a few we (I) care about.

Ok, then this sounds like find_package(COMPONENTS). Using FetchContent, the consumer knows which libint library they want to generate, while in the packaged version, you want to pick up specific COMPONENTS that have been packaged. My suggestion is not even to use FetchContent, but plain old add_subdirectory where the inner project is templated with variables defined in the main project that calls the add_subdirectory. Then you can simply have a for loop to include of the requested templetized projects. Maybe llvm would also be a good reference for the design

The libint project contains a compiler, which generates a libint library.

Then this is the same design as for swig. Here we could have two different approaches. One is the add_subdirectory approach mentioned previously. The other one is to have a libint_add_library function that will generate the appropriate library target on-the-fly, where inside the function you mimic the templetized project. Which approach to choose depends on if you want the libint library to be pre-built and packaged (add_subdirectory and the current approach I believe), or a minimal compilation and the consuming project would generate and manage the corresponding library (`libint_add_library).

Since library generation is expensive, we don't generate at configure time.

Note that with FetchContent, only the configure stage is done, so if the tarball generation can be avoided, this step would be rather quick.

but in the common (FetchContent) scenario we must have the library already generated

That should not be the case. As long as the appropriate CMake targets are made available, CMake would be able to compile the generated libint library before it is consumed by a linker.

@loriab
Copy link
Collaborator Author

loriab commented Dec 13, 2023

EFV can correct me if I'm wrong, but probably we're not looking to restructure the build from this PR. The number of people running the actual compiler/generator step of libint are about 2 * number_of_qc_pkgs_using_libint + number_of_misc_packagers + everyone_in_the_Valeev_group = ~few_dozen . So as long as a scheme improves the lives of developers (with end-to-end cmake generate/build/test) and improves the lives of consumers (ease of embedding configuration info into library with components, as you say), I think it's doing its duty.

The PR scheme has been steady in Psi4 for the past 2y, so apart from exploring configuration space I haven't touched but mpqc/nwchemex might, I'm confident in its utility.

@evaleev
Copy link
Owner

evaleev commented Dec 13, 2023

That should not be the case. As long as the appropriate CMake targets are made available, CMake would be able to compile the generated libint library before it is consumed by a linker.

I agree, but the impact of doing this on maintainability (due to the need to also support building the targets in standalone library harness ... so much logic is duplicated) coupled with roughly 0.01 developer/year allocated to libint makes this a nonstarter. We confine all the library target logic to the library harness which cannot be configured until the compiler had a chance to run.

EFV can correct me if I'm wrong, but probably we're not looking to restructure the build from this PR. The number of people running the actual compiler/generator step of libint are about 2 * number_of_qc_pkgs_using_libint + number_of_misc_packagers + everyone_in_the_Valeev_group = ~few_dozen . So as long as a scheme improves the lives of developers (with end-to-end cmake generate/build/test) and improves the lives of consumers (ease of embedding configuration info into library with components, as you say), I think it's doing its duty.

Agreed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants