diff --git a/CHANGELOG.md b/CHANGELOG.md index b4f82c504..2b6d536d8 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -27,7 +27,10 @@ The format of this changelog is based on - Added support for operator partial assembly for high-order finite element spaces based on libCEED for non-tensor product element meshes. This option is disabled by default, but can be activated using `config["Solver"]["PartialAssemblyOrder"]` set to some number - less than `"Order"` and `config["Solver"]["Device"]: "ceed-cpu"`. + less than or equal to `"Order"`. + - Added `config["Solver"]["Device"]` and `config["Solver"]["Backend"]` options for runtime + configuration of the MFEM device (CPU or GPU) and corresponding libCEED backend, with + suitable defaults for users. - Added support for non axis aligned lumped ports and current sources. Key words `"X"`, `"Y"`, `"Z"` and `"R"`, with optional prefix `"+"` or `"-"` still work, but now directions can be specified as vectors with 3 components. Users will be warned, and @@ -40,8 +43,13 @@ The format of this changelog is based on - Added build dependencies on [libCEED](https://github.com/CEED/libCEED) and [LIBXSMM](https://github.com/libxsmm/libxsmm) to support operator partial assembly (CPU- based for now). + - Added unit test framework for all integrators based on + [Catch2](https://github.com/catchorg/Catch2), which also includes some automated + benchmarking capabilities for operator assembly and application. - Added improved OpenMP support in `palace` wrapper script and CI tests. - Added Apptainer/Singularity container build definition for Palace. + - Fixed bugs related to thread-safety for OpenMP builds and parallel tetrahedral meshes in + the upstream MFEM library. ## [0.11.2] - 2023-07-14 diff --git a/docs/src/config/solver.md b/docs/src/config/solver.md index 2f0d05de6..82399d140 100644 --- a/docs/src/config/solver.md +++ b/docs/src/config/solver.md @@ -11,6 +11,7 @@ "Order": , "PartialAssemblyOrder": , "Device": , + "Backend": , "Eigenmode": { ... @@ -46,11 +47,24 @@ with element operators to [partial assembly](https://mfem.org/howto/assembly_levels/). Setting this parameter equal to 1 will fully activate operator partial assembly on all levels. -`"Device" ["cpu"]` : The device configuration passed to [MFEM] -(https://mfem.org/howto/assembly_levels/) in order to activate different backends at -runtime. CPU-based partial assembly is supported by the `"cpu"` backend for tensor-product -meshes using the native MFEM kernels and `"ceed-cpu"` backend for all mesh types using -libCEED. +`"Device" ["CPU"]` : The runtime device configuration passed to [MFEM] +(https://mfem.org/howto/assembly_levels/) in order to activate different options specified +during configuration. The available options are: + + - `"CPU"` + - `"GPU"` + - `"Debug"` + +The `"GPU"` option will automatically activate the `cuda` or `hip` device based on whether +MFEM is built with CUDA (`MFEM_USE_CUDA=ON`) or HIP (`MFEM_USE_HIP=ON`) support. When +*Palace* is built with OpenMP support (`PALACE_WITH_OPENMP=ON`), `omp` is automatically +added to the list of activated MFEM devices. The `"Debug"` option for MFEM's `debug` device +is useful for debugging issues associated with GPU-based runs of *Palace*. + +`"Backend" [""]` : Specifies the [libCEED backend] +(https://libceed.org/en/latest/gettingstarted/#backends) to use for the simulation. If no +backend is specified, a suitable default backend is selected based on the given +`config["Solver"]["Device"]`. `"Eigenmode"` : Top-level object for configuring the eigenvalue solver for the eigenmode simulation type. Thus, this object is only relevant for @@ -75,6 +89,10 @@ Thus, this object is only relevant for [`config["Problem"]["Type"]: "Magnetostat `"Linear"` : Top-level object for configuring the linear solver employed by all simulation types. +### Advanced solver options + + - `"PartialAssemblyInterpolators" [true]` + ## `solver["Eigenmode"]` ```json @@ -435,7 +453,6 @@ vectors in Krylov subspace methods or other parts of the code. - `"MGSmoothEigScaleMax" [1.0]` - `"MGSmoothEigScaleMin" [0.0]` - `"MGSmoothChebyshev4th" [true]` - - `"PCLowOrderRefined" [false]` - `"ColumnOrdering" ["Default"]` : `"METIS"`, `"ParMETIS"`,`"Scotch"`, `"PTScotch"`, `"Default"` - `"STRUMPACKCompressionType" ["None"]` : `"None"`, `"BLR"`, `"HSS"`, `"HODLR"`, `"ZFP"`, diff --git a/docs/src/developer.md b/docs/src/developer.md index 8122790d6..0b26aec0d 100644 --- a/docs/src/developer.md +++ b/docs/src/developer.md @@ -71,6 +71,35 @@ which behaves as a stopwatch with some memory functions. It is the responsibilit objects may be created for local timing purposes, but these will not count toward time reported at the end of a log file or in the metadata JSON. +## Testing + +We use [Catch2](https://github.com/catchorg/Catch2) to perform unit testing of the [libCEED] +(https://libceed.org/en/latest/) integration in Palace against the legacy MFEM assembly +routines. The unit tests source code is located in the [`test/unit/`] +(https://github.com/awslabs/palace/blob/main/test/unit/) directory, and can be built from +within the *Palace* build directory using `make unit-tests`, or from the superbuild as +`make palace-tests`. The unit tests can be accelerated using MPI and/or OpenMP parallelism +(when configured with `PALACE_WITH_OPENMP=ON`), but in all cases they are only testing the +local operator assembly on each process. The 2D and 3D sample meshes in [`test/unit/mesh/`] +(https://github.com/awslabs/palace/blob/main/test/unit/mesh/) come from the +[MFEM repository](https://github.com/mfem/mfem/tree/master/data). + +The unit test application also includes a small number of benchmarks to compare performance +between MFEM's legacy assembly backend, MFEM's partial assembly backend, and the specified +libCEED backend (specified with the `--backend` option, use `-h`/`--help` to list all +command line options for the `unit-tests` executable). These can be run using, for +example: + +```bash +./unit-tests "[Benchmark]" --benchmark-samples 10 +``` + +The unit tests are run automatically as part of the project's continuous integration (CI) +workflows. Also run as part of the CI are regression tests based on the provided example +applications in the [`examples/`](https://github.com/awslabs/palace/blob/main/examples/) +directory. These are executed based on the code in [`test/examples/`] +(https://github.com/awslabs/palace/blob/main/test/examples/). + ## Changelog Code contributions should generally be accompanied by an entry in the [changelog] diff --git a/docs/src/install.md b/docs/src/install.md index f6da95e84..ee5752bbb 100644 --- a/docs/src/install.md +++ b/docs/src/install.md @@ -113,15 +113,13 @@ Additional build options are (with default values in brackets): - `PALACE_WITH_64BIT_INT [OFF]` : Build with 64-bit integer support - `PALACE_WITH_OPENMP [OFF]` : Use OpenMP - - `PALACE_WITH_LIBCEED [ON]` : Build with libCEED library for high-order partial assembly - support - `PALACE_WITH_GSLIB [ON]` : Build with GSLIB library for high-order field interpolation - `PALACE_WITH_SUPERLU [ON]` : Build with SuperLU_DIST sparse direct solver - `PALACE_WITH_STRUMPACK [OFF]` : Build with STRUMPACK sparse direct solver - `PALACE_WITH_MUMPS [OFF]` : Build with MUMPS sparse direct solver - `PALACE_WITH_SLEPC [ON]` : Build with SLEPc eigenvalue solver - `PALACE_WITH_ARPACK [OFF]` : Build with ARPACK eigenvalue solver - - `PALACE_WITH_LIBXSMM [ON]` : Build with LIBXSMM backend when libCEED is enabled + - `PALACE_WITH_LIBXSMM [ON]` : Build with LIBXSMM backend for libCEED The build step is invoked by running (for example with 4 `make` threads) @@ -174,8 +172,8 @@ as the standard parallelization in approach in *Palace* is to use pure MPI paral *Palace* leverages the [MFEM finite element discretization library](http://mfem.org). It always configures and builds its own installation of MFEM internally in order to support the most up to date features and patches. Likewise, Palace will always build its own -installation of [libCEED](https://github.com/CEED/libCEED), when `PALACE_WITH_LIBCEED=ON`, -and [GSLIB](https://github.com/Nek5000/gslib), when `PALACE_WITH_GSLIB=ON`. +installation of [libCEED](https://github.com/CEED/libCEED), and [GSLIB] +(https://github.com/Nek5000/gslib), when `PALACE_WITH_GSLIB=ON`. As part of the [Build from source](#Build-from-source), the CMake build will automatically build and install a small number of third-party dependencies before building *Palace*. The @@ -196,8 +194,7 @@ source code for these dependencies is downloaded using using [Git submodules] [PETSc](https://petsc.org/release/) - [ARPACK-NG](https://github.com/opencollab/arpack-ng) (optional, when `PALACE_WITH_ARPACK=ON`) - - [LIBXSMM](https://github.com/libxsmm/libxsmm) (optional, when `PALACE_WITH_LIBXSMM=ON` - and `PALACE_WITH_LIBCEED=ON`) + - [LIBXSMM](https://github.com/libxsmm/libxsmm) (optional, when `PALACE_WITH_LIBXSMM=ON`) - [nlohmann/json](https://github.com/nlohmann/json) - [fmt](https://fmt.dev/latest) - [Eigen](https://eigen.tuxfamily.org) @@ -205,3 +202,7 @@ source code for these dependencies is downloaded using using [Git submodules] For solving eigenvalue problems, at least one of SLEPc or ARPACK-NG must be specified. Typically only one of the SuperLU_DIST, STRUMPACK, and MUMPS dependencies is required but all can be built so the user can decide at runtime which solver to use. + +For unit testing, Palace relies on the [Catch2 library](https://github.com/catchorg/Catch2), +which is automatically downloaded and built when building the `unit-tests` target. See the +[Developer Notes](developer.md#Testing) for more information. diff --git a/scripts/schema/config/solver.json b/scripts/schema/config/solver.json index bae4e1ae1..09bf22f99 100644 --- a/scripts/schema/config/solver.json +++ b/scripts/schema/config/solver.json @@ -8,7 +8,9 @@ { "Order": { "type": "integer", "minimum": 1 }, "PartialAssemblyOrder": { "type": "integer", "minimum": 1 }, - "Device": { "type": "string" }, + "PartialAssemblyInterpolators": { "type": "boolean" }, + "Device": { "type": "string", "enum": ["CPU", "GPU", "Debug"] }, + "Backend": { "type": "string" }, "Eigenmode": { "type": "object", @@ -115,7 +117,6 @@ "MGSmoothChebyshev4th": { "type": "boolean" }, "PCMatReal": { "type": "boolean" }, "PCMatShifted": { "type": "boolean" }, - "PCLowOrderRefined": { "type": "boolean" }, "PCSide": { "type": "string" }, "ColumnOrdering": { "type": "string" }, "STRUMPACKCompressionType": { "type": "string" },