{numlib,chem,tollchain}[NVHPC/23.7-CUDA-12.1.1] nvompi-2023a + QuantumESPRESSO-7.3.1 (GPU enabled) #20364

Crivella · 2024-04-15T16:32:52Z

Added easyconfig files for nvofbf toolchain + QE 7.3.1

local compilers:

GCC/12.3.0
CUDA/12.1.1

Added toolchain/numlib

nvofbf-2023a
- nvompi-2023a
  - NVHPC-23.7-CUDA-12.1.1
  - OpenMPI-4.1.5
- FlexiBLAS-3.3.1
  - OpenBLAS-0.3.24
- FFTW-3.3.10
- FFTW.MPI-3.3.10
- ScaLAPACK-2.2.0-fb

Added easyconfigs

HDF5-1.14.0-nvompi-2023a-CUDA-12.1.1.eb
libxc-6.2.2-NVHPC-23.7-CUDA-12.1.1.eb
QuantumESPRESSO-7.3.1-nvompi-2023a-CUDA-12.1.1.eb

NOTES:

QuantumESPRESSO easyconfig also requires changes from this commit. Will incorporate them once Archive autotools-based QuantumESPRESSO easyblock and switch default to CMakeMake-based easyblock easybuild-easyblocks#3306 is merged
ELPA: compiles using cuda compilers which requires specified compute capability (CC), while QE uses hpc-sdk compilers which if not specified compiles for all supported CCs

Solved issues:

High number of failures in OpenBLAS lapack-testsuite: LAPACK test failures with NVHPC 23.7 OpenMathLib/OpenBLAS#4625
- Use less optimization for v0.3.24
- Use patch for v0.3.27

Open issue:

Segfault in QE test-suite due to FlexiBLAS occasionally when calling the ZHEEV BLAS routine
- Bug does not manifest when running the code with cuda-gdb
- Tested starting from nvompi linking directly to OpenBLAS and the error was not present
Segfault in 3 test cases with RMM-DIS diagonalization with k points other than GAMMA, most likely a QE bug (https://gitlab.com/QEF/q-e/-/issues/675)
Full CUDA libxc: https://gitlab.com/libxc/libxc/-/issues/135
- Tested patch from commit e648f37b
  - Compile time goes from ~5min to ~3.5h
  - Tests are unable to run
  - I would argue for now since it is not officially supported with CMAKE and only experimental with autotools, and also not a really widely used feature of QE, it is ok to not have the libxc routines run on GPU

This reverts commit 706e9d1.

Crivella · 2024-04-18T08:58:21Z

Comparison of code efficiency when linked to EB numlibs (no prefix) VS linked to NVHPC math_libs (-test prefix) shows no significative difference running on one node with a A100 GPU

[ RUN      ] MINE_QESPRESSO %ecut=250 %nbnd=400 %module_name=QuantumESPRESSO/7.3.1-nvompi-2023a-test %threads=1 /bf4db141 @vega-gpu:default+default
[ RUN      ] MINE_QESPRESSO %ecut=250 %nbnd=400 %module_name=QuantumESPRESSO/7.3.1-nvompi-2023a-CUDA-12.1.1 %threads=1 /e4ce2bb2 @vega-gpu:default+default
[       OK ] (1/2) MINE_QESPRESSO %ecut=250 %nbnd=400 %module_name=QuantumESPRESSO/7.3.1-nvompi-2023a-test %threads=1 /bf4db141 @vega-gpu:default+default
P: extract_report_time: 0 s (r:0, l:None, u:None)
P: PWSCF_cpu: 231.98 s (r:0, l:None, u:None)
P: PWSCF_wall: 241.82 s (r:0, l:None, u:None)
P: electrons_cpu: 213.8 s (r:0, l:None, u:None)
P: electrons_wall: 216.04 s (r:0, l:None, u:None)
P: c_bands_cpu: 181.97 s (r:0, l:None, u:None)
P: c_bands_wall: 183.78 s (r:0, l:None, u:None)
P: cegterg_cpu: 142.72 s (r:0, l:None, u:None)
P: cegterg_wall: 144.01 s (r:0, l:None, u:None)
P: calbec_cpu: 0.12 s (r:0, l:None, u:None)
P: calbec_wall: 0.55 s (r:0, l:None, u:None)
P: fft_cpu: 0.12 s (r:0, l:None, u:None)
P: fft_wall: 0.14 s (r:0, l:None, u:None)
P: ffts_cpu: 0.0 s (r:0, l:None, u:None)
P: ffts_wall: 0.0 s (r:0, l:None, u:None)
P: fftw_cpu: 1.26 s (r:0, l:None, u:None)
P: fftw_wall: 77.36 s (r:0, l:None, u:None)
[       OK ] (2/2) MINE_QESPRESSO %ecut=250 %nbnd=400 %module_name=QuantumESPRESSO/7.3.1-nvompi-2023a-CUDA-12.1.1 %threads=1 /e4ce2bb2 @vega-gpu:default+default
P: extract_report_time: 0 s (r:0, l:None, u:None)
P: PWSCF_cpu: 232.44 s (r:0, l:None, u:None)
P: PWSCF_wall: 241.74 s (r:0, l:None, u:None)
P: electrons_cpu: 214.16 s (r:0, l:None, u:None)
P: electrons_wall: 216.18 s (r:0, l:None, u:None)
P: c_bands_cpu: 182.3 s (r:0, l:None, u:None)
P: c_bands_wall: 183.9 s (r:0, l:None, u:None)
P: cegterg_cpu: 143.11 s (r:0, l:None, u:None)
P: cegterg_wall: 144.18 s (r:0, l:None, u:None)
P: calbec_cpu: 0.12 s (r:0, l:None, u:None)
P: calbec_wall: 0.56 s (r:0, l:None, u:None)
P: fft_cpu: 0.0 s (r:0, l:None, u:None)
P: fft_wall: 0.01 s (r:0, l:None, u:None)
P: ffts_cpu: 0.0 s (r:0, l:None, u:None)
P: ffts_wall: 0.0 s (r:0, l:None, u:None)
P: fftw_cpu: 1.24 s (r:0, l:None, u:None)
P: fftw_wall: 77.23 s (r:0, l:None, u:None)
[----------] all spawned checks have finished

cgross95 · 2024-05-21T22:06:57Z

Thanks for putting all of this together! Our site is interested in a GPU enabled QuantumESPRESSO build, so we've been testing this.

Were you able to get around the "other error"s that occur in LAPACK testing when building OpenBLAS? Using the OpenBLAS_0.3.24-NVHPC-23.7-CUDA-12.1.1.eb EasyConfig as provided gives us 55 other errors:

                        -->   LAPACK TESTING SUMMARY  <--
SUMMARY                 nb test run     numerical error         other error  
================        ===========     =================       ================  
REAL                    1328283         0       (0.000%)        0       (0.000%)        
DOUBLE PRECISION        1328013         10      (0.001%)        0       (0.000%)        
COMPLEX                 769507          159     (0.021%)        55      (0.007%)        
COMPLEX16               780654          116     (0.015%)        0       (0.000%)        

--> ALL PRECISIONS      4206457         285     (0.007%)        55      (0.001%)

I saw that you had done some work with on OpenBLAS issue #4652 to get some of the numerical failures down, but was wondering if you were ever able to get rid of the other errors that stop EasyBuild from finishing.

Crivella · 2024-05-22T08:33:17Z

@cgross95
In my case by compiling on a machine with a A100 gpu i ended up with only 148 numerical errors.

			-->   LAPACK TESTING SUMMARY  <--
SUMMARY             	nb test run 	numerical error   	other error  
================   	===========	=================	================  
REAL             	1326099		12	(0.001%)	0	(0.000%)	
DOUBLE PRECISION	1326921		36	(0.003%)	0	(0.000%)	
COMPLEX          	762663		42	(0.006%)	0	(0.000%)	
COMPLEX16         	771518		58	(0.008%)	0	(0.000%)	

--> ALL PRECISIONS	4187201		148	(0.004%)	0	(0.000%)

What hardware are you trying this on?
Would be interesting in finding out if this is strictly OpenBLAS/LaPACK related or if it is about setting more compiler flags for different architectures.

I think i was still getting some other errors as well with 0.3.27 but i didn't investigate much further into it as i was aiming at 0.3.24 for this release (In that case i was getting 14 errors related to the ZHSQR and ZGEEV routines failing to find all eigenvalues).

The logs should give you further details on which lapack routine failed and with what error code (each function should have the meaning of the errors as comments in the source/documentation).
In case you think those errors might not be a problem you could also increase the threshold of allowed lapack test errors by changing the value assigned to max_failing_lapack_tests_num_errors and adding also max_failing_lapack_tests_other_errors to allow the other errors.

cgross95 · 2024-05-22T12:15:34Z

I'm compiling on a v100s with an Intel Xeon Skylake on Ubuntu 22.04. We also have some a100 cards, but we're in the midst of transferring everything in our cluster to Ubuntu, so they're not easily accessible at the moment. I'll dig into the LAPACK testing logs and see if I can produce some more useful debugging information.

cgross95 · 2024-06-07T19:22:32Z

I finally got access to our A100 cards, and can report that there were no "other error"s in the LAPACK tests. I ended up with 152 numerical errors, so increased the max_failing_lapack_tests_num_errors EasyConfig parameter, and was able to successfully install OpenBLAS. I'm continuing on with the rest of the build now.

beeebiii · 2024-09-05T12:46:02Z

Hi, I'm compiling on a v100s with an Intel Xeon Skylake on Ubuntu 22.04. What more changes do you think i should do to be able to use QuantumEspresso(GPU enabled)?? Because, when i use this PR, eb --from-pr 20364 -r, i got checksum error in libxc, which i fixed, afterwards i am getting error in OpenBLAS/0.3.24-NVHPC-23.7-CUDA-12.1.1.... The error i get is

Error limit reached. Use -fmax-errors=N to change the limit, N=0 for unlimited. 100 errors detected in the compilation of "../kernel/x86_64/sgemv_t_4.c". Compilation terminated. make[1]: *** [Makefile.L2:268: sgemv_t.o] Error 2 make[1]: Leaving directory '/test/parta/EASYBUILD/build/OpenBLAS/0.3.24/NVHPC-23.7-CUDA-12.1.1/OpenBLAS-0.3.24/kernel' make: *** [Makefile:184: libs] Error 1 (at easybuild/tools/run.py:682 in parse_cmd_output)
Should i increase the -fmax-errors as i have a total of 241 errors detected during the compilation of ../kernel/x86_64/...?? Or how do you think can i solve this problem!!??

Crivella · 2024-09-05T13:29:46Z

@beeebiii
Regarding the checksum for libxc it is related to this issue and also this PR.
Since the checksum for the code is not stable it would be best to use --ignore-checksums (in case for security after you've checked that only the libxc one is failing and the the downloaded files are what you expect)

The other error you are reporting seems related to OpenBLAS. In my tests on an A100 with an AMD zen2 CPU I did not encounter failures in the compilation (only some failures in the test suite).
It seems @cgross95 was on a system similar to yours but also did not encounter it.

One weird thing is I am not sure -fmax-errors is supported by NVCC but is a GCC only flag. Would probably need the full debug log to understand what is happening here (eg is gcc being used instead of nvcc?).

beeebiii · 2024-09-05T13:36:59Z

Yeah you are right, i think gcc is being used instead of nvcc.
Remark: individual warnings can be suppressed with "--diag_suppress <warning-name>" "../kernel/x86_64/sgemv_t_4.c", line 31: warning: unrecognized GCC pragma [unrecognized_gcc_pragma] #pragma GCC optimize("no-tree-vectorize")
Very strange, i did compile libxc/6.2.2-NVHPC-23.7-CUDA-12.1.1 myself as i had to change the checksums. Other than that i do eb --from-pr 20364 -r --accept-eula-for=CUDA --cuda-compute-capabilities=9.0 --accept-eula-for=NVHPC

Crivella · 2024-09-05T13:44:13Z

If you look at the OpenBLAS easyconfig, only NVHPC should be used and easybuild should not be aware for other compiler toolchains in that instance.
I would suggest running the build with -d and check the logs to see if there is something weird happening with environment variables or some other module you have already loaded in your environment

Crivella · 2024-09-05T14:04:36Z

That is basically easybuild reporting the easyconfig file being used, the debug logging adds much more.
Look for DEBUG Not skipping configure step in the logs and start scrolling

up to see what env variables were set by easybuild and what modules were found loaded
down to see the output of the configure command

I would venture to guess the problem is in either of them.
Also is OpenBLAS the first thing that gets installed with -r (eg did something else installing with NVHPC only complete sucessfully?)
BTW this conversation might more suited for slack. If you are on the EB slack you can @ me with Davide Grassano

Crivella · 2024-09-09T08:21:43Z

To summarize the problem @beeebiii was having, the -tp=px was causing trouble compiling which was resolved by commenting the 'optarch': 'GENERIC', line from the OpenBLAS easyconfig (causing it to become -tp=host)

This reverts commit 706e9d1.

yqshao · 2024-10-25T16:41:41Z

I was trying this and I think the 55 "other error" that @cgross95 observed was discussed and resolved in #19021

github-actions · 2024-12-05T10:32:22Z

Updated software `FFTW.MPI-3.3.10-nvompi-2023a.eb`

Diff against FFTW.MPI-3.3.10-gompi-2024a.eb

easybuild/easyconfigs/f/FFTW.MPI/FFTW.MPI-3.3.10-gompi-2024a.eb

diff --git a/easybuild/easyconfigs/f/FFTW.MPI/FFTW.MPI-3.3.10-gompi-2024a.eb b/easybuild/easyconfigs/f/FFTW.MPI/FFTW.MPI-3.3.10-nvompi-2023a.eb
index 2ed420c412..6a87b1494a 100644
--- a/easybuild/easyconfigs/f/FFTW.MPI/FFTW.MPI-3.3.10-gompi-2024a.eb
+++ b/easybuild/easyconfigs/f/FFTW.MPI/FFTW.MPI-3.3.10-nvompi-2023a.eb
@@ -5,14 +5,17 @@ homepage = 'https://www.fftw.org'
 description = """FFTW is a C subroutine library for computing the discrete Fourier transform (DFT)
 in one or more dimensions, of arbitrary input size, and of both real and complex data."""
 
-toolchain = {'name': 'gompi', 'version': '2024a'}
+toolchain = {'name': 'nvompi', 'version': '2023a'}
 toolchainopts = {'pic': True}
 
 source_urls = [homepage]
 sources = ['fftw-%(version)s.tar.gz']
 checksums = ['56c932549852cddcfafdab3820b0200c7742675be92179e59e6215b340e26467']
 
-dependencies = [('FFTW', '3.3.10')]
+local_cuda = '12.1.1'
+local_compiler = ('NVHPC', '23.7-CUDA-%s' % local_cuda)
+
+dependencies = [('FFTW', '3.3.10', '', local_compiler)]
 
 runtest = 'check'

Diff against FFTW.MPI-3.3.10-gmpich-2024.06.eb

easybuild/easyconfigs/f/FFTW.MPI/FFTW.MPI-3.3.10-gmpich-2024.06.eb

diff --git a/easybuild/easyconfigs/f/FFTW.MPI/FFTW.MPI-3.3.10-gmpich-2024.06.eb b/easybuild/easyconfigs/f/FFTW.MPI/FFTW.MPI-3.3.10-nvompi-2023a.eb
index 4b79e9e7c2..6a87b1494a 100644
--- a/easybuild/easyconfigs/f/FFTW.MPI/FFTW.MPI-3.3.10-gmpich-2024.06.eb
+++ b/easybuild/easyconfigs/f/FFTW.MPI/FFTW.MPI-3.3.10-nvompi-2023a.eb
@@ -5,14 +5,17 @@ homepage = 'https://www.fftw.org'
 description = """FFTW is a C subroutine library for computing the discrete Fourier transform (DFT)
 in one or more dimensions, of arbitrary input size, and of both real and complex data."""
 
-toolchain = {'name': 'gmpich', 'version': '2024.06'}
+toolchain = {'name': 'nvompi', 'version': '2023a'}
 toolchainopts = {'pic': True}
 
 source_urls = [homepage]
 sources = ['fftw-%(version)s.tar.gz']
 checksums = ['56c932549852cddcfafdab3820b0200c7742675be92179e59e6215b340e26467']
 
-dependencies = [('FFTW', '3.3.10')]
+local_cuda = '12.1.1'
+local_compiler = ('NVHPC', '23.7-CUDA-%s' % local_cuda)
+
+dependencies = [('FFTW', '3.3.10', '', local_compiler)]
 
 runtest = 'check'

Updated software `FFTW-3.3.10-NVHPC-23.7-CUDA-12.1.1.eb`

Diff against FFTW-3.3.10-GCC-13.3.0.eb

easybuild/easyconfigs/f/FFTW/FFTW-3.3.10-GCC-13.3.0.eb

diff --git a/easybuild/easyconfigs/f/FFTW/FFTW-3.3.10-GCC-13.3.0.eb b/easybuild/easyconfigs/f/FFTW/FFTW-3.3.10-NVHPC-23.7-CUDA-12.1.1.eb
index bf18544e70..670602aadc 100644
--- a/easybuild/easyconfigs/f/FFTW/FFTW-3.3.10-GCC-13.3.0.eb
+++ b/easybuild/easyconfigs/f/FFTW/FFTW-3.3.10-NVHPC-23.7-CUDA-12.1.1.eb
@@ -5,13 +5,16 @@ homepage = 'https://www.fftw.org'
 description = """FFTW is a C subroutine library for computing the discrete Fourier transform (DFT)
 in one or more dimensions, of arbitrary input size, and of both real and complex data."""
 
-toolchain = {'name': 'GCC', 'version': '13.3.0'}
+toolchain = {'name': 'NVHPC', 'version': '23.7-CUDA-12.1.1'}
 toolchainopts = {'pic': True}
 
 source_urls = [homepage]
 sources = [SOURCELOWER_TAR_GZ]
 checksums = ['56c932549852cddcfafdab3820b0200c7742675be92179e59e6215b340e26467']
 
+# Does not work with nvc
+with_quad_prec = False
+
 runtest = 'check'
 
 moduleclass = 'numlib'

Diff against FFTW-3.3.10-GCC-13.2.0.eb

easybuild/easyconfigs/f/FFTW/FFTW-3.3.10-GCC-13.2.0.eb

diff --git a/easybuild/easyconfigs/f/FFTW/FFTW-3.3.10-GCC-13.2.0.eb b/easybuild/easyconfigs/f/FFTW/FFTW-3.3.10-NVHPC-23.7-CUDA-12.1.1.eb
index 32652387f8..670602aadc 100644
--- a/easybuild/easyconfigs/f/FFTW/FFTW-3.3.10-GCC-13.2.0.eb
+++ b/easybuild/easyconfigs/f/FFTW/FFTW-3.3.10-NVHPC-23.7-CUDA-12.1.1.eb
@@ -5,13 +5,16 @@ homepage = 'https://www.fftw.org'
 description = """FFTW is a C subroutine library for computing the discrete Fourier transform (DFT)
 in one or more dimensions, of arbitrary input size, and of both real and complex data."""
 
-toolchain = {'name': 'GCC', 'version': '13.2.0'}
+toolchain = {'name': 'NVHPC', 'version': '23.7-CUDA-12.1.1'}
 toolchainopts = {'pic': True}
 
 source_urls = [homepage]
 sources = [SOURCELOWER_TAR_GZ]
 checksums = ['56c932549852cddcfafdab3820b0200c7742675be92179e59e6215b340e26467']
 
+# Does not work with nvc
+with_quad_prec = False
+
 runtest = 'check'
 
 moduleclass = 'numlib'

Updated software `HDF5-1.14.0-nvompi-2023a-CUDA-12.1.1.eb`

Diff against HDF5-1.14.5-gompi-2024a.eb

easybuild/easyconfigs/h/HDF5/HDF5-1.14.5-gompi-2024a.eb

diff --git a/easybuild/easyconfigs/h/HDF5/HDF5-1.14.5-gompi-2024a.eb b/easybuild/easyconfigs/h/HDF5/HDF5-1.14.0-nvompi-2023a-CUDA-12.1.1.eb
index 5917196292..74256e2cb4 100644
--- a/easybuild/easyconfigs/h/HDF5/HDF5-1.14.5-gompi-2024a.eb
+++ b/easybuild/easyconfigs/h/HDF5/HDF5-1.14.0-nvompi-2023a-CUDA-12.1.1.eb
@@ -1,27 +1,27 @@
 name = 'HDF5'
 # Note: Odd minor releases are only RCs and should not be used.
-version = '1.14.5'
+version = '1.14.0'
+versionsuffix = '-CUDA-%(cudaver)s'
 
 homepage = 'https://portal.hdfgroup.org/display/support'
 description = """HDF5 is a data model, library, and file format for storing and managing data.
  It supports an unlimited variety of datatypes, and is designed for flexible
  and efficient I/O and for high volume and complex data."""
 
-toolchain = {'name': 'gompi', 'version': '2024a'}
+toolchain = {'name': 'nvompi', 'version': '2023a'}
 toolchainopts = {'pic': True, 'usempi': True}
 
-source_urls = ['https://github.com/HDFGroup/hdf5/archive']
-sources = ['hdf5_%(version)s.tar.gz']
-checksums = ['c83996dc79080a34e7b5244a1d5ea076abfd642ec12d7c25388e2fdd81d26350']
+source_urls = ['https://support.hdfgroup.org/ftp/HDF5/releases/hdf5-%(version_major_minor)s/hdf5-%(version)s/src']
+sources = [SOURCELOWER_TAR_GZ]
+checksums = ['a571cc83efda62e1a51a0a912dd916d01895801c5025af91669484a1575a6ef4']
 
-dependencies = [
-    ('zlib', '1.3.1'),
-    ('Szip', '2.1.1'),
-]
+local_gcc_compiler = ('GCCcore', '12.3.0')
+# local_compiler = ('NVHPC', '23.7-CUDA-12.1.1')
 
-postinstallcmds = [
-    'sed -i -r "s, -I[^[:space:]]+H5FDsubfiling , -I%(installdir)s/include ,g" %(installdir)s/bin/h5c++',
-    'sed -i -r "s, -I[^[:space:]]+H5FDsubfiling , -I%(installdir)s/include ,g" %(installdir)s/bin/h5pcc',
+dependencies = [
+    ('CUDA', '12.1.1', '', SYSTEM),
+    ('zlib', '1.2.13', '', local_gcc_compiler),
+    ('Szip', '2.1.1', '', local_gcc_compiler),
 ]
 
 moduleclass = 'data'

Diff against HDF5-1.14.5-iimpi-2024a.eb

easybuild/easyconfigs/h/HDF5/HDF5-1.14.5-iimpi-2024a.eb

diff --git a/easybuild/easyconfigs/h/HDF5/HDF5-1.14.5-iimpi-2024a.eb b/easybuild/easyconfigs/h/HDF5/HDF5-1.14.0-nvompi-2023a-CUDA-12.1.1.eb
index 153463a223..74256e2cb4 100644
--- a/easybuild/easyconfigs/h/HDF5/HDF5-1.14.5-iimpi-2024a.eb
+++ b/easybuild/easyconfigs/h/HDF5/HDF5-1.14.0-nvompi-2023a-CUDA-12.1.1.eb
@@ -1,26 +1,27 @@
 name = 'HDF5'
 # Note: Odd minor releases are only RCs and should not be used.
-version = '1.14.5'
+version = '1.14.0'
+versionsuffix = '-CUDA-%(cudaver)s'
 
 homepage = 'https://portal.hdfgroup.org/display/support'
 description = """HDF5 is a data model, library, and file format for storing and managing data.
  It supports an unlimited variety of datatypes, and is designed for flexible
  and efficient I/O and for high volume and complex data."""
 
-toolchain = {'name': 'iimpi', 'version': '2024a'}
+toolchain = {'name': 'nvompi', 'version': '2023a'}
 toolchainopts = {'pic': True, 'usempi': True}
 
-source_urls = ['https://github.com/HDFGroup/hdf5/archive']
-sources = ['hdf5_%(version)s.tar.gz']
-checksums = ['c83996dc79080a34e7b5244a1d5ea076abfd642ec12d7c25388e2fdd81d26350']
+source_urls = ['https://support.hdfgroup.org/ftp/HDF5/releases/hdf5-%(version_major_minor)s/hdf5-%(version)s/src']
+sources = [SOURCELOWER_TAR_GZ]
+checksums = ['a571cc83efda62e1a51a0a912dd916d01895801c5025af91669484a1575a6ef4']
 
-# replace src include path with installation dir for $H5BLD_CPPFLAGS
-_regex = 's, -I[^[:space:]]+H5FDsubfiling , -I%(installdir)s/include ,g'
-postinstallcmds = ['sed -i -r "%s" %%(installdir)s/bin/%s' % (_regex, x) for x in ['h5c++', 'h5pcc']]
+local_gcc_compiler = ('GCCcore', '12.3.0')
+# local_compiler = ('NVHPC', '23.7-CUDA-12.1.1')
 
 dependencies = [
-    ('zlib', '1.3.1'),
-    ('Szip', '2.1.1'),
+    ('CUDA', '12.1.1', '', SYSTEM),
+    ('zlib', '1.2.13', '', local_gcc_compiler),
+    ('Szip', '2.1.1', '', local_gcc_compiler),
 ]
 
 moduleclass = 'data'

Updated software `libxc-6.2.2-NVHPC-23.7-CUDA-12.1.1.eb`

Diff against libxc-6.2.2-GCC-13.2.0-nofhc.eb

easybuild/easyconfigs/l/libxc/libxc-6.2.2-GCC-13.2.0-nofhc.eb

diff --git a/easybuild/easyconfigs/l/libxc/libxc-6.2.2-GCC-13.2.0-nofhc.eb b/easybuild/easyconfigs/l/libxc/libxc-6.2.2-NVHPC-23.7-CUDA-12.1.1.eb
index 709703f294..2cacc1fb9d 100644
--- a/easybuild/easyconfigs/l/libxc/libxc-6.2.2-GCC-13.2.0-nofhc.eb
+++ b/easybuild/easyconfigs/l/libxc/libxc-6.2.2-NVHPC-23.7-CUDA-12.1.1.eb
@@ -2,22 +2,22 @@ easyblock = 'CMakeMake'
 
 name = 'libxc'
 version = '6.2.2'
-versionsuffix = '-nofhc'
 
 homepage = 'https://libxc.gitlab.io'
 description = """Libxc is a library of exchange-correlation functionals for density-functional theory.
  The aim is to provide a portable, well tested and reliable set of exchange and correlation functionals."""
 
-toolchain = {'name': 'GCC', 'version': '13.2.0'}
+toolchain = {'name': 'NVHPC', 'version': '23.7-CUDA-12.1.1'}
 
 source_urls = ['https://gitlab.com/libxc/libxc/-/archive/%(version)s/']
 sources = [SOURCE_TAR_GZ]
 checksums = [('a0f6f1bba7ba5c0c85b2bfe65aca1591025f509a7f11471b4cd651a79491b045',
-              '3b0523924579cf494cafc6fea92945257f35692b004217d3dfd3ea7ca780e8dc')]
+              '3b0523924579cf494cafc6fea92945257f35692b004217d3dfd3ea7ca780e8dc'),
+             ]
 
 builddependencies = [
-    ('CMake', '3.27.6'),
-    ('Perl', '5.38.0'),
+    ('CMake', '3.26.3'),
+    ('Perl', '5.36.1'),
 ]
 
 local_common_configopts = "-DENABLE_FORTRAN=ON -DENABLE_XHOST=OFF "
@@ -27,9 +27,6 @@ local_common_configopts = "-DENABLE_FORTRAN=ON -DENABLE_XHOST=OFF "
 # see also https://github.com/pyscf/pyscf/issues/1103
 local_common_configopts += "-DDISABLE_KXC=OFF -DDISABLE_LXC=OFF"
 
-# Disable fhc, this needs to support codes (like VASP) relying on the projector augmented wave (PAW) approach
-local_common_configopts += ' -DDISABLE_FHC=ON'
-
 # perform iterative build to get both static and shared libraries
 configopts = [
     local_common_configopts + ' -DBUILD_SHARED_LIBS=OFF',

Diff against libxc-6.2.2-GCC-13.3.0.eb

easybuild/easyconfigs/l/libxc/libxc-6.2.2-GCC-13.3.0.eb

diff --git a/easybuild/easyconfigs/l/libxc/libxc-6.2.2-GCC-13.3.0.eb b/easybuild/easyconfigs/l/libxc/libxc-6.2.2-NVHPC-23.7-CUDA-12.1.1.eb
index 9e4de5e808..2cacc1fb9d 100644
--- a/easybuild/easyconfigs/l/libxc/libxc-6.2.2-GCC-13.3.0.eb
+++ b/easybuild/easyconfigs/l/libxc/libxc-6.2.2-NVHPC-23.7-CUDA-12.1.1.eb
@@ -7,19 +7,17 @@ homepage = 'https://libxc.gitlab.io'
 description = """Libxc is a library of exchange-correlation functionals for density-functional theory.
  The aim is to provide a portable, well tested and reliable set of exchange and correlation functionals."""
 
-toolchain = {'name': 'GCC', 'version': '13.3.0'}
+toolchain = {'name': 'NVHPC', 'version': '23.7-CUDA-12.1.1'}
 
-source_urls = ['https://gitlab.com/%(name)s/%(name)s/-/archive/%(version)s/']
+source_urls = ['https://gitlab.com/libxc/libxc/-/archive/%(version)s/']
 sources = [SOURCE_TAR_GZ]
-checksums = [
-    ('a0f6f1bba7ba5c0c85b2bfe65aca1591025f509a7f11471b4cd651a79491b045',
-     '3b0523924579cf494cafc6fea92945257f35692b004217d3dfd3ea7ca780e8dc',
-     'd1b65ef74615a1e539d87a0e6662f04baf3a2316706b4e2e686da3193b26b20f'),
-]
+checksums = [('a0f6f1bba7ba5c0c85b2bfe65aca1591025f509a7f11471b4cd651a79491b045',
+              '3b0523924579cf494cafc6fea92945257f35692b004217d3dfd3ea7ca780e8dc'),
+             ]
 
 builddependencies = [
-    ('CMake', '3.29.3'),
-    ('Perl', '5.38.2'),
+    ('CMake', '3.26.3'),
+    ('Perl', '5.36.1'),
 ]
 
 local_common_configopts = "-DENABLE_FORTRAN=ON -DENABLE_XHOST=OFF "

Updated software `nvompi-2023a.eb`

Diff against nvompi-2022.07.eb

easybuild/easyconfigs/n/nvompi/nvompi-2022.07.eb

diff --git a/easybuild/easyconfigs/n/nvompi/nvompi-2022.07.eb b/easybuild/easyconfigs/n/nvompi/nvompi-2023a.eb
index 1a1647cbfa..31a17edb3f 100644
--- a/easybuild/easyconfigs/n/nvompi/nvompi-2022.07.eb
+++ b/easybuild/easyconfigs/n/nvompi/nvompi-2023a.eb
@@ -1,19 +1,20 @@
 easyblock = 'Toolchain'
 
 name = 'nvompi'
-version = '2022.07'
+version = '2023a'
 
 homepage = '(none)'
 description = 'NVHPC based compiler toolchain, including OpenMPI for MPI support.'
 
 toolchain = SYSTEM
 
-local_compiler = ('NVHPC', '22.7-CUDA-11.7.0')
+local_cuda = '12.1.1'
+local_compiler = ('NVHPC', '23.7-CUDA-%s' % local_cuda)
 
 dependencies = [
     local_compiler,
-    ('OpenMPI', '4.1.4', '', local_compiler),
-    ('CUDA', '11.7.0', '', SYSTEM),
+    ('CUDA', local_cuda, '', SYSTEM),
+    ('OpenMPI', '4.1.5', '', local_compiler),
 ]
 
 moduleclass = 'toolchain'

Updated software `OpenBLAS-0.3.24-NVHPC-23.7-CUDA-12.1.1.eb`

Diff against OpenBLAS-0.3.27-GCC-13.3.0-seq-iface64.eb

easybuild/easyconfigs/o/OpenBLAS/OpenBLAS-0.3.27-GCC-13.3.0-seq-iface64.eb

diff --git a/easybuild/easyconfigs/o/OpenBLAS/OpenBLAS-0.3.27-GCC-13.3.0-seq-iface64.eb b/easybuild/easyconfigs/o/OpenBLAS/OpenBLAS-0.3.24-NVHPC-23.7-CUDA-12.1.1.eb
index 5527c667f6..db86d55f15 100644
--- a/easybuild/easyconfigs/o/OpenBLAS/OpenBLAS-0.3.27-GCC-13.3.0-seq-iface64.eb
+++ b/easybuild/easyconfigs/o/OpenBLAS/OpenBLAS-0.3.24-NVHPC-23.7-CUDA-12.1.1.eb
@@ -1,11 +1,15 @@
 name = 'OpenBLAS'
-version = '0.3.27'
-versionsuffix = '-seq-iface64'
+version = '0.3.24'
 
-homepage = 'https://www.openblas.net/'
+homepage = 'http://www.openblas.net/'
 description = "OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version."
 
-toolchain = {'name': 'GCC', 'version': '13.3.0'}
+toolchain = {'name': 'NVHPC', 'version': '23.7-CUDA-12.1.1'}
+toolchainopts = {
+    # https://github.com/OpenMathLib/OpenBLAS/issues/4625
+    'precise': True,
+    'optarch': 'GENERIC',
+}
 
 source_urls = [
     # order matters, trying to download the large.tgz/timing.tgz LAPACK tarballs from GitHub causes trouble
@@ -17,40 +21,32 @@ patches = [
     ('large.tgz', '.'),
     ('timing.tgz', '.'),
     'OpenBLAS-0.3.15_workaround-gcc-miscompilation.patch',
+    'OpenBLAS-0.3.20_use-xASUM-microkernels-on-new-intel-cpus.patch',
     'OpenBLAS-0.3.21_fix-order-vectorization.patch',
-    'OpenBLAS-0.3.26_lapack_qr_noninittest.patch',
-    'OpenBLAS-0.3.27_fix_zscal.patch',
-    'OpenBLAS-0.3.27_riscv-drop-static-fortran-flag.patch',
+    'OpenBLAS-0.3.23_disable-xDRGES-LAPACK-test.patch',
+    'OpenBLAS-0.3.23_lapack_test_nomain.patch'  # https://github.com/OpenMathLib/OpenBLAS/issues/4625
 ]
 checksums = [
-    {'v0.3.27.tar.gz': 'aa2d68b1564fe2b13bc292672608e9cdeeeb6dc34995512e65c3b10f4599e897'},
+    {'v0.3.24.tar.gz': 'ceadc5065da97bd92404cac7254da66cc6eb192679cf1002098688978d4d5132'},
     {'large.tgz': 'f328d88b7fa97722f271d7d0cfea1c220e0f8e5ed5ff01d8ef1eb51d6f4243a1'},
     {'timing.tgz': '999c65f8ea8bd4eac7f1c7f3463d4946917afd20a997807300fe35d70122f3af'},
     {'OpenBLAS-0.3.15_workaround-gcc-miscompilation.patch':
      'e6b326fb8c4a8a6fd07741d9983c37a72c55c9ff9a4f74a80e1352ce5f975971'},
+    {'OpenBLAS-0.3.20_use-xASUM-microkernels-on-new-intel-cpus.patch':
+     '1dbd0f9473963dbdd9131611b455d8a801f1e995eae82896186d3d3ffe6d5f03'},
     {'OpenBLAS-0.3.21_fix-order-vectorization.patch':
      '08af834e5d60441fd35c128758ed9c092ba6887c829e0471ecd489079539047d'},
-    {'OpenBLAS-0.3.26_lapack_qr_noninittest.patch': '4781bf1d7b239374fd8069e15b4e2c0ef0e8efaa1a7d4c33557bd5b27e5de77c'},
-    {'OpenBLAS-0.3.27_fix_zscal.patch': '9210d7b66538dabaddbe1bfceb16f8225708856f60876ca5561b19d3599f9fd1'},
-    {'OpenBLAS-0.3.27_riscv-drop-static-fortran-flag.patch':
-     'f374e41efffd592ab1c9034df9e7abf1045ed151f4fc0fd0da618ce9826f2d4b'},
+    {'OpenBLAS-0.3.23_disable-xDRGES-LAPACK-test.patch':
+     'ab7e0af05f9b2a2ced32f3875e1e3767d9c3531a455421a38f7324350178a0ff'},
+    {'OpenBLAS-0.3.23_lapack_test_nomain.patch': '63e14e2cb67dd81ecc13fa0e07685f853abe0669d4cadd520247b88789346948'},
 ]
 
 builddependencies = [
     ('make', '4.4.1'),
     # required by LAPACK test suite
-    ('Python', '3.12.3'),
+    ('Python', '3.11.3'),
 ]
 
-# INTERFACE64=1 needs if you link OpenBLAS for fortran code compied with 64 bit integers (-i8)
-# This would be in intel library naming convention ilp64
-# The USE_OPENMP=0 and USE_THREAD=0 needs for the single threaded version
-# The USE_LOCKING=1 needs for thread safe version (if threaded software calls OpenBLAS, without it
-# OpenBLAS is not thread safe (so only single threaded software would be able to use it)
-buildopts = "INTERFACE64=1 USE_OPENMP=0 USE_THREAD=0 USE_LOCKING=1 "
-testopts = buildopts
-installopts = buildopts
-
 run_lapack_tests = True
 max_failing_lapack_tests_num_errors = 150

Diff against OpenBLAS-0.3.27-GCC-13.2.0-seq-iface64.eb

easybuild/easyconfigs/o/OpenBLAS/OpenBLAS-0.3.27-GCC-13.2.0-seq-iface64.eb

diff --git a/easybuild/easyconfigs/o/OpenBLAS/OpenBLAS-0.3.27-GCC-13.2.0-seq-iface64.eb b/easybuild/easyconfigs/o/OpenBLAS/OpenBLAS-0.3.24-NVHPC-23.7-CUDA-12.1.1.eb
index f205e063da..db86d55f15 100644
--- a/easybuild/easyconfigs/o/OpenBLAS/OpenBLAS-0.3.27-GCC-13.2.0-seq-iface64.eb
+++ b/easybuild/easyconfigs/o/OpenBLAS/OpenBLAS-0.3.24-NVHPC-23.7-CUDA-12.1.1.eb
@@ -1,11 +1,15 @@
 name = 'OpenBLAS'
-version = '0.3.27'
-versionsuffix = '-seq-iface64'
+version = '0.3.24'
 
-homepage = 'https://www.openblas.net/'
+homepage = 'http://www.openblas.net/'
 description = "OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version."
 
-toolchain = {'name': 'GCC', 'version': '13.2.0'}
+toolchain = {'name': 'NVHPC', 'version': '23.7-CUDA-12.1.1'}
+toolchainopts = {
+    # https://github.com/OpenMathLib/OpenBLAS/issues/4625
+    'precise': True,
+    'optarch': 'GENERIC',
+}
 
 source_urls = [
     # order matters, trying to download the large.tgz/timing.tgz LAPACK tarballs from GitHub causes trouble
@@ -17,40 +21,32 @@ patches = [
     ('large.tgz', '.'),
     ('timing.tgz', '.'),
     'OpenBLAS-0.3.15_workaround-gcc-miscompilation.patch',
+    'OpenBLAS-0.3.20_use-xASUM-microkernels-on-new-intel-cpus.patch',
     'OpenBLAS-0.3.21_fix-order-vectorization.patch',
-    'OpenBLAS-0.3.26_lapack_qr_noninittest.patch',
-    'OpenBLAS-0.3.27_fix_zscal.patch',
-    'OpenBLAS-0.3.27_riscv-drop-static-fortran-flag.patch',
+    'OpenBLAS-0.3.23_disable-xDRGES-LAPACK-test.patch',
+    'OpenBLAS-0.3.23_lapack_test_nomain.patch'  # https://github.com/OpenMathLib/OpenBLAS/issues/4625
 ]
 checksums = [
-    {'v0.3.27.tar.gz': 'aa2d68b1564fe2b13bc292672608e9cdeeeb6dc34995512e65c3b10f4599e897'},
+    {'v0.3.24.tar.gz': 'ceadc5065da97bd92404cac7254da66cc6eb192679cf1002098688978d4d5132'},
     {'large.tgz': 'f328d88b7fa97722f271d7d0cfea1c220e0f8e5ed5ff01d8ef1eb51d6f4243a1'},
     {'timing.tgz': '999c65f8ea8bd4eac7f1c7f3463d4946917afd20a997807300fe35d70122f3af'},
     {'OpenBLAS-0.3.15_workaround-gcc-miscompilation.patch':
      'e6b326fb8c4a8a6fd07741d9983c37a72c55c9ff9a4f74a80e1352ce5f975971'},
+    {'OpenBLAS-0.3.20_use-xASUM-microkernels-on-new-intel-cpus.patch':
+     '1dbd0f9473963dbdd9131611b455d8a801f1e995eae82896186d3d3ffe6d5f03'},
     {'OpenBLAS-0.3.21_fix-order-vectorization.patch':
      '08af834e5d60441fd35c128758ed9c092ba6887c829e0471ecd489079539047d'},
-    {'OpenBLAS-0.3.26_lapack_qr_noninittest.patch': '4781bf1d7b239374fd8069e15b4e2c0ef0e8efaa1a7d4c33557bd5b27e5de77c'},
-    {'OpenBLAS-0.3.27_fix_zscal.patch': '9210d7b66538dabaddbe1bfceb16f8225708856f60876ca5561b19d3599f9fd1'},
-    {'OpenBLAS-0.3.27_riscv-drop-static-fortran-flag.patch':
-     'f374e41efffd592ab1c9034df9e7abf1045ed151f4fc0fd0da618ce9826f2d4b'},
+    {'OpenBLAS-0.3.23_disable-xDRGES-LAPACK-test.patch':
+     'ab7e0af05f9b2a2ced32f3875e1e3767d9c3531a455421a38f7324350178a0ff'},
+    {'OpenBLAS-0.3.23_lapack_test_nomain.patch': '63e14e2cb67dd81ecc13fa0e07685f853abe0669d4cadd520247b88789346948'},
 ]
 
 builddependencies = [
     ('make', '4.4.1'),
     # required by LAPACK test suite
-    ('Python', '3.11.5'),
+    ('Python', '3.11.3'),
 ]
 
-# INTERFACE64=1 needs if you link OpenBLAS for fortran code compied with 64 bit integers (-i8)
-# This would be in intel library naming convention ilp64
-# The USE_OPENMP=0 and USE_THREAD=0 needs for the single threaded version
-# The USE_LOCKING=1 needs for thread safe version (if threaded software calls OpenBLAS, without it
-# OpenBLAS is not thread safe (so only single threaded software would be able to use it)
-buildopts = "INTERFACE64=1 USE_OPENMP=0 USE_THREAD=0 USE_LOCKING=1 "
-testopts = buildopts
-installopts = buildopts
-
 run_lapack_tests = True
 max_failing_lapack_tests_num_errors = 150

Updated software `OpenMPI-4.1.5-NVHPC-23.7-CUDA-12.1.1.eb`

Diff against OpenMPI-5.0.3-NVHPC-24.9-CUDA-12.6.0.eb

easybuild/easyconfigs/o/OpenMPI/OpenMPI-5.0.3-NVHPC-24.9-CUDA-12.6.0.eb

diff --git a/easybuild/easyconfigs/o/OpenMPI/OpenMPI-5.0.3-NVHPC-24.9-CUDA-12.6.0.eb b/easybuild/easyconfigs/o/OpenMPI/OpenMPI-4.1.5-NVHPC-23.7-CUDA-12.1.1.eb
index e6c772bf64..f858d99eb9 100644
--- a/easybuild/easyconfigs/o/OpenMPI/OpenMPI-5.0.3-NVHPC-24.9-CUDA-12.6.0.eb
+++ b/easybuild/easyconfigs/o/OpenMPI/OpenMPI-4.1.5-NVHPC-23.7-CUDA-12.1.1.eb
@@ -1,63 +1,72 @@
 name = 'OpenMPI'
-version = '5.0.3'
+version = '4.1.5'
 
 homepage = 'https://www.open-mpi.org/'
 description = """The Open MPI Project is an open source MPI-3 implementation."""
 
-toolchain = {'name': 'NVHPC', 'version': '24.9-CUDA-12.6.0'}
+toolchain = {'name': 'NVHPC', 'version': '23.7-CUDA-12.1.1'}
+toolchainopts = {
+    'extra_fcflags': '-Mstandard',  # https://forums.developer.nvidia.com/t/howto-build-openmpi-with-nvhpc-24-1/283219
+}
 
 source_urls = ['https://www.open-mpi.org/software/ompi/v%(version_major_minor)s/downloads']
 sources = [SOURCELOWER_TAR_BZ2]
 patches = [
-    'OpenMPI-5.0.3_fix_hle_make_errors.patch',
-    'OpenMPI-5.0.3_disable_opal_path_nfs_test.patch',
-    ('OpenMPI-5.0.2_build-with-internal-cuda-header.patch', 1)
+    'OpenMPI-4.1.1_build-with-internal-cuda-header.patch',
+    'OpenMPI-4.1.1_opal-datatype-cuda-performance.patch',
+    'OpenMPI-4.1.5_fix-pmix3x.patch',
+    'OpenMPI-4.1.x_add_atomic_wmb.patch',
 ]
 checksums = [
-    {'openmpi-5.0.3.tar.bz2':
-     '990582f206b3ab32e938aa31bbf07c639368e4405dca196fabe7f0f76eeda90b'},
-    {'OpenMPI-5.0.3_fix_hle_make_errors.patch':
-     '881c907a9f5901d5d6af41cd33dffdcecba4a67a9e5123e602542aea57a80895'},
-    {'OpenMPI-5.0.3_disable_opal_path_nfs_test.patch':
-     '75d4417e35252ea3a19b2792f1b06e9aeb408c253aa4921d77226d57b71dee45'},
-    {'OpenMPI-5.0.2_build-with-internal-cuda-header.patch':
-     'f52dc470543f35efef10d651dd159c771ae25f8f76a420d20d87abf4dc769ed7'},
+    {'openmpi-4.1.5.tar.bz2': 'a640986bc257389dd379886fdae6264c8cfa56bc98b71ce3ae3dfbd8ce61dbe3'},
+    {'OpenMPI-4.1.1_build-with-internal-cuda-header.patch':
+     '63eac52736bdf7644c480362440a7f1f0ae7c7cae47b7565f5635c41793f8c83'},
+    {'OpenMPI-4.1.1_opal-datatype-cuda-performance.patch':
+     'b767c7166cf0b32906132d58de5439c735193c9fd09ec3c5c11db8d5fa68750e'},
+    {'OpenMPI-4.1.5_fix-pmix3x.patch': '46edac3dbf32f2a611d45e8a3c8edd3ae2f430eec16a1373b510315272115c40'},
+    {'OpenMPI-4.1.x_add_atomic_wmb.patch': '9494bbc546d661ba5189e44b4c84a7f8df30a87cdb9d96ce2e73a7c8fecba172'},
 ]
 
 builddependencies = [
-    ('pkgconf', '2.2.0'),
-    ('Perl', '5.38.2'),
-    ('Autotools', '20231222'),
+    ('pkgconf', '1.9.5'),
+    ('Perl', '5.36.1'),
+    ('Autotools', '20220317'),
 ]
 
 dependencies = [
-    ('zlib', '1.3.1'),
-    ('hwloc', '2.10.0'),
+    ('zlib', '1.2.13'),
+    ('hwloc', '2.9.1'),
     ('libevent', '2.1.12'),
-    ('UCX', '1.16.0'),
-    ('UCX-CUDA', '1.16.0', '-CUDA-%(cudaver)s'),
-    ('libfabric', '1.21.0'),
-    ('PMIx', '5.0.2'),
-    ('PRRTE', '3.0.5'),
-    ('UCC', '1.3.0'),
-    ('UCC-CUDA', '1.3.0', '-CUDA-%(cudaver)s'),
+    ('UCX', '1.14.1'),
+    ('UCX-CUDA', '1.14.1', '-CUDA-%(cudaver)s'),
+    ('libfabric', '1.18.0'),
+    ('PMIx', '4.2.4'),
+    ('UCC', '1.2.0'),
+    ('UCC-CUDA', '1.2.0', '-CUDA-%(cudaver)s'),
 ]
 
+# Update configure to include changes from the "internal-cuda" patch
+# by running a subset of autogen.pl sufficient to achieve this
+# without doing the full, long-running regeneration.
+preconfigopts = ' && '.join([
+    'cd config',
+    'autom4te --language=m4sh opal_get_version.m4sh -o opal_get_version.sh',
+    'cd ..',
+    'autoconf',
+    'autoheader',
+    'aclocal',
+    'automake',
+    ''
+])
+
 # CUDA related patches and custom configure option can be removed if CUDA support isn't wanted.
-preconfigopts = 'nvc -Iopal/mca/cuda/include -shared opal/mca/cuda/lib/cuda.c -o opal/mca/cuda/lib/libcuda.so && '
-# Update configure to include changes from the "disable_opal_path_nfs_test" patch
-preconfigopts += './autogen.pl --force && '
-
-configopts = '--with-cuda=%(start_dir)s/opal/mca/cuda '
-# Required to prevent internal compiler error in opal.
-configopts += '--enable-alt-short-float=no '
-# Set PGI compilers manually, as NVHPC compilers are not correctly detected
-configopts += 'CC=pgcc CXX=pgc++ FC=pgfortran '
-
-# site specific options
-# configopts += '--without-psm2 '
-# configopts += '--disable-oshmem '
-# configopts += '--with-gpfs '
-configopts += '--with-slurm '
+configopts = '--with-cuda=internal '
+configopts += ' CC=pgcc CXX=pgc++ FC=pgfortran'
+
+# disable MPI1 compatibility for now, see what breaks...
+# configopts += '--enable-mpi1-compatibility '
+
+# to enable SLURM integration (site-specific)
+# configopts += '--with-slurm --with-pmi=/usr/include/slurm --with-pmi-libdir=/usr'
 
 moduleclass = 'mpi'

Diff against OpenMPI-5.0.3-GCC-13.3.0.eb

easybuild/easyconfigs/o/OpenMPI/OpenMPI-5.0.3-GCC-13.3.0.eb

diff --git a/easybuild/easyconfigs/o/OpenMPI/OpenMPI-5.0.3-GCC-13.3.0.eb b/easybuild/easyconfigs/o/OpenMPI/OpenMPI-4.1.5-NVHPC-23.7-CUDA-12.1.1.eb
index 6864e213a9..f858d99eb9 100644
--- a/easybuild/easyconfigs/o/OpenMPI/OpenMPI-5.0.3-GCC-13.3.0.eb
+++ b/easybuild/easyconfigs/o/OpenMPI/OpenMPI-4.1.5-NVHPC-23.7-CUDA-12.1.1.eb
@@ -1,38 +1,72 @@
 name = 'OpenMPI'
-version = '5.0.3'
+version = '4.1.5'
 
 homepage = 'https://www.open-mpi.org/'
 description = """The Open MPI Project is an open source MPI-3 implementation."""
 
-toolchain = {'name': 'GCC', 'version': '13.3.0'}
+toolchain = {'name': 'NVHPC', 'version': '23.7-CUDA-12.1.1'}
+toolchainopts = {
+    'extra_fcflags': '-Mstandard',  # https://forums.developer.nvidia.com/t/howto-build-openmpi-with-nvhpc-24-1/283219
+}
 
 source_urls = ['https://www.open-mpi.org/software/ompi/v%(version_major_minor)s/downloads']
 sources = [SOURCELOWER_TAR_BZ2]
-patches = [('OpenMPI-5.0.2_build-with-internal-cuda-header.patch', 1)]
+patches = [
+    'OpenMPI-4.1.1_build-with-internal-cuda-header.patch',
+    'OpenMPI-4.1.1_opal-datatype-cuda-performance.patch',
+    'OpenMPI-4.1.5_fix-pmix3x.patch',
+    'OpenMPI-4.1.x_add_atomic_wmb.patch',
+]
 checksums = [
-    {'openmpi-5.0.3.tar.bz2': '990582f206b3ab32e938aa31bbf07c639368e4405dca196fabe7f0f76eeda90b'},
-    {'OpenMPI-5.0.2_build-with-internal-cuda-header.patch':
-     'f52dc470543f35efef10d651dd159c771ae25f8f76a420d20d87abf4dc769ed7'},
+    {'openmpi-4.1.5.tar.bz2': 'a640986bc257389dd379886fdae6264c8cfa56bc98b71ce3ae3dfbd8ce61dbe3'},
+    {'OpenMPI-4.1.1_build-with-internal-cuda-header.patch':
+     '63eac52736bdf7644c480362440a7f1f0ae7c7cae47b7565f5635c41793f8c83'},
+    {'OpenMPI-4.1.1_opal-datatype-cuda-performance.patch':
+     'b767c7166cf0b32906132d58de5439c735193c9fd09ec3c5c11db8d5fa68750e'},
+    {'OpenMPI-4.1.5_fix-pmix3x.patch': '46edac3dbf32f2a611d45e8a3c8edd3ae2f430eec16a1373b510315272115c40'},
+    {'OpenMPI-4.1.x_add_atomic_wmb.patch': '9494bbc546d661ba5189e44b4c84a7f8df30a87cdb9d96ce2e73a7c8fecba172'},
 ]
 
 builddependencies = [
-    ('pkgconf', '2.2.0'),
-    ('Autotools', '20231222'),
+    ('pkgconf', '1.9.5'),
+    ('Perl', '5.36.1'),
+    ('Autotools', '20220317'),
 ]
 
 dependencies = [
-    ('zlib', '1.3.1'),
-    ('hwloc', '2.10.0'),
+    ('zlib', '1.2.13'),
+    ('hwloc', '2.9.1'),
     ('libevent', '2.1.12'),
-    ('UCX', '1.16.0'),
-    ('libfabric', '1.21.0'),
-    ('PMIx', '5.0.2'),
-    ('PRRTE', '3.0.5'),
-    ('UCC', '1.3.0'),
+    ('UCX', '1.14.1'),
+    ('UCX-CUDA', '1.14.1', '-CUDA-%(cudaver)s'),
+    ('libfabric', '1.18.0'),
+    ('PMIx', '4.2.4'),
+    ('UCC', '1.2.0'),
+    ('UCC-CUDA', '1.2.0', '-CUDA-%(cudaver)s'),
 ]
 
+# Update configure to include changes from the "internal-cuda" patch
+# by running a subset of autogen.pl sufficient to achieve this
+# without doing the full, long-running regeneration.
+preconfigopts = ' && '.join([
+    'cd config',
+    'autom4te --language=m4sh opal_get_version.m4sh -o opal_get_version.sh',
+    'cd ..',
+    'autoconf',
+    'autoheader',
+    'aclocal',
+    'automake',
+    ''
+])
+
 # CUDA related patches and custom configure option can be removed if CUDA support isn't wanted.
-preconfigopts = 'gcc -Iopal/mca/cuda/include -shared opal/mca/cuda/lib/cuda.c -o opal/mca/cuda/lib/libcuda.so && '
-configopts = '--with-cuda=%(start_dir)s/opal/mca/cuda --with-show-load-errors=no '
+configopts = '--with-cuda=internal '
+configopts += ' CC=pgcc CXX=pgc++ FC=pgfortran'
+
+# disable MPI1 compatibility for now, see what breaks...
+# configopts += '--enable-mpi1-compatibility '
+
+# to enable SLURM integration (site-specific)
+# configopts += '--with-slurm --with-pmi=/usr/include/slurm --with-pmi-libdir=/usr'
 
 moduleclass = 'mpi'

Updated software `QuantumESPRESSO-7.3.1-nvompi-2023a-CUDA-12.1.1.eb`

Diff against QuantumESPRESSO-7.4-foss-2024a.eb

easybuild/easyconfigs/q/QuantumESPRESSO/QuantumESPRESSO-7.4-foss-2024a.eb

diff --git a/easybuild/easyconfigs/q/QuantumESPRESSO/QuantumESPRESSO-7.4-foss-2024a.eb b/easybuild/easyconfigs/q/QuantumESPRESSO/QuantumESPRESSO-7.3.1-nvompi-2023a-CUDA-12.1.1.eb
index f017367700..609a3b47f0 100644
--- a/easybuild/easyconfigs/q/QuantumESPRESSO/QuantumESPRESSO-7.4-foss-2024a.eb
+++ b/easybuild/easyconfigs/q/QuantumESPRESSO/QuantumESPRESSO-7.3.1-nvompi-2023a-CUDA-12.1.1.eb
@@ -1,15 +1,15 @@
-name = 'QuantumESPRESSO'
-version = '7.4'
+name = "QuantumESPRESSO"
+version = "7.3.1"
+versionsuffix = '-CUDA-%(cudaver)s'
 
-homepage = 'https://www.quantum-espresso.org'
+homepage = "https://www.quantum-espresso.org"
 description = """Quantum ESPRESSO  is an integrated suite of computer codes
 for electronic-structure calculations and materials modeling at the nanoscale.
 It is based on density-functional theory, plane waves, and pseudopotentials
 (both norm-conserving and ultrasoft).
 """
 
-toolchain = {'name': 'foss', 'version': '2024a'}
-
+toolchain = {"name": "nvompi", "version": "2023a"}
 toolchainopts = {
     "usempi": True,
     "openmp": True,
@@ -17,13 +17,13 @@ toolchainopts = {
 
 # Check hashes inside external/submodule_commit_hash_records when making file for new version
 local_lapack_hash = "12d825396fcef1e0a1b27be9f119f9e554621e55"
-local_mbd_hash = "89a3cc199c0a200c9f0f688c3229ef6b9a8d63bd"
+local_mbd_hash = "82005cbb65bdf5d32ca021848eec8f19da956a77"
 local_devxlib_hash = "a6b89ef77b1ceda48e967921f1f5488d2df9226d"
 local_fox_hash = "3453648e6837658b747b895bb7bef4b1ed2eac40"
-# Different from the one at tag qe-7.4, see https://github.com/anharmonic/d3q/issues/22
-local_d3q_hash = "808acbaf012468f42147d8d6af452ec64b9e5ab0"
-# Different from the one at tag qe-7.4
-local_qe_gipaw_hash = "9b2ae1a46cae045cc04ef02c1072f2e1e74873b2"
+# Different from the one at tag qe-7.3.1 because of:
+# https://gitlab.com/QEF/q-e/-/issues/666
+local_d3q_hash = "de4718351e7bbb9d1d12aad2b7ca232d06775b83"
+local_qe_gipaw_hash = "75b01b694c9ba4df55d294cacc27cf28591b2161"
 local_qmcpack_hash = "f72ab25fa4ea755c1b4b230ae8074b47d5509c70"
 local_w90_hash = "1d6b187374a2d50b509e5e79e2cab01a79ff7ce1"
 
@@ -101,47 +101,76 @@ sources = [
         },
     },
 ]
-patches = [
-    {
-        'name': 'QuantumESPRESSO-7.4-d3q.patch',
-        'sourcepath': '../'  # Needed as patches are normally applied to the first `finalpath` directory
-    },
-]
-# Holding off git clone checksum checks untill 5.0.x
-# See https://github.com/easybuilders/easybuild-framework/pull/4248
+# Holding off checksum checks untill 5.0.x
+# https://github.com/easybuilders/easybuild-framework/pull/4248
+# checksums = [
+#     {'q-e-qe-7.3.1.tar.gz': '2c58b8fadfe4177de5a8b69eba447db5e623420b070dea6fd26c1533b081d844'},
+#     {'lapack-%s.tar.gz' % local_lapack_hash: 'c05532ae0e5fe35f473206dda12970da5f2e2214620487d71837ddcf0ea6b21d'},
+#     {'mbd-%s.tar.gz' % local_mbd_hash: 'a180682c00bb890c9b1e26a98addbd68e32f970c06439acf7582415f4c589800'},
+#     {'devxlib-%s.tar.gz' % local_devxlib_hash: '76da8fe5a2050f58efdc92fa8831efec25c19190df7f4e5e39c173a5fbae83b4'},
+#     {'d3q-%s.tar.gz' % local_d3q_hash: '43e50753a56af05d181b859d3e29d842fb3fc4352f00cb7fe229a435a1f20c31'},
+#     {'fox-%s.tar.gz' % local_fox_hash: '99b6a899a3f947d7763aa318e86f9f08db684568bfdcd293f3318bee9d7f1948'},
+#     {'qe-gipaw-%s.tar.gz' % local_qe_gipaw_hash: '9ac8314363d29cc2f1ce85abd8f26c1a3ae311d54f6e6034d656442dd101c928'},
+#     {'pw2qmcpack-%s.tar.gz' % local_qmcpack_hash: 'a8136da8429fc49ab560ef7356cd6f0a2714dfbb137baff7961f46dfe32061eb'},
+#     {'wannier90-%s.tar.gz' % local_w90_hash: 'f989497790ec9777bdc159945bbf42156edb7268011f972874dec67dd4f58658'},
+# ]
 checksums = [
-    'b15dcfe25f4fbf15ccd34c1194021e90996393478226e601d876f7dea481d104',
-    None, None, None, None, None, None, None, None,
-    '1f1686365fbf0cc56f634e072a92b3d336fe454348e514d0b4136d447f0d4923'
+    '2c58b8fadfe4177de5a8b69eba447db5e623420b070dea6fd26c1533b081d844',
+    None, None, None, None, None, None, None, None
 ]
 
+local_gcc_compiler = ('GCCcore', '12.3.0')
+local_compiler = ('NVHPC', '23.7-CUDA-12.1.1')
+
 builddependencies = [
-    ('M4', '1.4.19'),
-    ('CMake', '3.29.3'),
-    ('pkgconf', '2.2.0'),
+    ("M4", "1.4.19", '', local_gcc_compiler),
+    ("CMake", "3.26.3", '', local_gcc_compiler),
+    ("cURL", "8.0.1", '', local_gcc_compiler),
 ]
 dependencies = [
-    ('HDF5', '1.14.5'),
-    ('ELPA', '2024.05.001'),
-    ('libxc', '6.2.2'),
+    ('CUDA', '12.1.1', '', SYSTEM),
+    ('OpenBLAS', '0.3.24', '', local_compiler),
+    ('FFTW', '3.3.10', '', local_compiler),
+    ('FFTW.MPI', '3.3.10'),
+    ('ScaLAPACK', '2.2.0'),
+    ("HDF5", "1.14.0", '-CUDA-%(cudaver)s'),
+    ("libxc", "6.2.2", '', local_compiler),
+    # ("ELPA", "2023.05.001"),
 ]
 
 # Disabled because of
 # https://gitlab.com/QEF/q-e/-/issues/667
 # https://github.com/anharmonic/d3q/issues/15
 build_shared_libs = False
+with_cuda = True
+
 with_scalapack = True
-with_fox = True
-with_gipaw = True
-with_d3q = True
+with_gipaw = False  # https://github.com/dceresoli/qe-gipaw/issues/19
+with_d3q = False  # Gives `conflict with access` errors
 with_qmcpack = True
 
 moduleclass = "chem"
 
+test_suite_nprocs = 1  # Unless multiple GPUs are available or GPU should be oversubscribed
 test_suite_threshold = (
-    0.98
+    0.4  # Low threshold because of https://gitlab.com/QEF/q-e/-/issues/665
 )
 test_suite_max_failed = (
     5  # Allow for some flaky tests (failed due to strict thresholds)
 )
-test_suite_allow_failures = []
+test_suite_allow_failures = [
+    "test_qe_xclib_",  # 7.3.1:  https://gitlab.com/QEF/q-e/-/issues/640
+
+    # 7.3.1: Broken testsuite (https://gitlab.com/QEF/q-e/-/issues/665)
+    "--hp_",
+    "--ph_",
+    "--epw_",
+    "--tddfpt_",
+
+    # 7.3.1: https://gitlab.com/QEF/q-e/-/issues/675
+    "system--pw_scf--scf-rmm-k",
+    "system--pw_scf--scf-rmm-paro-k",
+    "system--pw_scf-correctness",
+    "system--pw_noncolin--noncolin-rmm",
+    "system--pw_noncolin-correctness",
+]

Diff against QuantumESPRESSO-7.3.1-foss-2024a.eb

easybuild/easyconfigs/q/QuantumESPRESSO/QuantumESPRESSO-7.3.1-foss-2024a.eb

diff --git a/easybuild/easyconfigs/q/QuantumESPRESSO/QuantumESPRESSO-7.3.1-foss-2024a.eb b/easybuild/easyconfigs/q/QuantumESPRESSO/QuantumESPRESSO-7.3.1-nvompi-2023a-CUDA-12.1.1.eb
index 8cbd539a51..609a3b47f0 100644
--- a/easybuild/easyconfigs/q/QuantumESPRESSO/QuantumESPRESSO-7.3.1-foss-2024a.eb
+++ b/easybuild/easyconfigs/q/QuantumESPRESSO/QuantumESPRESSO-7.3.1-nvompi-2023a-CUDA-12.1.1.eb
@@ -1,15 +1,15 @@
-name = 'QuantumESPRESSO'
-version = '7.3.1'
+name = "QuantumESPRESSO"
+version = "7.3.1"
+versionsuffix = '-CUDA-%(cudaver)s'
 
-homepage = 'https://www.quantum-espresso.org'
+homepage = "https://www.quantum-espresso.org"
 description = """Quantum ESPRESSO  is an integrated suite of computer codes
 for electronic-structure calculations and materials modeling at the nanoscale.
 It is based on density-functional theory, plane waves, and pseudopotentials
 (both norm-conserving and ultrasoft).
 """
 
-toolchain = {'name': 'foss', 'version': '2024a'}
-
+toolchain = {"name": "nvompi", "version": "2023a"}
 toolchainopts = {
     "usempi": True,
     "openmp": True,
@@ -23,9 +23,7 @@ local_fox_hash = "3453648e6837658b747b895bb7bef4b1ed2eac40"
 # Different from the one at tag qe-7.3.1 because of:
 # https://gitlab.com/QEF/q-e/-/issues/666
 local_d3q_hash = "de4718351e7bbb9d1d12aad2b7ca232d06775b83"
-# Different from the one at tag qe-7.3.1 because of:
-# https://github.com/dceresoli/qe-gipaw/issues/19
-local_qe_gipaw_hash = "79d3a03b7bdc4325e66f3fad02a24c6e6e3e5806"
+local_qe_gipaw_hash = "75b01b694c9ba4df55d294cacc27cf28591b2161"
 local_qmcpack_hash = "f72ab25fa4ea755c1b4b230ae8074b47d5509c70"
 local_w90_hash = "1d6b187374a2d50b509e5e79e2cab01a79ff7ce1"
 
@@ -121,29 +119,39 @@ checksums = [
     None, None, None, None, None, None, None, None
 ]
 
+local_gcc_compiler = ('GCCcore', '12.3.0')
+local_compiler = ('NVHPC', '23.7-CUDA-12.1.1')
+
 builddependencies = [
-    ('M4', '1.4.19'),
-    ('CMake', '3.29.3'),
-    ('pkgconf', '2.2.0'),
+    ("M4", "1.4.19", '', local_gcc_compiler),
+    ("CMake", "3.26.3", '', local_gcc_compiler),
+    ("cURL", "8.0.1", '', local_gcc_compiler),
 ]
 dependencies = [
-    ('HDF5', '1.14.5'),
-    ('ELPA', '2024.05.001'),
-    ('libxc', '6.2.2'),
+    ('CUDA', '12.1.1', '', SYSTEM),
+    ('OpenBLAS', '0.3.24', '', local_compiler),
+    ('FFTW', '3.3.10', '', local_compiler),
+    ('FFTW.MPI', '3.3.10'),
+    ('ScaLAPACK', '2.2.0'),
+    ("HDF5", "1.14.0", '-CUDA-%(cudaver)s'),
+    ("libxc", "6.2.2", '', local_compiler),
+    # ("ELPA", "2023.05.001"),
 ]
 
 # Disabled because of
 # https://gitlab.com/QEF/q-e/-/issues/667
 # https://github.com/anharmonic/d3q/issues/15
 build_shared_libs = False
+with_cuda = True
+
 with_scalapack = True
-with_fox = True
-with_gipaw = True
-with_d3q = True
+with_gipaw = False  # https://github.com/dceresoli/qe-gipaw/issues/19
+with_d3q = False  # Gives `conflict with access` errors
 with_qmcpack = True
 
 moduleclass = "chem"
 
+test_suite_nprocs = 1  # Unless multiple GPUs are available or GPU should be oversubscribed
 test_suite_threshold = (
     0.4  # Low threshold because of https://gitlab.com/QEF/q-e/-/issues/665
 )
@@ -152,8 +160,17 @@ test_suite_max_failed = (
 )
 test_suite_allow_failures = [
     "test_qe_xclib_",  # 7.3.1:  https://gitlab.com/QEF/q-e/-/issues/640
-    "--hp_",  # 7.3.1:  Broken testsuite (https://gitlab.com/QEF/q-e/-/issues/665)
-    "--ph_",  # 7.3.1:  Broken testsuite (https://gitlab.com/QEF/q-e/-/issues/665)
-    "--epw_",  # 7.3.1:  Broken testsuite (https://gitlab.com/QEF/q-e/-/issues/665)
-    "--tddfpt_",  # 7.3.1:  Broken testsuite (https://gitlab.com/QEF/q-e/-/issues/665)
+
+    # 7.3.1: Broken testsuite (https://gitlab.com/QEF/q-e/-/issues/665)
+    "--hp_",
+    "--ph_",
+    "--epw_",
+    "--tddfpt_",
+
+    # 7.3.1: https://gitlab.com/QEF/q-e/-/issues/675
+    "system--pw_scf--scf-rmm-k",
+    "system--pw_scf--scf-rmm-paro-k",
+    "system--pw_scf-correctness",
+    "system--pw_noncolin--noncolin-rmm",
+    "system--pw_noncolin-correctness",
 ]

Updated software `ScaLAPACK-2.2.0-nvompi-2023a.eb`

Diff against ScaLAPACK-2.2.0-gompi-2024a-fb.eb

easybuild/easyconfigs/s/ScaLAPACK/ScaLAPACK-2.2.0-gompi-2024a-fb.eb

diff --git a/easybuild/easyconfigs/s/ScaLAPACK/ScaLAPACK-2.2.0-gompi-2024a-fb.eb b/easybuild/easyconfigs/s/ScaLAPACK/ScaLAPACK-2.2.0-nvompi-2023a.eb
index 1bccc16f38..8e4f83e46c 100644
--- a/easybuild/easyconfigs/s/ScaLAPACK/ScaLAPACK-2.2.0-gompi-2024a-fb.eb
+++ b/easybuild/easyconfigs/s/ScaLAPACK/ScaLAPACK-2.2.0-nvompi-2023a.eb
@@ -1,12 +1,11 @@
 name = 'ScaLAPACK'
 version = '2.2.0'
-versionsuffix = '-fb'
 
 homepage = 'https://www.netlib.org/scalapack/'
 description = """The ScaLAPACK (or Scalable LAPACK) library includes a subset of LAPACK routines
  redesigned for distributed memory MIMD parallel computers."""
 
-toolchain = {'name': 'gompi', 'version': '2024a'}
+toolchain = {'name': 'nvompi', 'version': '2023a'}
 toolchainopts = {'extra_fflags': '-lpthread', 'openmp': True, 'pic': True, 'usempi': True}
 
 source_urls = [homepage]
@@ -18,19 +17,19 @@ checksums = [
 ]
 
 builddependencies = [
-    ('CMake', '3.29.3'),
+    ('CMake', '3.26.3'),
 ]
 
 dependencies = [
-    ('FlexiBLAS', '3.4.4'),
+    ('OpenBLAS', '0.3.24'),
 ]
 
 # Config Opts based on AOCL User Guide:
 # https://developer.amd.com/wp-content/resources/AOCL_User%20Guide_2.2.pdf
 
 configopts = '-DBUILD_SHARED_LIBS=ON '
-configopts += '-DBLAS_LIBRARIES="$EBROOTFLEXIBLAS/lib/libflexiblas.%s" ' % SHLIB_EXT
-configopts += '-DLAPACK_LIBRARIES="$EBROOTFLEXIBLAS/lib/libflexiblas.%s" ' % SHLIB_EXT
+# configopts += '-DBLAS_LIBRARIES="$EBROOTFLEXIBLAS/lib/libflexiblas.%s" ' % SHLIB_EXT
+# configopts += '-DLAPACK_LIBRARIES="$EBROOTFLEXIBLAS/lib/libflexiblas.%s" ' % SHLIB_EXT
 
 sanity_check_paths = {
     'files': ['lib/libscalapack.%s' % SHLIB_EXT, 'lib64/libscalapack.%s' % SHLIB_EXT],

Diff against ScaLAPACK-2.2.0-gmpich-2024.06-fb.eb

easybuild/easyconfigs/s/ScaLAPACK/ScaLAPACK-2.2.0-gmpich-2024.06-fb.eb

diff --git a/easybuild/easyconfigs/s/ScaLAPACK/ScaLAPACK-2.2.0-gmpich-2024.06-fb.eb b/easybuild/easyconfigs/s/ScaLAPACK/ScaLAPACK-2.2.0-nvompi-2023a.eb
index 28d4f766d5..8e4f83e46c 100644
--- a/easybuild/easyconfigs/s/ScaLAPACK/ScaLAPACK-2.2.0-gmpich-2024.06-fb.eb
+++ b/easybuild/easyconfigs/s/ScaLAPACK/ScaLAPACK-2.2.0-nvompi-2023a.eb
@@ -1,12 +1,11 @@
 name = 'ScaLAPACK'
 version = '2.2.0'
-versionsuffix = '-fb'
 
 homepage = 'https://www.netlib.org/scalapack/'
 description = """The ScaLAPACK (or Scalable LAPACK) library includes a subset of LAPACK routines
  redesigned for distributed memory MIMD parallel computers."""
 
-toolchain = {'name': 'gmpich', 'version': '2024.06'}
+toolchain = {'name': 'nvompi', 'version': '2023a'}
 toolchainopts = {'extra_fflags': '-lpthread', 'openmp': True, 'pic': True, 'usempi': True}
 
 source_urls = [homepage]
@@ -22,15 +21,15 @@ builddependencies = [
 ]
 
 dependencies = [
-    ('FlexiBLAS', '3.3.1'),
+    ('OpenBLAS', '0.3.24'),
 ]
 
 # Config Opts based on AOCL User Guide:
 # https://developer.amd.com/wp-content/resources/AOCL_User%20Guide_2.2.pdf
 
 configopts = '-DBUILD_SHARED_LIBS=ON '
-configopts += '-DBLAS_LIBRARIES="$EBROOTFLEXIBLAS/lib/libflexiblas.%s" ' % SHLIB_EXT
-configopts += '-DLAPACK_LIBRARIES="$EBROOTFLEXIBLAS/lib/libflexiblas.%s" ' % SHLIB_EXT
+# configopts += '-DBLAS_LIBRARIES="$EBROOTFLEXIBLAS/lib/libflexiblas.%s" ' % SHLIB_EXT
+# configopts += '-DLAPACK_LIBRARIES="$EBROOTFLEXIBLAS/lib/libflexiblas.%s" ' % SHLIB_EXT
 
 sanity_check_paths = {
     'files': ['lib/libscalapack.%s' % SHLIB_EXT, 'lib64/libscalapack.%s' % SHLIB_EXT],

Crivella added 2 commits April 8, 2024 12:35

Added cuda recipe for QE

ce4438d

Added EC files for nvofbf toolchain + QE

7a62d7e

migueldiascosta added the update label Apr 16, 2024

migueldiascosta added this to the 4.x milestone Apr 16, 2024

Crivella added 7 commits April 17, 2024 15:12

Added HDF5

32c751f

Added libxc

dac0e06

Changed QE to compile from nvompi

0133074

Removed flexiblas and child packages

706e9d1

Cleanup and better docs

7ebde85

Revert "Removed flexiblas and child packages"

9a45407

This reverts commit 706e9d1.

Removed flexiblas and child packages

ad07ec8

Crivella changed the title ~~{numlib,chem,tollchain}[NVHPC/23.7-CUDA-12.1.1] nvofbf-2023a + QuantumESPRESSO-7.3.1 (GPU enabled)~~ {numlib,chem,tollchain}[NVHPC/23.7-CUDA-12.1.1] nvompi-2023a + QuantumESPRESSO-7.3.1 (GPU enabled) Apr 22, 2024

bedroge mentioned this pull request Jun 5, 2024

add patches for failing LAPACK tests and RISC-V test segfaults to OpenBLAS 0.3.27 #20745

Merged

This comment was marked as off-topic.

Sign in to view

Crivella added 2 commits September 6, 2024 15:31

Removed unsed file

0a4bd25

Merge branch 'develop' into feature-QE_cuda

b2126f7

Updated with new extras

e85a5ef

Crivella added a commit to Crivella/easybuild-easyconfigs that referenced this pull request Oct 3, 2024

Readded files removed from PR easybuilders#20364

0c2a597

This reverts commit 706e9d1.

Crivella mentioned this pull request Oct 3, 2024

{toolchain}[NVHPC/23.7-CUDA-12.1.1] nvofbf-2023a #21530

Open

Crivella mentioned this pull request Oct 3, 2024

{chem}[nvofbf/2023a-CUDA-12.1.1] MetalWalls 21.06.1 #21533

Open

Fix for new easyblock

5869132

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

{numlib,chem,tollchain}[NVHPC/23.7-CUDA-12.1.1] nvompi-2023a + QuantumESPRESSO-7.3.1 (GPU enabled) #20364

{numlib,chem,tollchain}[NVHPC/23.7-CUDA-12.1.1] nvompi-2023a + QuantumESPRESSO-7.3.1 (GPU enabled) #20364

Crivella commented Apr 15, 2024 •

edited

Loading

Crivella commented Apr 18, 2024

cgross95 commented May 21, 2024

Crivella commented May 22, 2024 •

edited

Loading

cgross95 commented May 22, 2024

cgross95 commented Jun 7, 2024

beeebiii commented Sep 5, 2024 •

edited

Loading

Crivella commented Sep 5, 2024

beeebiii commented Sep 5, 2024 •

edited

Loading

Crivella commented Sep 5, 2024

This comment was marked as off-topic.

Crivella commented Sep 5, 2024

Crivella commented Sep 9, 2024

yqshao commented Oct 25, 2024 •

edited

Loading

github-actions bot commented Dec 5, 2024

{numlib,chem,tollchain}[NVHPC/23.7-CUDA-12.1.1] nvompi-2023a + QuantumESPRESSO-7.3.1 (GPU enabled) #20364

Are you sure you want to change the base?

{numlib,chem,tollchain}[NVHPC/23.7-CUDA-12.1.1] nvompi-2023a + QuantumESPRESSO-7.3.1 (GPU enabled) #20364

Conversation

Crivella commented Apr 15, 2024 • edited Loading

Crivella commented Apr 18, 2024

cgross95 commented May 21, 2024

Crivella commented May 22, 2024 • edited Loading

cgross95 commented May 22, 2024

cgross95 commented Jun 7, 2024

beeebiii commented Sep 5, 2024 • edited Loading

Crivella commented Sep 5, 2024

beeebiii commented Sep 5, 2024 • edited Loading

Crivella commented Sep 5, 2024

This comment was marked as off-topic.

Crivella commented Sep 5, 2024

Crivella commented Sep 9, 2024

yqshao commented Oct 25, 2024 • edited Loading

github-actions bot commented Dec 5, 2024

Updated software FFTW.MPI-3.3.10-nvompi-2023a.eb

Updated software FFTW-3.3.10-NVHPC-23.7-CUDA-12.1.1.eb

Updated software HDF5-1.14.0-nvompi-2023a-CUDA-12.1.1.eb

Updated software libxc-6.2.2-NVHPC-23.7-CUDA-12.1.1.eb

Updated software nvompi-2023a.eb

Updated software OpenBLAS-0.3.24-NVHPC-23.7-CUDA-12.1.1.eb

Updated software OpenMPI-4.1.5-NVHPC-23.7-CUDA-12.1.1.eb

Updated software QuantumESPRESSO-7.3.1-nvompi-2023a-CUDA-12.1.1.eb

Updated software ScaLAPACK-2.2.0-nvompi-2023a.eb

Crivella commented Apr 15, 2024 •

edited

Loading

Crivella commented May 22, 2024 •

edited

Loading

beeebiii commented Sep 5, 2024 •

edited

Loading

beeebiii commented Sep 5, 2024 •

edited

Loading

yqshao commented Oct 25, 2024 •

edited

Loading

Updated software `FFTW.MPI-3.3.10-nvompi-2023a.eb`

Updated software `FFTW-3.3.10-NVHPC-23.7-CUDA-12.1.1.eb`

Updated software `HDF5-1.14.0-nvompi-2023a-CUDA-12.1.1.eb`

Updated software `libxc-6.2.2-NVHPC-23.7-CUDA-12.1.1.eb`

Updated software `nvompi-2023a.eb`

Updated software `OpenBLAS-0.3.24-NVHPC-23.7-CUDA-12.1.1.eb`

Updated software `OpenMPI-4.1.5-NVHPC-23.7-CUDA-12.1.1.eb`

Updated software `QuantumESPRESSO-7.3.1-nvompi-2023a-CUDA-12.1.1.eb`

Updated software `ScaLAPACK-2.2.0-nvompi-2023a.eb`