Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

{numlib,chem,tollchain}[NVHPC/23.7-CUDA-12.1.1] nvompi-2023a + QuantumESPRESSO-7.3.1 (GPU enabled) #20364

Open
wants to merge 13 commits into
base: develop
Choose a base branch
from

Conversation

Crivella
Copy link
Contributor

@Crivella Crivella commented Apr 15, 2024

Added easyconfig files for nvofbf toolchain + QE 7.3.1

local compilers:

  • GCC/12.3.0
  • CUDA/12.1.1

Added toolchain/numlib

  • nvofbf-2023a
    • nvompi-2023a
      • NVHPC-23.7-CUDA-12.1.1
      • OpenMPI-4.1.5
    • FlexiBLAS-3.3.1
      • OpenBLAS-0.3.24
    • FFTW-3.3.10
    • FFTW.MPI-3.3.10
    • ScaLAPACK-2.2.0-fb

Added easyconfigs

  • HDF5-1.14.0-nvompi-2023a-CUDA-12.1.1.eb
  • libxc-6.2.2-NVHPC-23.7-CUDA-12.1.1.eb
  • QuantumESPRESSO-7.3.1-nvompi-2023a-CUDA-12.1.1.eb

NOTES:

Solved issues:

Open issue:

  • Segfault in QE test-suite due to FlexiBLAS occasionally when calling the ZHEEV BLAS routine
    • Bug does not manifest when running the code with cuda-gdb
    • Tested starting from nvompi linking directly to OpenBLAS and the error was not present
  • Segfault in 3 test cases with RMM-DIS diagonalization with k points other than GAMMA, most likely a QE bug (https://gitlab.com/QEF/q-e/-/issues/675)
  • Full CUDA libxc: https://gitlab.com/libxc/libxc/-/issues/135
    • Tested patch from commit e648f37b
      • Compile time goes from ~5min to ~3.5h
      • Tests are unable to run
      • I would argue for now since it is not officially supported with CMAKE and only experimental with autotools, and also not a really widely used feature of QE, it is ok to not have the libxc routines run on GPU

@migueldiascosta migueldiascosta added this to the 4.x milestone Apr 16, 2024
@Crivella
Copy link
Contributor Author

Comparison of code efficiency when linked to EB numlibs (no prefix) VS linked to NVHPC math_libs (-test prefix) shows no significative difference running on one node with a A100 GPU

[ RUN      ] MINE_QESPRESSO %ecut=250 %nbnd=400 %module_name=QuantumESPRESSO/7.3.1-nvompi-2023a-test %threads=1 /bf4db141 @vega-gpu:default+default
[ RUN      ] MINE_QESPRESSO %ecut=250 %nbnd=400 %module_name=QuantumESPRESSO/7.3.1-nvompi-2023a-CUDA-12.1.1 %threads=1 /e4ce2bb2 @vega-gpu:default+default
[       OK ] (1/2) MINE_QESPRESSO %ecut=250 %nbnd=400 %module_name=QuantumESPRESSO/7.3.1-nvompi-2023a-test %threads=1 /bf4db141 @vega-gpu:default+default
P: extract_report_time: 0 s (r:0, l:None, u:None)
P: PWSCF_cpu: 231.98 s (r:0, l:None, u:None)
P: PWSCF_wall: 241.82 s (r:0, l:None, u:None)
P: electrons_cpu: 213.8 s (r:0, l:None, u:None)
P: electrons_wall: 216.04 s (r:0, l:None, u:None)
P: c_bands_cpu: 181.97 s (r:0, l:None, u:None)
P: c_bands_wall: 183.78 s (r:0, l:None, u:None)
P: cegterg_cpu: 142.72 s (r:0, l:None, u:None)
P: cegterg_wall: 144.01 s (r:0, l:None, u:None)
P: calbec_cpu: 0.12 s (r:0, l:None, u:None)
P: calbec_wall: 0.55 s (r:0, l:None, u:None)
P: fft_cpu: 0.12 s (r:0, l:None, u:None)
P: fft_wall: 0.14 s (r:0, l:None, u:None)
P: ffts_cpu: 0.0 s (r:0, l:None, u:None)
P: ffts_wall: 0.0 s (r:0, l:None, u:None)
P: fftw_cpu: 1.26 s (r:0, l:None, u:None)
P: fftw_wall: 77.36 s (r:0, l:None, u:None)
[       OK ] (2/2) MINE_QESPRESSO %ecut=250 %nbnd=400 %module_name=QuantumESPRESSO/7.3.1-nvompi-2023a-CUDA-12.1.1 %threads=1 /e4ce2bb2 @vega-gpu:default+default
P: extract_report_time: 0 s (r:0, l:None, u:None)
P: PWSCF_cpu: 232.44 s (r:0, l:None, u:None)
P: PWSCF_wall: 241.74 s (r:0, l:None, u:None)
P: electrons_cpu: 214.16 s (r:0, l:None, u:None)
P: electrons_wall: 216.18 s (r:0, l:None, u:None)
P: c_bands_cpu: 182.3 s (r:0, l:None, u:None)
P: c_bands_wall: 183.9 s (r:0, l:None, u:None)
P: cegterg_cpu: 143.11 s (r:0, l:None, u:None)
P: cegterg_wall: 144.18 s (r:0, l:None, u:None)
P: calbec_cpu: 0.12 s (r:0, l:None, u:None)
P: calbec_wall: 0.56 s (r:0, l:None, u:None)
P: fft_cpu: 0.0 s (r:0, l:None, u:None)
P: fft_wall: 0.01 s (r:0, l:None, u:None)
P: ffts_cpu: 0.0 s (r:0, l:None, u:None)
P: ffts_wall: 0.0 s (r:0, l:None, u:None)
P: fftw_cpu: 1.24 s (r:0, l:None, u:None)
P: fftw_wall: 77.23 s (r:0, l:None, u:None)
[----------] all spawned checks have finished

@Crivella Crivella changed the title {numlib,chem,tollchain}[NVHPC/23.7-CUDA-12.1.1] nvofbf-2023a + QuantumESPRESSO-7.3.1 (GPU enabled) {numlib,chem,tollchain}[NVHPC/23.7-CUDA-12.1.1] nvompi-2023a + QuantumESPRESSO-7.3.1 (GPU enabled) Apr 22, 2024
@cgross95
Copy link
Contributor

Thanks for putting all of this together! Our site is interested in a GPU enabled QuantumESPRESSO build, so we've been testing this.

Were you able to get around the "other error"s that occur in LAPACK testing when building OpenBLAS? Using the OpenBLAS_0.3.24-NVHPC-23.7-CUDA-12.1.1.eb EasyConfig as provided gives us 55 other errors:

                        -->   LAPACK TESTING SUMMARY  <--
SUMMARY                 nb test run     numerical error         other error  
================        ===========     =================       ================  
REAL                    1328283         0       (0.000%)        0       (0.000%)        
DOUBLE PRECISION        1328013         10      (0.001%)        0       (0.000%)        
COMPLEX                 769507          159     (0.021%)        55      (0.007%)        
COMPLEX16               780654          116     (0.015%)        0       (0.000%)        

--> ALL PRECISIONS      4206457         285     (0.007%)        55      (0.001%)        

I saw that you had done some work with on OpenBLAS issue #4652 to get some of the numerical failures down, but was wondering if you were ever able to get rid of the other errors that stop EasyBuild from finishing.

@Crivella
Copy link
Contributor Author

Crivella commented May 22, 2024

@cgross95
In my case by compiling on a machine with a A100 gpu i ended up with only 148 numerical errors.

			-->   LAPACK TESTING SUMMARY  <--
SUMMARY             	nb test run 	numerical error   	other error  
================   	===========	=================	================  
REAL             	1326099		12	(0.001%)	0	(0.000%)	
DOUBLE PRECISION	1326921		36	(0.003%)	0	(0.000%)	
COMPLEX          	762663		42	(0.006%)	0	(0.000%)	
COMPLEX16         	771518		58	(0.008%)	0	(0.000%)	

--> ALL PRECISIONS	4187201		148	(0.004%)	0	(0.000%)

What hardware are you trying this on?
Would be interesting in finding out if this is strictly OpenBLAS/LaPACK related or if it is about setting more compiler flags for different architectures.

I think i was still getting some other errors as well with 0.3.27 but i didn't investigate much further into it as i was aiming at 0.3.24 for this release (In that case i was getting 14 errors related to the ZHSQR and ZGEEV routines failing to find all eigenvalues).

The logs should give you further details on which lapack routine failed and with what error code (each function should have the meaning of the errors as comments in the source/documentation).
In case you think those errors might not be a problem you could also increase the threshold of allowed lapack test errors by changing the value assigned to max_failing_lapack_tests_num_errors and adding also max_failing_lapack_tests_other_errors to allow the other errors.

@cgross95
Copy link
Contributor

I'm compiling on a v100s with an Intel Xeon Skylake on Ubuntu 22.04. We also have some a100 cards, but we're in the midst of transferring everything in our cluster to Ubuntu, so they're not easily accessible at the moment. I'll dig into the LAPACK testing logs and see if I can produce some more useful debugging information.

@cgross95
Copy link
Contributor

cgross95 commented Jun 7, 2024

I finally got access to our A100 cards, and can report that there were no "other error"s in the LAPACK tests. I ended up with 152 numerical errors, so increased the max_failing_lapack_tests_num_errors EasyConfig parameter, and was able to successfully install OpenBLAS. I'm continuing on with the rest of the build now.

@beeebiii
Copy link

beeebiii commented Sep 5, 2024

Hi, I'm compiling on a v100s with an Intel Xeon Skylake on Ubuntu 22.04. What more changes do you think i should do to be able to use QuantumEspresso(GPU enabled)?? Because, when i use this PR, eb --from-pr 20364 -r, i got checksum error in libxc, which i fixed, afterwards i am getting error in OpenBLAS/0.3.24-NVHPC-23.7-CUDA-12.1.1.... The error i get is

Error limit reached. Use -fmax-errors=N to change the limit, N=0 for unlimited. 100 errors detected in the compilation of "../kernel/x86_64/sgemv_t_4.c". Compilation terminated. make[1]: *** [Makefile.L2:268: sgemv_t.o] Error 2 make[1]: Leaving directory '/test/parta/EASYBUILD/build/OpenBLAS/0.3.24/NVHPC-23.7-CUDA-12.1.1/OpenBLAS-0.3.24/kernel' make: *** [Makefile:184: libs] Error 1 (at easybuild/tools/run.py:682 in parse_cmd_output)
Should i increase the -fmax-errors as i have a total of 241 errors detected during the compilation of ../kernel/x86_64/...?? Or how do you think can i solve this problem!!??

@Crivella
Copy link
Contributor Author

Crivella commented Sep 5, 2024

@beeebiii
Regarding the checksum for libxc it is related to this issue and also this PR.
Since the checksum for the code is not stable it would be best to use --ignore-checksums (in case for security after you've checked that only the libxc one is failing and the the downloaded files are what you expect)

The other error you are reporting seems related to OpenBLAS. In my tests on an A100 with an AMD zen2 CPU I did not encounter failures in the compilation (only some failures in the test suite).
It seems @cgross95 was on a system similar to yours but also did not encounter it.

One weird thing is I am not sure -fmax-errors is supported by NVCC but is a GCC only flag. Would probably need the full debug log to understand what is happening here (eg is gcc being used instead of nvcc?).

@beeebiii
Copy link

beeebiii commented Sep 5, 2024

Yeah you are right, i think gcc is being used instead of nvcc.
Remark: individual warnings can be suppressed with "--diag_suppress <warning-name>" "../kernel/x86_64/sgemv_t_4.c", line 31: warning: unrecognized GCC pragma [unrecognized_gcc_pragma] #pragma GCC optimize("no-tree-vectorize")
Very strange, i did compile libxc/6.2.2-NVHPC-23.7-CUDA-12.1.1 myself as i had to change the checksums. Other than that i do eb --from-pr 20364 -r --accept-eula-for=CUDA --cuda-compute-capabilities=9.0 --accept-eula-for=NVHPC

@Crivella
Copy link
Contributor Author

Crivella commented Sep 5, 2024

If you look at the OpenBLAS easyconfig, only NVHPC should be used and easybuild should not be aware for other compiler toolchains in that instance.
I would suggest running the build with -d and check the logs to see if there is something weird happening with environment variables or some other module you have already loaded in your environment

@beeebiii

This comment was marked as off-topic.

@Crivella
Copy link
Contributor Author

Crivella commented Sep 5, 2024

That is basically easybuild reporting the easyconfig file being used, the debug logging adds much more.
Look for DEBUG Not skipping configure step in the logs and start scrolling

  • up to see what env variables were set by easybuild and what modules were found loaded
  • down to see the output of the configure command

I would venture to guess the problem is in either of them.
Also is OpenBLAS the first thing that gets installed with -r (eg did something else installing with NVHPC only complete sucessfully?)
BTW this conversation might more suited for slack. If you are on the EB slack you can @ me with Davide Grassano

@Crivella
Copy link
Contributor Author

Crivella commented Sep 9, 2024

To summarize the problem @beeebiii was having, the -tp=px was causing trouble compiling which was resolved by commenting the 'optarch': 'GENERIC', line from the OpenBLAS easyconfig (causing it to become -tp=host)

Crivella added a commit to Crivella/easybuild-easyconfigs that referenced this pull request Oct 3, 2024
@yqshao
Copy link
Contributor

yqshao commented Oct 25, 2024

I was trying this and I think the 55 "other error" that @cgross95 observed was discussed and resolved in #19021

Copy link

github-actions bot commented Dec 5, 2024

Updated software FFTW.MPI-3.3.10-nvompi-2023a.eb

Diff against FFTW.MPI-3.3.10-gompi-2024a.eb

easybuild/easyconfigs/f/FFTW.MPI/FFTW.MPI-3.3.10-gompi-2024a.eb

diff --git a/easybuild/easyconfigs/f/FFTW.MPI/FFTW.MPI-3.3.10-gompi-2024a.eb b/easybuild/easyconfigs/f/FFTW.MPI/FFTW.MPI-3.3.10-nvompi-2023a.eb
index 2ed420c412..6a87b1494a 100644
--- a/easybuild/easyconfigs/f/FFTW.MPI/FFTW.MPI-3.3.10-gompi-2024a.eb
+++ b/easybuild/easyconfigs/f/FFTW.MPI/FFTW.MPI-3.3.10-nvompi-2023a.eb
@@ -5,14 +5,17 @@ homepage = 'https://www.fftw.org'
 description = """FFTW is a C subroutine library for computing the discrete Fourier transform (DFT)
 in one or more dimensions, of arbitrary input size, and of both real and complex data."""
 
-toolchain = {'name': 'gompi', 'version': '2024a'}
+toolchain = {'name': 'nvompi', 'version': '2023a'}
 toolchainopts = {'pic': True}
 
 source_urls = [homepage]
 sources = ['fftw-%(version)s.tar.gz']
 checksums = ['56c932549852cddcfafdab3820b0200c7742675be92179e59e6215b340e26467']
 
-dependencies = [('FFTW', '3.3.10')]
+local_cuda = '12.1.1'
+local_compiler = ('NVHPC', '23.7-CUDA-%s' % local_cuda)
+
+dependencies = [('FFTW', '3.3.10', '', local_compiler)]
 
 runtest = 'check'
 
Diff against FFTW.MPI-3.3.10-gmpich-2024.06.eb

easybuild/easyconfigs/f/FFTW.MPI/FFTW.MPI-3.3.10-gmpich-2024.06.eb

diff --git a/easybuild/easyconfigs/f/FFTW.MPI/FFTW.MPI-3.3.10-gmpich-2024.06.eb b/easybuild/easyconfigs/f/FFTW.MPI/FFTW.MPI-3.3.10-nvompi-2023a.eb
index 4b79e9e7c2..6a87b1494a 100644
--- a/easybuild/easyconfigs/f/FFTW.MPI/FFTW.MPI-3.3.10-gmpich-2024.06.eb
+++ b/easybuild/easyconfigs/f/FFTW.MPI/FFTW.MPI-3.3.10-nvompi-2023a.eb
@@ -5,14 +5,17 @@ homepage = 'https://www.fftw.org'
 description = """FFTW is a C subroutine library for computing the discrete Fourier transform (DFT)
 in one or more dimensions, of arbitrary input size, and of both real and complex data."""
 
-toolchain = {'name': 'gmpich', 'version': '2024.06'}
+toolchain = {'name': 'nvompi', 'version': '2023a'}
 toolchainopts = {'pic': True}
 
 source_urls = [homepage]
 sources = ['fftw-%(version)s.tar.gz']
 checksums = ['56c932549852cddcfafdab3820b0200c7742675be92179e59e6215b340e26467']
 
-dependencies = [('FFTW', '3.3.10')]
+local_cuda = '12.1.1'
+local_compiler = ('NVHPC', '23.7-CUDA-%s' % local_cuda)
+
+dependencies = [('FFTW', '3.3.10', '', local_compiler)]
 
 runtest = 'check'
 

Updated software FFTW-3.3.10-NVHPC-23.7-CUDA-12.1.1.eb

Diff against FFTW-3.3.10-GCC-13.3.0.eb

easybuild/easyconfigs/f/FFTW/FFTW-3.3.10-GCC-13.3.0.eb

diff --git a/easybuild/easyconfigs/f/FFTW/FFTW-3.3.10-GCC-13.3.0.eb b/easybuild/easyconfigs/f/FFTW/FFTW-3.3.10-NVHPC-23.7-CUDA-12.1.1.eb
index bf18544e70..670602aadc 100644
--- a/easybuild/easyconfigs/f/FFTW/FFTW-3.3.10-GCC-13.3.0.eb
+++ b/easybuild/easyconfigs/f/FFTW/FFTW-3.3.10-NVHPC-23.7-CUDA-12.1.1.eb
@@ -5,13 +5,16 @@ homepage = 'https://www.fftw.org'
 description = """FFTW is a C subroutine library for computing the discrete Fourier transform (DFT)
 in one or more dimensions, of arbitrary input size, and of both real and complex data."""
 
-toolchain = {'name': 'GCC', 'version': '13.3.0'}
+toolchain = {'name': 'NVHPC', 'version': '23.7-CUDA-12.1.1'}
 toolchainopts = {'pic': True}
 
 source_urls = [homepage]
 sources = [SOURCELOWER_TAR_GZ]
 checksums = ['56c932549852cddcfafdab3820b0200c7742675be92179e59e6215b340e26467']
 
+# Does not work with nvc
+with_quad_prec = False
+
 runtest = 'check'
 
 moduleclass = 'numlib'
Diff against FFTW-3.3.10-GCC-13.2.0.eb

easybuild/easyconfigs/f/FFTW/FFTW-3.3.10-GCC-13.2.0.eb

diff --git a/easybuild/easyconfigs/f/FFTW/FFTW-3.3.10-GCC-13.2.0.eb b/easybuild/easyconfigs/f/FFTW/FFTW-3.3.10-NVHPC-23.7-CUDA-12.1.1.eb
index 32652387f8..670602aadc 100644
--- a/easybuild/easyconfigs/f/FFTW/FFTW-3.3.10-GCC-13.2.0.eb
+++ b/easybuild/easyconfigs/f/FFTW/FFTW-3.3.10-NVHPC-23.7-CUDA-12.1.1.eb
@@ -5,13 +5,16 @@ homepage = 'https://www.fftw.org'
 description = """FFTW is a C subroutine library for computing the discrete Fourier transform (DFT)
 in one or more dimensions, of arbitrary input size, and of both real and complex data."""
 
-toolchain = {'name': 'GCC', 'version': '13.2.0'}
+toolchain = {'name': 'NVHPC', 'version': '23.7-CUDA-12.1.1'}
 toolchainopts = {'pic': True}
 
 source_urls = [homepage]
 sources = [SOURCELOWER_TAR_GZ]
 checksums = ['56c932549852cddcfafdab3820b0200c7742675be92179e59e6215b340e26467']
 
+# Does not work with nvc
+with_quad_prec = False
+
 runtest = 'check'
 
 moduleclass = 'numlib'

Updated software HDF5-1.14.0-nvompi-2023a-CUDA-12.1.1.eb

Diff against HDF5-1.14.5-gompi-2024a.eb

easybuild/easyconfigs/h/HDF5/HDF5-1.14.5-gompi-2024a.eb

diff --git a/easybuild/easyconfigs/h/HDF5/HDF5-1.14.5-gompi-2024a.eb b/easybuild/easyconfigs/h/HDF5/HDF5-1.14.0-nvompi-2023a-CUDA-12.1.1.eb
index 5917196292..74256e2cb4 100644
--- a/easybuild/easyconfigs/h/HDF5/HDF5-1.14.5-gompi-2024a.eb
+++ b/easybuild/easyconfigs/h/HDF5/HDF5-1.14.0-nvompi-2023a-CUDA-12.1.1.eb
@@ -1,27 +1,27 @@
 name = 'HDF5'
 # Note: Odd minor releases are only RCs and should not be used.
-version = '1.14.5'
+version = '1.14.0'
+versionsuffix = '-CUDA-%(cudaver)s'
 
 homepage = 'https://portal.hdfgroup.org/display/support'
 description = """HDF5 is a data model, library, and file format for storing and managing data.
  It supports an unlimited variety of datatypes, and is designed for flexible
  and efficient I/O and for high volume and complex data."""
 
-toolchain = {'name': 'gompi', 'version': '2024a'}
+toolchain = {'name': 'nvompi', 'version': '2023a'}
 toolchainopts = {'pic': True, 'usempi': True}
 
-source_urls = ['https://github.com/HDFGroup/hdf5/archive']
-sources = ['hdf5_%(version)s.tar.gz']
-checksums = ['c83996dc79080a34e7b5244a1d5ea076abfd642ec12d7c25388e2fdd81d26350']
+source_urls = ['https://support.hdfgroup.org/ftp/HDF5/releases/hdf5-%(version_major_minor)s/hdf5-%(version)s/src']
+sources = [SOURCELOWER_TAR_GZ]
+checksums = ['a571cc83efda62e1a51a0a912dd916d01895801c5025af91669484a1575a6ef4']
 
-dependencies = [
-    ('zlib', '1.3.1'),
-    ('Szip', '2.1.1'),
-]
+local_gcc_compiler = ('GCCcore', '12.3.0')
+# local_compiler = ('NVHPC', '23.7-CUDA-12.1.1')
 
-postinstallcmds = [
-    'sed -i -r "s, -I[^[:space:]]+H5FDsubfiling , -I%(installdir)s/include ,g" %(installdir)s/bin/h5c++',
-    'sed -i -r "s, -I[^[:space:]]+H5FDsubfiling , -I%(installdir)s/include ,g" %(installdir)s/bin/h5pcc',
+dependencies = [
+    ('CUDA', '12.1.1', '', SYSTEM),
+    ('zlib', '1.2.13', '', local_gcc_compiler),
+    ('Szip', '2.1.1', '', local_gcc_compiler),
 ]
 
 moduleclass = 'data'
Diff against HDF5-1.14.5-iimpi-2024a.eb

easybuild/easyconfigs/h/HDF5/HDF5-1.14.5-iimpi-2024a.eb

diff --git a/easybuild/easyconfigs/h/HDF5/HDF5-1.14.5-iimpi-2024a.eb b/easybuild/easyconfigs/h/HDF5/HDF5-1.14.0-nvompi-2023a-CUDA-12.1.1.eb
index 153463a223..74256e2cb4 100644
--- a/easybuild/easyconfigs/h/HDF5/HDF5-1.14.5-iimpi-2024a.eb
+++ b/easybuild/easyconfigs/h/HDF5/HDF5-1.14.0-nvompi-2023a-CUDA-12.1.1.eb
@@ -1,26 +1,27 @@
 name = 'HDF5'
 # Note: Odd minor releases are only RCs and should not be used.
-version = '1.14.5'
+version = '1.14.0'
+versionsuffix = '-CUDA-%(cudaver)s'
 
 homepage = 'https://portal.hdfgroup.org/display/support'
 description = """HDF5 is a data model, library, and file format for storing and managing data.
  It supports an unlimited variety of datatypes, and is designed for flexible
  and efficient I/O and for high volume and complex data."""
 
-toolchain = {'name': 'iimpi', 'version': '2024a'}
+toolchain = {'name': 'nvompi', 'version': '2023a'}
 toolchainopts = {'pic': True, 'usempi': True}
 
-source_urls = ['https://github.com/HDFGroup/hdf5/archive']
-sources = ['hdf5_%(version)s.tar.gz']
-checksums = ['c83996dc79080a34e7b5244a1d5ea076abfd642ec12d7c25388e2fdd81d26350']
+source_urls = ['https://support.hdfgroup.org/ftp/HDF5/releases/hdf5-%(version_major_minor)s/hdf5-%(version)s/src']
+sources = [SOURCELOWER_TAR_GZ]
+checksums = ['a571cc83efda62e1a51a0a912dd916d01895801c5025af91669484a1575a6ef4']
 
-# replace src include path with installation dir for $H5BLD_CPPFLAGS
-_regex = 's, -I[^[:space:]]+H5FDsubfiling , -I%(installdir)s/include ,g'
-postinstallcmds = ['sed -i -r "%s" %%(installdir)s/bin/%s' % (_regex, x) for x in ['h5c++', 'h5pcc']]
+local_gcc_compiler = ('GCCcore', '12.3.0')
+# local_compiler = ('NVHPC', '23.7-CUDA-12.1.1')
 
 dependencies = [
-    ('zlib', '1.3.1'),
-    ('Szip', '2.1.1'),
+    ('CUDA', '12.1.1', '', SYSTEM),
+    ('zlib', '1.2.13', '', local_gcc_compiler),
+    ('Szip', '2.1.1', '', local_gcc_compiler),
 ]
 
 moduleclass = 'data'

Updated software libxc-6.2.2-NVHPC-23.7-CUDA-12.1.1.eb

Diff against libxc-6.2.2-GCC-13.2.0-nofhc.eb

easybuild/easyconfigs/l/libxc/libxc-6.2.2-GCC-13.2.0-nofhc.eb

diff --git a/easybuild/easyconfigs/l/libxc/libxc-6.2.2-GCC-13.2.0-nofhc.eb b/easybuild/easyconfigs/l/libxc/libxc-6.2.2-NVHPC-23.7-CUDA-12.1.1.eb
index 709703f294..2cacc1fb9d 100644
--- a/easybuild/easyconfigs/l/libxc/libxc-6.2.2-GCC-13.2.0-nofhc.eb
+++ b/easybuild/easyconfigs/l/libxc/libxc-6.2.2-NVHPC-23.7-CUDA-12.1.1.eb
@@ -2,22 +2,22 @@ easyblock = 'CMakeMake'
 
 name = 'libxc'
 version = '6.2.2'
-versionsuffix = '-nofhc'
 
 homepage = 'https://libxc.gitlab.io'
 description = """Libxc is a library of exchange-correlation functionals for density-functional theory.
  The aim is to provide a portable, well tested and reliable set of exchange and correlation functionals."""
 
-toolchain = {'name': 'GCC', 'version': '13.2.0'}
+toolchain = {'name': 'NVHPC', 'version': '23.7-CUDA-12.1.1'}
 
 source_urls = ['https://gitlab.com/libxc/libxc/-/archive/%(version)s/']
 sources = [SOURCE_TAR_GZ]
 checksums = [('a0f6f1bba7ba5c0c85b2bfe65aca1591025f509a7f11471b4cd651a79491b045',
-              '3b0523924579cf494cafc6fea92945257f35692b004217d3dfd3ea7ca780e8dc')]
+              '3b0523924579cf494cafc6fea92945257f35692b004217d3dfd3ea7ca780e8dc'),
+             ]
 
 builddependencies = [
-    ('CMake', '3.27.6'),
-    ('Perl', '5.38.0'),
+    ('CMake', '3.26.3'),
+    ('Perl', '5.36.1'),
 ]
 
 local_common_configopts = "-DENABLE_FORTRAN=ON -DENABLE_XHOST=OFF "
@@ -27,9 +27,6 @@ local_common_configopts = "-DENABLE_FORTRAN=ON -DENABLE_XHOST=OFF "
 # see also https://github.com/pyscf/pyscf/issues/1103
 local_common_configopts += "-DDISABLE_KXC=OFF -DDISABLE_LXC=OFF"
 
-# Disable fhc, this needs to support codes (like VASP) relying on the projector augmented wave (PAW) approach
-local_common_configopts += ' -DDISABLE_FHC=ON'
-
 # perform iterative build to get both static and shared libraries
 configopts = [
     local_common_configopts + ' -DBUILD_SHARED_LIBS=OFF',
Diff against libxc-6.2.2-GCC-13.3.0.eb

easybuild/easyconfigs/l/libxc/libxc-6.2.2-GCC-13.3.0.eb

diff --git a/easybuild/easyconfigs/l/libxc/libxc-6.2.2-GCC-13.3.0.eb b/easybuild/easyconfigs/l/libxc/libxc-6.2.2-NVHPC-23.7-CUDA-12.1.1.eb
index 9e4de5e808..2cacc1fb9d 100644
--- a/easybuild/easyconfigs/l/libxc/libxc-6.2.2-GCC-13.3.0.eb
+++ b/easybuild/easyconfigs/l/libxc/libxc-6.2.2-NVHPC-23.7-CUDA-12.1.1.eb
@@ -7,19 +7,17 @@ homepage = 'https://libxc.gitlab.io'
 description = """Libxc is a library of exchange-correlation functionals for density-functional theory.
  The aim is to provide a portable, well tested and reliable set of exchange and correlation functionals."""
 
-toolchain = {'name': 'GCC', 'version': '13.3.0'}
+toolchain = {'name': 'NVHPC', 'version': '23.7-CUDA-12.1.1'}
 
-source_urls = ['https://gitlab.com/%(name)s/%(name)s/-/archive/%(version)s/']
+source_urls = ['https://gitlab.com/libxc/libxc/-/archive/%(version)s/']
 sources = [SOURCE_TAR_GZ]
-checksums = [
-    ('a0f6f1bba7ba5c0c85b2bfe65aca1591025f509a7f11471b4cd651a79491b045',
-     '3b0523924579cf494cafc6fea92945257f35692b004217d3dfd3ea7ca780e8dc',
-     'd1b65ef74615a1e539d87a0e6662f04baf3a2316706b4e2e686da3193b26b20f'),
-]
+checksums = [('a0f6f1bba7ba5c0c85b2bfe65aca1591025f509a7f11471b4cd651a79491b045',
+              '3b0523924579cf494cafc6fea92945257f35692b004217d3dfd3ea7ca780e8dc'),
+             ]
 
 builddependencies = [
-    ('CMake', '3.29.3'),
-    ('Perl', '5.38.2'),
+    ('CMake', '3.26.3'),
+    ('Perl', '5.36.1'),
 ]
 
 local_common_configopts = "-DENABLE_FORTRAN=ON -DENABLE_XHOST=OFF "

Updated software nvompi-2023a.eb

Diff against nvompi-2022.07.eb

easybuild/easyconfigs/n/nvompi/nvompi-2022.07.eb

diff --git a/easybuild/easyconfigs/n/nvompi/nvompi-2022.07.eb b/easybuild/easyconfigs/n/nvompi/nvompi-2023a.eb
index 1a1647cbfa..31a17edb3f 100644
--- a/easybuild/easyconfigs/n/nvompi/nvompi-2022.07.eb
+++ b/easybuild/easyconfigs/n/nvompi/nvompi-2023a.eb
@@ -1,19 +1,20 @@
 easyblock = 'Toolchain'
 
 name = 'nvompi'
-version = '2022.07'
+version = '2023a'
 
 homepage = '(none)'
 description = 'NVHPC based compiler toolchain, including OpenMPI for MPI support.'
 
 toolchain = SYSTEM
 
-local_compiler = ('NVHPC', '22.7-CUDA-11.7.0')
+local_cuda = '12.1.1'
+local_compiler = ('NVHPC', '23.7-CUDA-%s' % local_cuda)
 
 dependencies = [
     local_compiler,
-    ('OpenMPI', '4.1.4', '', local_compiler),
-    ('CUDA', '11.7.0', '', SYSTEM),
+    ('CUDA', local_cuda, '', SYSTEM),
+    ('OpenMPI', '4.1.5', '', local_compiler),
 ]
 
 moduleclass = 'toolchain'

Updated software OpenBLAS-0.3.24-NVHPC-23.7-CUDA-12.1.1.eb

Diff against OpenBLAS-0.3.27-GCC-13.3.0-seq-iface64.eb

easybuild/easyconfigs/o/OpenBLAS/OpenBLAS-0.3.27-GCC-13.3.0-seq-iface64.eb

diff --git a/easybuild/easyconfigs/o/OpenBLAS/OpenBLAS-0.3.27-GCC-13.3.0-seq-iface64.eb b/easybuild/easyconfigs/o/OpenBLAS/OpenBLAS-0.3.24-NVHPC-23.7-CUDA-12.1.1.eb
index 5527c667f6..db86d55f15 100644
--- a/easybuild/easyconfigs/o/OpenBLAS/OpenBLAS-0.3.27-GCC-13.3.0-seq-iface64.eb
+++ b/easybuild/easyconfigs/o/OpenBLAS/OpenBLAS-0.3.24-NVHPC-23.7-CUDA-12.1.1.eb
@@ -1,11 +1,15 @@
 name = 'OpenBLAS'
-version = '0.3.27'
-versionsuffix = '-seq-iface64'
+version = '0.3.24'
 
-homepage = 'https://www.openblas.net/'
+homepage = 'http://www.openblas.net/'
 description = "OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version."
 
-toolchain = {'name': 'GCC', 'version': '13.3.0'}
+toolchain = {'name': 'NVHPC', 'version': '23.7-CUDA-12.1.1'}
+toolchainopts = {
+    # https://github.com/OpenMathLib/OpenBLAS/issues/4625
+    'precise': True,
+    'optarch': 'GENERIC',
+}
 
 source_urls = [
     # order matters, trying to download the large.tgz/timing.tgz LAPACK tarballs from GitHub causes trouble
@@ -17,40 +21,32 @@ patches = [
     ('large.tgz', '.'),
     ('timing.tgz', '.'),
     'OpenBLAS-0.3.15_workaround-gcc-miscompilation.patch',
+    'OpenBLAS-0.3.20_use-xASUM-microkernels-on-new-intel-cpus.patch',
     'OpenBLAS-0.3.21_fix-order-vectorization.patch',
-    'OpenBLAS-0.3.26_lapack_qr_noninittest.patch',
-    'OpenBLAS-0.3.27_fix_zscal.patch',
-    'OpenBLAS-0.3.27_riscv-drop-static-fortran-flag.patch',
+    'OpenBLAS-0.3.23_disable-xDRGES-LAPACK-test.patch',
+    'OpenBLAS-0.3.23_lapack_test_nomain.patch'  # https://github.com/OpenMathLib/OpenBLAS/issues/4625
 ]
 checksums = [
-    {'v0.3.27.tar.gz': 'aa2d68b1564fe2b13bc292672608e9cdeeeb6dc34995512e65c3b10f4599e897'},
+    {'v0.3.24.tar.gz': 'ceadc5065da97bd92404cac7254da66cc6eb192679cf1002098688978d4d5132'},
     {'large.tgz': 'f328d88b7fa97722f271d7d0cfea1c220e0f8e5ed5ff01d8ef1eb51d6f4243a1'},
     {'timing.tgz': '999c65f8ea8bd4eac7f1c7f3463d4946917afd20a997807300fe35d70122f3af'},
     {'OpenBLAS-0.3.15_workaround-gcc-miscompilation.patch':
      'e6b326fb8c4a8a6fd07741d9983c37a72c55c9ff9a4f74a80e1352ce5f975971'},
+    {'OpenBLAS-0.3.20_use-xASUM-microkernels-on-new-intel-cpus.patch':
+     '1dbd0f9473963dbdd9131611b455d8a801f1e995eae82896186d3d3ffe6d5f03'},
     {'OpenBLAS-0.3.21_fix-order-vectorization.patch':
      '08af834e5d60441fd35c128758ed9c092ba6887c829e0471ecd489079539047d'},
-    {'OpenBLAS-0.3.26_lapack_qr_noninittest.patch': '4781bf1d7b239374fd8069e15b4e2c0ef0e8efaa1a7d4c33557bd5b27e5de77c'},
-    {'OpenBLAS-0.3.27_fix_zscal.patch': '9210d7b66538dabaddbe1bfceb16f8225708856f60876ca5561b19d3599f9fd1'},
-    {'OpenBLAS-0.3.27_riscv-drop-static-fortran-flag.patch':
-     'f374e41efffd592ab1c9034df9e7abf1045ed151f4fc0fd0da618ce9826f2d4b'},
+    {'OpenBLAS-0.3.23_disable-xDRGES-LAPACK-test.patch':
+     'ab7e0af05f9b2a2ced32f3875e1e3767d9c3531a455421a38f7324350178a0ff'},
+    {'OpenBLAS-0.3.23_lapack_test_nomain.patch': '63e14e2cb67dd81ecc13fa0e07685f853abe0669d4cadd520247b88789346948'},
 ]
 
 builddependencies = [
     ('make', '4.4.1'),
     # required by LAPACK test suite
-    ('Python', '3.12.3'),
+    ('Python', '3.11.3'),
 ]
 
-# INTERFACE64=1 needs if you link OpenBLAS for fortran code compied with 64 bit integers (-i8)
-# This would be in intel library naming convention ilp64
-# The USE_OPENMP=0 and USE_THREAD=0 needs for the single threaded version
-# The USE_LOCKING=1 needs for thread safe version (if threaded software calls OpenBLAS, without it
-# OpenBLAS is not thread safe (so only single threaded software would be able to use it)
-buildopts = "INTERFACE64=1 USE_OPENMP=0 USE_THREAD=0 USE_LOCKING=1 "
-testopts = buildopts
-installopts = buildopts
-
 run_lapack_tests = True
 max_failing_lapack_tests_num_errors = 150
 
Diff against OpenBLAS-0.3.27-GCC-13.2.0-seq-iface64.eb

easybuild/easyconfigs/o/OpenBLAS/OpenBLAS-0.3.27-GCC-13.2.0-seq-iface64.eb

diff --git a/easybuild/easyconfigs/o/OpenBLAS/OpenBLAS-0.3.27-GCC-13.2.0-seq-iface64.eb b/easybuild/easyconfigs/o/OpenBLAS/OpenBLAS-0.3.24-NVHPC-23.7-CUDA-12.1.1.eb
index f205e063da..db86d55f15 100644
--- a/easybuild/easyconfigs/o/OpenBLAS/OpenBLAS-0.3.27-GCC-13.2.0-seq-iface64.eb
+++ b/easybuild/easyconfigs/o/OpenBLAS/OpenBLAS-0.3.24-NVHPC-23.7-CUDA-12.1.1.eb
@@ -1,11 +1,15 @@
 name = 'OpenBLAS'
-version = '0.3.27'
-versionsuffix = '-seq-iface64'
+version = '0.3.24'
 
-homepage = 'https://www.openblas.net/'
+homepage = 'http://www.openblas.net/'
 description = "OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version."
 
-toolchain = {'name': 'GCC', 'version': '13.2.0'}
+toolchain = {'name': 'NVHPC', 'version': '23.7-CUDA-12.1.1'}
+toolchainopts = {
+    # https://github.com/OpenMathLib/OpenBLAS/issues/4625
+    'precise': True,
+    'optarch': 'GENERIC',
+}
 
 source_urls = [
     # order matters, trying to download the large.tgz/timing.tgz LAPACK tarballs from GitHub causes trouble
@@ -17,40 +21,32 @@ patches = [
     ('large.tgz', '.'),
     ('timing.tgz', '.'),
     'OpenBLAS-0.3.15_workaround-gcc-miscompilation.patch',
+    'OpenBLAS-0.3.20_use-xASUM-microkernels-on-new-intel-cpus.patch',
     'OpenBLAS-0.3.21_fix-order-vectorization.patch',
-    'OpenBLAS-0.3.26_lapack_qr_noninittest.patch',
-    'OpenBLAS-0.3.27_fix_zscal.patch',
-    'OpenBLAS-0.3.27_riscv-drop-static-fortran-flag.patch',
+    'OpenBLAS-0.3.23_disable-xDRGES-LAPACK-test.patch',
+    'OpenBLAS-0.3.23_lapack_test_nomain.patch'  # https://github.com/OpenMathLib/OpenBLAS/issues/4625
 ]
 checksums = [
-    {'v0.3.27.tar.gz': 'aa2d68b1564fe2b13bc292672608e9cdeeeb6dc34995512e65c3b10f4599e897'},
+    {'v0.3.24.tar.gz': 'ceadc5065da97bd92404cac7254da66cc6eb192679cf1002098688978d4d5132'},
     {'large.tgz': 'f328d88b7fa97722f271d7d0cfea1c220e0f8e5ed5ff01d8ef1eb51d6f4243a1'},
     {'timing.tgz': '999c65f8ea8bd4eac7f1c7f3463d4946917afd20a997807300fe35d70122f3af'},
     {'OpenBLAS-0.3.15_workaround-gcc-miscompilation.patch':
      'e6b326fb8c4a8a6fd07741d9983c37a72c55c9ff9a4f74a80e1352ce5f975971'},
+    {'OpenBLAS-0.3.20_use-xASUM-microkernels-on-new-intel-cpus.patch':
+     '1dbd0f9473963dbdd9131611b455d8a801f1e995eae82896186d3d3ffe6d5f03'},
     {'OpenBLAS-0.3.21_fix-order-vectorization.patch':
      '08af834e5d60441fd35c128758ed9c092ba6887c829e0471ecd489079539047d'},
-    {'OpenBLAS-0.3.26_lapack_qr_noninittest.patch': '4781bf1d7b239374fd8069e15b4e2c0ef0e8efaa1a7d4c33557bd5b27e5de77c'},
-    {'OpenBLAS-0.3.27_fix_zscal.patch': '9210d7b66538dabaddbe1bfceb16f8225708856f60876ca5561b19d3599f9fd1'},
-    {'OpenBLAS-0.3.27_riscv-drop-static-fortran-flag.patch':
-     'f374e41efffd592ab1c9034df9e7abf1045ed151f4fc0fd0da618ce9826f2d4b'},
+    {'OpenBLAS-0.3.23_disable-xDRGES-LAPACK-test.patch':
+     'ab7e0af05f9b2a2ced32f3875e1e3767d9c3531a455421a38f7324350178a0ff'},
+    {'OpenBLAS-0.3.23_lapack_test_nomain.patch': '63e14e2cb67dd81ecc13fa0e07685f853abe0669d4cadd520247b88789346948'},
 ]
 
 builddependencies = [
     ('make', '4.4.1'),
     # required by LAPACK test suite
-    ('Python', '3.11.5'),
+    ('Python', '3.11.3'),
 ]
 
-# INTERFACE64=1 needs if you link OpenBLAS for fortran code compied with 64 bit integers (-i8)
-# This would be in intel library naming convention ilp64
-# The USE_OPENMP=0 and USE_THREAD=0 needs for the single threaded version
-# The USE_LOCKING=1 needs for thread safe version (if threaded software calls OpenBLAS, without it
-# OpenBLAS is not thread safe (so only single threaded software would be able to use it)
-buildopts = "INTERFACE64=1 USE_OPENMP=0 USE_THREAD=0 USE_LOCKING=1 "
-testopts = buildopts
-installopts = buildopts
-
 run_lapack_tests = True
 max_failing_lapack_tests_num_errors = 150
 

Updated software OpenMPI-4.1.5-NVHPC-23.7-CUDA-12.1.1.eb

Diff against OpenMPI-5.0.3-NVHPC-24.9-CUDA-12.6.0.eb

easybuild/easyconfigs/o/OpenMPI/OpenMPI-5.0.3-NVHPC-24.9-CUDA-12.6.0.eb

diff --git a/easybuild/easyconfigs/o/OpenMPI/OpenMPI-5.0.3-NVHPC-24.9-CUDA-12.6.0.eb b/easybuild/easyconfigs/o/OpenMPI/OpenMPI-4.1.5-NVHPC-23.7-CUDA-12.1.1.eb
index e6c772bf64..f858d99eb9 100644
--- a/easybuild/easyconfigs/o/OpenMPI/OpenMPI-5.0.3-NVHPC-24.9-CUDA-12.6.0.eb
+++ b/easybuild/easyconfigs/o/OpenMPI/OpenMPI-4.1.5-NVHPC-23.7-CUDA-12.1.1.eb
@@ -1,63 +1,72 @@
 name = 'OpenMPI'
-version = '5.0.3'
+version = '4.1.5'
 
 homepage = 'https://www.open-mpi.org/'
 description = """The Open MPI Project is an open source MPI-3 implementation."""
 
-toolchain = {'name': 'NVHPC', 'version': '24.9-CUDA-12.6.0'}
+toolchain = {'name': 'NVHPC', 'version': '23.7-CUDA-12.1.1'}
+toolchainopts = {
+    'extra_fcflags': '-Mstandard',  # https://forums.developer.nvidia.com/t/howto-build-openmpi-with-nvhpc-24-1/283219
+}
 
 source_urls = ['https://www.open-mpi.org/software/ompi/v%(version_major_minor)s/downloads']
 sources = [SOURCELOWER_TAR_BZ2]
 patches = [
-    'OpenMPI-5.0.3_fix_hle_make_errors.patch',
-    'OpenMPI-5.0.3_disable_opal_path_nfs_test.patch',
-    ('OpenMPI-5.0.2_build-with-internal-cuda-header.patch', 1)
+    'OpenMPI-4.1.1_build-with-internal-cuda-header.patch',
+    'OpenMPI-4.1.1_opal-datatype-cuda-performance.patch',
+    'OpenMPI-4.1.5_fix-pmix3x.patch',
+    'OpenMPI-4.1.x_add_atomic_wmb.patch',
 ]
 checksums = [
-    {'openmpi-5.0.3.tar.bz2':
-     '990582f206b3ab32e938aa31bbf07c639368e4405dca196fabe7f0f76eeda90b'},
-    {'OpenMPI-5.0.3_fix_hle_make_errors.patch':
-     '881c907a9f5901d5d6af41cd33dffdcecba4a67a9e5123e602542aea57a80895'},
-    {'OpenMPI-5.0.3_disable_opal_path_nfs_test.patch':
-     '75d4417e35252ea3a19b2792f1b06e9aeb408c253aa4921d77226d57b71dee45'},
-    {'OpenMPI-5.0.2_build-with-internal-cuda-header.patch':
-     'f52dc470543f35efef10d651dd159c771ae25f8f76a420d20d87abf4dc769ed7'},
+    {'openmpi-4.1.5.tar.bz2': 'a640986bc257389dd379886fdae6264c8cfa56bc98b71ce3ae3dfbd8ce61dbe3'},
+    {'OpenMPI-4.1.1_build-with-internal-cuda-header.patch':
+     '63eac52736bdf7644c480362440a7f1f0ae7c7cae47b7565f5635c41793f8c83'},
+    {'OpenMPI-4.1.1_opal-datatype-cuda-performance.patch':
+     'b767c7166cf0b32906132d58de5439c735193c9fd09ec3c5c11db8d5fa68750e'},
+    {'OpenMPI-4.1.5_fix-pmix3x.patch': '46edac3dbf32f2a611d45e8a3c8edd3ae2f430eec16a1373b510315272115c40'},
+    {'OpenMPI-4.1.x_add_atomic_wmb.patch': '9494bbc546d661ba5189e44b4c84a7f8df30a87cdb9d96ce2e73a7c8fecba172'},
 ]
 
 builddependencies = [
-    ('pkgconf', '2.2.0'),
-    ('Perl', '5.38.2'),
-    ('Autotools', '20231222'),
+    ('pkgconf', '1.9.5'),
+    ('Perl', '5.36.1'),
+    ('Autotools', '20220317'),
 ]
 
 dependencies = [
-    ('zlib', '1.3.1'),
-    ('hwloc', '2.10.0'),
+    ('zlib', '1.2.13'),
+    ('hwloc', '2.9.1'),
     ('libevent', '2.1.12'),
-    ('UCX', '1.16.0'),
-    ('UCX-CUDA', '1.16.0', '-CUDA-%(cudaver)s'),
-    ('libfabric', '1.21.0'),
-    ('PMIx', '5.0.2'),
-    ('PRRTE', '3.0.5'),
-    ('UCC', '1.3.0'),
-    ('UCC-CUDA', '1.3.0', '-CUDA-%(cudaver)s'),
+    ('UCX', '1.14.1'),
+    ('UCX-CUDA', '1.14.1', '-CUDA-%(cudaver)s'),
+    ('libfabric', '1.18.0'),
+    ('PMIx', '4.2.4'),
+    ('UCC', '1.2.0'),
+    ('UCC-CUDA', '1.2.0', '-CUDA-%(cudaver)s'),
 ]
 
+# Update configure to include changes from the "internal-cuda" patch
+# by running a subset of autogen.pl sufficient to achieve this
+# without doing the full, long-running regeneration.
+preconfigopts = ' && '.join([
+    'cd config',
+    'autom4te --language=m4sh opal_get_version.m4sh -o opal_get_version.sh',
+    'cd ..',
+    'autoconf',
+    'autoheader',
+    'aclocal',
+    'automake',
+    ''
+])
+
 # CUDA related patches and custom configure option can be removed if CUDA support isn't wanted.
-preconfigopts = 'nvc -Iopal/mca/cuda/include -shared opal/mca/cuda/lib/cuda.c -o opal/mca/cuda/lib/libcuda.so && '
-# Update configure to include changes from the "disable_opal_path_nfs_test" patch
-preconfigopts += './autogen.pl --force && '
-
-configopts = '--with-cuda=%(start_dir)s/opal/mca/cuda '
-# Required to prevent internal compiler error in opal.
-configopts += '--enable-alt-short-float=no '
-# Set PGI compilers manually, as NVHPC compilers are not correctly detected
-configopts += 'CC=pgcc CXX=pgc++ FC=pgfortran '
-
-# site specific options
-# configopts += '--without-psm2 '
-# configopts += '--disable-oshmem '
-# configopts += '--with-gpfs '
-configopts += '--with-slurm '
+configopts = '--with-cuda=internal '
+configopts += ' CC=pgcc CXX=pgc++ FC=pgfortran'
+
+# disable MPI1 compatibility for now, see what breaks...
+# configopts += '--enable-mpi1-compatibility '
+
+# to enable SLURM integration (site-specific)
+# configopts += '--with-slurm --with-pmi=/usr/include/slurm --with-pmi-libdir=/usr'
 
 moduleclass = 'mpi'
Diff against OpenMPI-5.0.3-GCC-13.3.0.eb

easybuild/easyconfigs/o/OpenMPI/OpenMPI-5.0.3-GCC-13.3.0.eb

diff --git a/easybuild/easyconfigs/o/OpenMPI/OpenMPI-5.0.3-GCC-13.3.0.eb b/easybuild/easyconfigs/o/OpenMPI/OpenMPI-4.1.5-NVHPC-23.7-CUDA-12.1.1.eb
index 6864e213a9..f858d99eb9 100644
--- a/easybuild/easyconfigs/o/OpenMPI/OpenMPI-5.0.3-GCC-13.3.0.eb
+++ b/easybuild/easyconfigs/o/OpenMPI/OpenMPI-4.1.5-NVHPC-23.7-CUDA-12.1.1.eb
@@ -1,38 +1,72 @@
 name = 'OpenMPI'
-version = '5.0.3'
+version = '4.1.5'
 
 homepage = 'https://www.open-mpi.org/'
 description = """The Open MPI Project is an open source MPI-3 implementation."""
 
-toolchain = {'name': 'GCC', 'version': '13.3.0'}
+toolchain = {'name': 'NVHPC', 'version': '23.7-CUDA-12.1.1'}
+toolchainopts = {
+    'extra_fcflags': '-Mstandard',  # https://forums.developer.nvidia.com/t/howto-build-openmpi-with-nvhpc-24-1/283219
+}
 
 source_urls = ['https://www.open-mpi.org/software/ompi/v%(version_major_minor)s/downloads']
 sources = [SOURCELOWER_TAR_BZ2]
-patches = [('OpenMPI-5.0.2_build-with-internal-cuda-header.patch', 1)]
+patches = [
+    'OpenMPI-4.1.1_build-with-internal-cuda-header.patch',
+    'OpenMPI-4.1.1_opal-datatype-cuda-performance.patch',
+    'OpenMPI-4.1.5_fix-pmix3x.patch',
+    'OpenMPI-4.1.x_add_atomic_wmb.patch',
+]
 checksums = [
-    {'openmpi-5.0.3.tar.bz2': '990582f206b3ab32e938aa31bbf07c639368e4405dca196fabe7f0f76eeda90b'},
-    {'OpenMPI-5.0.2_build-with-internal-cuda-header.patch':
-     'f52dc470543f35efef10d651dd159c771ae25f8f76a420d20d87abf4dc769ed7'},
+    {'openmpi-4.1.5.tar.bz2': 'a640986bc257389dd379886fdae6264c8cfa56bc98b71ce3ae3dfbd8ce61dbe3'},
+    {'OpenMPI-4.1.1_build-with-internal-cuda-header.patch':
+     '63eac52736bdf7644c480362440a7f1f0ae7c7cae47b7565f5635c41793f8c83'},
+    {'OpenMPI-4.1.1_opal-datatype-cuda-performance.patch':
+     'b767c7166cf0b32906132d58de5439c735193c9fd09ec3c5c11db8d5fa68750e'},
+    {'OpenMPI-4.1.5_fix-pmix3x.patch': '46edac3dbf32f2a611d45e8a3c8edd3ae2f430eec16a1373b510315272115c40'},
+    {'OpenMPI-4.1.x_add_atomic_wmb.patch': '9494bbc546d661ba5189e44b4c84a7f8df30a87cdb9d96ce2e73a7c8fecba172'},
 ]
 
 builddependencies = [
-    ('pkgconf', '2.2.0'),
-    ('Autotools', '20231222'),
+    ('pkgconf', '1.9.5'),
+    ('Perl', '5.36.1'),
+    ('Autotools', '20220317'),
 ]
 
 dependencies = [
-    ('zlib', '1.3.1'),
-    ('hwloc', '2.10.0'),
+    ('zlib', '1.2.13'),
+    ('hwloc', '2.9.1'),
     ('libevent', '2.1.12'),
-    ('UCX', '1.16.0'),
-    ('libfabric', '1.21.0'),
-    ('PMIx', '5.0.2'),
-    ('PRRTE', '3.0.5'),
-    ('UCC', '1.3.0'),
+    ('UCX', '1.14.1'),
+    ('UCX-CUDA', '1.14.1', '-CUDA-%(cudaver)s'),
+    ('libfabric', '1.18.0'),
+    ('PMIx', '4.2.4'),
+    ('UCC', '1.2.0'),
+    ('UCC-CUDA', '1.2.0', '-CUDA-%(cudaver)s'),
 ]
 
+# Update configure to include changes from the "internal-cuda" patch
+# by running a subset of autogen.pl sufficient to achieve this
+# without doing the full, long-running regeneration.
+preconfigopts = ' && '.join([
+    'cd config',
+    'autom4te --language=m4sh opal_get_version.m4sh -o opal_get_version.sh',
+    'cd ..',
+    'autoconf',
+    'autoheader',
+    'aclocal',
+    'automake',
+    ''
+])
+
 # CUDA related patches and custom configure option can be removed if CUDA support isn't wanted.
-preconfigopts = 'gcc -Iopal/mca/cuda/include -shared opal/mca/cuda/lib/cuda.c -o opal/mca/cuda/lib/libcuda.so && '
-configopts = '--with-cuda=%(start_dir)s/opal/mca/cuda --with-show-load-errors=no '
+configopts = '--with-cuda=internal '
+configopts += ' CC=pgcc CXX=pgc++ FC=pgfortran'
+
+# disable MPI1 compatibility for now, see what breaks...
+# configopts += '--enable-mpi1-compatibility '
+
+# to enable SLURM integration (site-specific)
+# configopts += '--with-slurm --with-pmi=/usr/include/slurm --with-pmi-libdir=/usr'
 
 moduleclass = 'mpi'

Updated software QuantumESPRESSO-7.3.1-nvompi-2023a-CUDA-12.1.1.eb

Diff against QuantumESPRESSO-7.4-foss-2024a.eb

easybuild/easyconfigs/q/QuantumESPRESSO/QuantumESPRESSO-7.4-foss-2024a.eb

diff --git a/easybuild/easyconfigs/q/QuantumESPRESSO/QuantumESPRESSO-7.4-foss-2024a.eb b/easybuild/easyconfigs/q/QuantumESPRESSO/QuantumESPRESSO-7.3.1-nvompi-2023a-CUDA-12.1.1.eb
index f017367700..609a3b47f0 100644
--- a/easybuild/easyconfigs/q/QuantumESPRESSO/QuantumESPRESSO-7.4-foss-2024a.eb
+++ b/easybuild/easyconfigs/q/QuantumESPRESSO/QuantumESPRESSO-7.3.1-nvompi-2023a-CUDA-12.1.1.eb
@@ -1,15 +1,15 @@
-name = 'QuantumESPRESSO'
-version = '7.4'
+name = "QuantumESPRESSO"
+version = "7.3.1"
+versionsuffix = '-CUDA-%(cudaver)s'
 
-homepage = 'https://www.quantum-espresso.org'
+homepage = "https://www.quantum-espresso.org"
 description = """Quantum ESPRESSO  is an integrated suite of computer codes
 for electronic-structure calculations and materials modeling at the nanoscale.
 It is based on density-functional theory, plane waves, and pseudopotentials
 (both norm-conserving and ultrasoft).
 """
 
-toolchain = {'name': 'foss', 'version': '2024a'}
-
+toolchain = {"name": "nvompi", "version": "2023a"}
 toolchainopts = {
     "usempi": True,
     "openmp": True,
@@ -17,13 +17,13 @@ toolchainopts = {
 
 # Check hashes inside external/submodule_commit_hash_records when making file for new version
 local_lapack_hash = "12d825396fcef1e0a1b27be9f119f9e554621e55"
-local_mbd_hash = "89a3cc199c0a200c9f0f688c3229ef6b9a8d63bd"
+local_mbd_hash = "82005cbb65bdf5d32ca021848eec8f19da956a77"
 local_devxlib_hash = "a6b89ef77b1ceda48e967921f1f5488d2df9226d"
 local_fox_hash = "3453648e6837658b747b895bb7bef4b1ed2eac40"
-# Different from the one at tag qe-7.4, see https://github.com/anharmonic/d3q/issues/22
-local_d3q_hash = "808acbaf012468f42147d8d6af452ec64b9e5ab0"
-# Different from the one at tag qe-7.4
-local_qe_gipaw_hash = "9b2ae1a46cae045cc04ef02c1072f2e1e74873b2"
+# Different from the one at tag qe-7.3.1 because of:
+# https://gitlab.com/QEF/q-e/-/issues/666
+local_d3q_hash = "de4718351e7bbb9d1d12aad2b7ca232d06775b83"
+local_qe_gipaw_hash = "75b01b694c9ba4df55d294cacc27cf28591b2161"
 local_qmcpack_hash = "f72ab25fa4ea755c1b4b230ae8074b47d5509c70"
 local_w90_hash = "1d6b187374a2d50b509e5e79e2cab01a79ff7ce1"
 
@@ -101,47 +101,76 @@ sources = [
         },
     },
 ]
-patches = [
-    {
-        'name': 'QuantumESPRESSO-7.4-d3q.patch',
-        'sourcepath': '../'  # Needed as patches are normally applied to the first `finalpath` directory
-    },
-]
-# Holding off git clone checksum checks untill 5.0.x
-# See https://github.com/easybuilders/easybuild-framework/pull/4248
+# Holding off checksum checks untill 5.0.x
+# https://github.com/easybuilders/easybuild-framework/pull/4248
+# checksums = [
+#     {'q-e-qe-7.3.1.tar.gz': '2c58b8fadfe4177de5a8b69eba447db5e623420b070dea6fd26c1533b081d844'},
+#     {'lapack-%s.tar.gz' % local_lapack_hash: 'c05532ae0e5fe35f473206dda12970da5f2e2214620487d71837ddcf0ea6b21d'},
+#     {'mbd-%s.tar.gz' % local_mbd_hash: 'a180682c00bb890c9b1e26a98addbd68e32f970c06439acf7582415f4c589800'},
+#     {'devxlib-%s.tar.gz' % local_devxlib_hash: '76da8fe5a2050f58efdc92fa8831efec25c19190df7f4e5e39c173a5fbae83b4'},
+#     {'d3q-%s.tar.gz' % local_d3q_hash: '43e50753a56af05d181b859d3e29d842fb3fc4352f00cb7fe229a435a1f20c31'},
+#     {'fox-%s.tar.gz' % local_fox_hash: '99b6a899a3f947d7763aa318e86f9f08db684568bfdcd293f3318bee9d7f1948'},
+#     {'qe-gipaw-%s.tar.gz' % local_qe_gipaw_hash: '9ac8314363d29cc2f1ce85abd8f26c1a3ae311d54f6e6034d656442dd101c928'},
+#     {'pw2qmcpack-%s.tar.gz' % local_qmcpack_hash: 'a8136da8429fc49ab560ef7356cd6f0a2714dfbb137baff7961f46dfe32061eb'},
+#     {'wannier90-%s.tar.gz' % local_w90_hash: 'f989497790ec9777bdc159945bbf42156edb7268011f972874dec67dd4f58658'},
+# ]
 checksums = [
-    'b15dcfe25f4fbf15ccd34c1194021e90996393478226e601d876f7dea481d104',
-    None, None, None, None, None, None, None, None,
-    '1f1686365fbf0cc56f634e072a92b3d336fe454348e514d0b4136d447f0d4923'
+    '2c58b8fadfe4177de5a8b69eba447db5e623420b070dea6fd26c1533b081d844',
+    None, None, None, None, None, None, None, None
 ]
 
+local_gcc_compiler = ('GCCcore', '12.3.0')
+local_compiler = ('NVHPC', '23.7-CUDA-12.1.1')
+
 builddependencies = [
-    ('M4', '1.4.19'),
-    ('CMake', '3.29.3'),
-    ('pkgconf', '2.2.0'),
+    ("M4", "1.4.19", '', local_gcc_compiler),
+    ("CMake", "3.26.3", '', local_gcc_compiler),
+    ("cURL", "8.0.1", '', local_gcc_compiler),
 ]
 dependencies = [
-    ('HDF5', '1.14.5'),
-    ('ELPA', '2024.05.001'),
-    ('libxc', '6.2.2'),
+    ('CUDA', '12.1.1', '', SYSTEM),
+    ('OpenBLAS', '0.3.24', '', local_compiler),
+    ('FFTW', '3.3.10', '', local_compiler),
+    ('FFTW.MPI', '3.3.10'),
+    ('ScaLAPACK', '2.2.0'),
+    ("HDF5", "1.14.0", '-CUDA-%(cudaver)s'),
+    ("libxc", "6.2.2", '', local_compiler),
+    # ("ELPA", "2023.05.001"),
 ]
 
 # Disabled because of
 # https://gitlab.com/QEF/q-e/-/issues/667
 # https://github.com/anharmonic/d3q/issues/15
 build_shared_libs = False
+with_cuda = True
+
 with_scalapack = True
-with_fox = True
-with_gipaw = True
-with_d3q = True
+with_gipaw = False  # https://github.com/dceresoli/qe-gipaw/issues/19
+with_d3q = False  # Gives `conflict with access` errors
 with_qmcpack = True
 
 moduleclass = "chem"
 
+test_suite_nprocs = 1  # Unless multiple GPUs are available or GPU should be oversubscribed
 test_suite_threshold = (
-    0.98
+    0.4  # Low threshold because of https://gitlab.com/QEF/q-e/-/issues/665
 )
 test_suite_max_failed = (
     5  # Allow for some flaky tests (failed due to strict thresholds)
 )
-test_suite_allow_failures = []
+test_suite_allow_failures = [
+    "test_qe_xclib_",  # 7.3.1:  https://gitlab.com/QEF/q-e/-/issues/640
+
+    # 7.3.1: Broken testsuite (https://gitlab.com/QEF/q-e/-/issues/665)
+    "--hp_",
+    "--ph_",
+    "--epw_",
+    "--tddfpt_",
+
+    # 7.3.1: https://gitlab.com/QEF/q-e/-/issues/675
+    "system--pw_scf--scf-rmm-k",
+    "system--pw_scf--scf-rmm-paro-k",
+    "system--pw_scf-correctness",
+    "system--pw_noncolin--noncolin-rmm",
+    "system--pw_noncolin-correctness",
+]
Diff against QuantumESPRESSO-7.3.1-foss-2024a.eb

easybuild/easyconfigs/q/QuantumESPRESSO/QuantumESPRESSO-7.3.1-foss-2024a.eb

diff --git a/easybuild/easyconfigs/q/QuantumESPRESSO/QuantumESPRESSO-7.3.1-foss-2024a.eb b/easybuild/easyconfigs/q/QuantumESPRESSO/QuantumESPRESSO-7.3.1-nvompi-2023a-CUDA-12.1.1.eb
index 8cbd539a51..609a3b47f0 100644
--- a/easybuild/easyconfigs/q/QuantumESPRESSO/QuantumESPRESSO-7.3.1-foss-2024a.eb
+++ b/easybuild/easyconfigs/q/QuantumESPRESSO/QuantumESPRESSO-7.3.1-nvompi-2023a-CUDA-12.1.1.eb
@@ -1,15 +1,15 @@
-name = 'QuantumESPRESSO'
-version = '7.3.1'
+name = "QuantumESPRESSO"
+version = "7.3.1"
+versionsuffix = '-CUDA-%(cudaver)s'
 
-homepage = 'https://www.quantum-espresso.org'
+homepage = "https://www.quantum-espresso.org"
 description = """Quantum ESPRESSO  is an integrated suite of computer codes
 for electronic-structure calculations and materials modeling at the nanoscale.
 It is based on density-functional theory, plane waves, and pseudopotentials
 (both norm-conserving and ultrasoft).
 """
 
-toolchain = {'name': 'foss', 'version': '2024a'}
-
+toolchain = {"name": "nvompi", "version": "2023a"}
 toolchainopts = {
     "usempi": True,
     "openmp": True,
@@ -23,9 +23,7 @@ local_fox_hash = "3453648e6837658b747b895bb7bef4b1ed2eac40"
 # Different from the one at tag qe-7.3.1 because of:
 # https://gitlab.com/QEF/q-e/-/issues/666
 local_d3q_hash = "de4718351e7bbb9d1d12aad2b7ca232d06775b83"
-# Different from the one at tag qe-7.3.1 because of:
-# https://github.com/dceresoli/qe-gipaw/issues/19
-local_qe_gipaw_hash = "79d3a03b7bdc4325e66f3fad02a24c6e6e3e5806"
+local_qe_gipaw_hash = "75b01b694c9ba4df55d294cacc27cf28591b2161"
 local_qmcpack_hash = "f72ab25fa4ea755c1b4b230ae8074b47d5509c70"
 local_w90_hash = "1d6b187374a2d50b509e5e79e2cab01a79ff7ce1"
 
@@ -121,29 +119,39 @@ checksums = [
     None, None, None, None, None, None, None, None
 ]
 
+local_gcc_compiler = ('GCCcore', '12.3.0')
+local_compiler = ('NVHPC', '23.7-CUDA-12.1.1')
+
 builddependencies = [
-    ('M4', '1.4.19'),
-    ('CMake', '3.29.3'),
-    ('pkgconf', '2.2.0'),
+    ("M4", "1.4.19", '', local_gcc_compiler),
+    ("CMake", "3.26.3", '', local_gcc_compiler),
+    ("cURL", "8.0.1", '', local_gcc_compiler),
 ]
 dependencies = [
-    ('HDF5', '1.14.5'),
-    ('ELPA', '2024.05.001'),
-    ('libxc', '6.2.2'),
+    ('CUDA', '12.1.1', '', SYSTEM),
+    ('OpenBLAS', '0.3.24', '', local_compiler),
+    ('FFTW', '3.3.10', '', local_compiler),
+    ('FFTW.MPI', '3.3.10'),
+    ('ScaLAPACK', '2.2.0'),
+    ("HDF5", "1.14.0", '-CUDA-%(cudaver)s'),
+    ("libxc", "6.2.2", '', local_compiler),
+    # ("ELPA", "2023.05.001"),
 ]
 
 # Disabled because of
 # https://gitlab.com/QEF/q-e/-/issues/667
 # https://github.com/anharmonic/d3q/issues/15
 build_shared_libs = False
+with_cuda = True
+
 with_scalapack = True
-with_fox = True
-with_gipaw = True
-with_d3q = True
+with_gipaw = False  # https://github.com/dceresoli/qe-gipaw/issues/19
+with_d3q = False  # Gives `conflict with access` errors
 with_qmcpack = True
 
 moduleclass = "chem"
 
+test_suite_nprocs = 1  # Unless multiple GPUs are available or GPU should be oversubscribed
 test_suite_threshold = (
     0.4  # Low threshold because of https://gitlab.com/QEF/q-e/-/issues/665
 )
@@ -152,8 +160,17 @@ test_suite_max_failed = (
 )
 test_suite_allow_failures = [
     "test_qe_xclib_",  # 7.3.1:  https://gitlab.com/QEF/q-e/-/issues/640
-    "--hp_",  # 7.3.1:  Broken testsuite (https://gitlab.com/QEF/q-e/-/issues/665)
-    "--ph_",  # 7.3.1:  Broken testsuite (https://gitlab.com/QEF/q-e/-/issues/665)
-    "--epw_",  # 7.3.1:  Broken testsuite (https://gitlab.com/QEF/q-e/-/issues/665)
-    "--tddfpt_",  # 7.3.1:  Broken testsuite (https://gitlab.com/QEF/q-e/-/issues/665)
+
+    # 7.3.1: Broken testsuite (https://gitlab.com/QEF/q-e/-/issues/665)
+    "--hp_",
+    "--ph_",
+    "--epw_",
+    "--tddfpt_",
+
+    # 7.3.1: https://gitlab.com/QEF/q-e/-/issues/675
+    "system--pw_scf--scf-rmm-k",
+    "system--pw_scf--scf-rmm-paro-k",
+    "system--pw_scf-correctness",
+    "system--pw_noncolin--noncolin-rmm",
+    "system--pw_noncolin-correctness",
 ]

Updated software ScaLAPACK-2.2.0-nvompi-2023a.eb

Diff against ScaLAPACK-2.2.0-gompi-2024a-fb.eb

easybuild/easyconfigs/s/ScaLAPACK/ScaLAPACK-2.2.0-gompi-2024a-fb.eb

diff --git a/easybuild/easyconfigs/s/ScaLAPACK/ScaLAPACK-2.2.0-gompi-2024a-fb.eb b/easybuild/easyconfigs/s/ScaLAPACK/ScaLAPACK-2.2.0-nvompi-2023a.eb
index 1bccc16f38..8e4f83e46c 100644
--- a/easybuild/easyconfigs/s/ScaLAPACK/ScaLAPACK-2.2.0-gompi-2024a-fb.eb
+++ b/easybuild/easyconfigs/s/ScaLAPACK/ScaLAPACK-2.2.0-nvompi-2023a.eb
@@ -1,12 +1,11 @@
 name = 'ScaLAPACK'
 version = '2.2.0'
-versionsuffix = '-fb'
 
 homepage = 'https://www.netlib.org/scalapack/'
 description = """The ScaLAPACK (or Scalable LAPACK) library includes a subset of LAPACK routines
  redesigned for distributed memory MIMD parallel computers."""
 
-toolchain = {'name': 'gompi', 'version': '2024a'}
+toolchain = {'name': 'nvompi', 'version': '2023a'}
 toolchainopts = {'extra_fflags': '-lpthread', 'openmp': True, 'pic': True, 'usempi': True}
 
 source_urls = [homepage]
@@ -18,19 +17,19 @@ checksums = [
 ]
 
 builddependencies = [
-    ('CMake', '3.29.3'),
+    ('CMake', '3.26.3'),
 ]
 
 dependencies = [
-    ('FlexiBLAS', '3.4.4'),
+    ('OpenBLAS', '0.3.24'),
 ]
 
 # Config Opts based on AOCL User Guide:
 # https://developer.amd.com/wp-content/resources/AOCL_User%20Guide_2.2.pdf
 
 configopts = '-DBUILD_SHARED_LIBS=ON '
-configopts += '-DBLAS_LIBRARIES="$EBROOTFLEXIBLAS/lib/libflexiblas.%s" ' % SHLIB_EXT
-configopts += '-DLAPACK_LIBRARIES="$EBROOTFLEXIBLAS/lib/libflexiblas.%s" ' % SHLIB_EXT
+# configopts += '-DBLAS_LIBRARIES="$EBROOTFLEXIBLAS/lib/libflexiblas.%s" ' % SHLIB_EXT
+# configopts += '-DLAPACK_LIBRARIES="$EBROOTFLEXIBLAS/lib/libflexiblas.%s" ' % SHLIB_EXT
 
 sanity_check_paths = {
     'files': ['lib/libscalapack.%s' % SHLIB_EXT, 'lib64/libscalapack.%s' % SHLIB_EXT],
Diff against ScaLAPACK-2.2.0-gmpich-2024.06-fb.eb

easybuild/easyconfigs/s/ScaLAPACK/ScaLAPACK-2.2.0-gmpich-2024.06-fb.eb

diff --git a/easybuild/easyconfigs/s/ScaLAPACK/ScaLAPACK-2.2.0-gmpich-2024.06-fb.eb b/easybuild/easyconfigs/s/ScaLAPACK/ScaLAPACK-2.2.0-nvompi-2023a.eb
index 28d4f766d5..8e4f83e46c 100644
--- a/easybuild/easyconfigs/s/ScaLAPACK/ScaLAPACK-2.2.0-gmpich-2024.06-fb.eb
+++ b/easybuild/easyconfigs/s/ScaLAPACK/ScaLAPACK-2.2.0-nvompi-2023a.eb
@@ -1,12 +1,11 @@
 name = 'ScaLAPACK'
 version = '2.2.0'
-versionsuffix = '-fb'
 
 homepage = 'https://www.netlib.org/scalapack/'
 description = """The ScaLAPACK (or Scalable LAPACK) library includes a subset of LAPACK routines
  redesigned for distributed memory MIMD parallel computers."""
 
-toolchain = {'name': 'gmpich', 'version': '2024.06'}
+toolchain = {'name': 'nvompi', 'version': '2023a'}
 toolchainopts = {'extra_fflags': '-lpthread', 'openmp': True, 'pic': True, 'usempi': True}
 
 source_urls = [homepage]
@@ -22,15 +21,15 @@ builddependencies = [
 ]
 
 dependencies = [
-    ('FlexiBLAS', '3.3.1'),
+    ('OpenBLAS', '0.3.24'),
 ]
 
 # Config Opts based on AOCL User Guide:
 # https://developer.amd.com/wp-content/resources/AOCL_User%20Guide_2.2.pdf
 
 configopts = '-DBUILD_SHARED_LIBS=ON '
-configopts += '-DBLAS_LIBRARIES="$EBROOTFLEXIBLAS/lib/libflexiblas.%s" ' % SHLIB_EXT
-configopts += '-DLAPACK_LIBRARIES="$EBROOTFLEXIBLAS/lib/libflexiblas.%s" ' % SHLIB_EXT
+# configopts += '-DBLAS_LIBRARIES="$EBROOTFLEXIBLAS/lib/libflexiblas.%s" ' % SHLIB_EXT
+# configopts += '-DLAPACK_LIBRARIES="$EBROOTFLEXIBLAS/lib/libflexiblas.%s" ' % SHLIB_EXT
 
 sanity_check_paths = {
     'files': ['lib/libscalapack.%s' % SHLIB_EXT, 'lib64/libscalapack.%s' % SHLIB_EXT],

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants