-
Notifications
You must be signed in to change notification settings - Fork 145
Conference call notes 20181212
Kenneth Hoste edited this page Dec 12, 2018
·
2 revisions
(back to Conference calls)
Notes on the 116th EasyBuild conference call, Wednesday Dec 11th 2018 (17:00 - 18:00 CET)
Alphabetical list of attendees (4):
- Damian Alvarez (JSC, Germany)
- Kenneth Hoste (HPC-UGent, Belgium)
- Bart Oldeman (ComputeCanada)
- Davide Vanzo (Vanderbilt University, US)
- updates on upcoming EasyBuild v3.8.0
- an easyblock for OpenMPI?
- Q&A
- ETA: (unknown, but soonish)
- framework
- https://github.com/easybuilders/easybuild-framework/milestone/60
- highlights:
- fix for easyconfig files saved in
reprod
dir - deprecating of
ictce
,goolf
and oldestintel
toolchains - some shuffling around of code to make implementation of other features easier
- fix for easyconfig files saved in
- TODO
- fix for long-standing issue where robot always required easyconfig files when resolving dependencies via subtoolchains
- postponed
- support for
eb --new
: postponed until future release (EB v3.9.0?)
- support for
- easyblocks
- https://github.com/easybuilders/easybuild-easyblocks/milestone/52
- highlights:
- various minor bug fixes & enhancements to existing easyblocks
- TODO:
- Trilinos (Davide)
- cfr. https://github.com/easybuilders/easybuild-easyblocks/pull/1601
- should maybe be changed to only configure for MKL with recent versions (> 12.12)?
- Trilinos (Davide)
- easyconfigs
- https://github.com/easybuilders/easybuild-easyconfigs/milestone/55
- highlights:
- fix for SCOTCH 6.0.5 easyconfigs: incorrect version & source URL
- cfr. https://github.com/easybuilders/easybuild-easyconfigs/pull/7159
- see also enhanced SCOTCH easyblock to catch these errors: https://github.com/easybuilders/easybuild-easyblocks/pull/1580
- fix for SKESA easyconfigs: incorrect version & source URL
- various other small bug fixes, enhancements, updates & easyconfigs for new software
- fix for SCOTCH 6.0.5 easyconfigs: incorrect version & source URL
- TODO:
- OpenBLAS (Bart)
- missing build deps for Qt5: flex/Bison/Python2
- now easyconfigs only, lots of hardcoding
- recent OpenMPI versions have better support for auto-detecting what is there
- or even for fat builds
- recent OpenMPI versions prefer UCX/libfabric
- rely on auto-detection that OpenMPI configure does?
- with dedicated easyconfig parameters for things like ibverbs, ucx, torque/slurm, ...
- reach out to OpenMPI community about this?
- Bart: benchmarking: UCX is overall winner + hcoll for collective communication
- also: SHARP, but that needs a dedicated server (only beneficial with largish jobs, 20 nodes/800 cores)
- Damian: lots of problems with collectives on latest Intel MPI with large jobs (1000 cores)
- Bart: 2019 versions have dropped everything but libfabric (OFI)
- Damian: also similar problems with 2018 versions (w/ default fabric, DAPL?)
- system isn't very special; JUWELS (Skylake + IB), JURECA (Haswell + IB)
- in some collectives it hangs, sometimes segfaults, ...
- Bart: reproducible with OSU microbenchmarks?
- Bart can look into reproducing this on upcoming new system
- relevant for upcoming intel/2019a
- Davide: frequent question at SC18
- which module naming schemes should be used?
- how to experiment with multiple module naming schemes?
- currently require multiple 'eb' runs:
- first run using one module naming scheme and using --fixed-installdir-naming-scheme
- subsequent run(s) using other module naming scheme(s) using --module-only
- main issue here is that --module-only is not perfect, and you may run into trouble here
- support for configuring EasyBuild to install module files with multiple different naming schemes in one go should not be too difficult (famous last words...)
- Kenneth plans to make some (good) progress on porting EasyBuild to Python 3 during the holidays
- plan is to ingest whatever parts of vsc-base are needed by EasyBuild into the EasyBuild framework repository
- porting vsc-base to Python 3 is likely going to take too long
- too much impact on system scripts in HPC-UGent, so changes to vsc-base have to be done with great care
- there's also very little development going on in vsc-base, so ingesting it kind of makes sense
- having a single code base to worry about when porting to Python 3 should help a lot with making actual progress