[gpaw-users] mpi assertion fail

Ole Holm Nielsen Ole.H.Nielsen at fysik.dtu.dk
Sat Jun 13 12:49:39 CEST 2020


Michal Krompiec michal.krompiec at gmail.com  wrote:
> This is caused by a bug in Intel MPI 2019 update 6. Updating to version
> 2019 update 7 solves the issue.
...
>> I get this error very often (GPAW development version, built with Intel
>> compiler and Intel MPI 2020):
>> Assertion failed in file ../../src/mpid/ch4/src/intel/ch4_shm_coll.c at
>> line 2147: comm->shm_numa_layout[my_numa_node].base_addr
>> Is it an out-of-memory error or something more serious?

On our Niflheim cluster we use EasyBuild[1,2] software modules. 
EasyBuild provides "toolchains", for example, intel-2019b and foss-2019b 
which allows you to build GPAW and other software packages very easily.

Many HPC centers around the world participate in the selection of stable 
and fairly bug-free versions of compilers and libraries.  I would like 
to mention that the current EasyBuild toolchains[3] use these 
particularly selected versions of the Intel and open source tools:

intel 	date 	binutils 	GCC 	Intel compilers 	Intel MPI 	Intel MKL
2019b 	Sept ‘19 	2.32 	8.3.0 	2019.5.281 	2018.5.288 	2019.5.281
2020a 	May‘20 	2.34 	9.3.0 	2020.1.217 	2019.7.217 	2020.1.217

You will be well advised to use the Intel versions selected by the 
EasyBuild community.  Please note that intel-2020a uses the same Intel 
MPI version which you have discovered.

Best regards,
Ole

[1] https://github.com/hpcugent/easybuild
[2] https://wiki.fysik.dtu.dk/niflheim/EasyBuild_modules
[3] 
https://easybuild.readthedocs.io/en/master/Common-toolchains.html#component-versions-in-intel-toolchain

-- 
Ole Holm Nielsen
PhD, Senior HPC Officer
Department of Physics, Technical University of Denmark


More information about the gpaw-users mailing list