[gpaw-users] ScaLAPACK and parallelization ove bands

Sami Auvinen sami.auvinen at lut.fi
Mon Nov 7 14:45:43 CET 2011


Hi,

Thank you for the advice. I try to use parallel={'sl_default': (6,6,64)} only.

With best regards,
   -Sami Auvinen 


________________________________________
Lähettäjä: Jussi Enkovaara [jussi.enkovaara at csc.fi]
Lähetetty: 7. marraskuuta 2011 12:44
Vastaanottaja: Sami Auvinen
Kopio: gpaw-users
Aihe: Re: [gpaw-users] ScaLAPACK and parallelization ove bands

On 2011-11-07 10:50, Sami Auvinen wrote:
> Hello,
>
> Could you help me with parallelization in GPAW? I try to run calculation on (TiO2)38 cluster, with no periodic boundary conditions using only a single k-point. Due to quite large cluster size I try to run it by using parallelization over bands and ScaLAPACK, by setting up the calculator in following way:
>
> calculon = GPAW(gpts=(160,148,168),
>                  xc='PBE',
>                  spinpol=False,
>                  txt=name+'.txt',
>                  occupations=FermiDirac(width=0),
>                  eigensolver='cg',
>                  nbands=688,
>                  convergence={'bands':-10},
>                  mixer=Mixer(beta=0.14, nmaxold=14, weight=50.0),
>                  maxiter=200,
>                  parallel={'band': 16, 'sl_default': (6,6,16)})
>
> I wanted to run it with 256 prosessors but I end up with error message:

Hi Sami,
the error message that GPAW gives is not very good, but basically the problem seems
to be that initializing large enough number of bands cannot be done with
parallelization over bands. I tried to check the source code of more recent version
of GPAW, and I think this limitation does not exist anymore.

However, 'cg' eigensolver does not support parallelization over bands, not even in
the most recent svn version.

Furthermore, with your grid-size (160x148x168) and 256 cores, using just domain
decomposition is probably more efficient than band parallelization, and using 16
cores for 688 bands is definitely too much. Rule of thumb is to have minimum of
150-250 bands per core.

However, ScaLAPACK could be useful (especially if you run on CSC's louhi machine),
so you can use just

parallel={'sl_default': (6,6,64)}

The last number in ScaLAPACK parameters is the block size, where 32 or 64 are
typically good values (in many cases the block size is not very import for
performance).

Best regards,
Jussi



More information about the gpaw-users mailing list