[gpaw-users] odd SCF convergence behavior

naromero at alcf.anl.gov naromero at alcf.anl.gov
Tue Oct 26 15:51:38 CEST 2010


Jussi,

Now I understand what you are talking about. I have seen similar behavior
with other codes (e.g. GAMESS). Also with high symmetry molecules. This
cannot be fixed and IMO is not a serious problem.

Consider closing the ticket and just documenting the behavior in the
GPAW web documentation.

----- "Jussi Enkovaara" <jussi.enkovaara at csc.fi> wrote:

> Nichols A. Romero wrote:
> > Jussi,
> > 
> > I had not seen this ticket. Is it still a problem in the current
> > version of GPAW?
> 
> Yes, the problem still exists. It seems that whenever there are
> degenerate
> electronic states, the actual linear combination obtained from LAPACK
> (or 
> ScaLAPACK) is very sensitive to the numerics i.e. differences smaller
> than 10^-12
> cause "problems". I guess that in parallel calculations this sort of
> differences 
> come just from the different order of summation for floating point
> numbers and are 
> thus very difficult to get rid of.
> 
> I do not know if there is other solution than manually break
> symmetries e.g. by 
> rotating the atoms slightly (this way the real-space grid will break
> the symmetries 
> slightly). This is of course usable only for systems without k-points
> (otherwise 
> also the k-space symmetries are lost). I just checked and at least for
> the script 
> attached in the ticket a small rotation seemed to solve the problem.
> 
> In any case this might be useful to document, maybe just in FAQ (e.g.
> why results 
> differ with different CPU counts...)
> 
> Best regards,
> Jussi
> 
> 
> > 
> > ----- "Jussi Enkovaara" <jussi.enkovaara at csc.fi> wrote:
> > 
> >> Nichols A. Romero wrote:
> >>> Hi,
> >>>
> >>> Here is the input file in question.
> >>> http://en.pastebin.ca/1973390
> >>>
> >>> The SCF cycle is well-behaved at 64 MPI tasks,
> >>> http://en.pastebin.ca/1973391
> >>>
> >>> but is poorly behaved at 32 MPI tasks. 
> >>> http://en.pastebin.ca/1973392
> >>>
> >>> Is there a simple way to explain this difference?
> >> Hi,
> >> I assume that this might be related to the ticket 51 
> >> https://trac.fysik.dtu.dk/projects/gpaw/ticket/51
> >>
> >> The problem is/was that with systems with symmetries i.e
> degenerate
> >> electronic
> >> states, very tiny differences (e.g. 10^-12) in LCAO Hamiltonian
> >> matrix, arising 
> >> from different number of CPUs, result in completely different
> >> eigenvectors for the 
> >> degenerate states. Basically, they are just different linear
> >> combinations, so 
> >> physically there is no problem, however, the different values
> affect
> >> the numerics 
> >> and differences e.g. in the total energy start to accumulate a
> little
> >> by little. 
> >> For example, in your output one can see that the total energies
> for
> >> the 32 task and 
> >> 64 task cases remain the same for couple of first iterations.
> >>
> >> You could try to check whether the initialization is the issue by
> >> writing the wave 
> >> functions to a restart file after the first iteration, and then
> >> restarting from the 
> >> same file both with 64 task and 32 task cases.
> >>
> >> After I found the problem I was not that worried as the converged
> >> total energies 
> >> were in good agreement with different CPU counts, and also the
> number
> >> of iterations 
> >> differed only by 1-4 (after all, it was just slightly different
> >> starting guess for 
> >> the wave functions).
> >>
> >> If also your problem is related to initialization it is of course
> >> quite worrying if 
> >> the convergence behaviour is affected so much.
> >>
> >> Best regards,
> >> Jussi
> >

-- 
Nichols A. Romero, Ph.D.
Argonne Leadership Computing Facility
Argonne National Laboratory
Building 240 Room 2-127
9700 South Cass Avenue
Argonne, IL 60490
(630) 252-3441



More information about the gpaw-users mailing list