[gpaw-users] odd SCF convergence behavior
Jussi Enkovaara
jussi.enkovaara at csc.fi
Tue Oct 26 15:38:46 CEST 2010
Nichols A. Romero wrote:
> Jussi,
>
> I had not seen this ticket. Is it still a problem in the current
> version of GPAW?
Yes, the problem still exists. It seems that whenever there are degenerate
electronic states, the actual linear combination obtained from LAPACK (or
ScaLAPACK) is very sensitive to the numerics i.e. differences smaller than 10^-12
cause "problems". I guess that in parallel calculations this sort of differences
come just from the different order of summation for floating point numbers and are
thus very difficult to get rid of.
I do not know if there is other solution than manually break symmetries e.g. by
rotating the atoms slightly (this way the real-space grid will break the symmetries
slightly). This is of course usable only for systems without k-points (otherwise
also the k-space symmetries are lost). I just checked and at least for the script
attached in the ticket a small rotation seemed to solve the problem.
In any case this might be useful to document, maybe just in FAQ (e.g. why results
differ with different CPU counts...)
Best regards,
Jussi
>
> ----- "Jussi Enkovaara" <jussi.enkovaara at csc.fi> wrote:
>
>> Nichols A. Romero wrote:
>>> Hi,
>>>
>>> Here is the input file in question.
>>> http://en.pastebin.ca/1973390
>>>
>>> The SCF cycle is well-behaved at 64 MPI tasks,
>>> http://en.pastebin.ca/1973391
>>>
>>> but is poorly behaved at 32 MPI tasks.
>>> http://en.pastebin.ca/1973392
>>>
>>> Is there a simple way to explain this difference?
>> Hi,
>> I assume that this might be related to the ticket 51
>> https://trac.fysik.dtu.dk/projects/gpaw/ticket/51
>>
>> The problem is/was that with systems with symmetries i.e degenerate
>> electronic
>> states, very tiny differences (e.g. 10^-12) in LCAO Hamiltonian
>> matrix, arising
>> from different number of CPUs, result in completely different
>> eigenvectors for the
>> degenerate states. Basically, they are just different linear
>> combinations, so
>> physically there is no problem, however, the different values affect
>> the numerics
>> and differences e.g. in the total energy start to accumulate a little
>> by little.
>> For example, in your output one can see that the total energies for
>> the 32 task and
>> 64 task cases remain the same for couple of first iterations.
>>
>> You could try to check whether the initialization is the issue by
>> writing the wave
>> functions to a restart file after the first iteration, and then
>> restarting from the
>> same file both with 64 task and 32 task cases.
>>
>> After I found the problem I was not that worried as the converged
>> total energies
>> were in good agreement with different CPU counts, and also the number
>> of iterations
>> differed only by 1-4 (after all, it was just slightly different
>> starting guess for
>> the wave functions).
>>
>> If also your problem is related to initialization it is of course
>> quite worrying if
>> the convergence behaviour is affected so much.
>>
>> Best regards,
>> Jussi
>
More information about the gpaw-users
mailing list