[gpaw-users] Error when relaxing atoms
Marcin Dulak
Marcin.Dulak at fysik.dtu.dk
Wed Feb 4 11:40:45 CET 2015
On 02/04/2015 10:54 AM, Tristan Maxson wrote:
> This same problem is being discussed on gpaw-developers, the problem
> arises because due to the geometry optimization being replicated on
> all of the cores and there being small differences due to
> optimizations certain compilers make. It seems that it should be
> possible to turn on a debug variable to dump the mismatches to a file
> for debugging unsurprisingly. You always could manually edit the file
> to lower the required precision and try again, but 8 decimal places
> already sounds like quite an allowance for variance.
>
> Is it out of the question to try a different compiler and does this
> occur with all systems you try?
the first thing when investigating any tricky problems is to mention the
GPAW/ASE versions used,
and if gpaw-test passed in parallel:
https://wiki.fysik.dtu.dk/gpaw/install/installationguide.html#run-the-tests
It is true that these kind of problems disappear after changing
compiler/libraries, but then often
appear for other systems. Let me add that the most reliable combination
for GPAW I found over the years is gcc/acml,
also on intel processors.
Best regards,
Marcin
>
> Thank you,
> Tristan Maxson
>
> On Wed, Feb 4, 2015 at 4:21 AM, Torsten Hahn <torstenhahn at fastmail.fm
> <mailto:torstenhahn at fastmail.fm>> wrote:
>
> Probably we could do this but my feeling is, that this would only
> cure the symptoms not the real origin of this annoying bug.
>
>
> In fact there is code in
>
> mpi/__init__.py
>
> that says:
>
> # Construct fingerprint:
> # ASE may return slightly different atomic positions (e.g. due
> # to MKL) so compare only first 8 decimals of positions
>
>
> The code says that only 8 decimal positions are used for the
> generation of atomic „fingerprints“. These code relies on numpy
> and therefore lapack/blas functions. However i have no idea what
> that md5_array etc. stuff really does. But there is some
> debug-code which should at least tell you which Atom(s) causes the
> problems.
>
> However, that error is *very* strange because mpi.broadcast(...)
> should result in *exactly* the same objects on all cores. No idea
> why there should be any difference at all and what was the
> intention behind the fancy fingerprint-generation stuff in the
> compare_atoms(atoms, comm=world) method.
>
> Best,
> Torsten.
>
> > Am 04.02.2015 um 10:00 schrieb jingzhe <jingzhe.chen at gmail.com
> <mailto:jingzhe.chen at gmail.com>>:
> >
> > Hi Torsten,
> >
> > Thanks for quick reply, but I use gcc and
> lapack/blas, I mean if the positions
> > of the atoms are slightly different for different ranks because
> of compiler/lib stuff,
> > can we just set a tolerance in the check_atoms and jump off the
> error?
> >
> > Best.
> >
> > Jingzhe
> >
> >
> >
> >
> >
> > 于 2015年02月04日 14:32, Torsten Hahn 写道:
> >> Dear Jingzhe,
> >>
> >> we often recognized this error if we use GPAW together with
> Intel MKL <= 11.x on Intel CPU’s. I never tracked down the error
> because it was gone after compiler/library upgrade.
> >>
> >> Best,
> >> Torsten.
> >>
> >>
> >> --
> >> Dr. Torsten Hahn
> >> torstenhahn at fastmail.fm <mailto:torstenhahn at fastmail.fm>
> >>
> >>> Am 04.02.2015 um 07:27 schrieb jingzhe Chen
> <jingzhe.chen at gmail.com <mailto:jingzhe.chen at gmail.com>>:
> >>>
> >>> Dear GPAW guys,
> >>>
> >>> I used the latest gpaw to run a relaxation job, and
> find the below
> >>> error message.
> >>>
> >>> RuntimeError: Atoms objects on different processors are
> not identical!
> >>>
> >>> I find a line in the force calculator
> 'wfs.world.broadcast(self.F_av, 0)'
> >>> so that all the forces on different ranks should be the same,
> which makes
> >>> me confused, I can not think out any other reason can lead to
> this error.
> >>>
> >>> Could anyone take a look at it?
> >>>
> >>> I attached the structure file and running script here,
> I used 24 cores.
> >>>
> >>> Thanks in advance.
> >>>
> >>> Jingzhe
> >>>
> >>>
> <main.py><model.traj>_______________________________________________
> >>> gpaw-users mailing list
> >>> gpaw-users at listserv.fysik.dtu.dk
> <mailto:gpaw-users at listserv.fysik.dtu.dk>
> >>> https://listserv.fysik.dtu.dk/mailman/listinfo/gpaw-users
> >>
> >
>
>
> _______________________________________________
> gpaw-users mailing list
> gpaw-users at listserv.fysik.dtu.dk
> <mailto:gpaw-users at listserv.fysik.dtu.dk>
> https://listserv.fysik.dtu.dk/mailman/listinfo/gpaw-users
>
>
>
>
> _______________________________________________
> gpaw-users mailing list
> gpaw-users at listserv.fysik.dtu.dk
> https://listserv.fysik.dtu.dk/mailman/listinfo/gpaw-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.fysik.dtu.dk/pipermail/gpaw-users/attachments/20150204/55155216/attachment-0001.html>
More information about the gpaw-users
mailing list