[ase-users] Parallelization over images with arbitrary calculator
Jens Jørgen Mortensen
jensj at fysik.dtu.dk
Fri Feb 15 08:00:26 CET 2013
> Hi there,
>
> I am looking for a way to start a NEB calculation with image
> parallelization using siesta.
> In my calculations 6 CPUs are distributed over 3 images, therefore each
> image should run with 2 CPUs.
> Since the siesta calculator does not directly support the
> parallelization, I was trying a work around by starting the neb-python
> script using
>
> mpirun -np 3 neb.py
>
> and in "run_siesta.py" I was changing the way how siesta is started
>
> from "siesta" to "mpirun -np 2 siesta"
>
> the neb.py starts up correctly but when siesta is initialized, it just
> crashed with the error message "OOB: Connection to HNP lost" and no
> further information or error message. The siesta output is empty.
>
> If I dont change "siesta" to "mpirun -np 2 siesta" it runs fine.
Looks like running MPI from inside MPI doesn't work - which means you
are out of luck with our current NEB implementation in ASE. Someone
could make our NEB implementation run in three threads instead of three
MPI processes - I think such a multithreaded NEB would work for your case.
Jens Jørgen
> Cheers and thanks for any help in advance
> Benedikt Ziebarth
>
>
>
>
> import mpi4py
> from ase import *
> import ase.io as io
> from ase.calculators.siesta import Siesta
> import time
> from ase.optimize import MDMin
> import os
> from ase.neb import NEB
> from ase.parallel import rank, size
> from ase.io.trajectory import PickleTrajectory
> import time
>
>
>
> initial = io.read('init.traj')
> final = io.read('final.traj')
>
> numimages=3
> print size
> print rank
> assert numimages == size
>
>
> images = [initial]
> calc=['z']*numimages
> for i in range(numimages):
> print calc[i]
> calc[i]=Siesta(label='IMAGE_%d'%i,\
> xc='PBE',\
> meshcutoff=200 * 13.6,\
> basis='dzp',\
> kpts=[1,1,4])
> calc[i].set_fdf('Diag.ParallelOverK',True)
> for i in range(numimages):
> image = initial.copy()
> if i == rank:
> image.set_calculator(calc[i])
>
> images.append(image)
> images.append(final)
> time.sleep(rank*1) #needed to avoid some copy errors of the pseudo
> potential files
> neb = NEB(images, parallel=True)
> neb.interpolate()
> qn = MDMin(neb)
>
> time.sleep(rank*1) #needed to avoid some copy errors of the pseudo
> potential files
> traj = PickleTrajectory('neb%d.traj' % rank, 'w', images[1 + rank],
> master=True)
> qn.attach(traj)
> qn.run(fmax=0.05)
>>>
>>> myatoms = Atoms(..., ideal positions)
>>>
>>> calc = MyCalculator(arg1, atoms=myatoms, kwargs)
>>> # I greatly prefer this to myatoms.set_calculator(calc)
>>> atoms.get_potential_energy()
>>
>> I like that way of attaching the calculator to the atoms. I'll put that
>> idea in the proposal.
> >
> >> 1. The first time you run this, a calculation get run.
>
> As a personal preference :-), I do not like that heavy calculations
> get run when one creates an object (even though some GPAW
> functionality behaves this way), but that user has to explicitly
> request calculation by calling a function.
>
> Also, I think that the above approach brings up the question whether
> one attaches atoms to a calculator, and asks calculator to determine
> physical quantities for this atomic configuration, or attaches
> calculator to atoms and asks physical quantities from atoms object,
> i.e. which comes first, atoms or calculator? There is already some
> controversy as some quantities are requested from atoms (energy,
> forces), and some from the calculator (wavefunctions, densities).
> Personally, I do not have strong feelings over this matter, asking
> atoms is maybe a little bit more physics oriented (calculator is just
> black box providing numbers), on the other hand asking calculator
> would unify things in the sense that everything is requested from
> calculator (quantities available from MD calculator are of course
> quite different than the ones availabe from DFT calculator).
>
> One question that should maybe also discussed is if there should
> standard way to specify the command/binary to be executed when
> calculator is run. Some calculators (e.g. Siesta) request a python
> script containing the actual command to be executed, while other
> calculators (e.g. Castep) ask for a shell command. I prefer the shell
> command, as it is easier to specify e.g. number of CPUs within a batch
> job script (at least for casual user who is used to do something like
> 'mpirun -np 8 siesta'). People wanting to script everything may prefer
> the first option...
>
> Best regards,
> Jussi
More information about the ase-users
mailing list