[ase-users] [ase-developers] ASE calculator interface proposal

Fri Feb 15 07:48:40 CET 2013

> Hi there,
>
> I am looking for a way to start a NEB calculation with image
> parallelization using  siesta.
> In my calculations 6 CPUs are distributed over 3 images, therefore each
> image should run with 2 CPUs.
> Since the siesta calculator does not directly support the
> parallelization, I was trying a work around by starting the neb-python
> script using
>
> mpirun -np 3 neb.py
>
> and  in  "run_siesta.py"  I was changing the way how  siesta is started
>
> from "siesta" to  "mpirun -np 2 siesta"
>
> the neb.py starts up correctly but when siesta is initialized, it just
> crashed  with the error message  "OOB: Connection to HNP lost" and no
> further information or error message. The siesta output is empty.
>
> If I dont change "siesta" to "mpirun -np 2 siesta" it runs fine.

Looks like running MPI from inside MPI doesn't work - which means you 
are out of luck with our current NEB implementation in ASE. Someone 
could make our NEB implementation run in three threads instead of three 
MPI processes - I think such a multithreaded NEB would work for your case.

Jens Jørgen

> Cheers and thanks for any help in advance
> Benedikt Ziebarth
>
>
>
>
> import mpi4py
> from ase import *
> import ase.io as io
> from ase.calculators.siesta import Siesta
> import time
> from ase.optimize import MDMin
> import os
> from ase.neb import NEB
> from ase.parallel import rank, size
> from ase.io.trajectory import PickleTrajectory
> import time
>
>
>
> initial = io.read('init.traj')
> final = io.read('final.traj')
>
> numimages=3
> print size
> print rank
> assert numimages == size
>
>
> images = [initial]
> calc=['z']*numimages
> for i in range(numimages):
>     print calc[i]
>     calc[i]=Siesta(label='IMAGE_%d'%i,\
>                 xc='PBE',\
>                 meshcutoff=200 * 13.6,\
>                 basis='dzp',\
>                 kpts=[1,1,4])
>     calc[i].set_fdf('Diag.ParallelOverK',True)
> for i in range(numimages):
>       image = initial.copy()
>       if i == rank:
>           image.set_calculator(calc[i])
>
>       images.append(image)
> images.append(final)
> time.sleep(rank*1) #needed to avoid some copy errors of the pseudo
> potential files
> neb = NEB(images, parallel=True)
> neb.interpolate()
> qn = MDMin(neb)
>
> time.sleep(rank*1) #needed to avoid some copy errors of the pseudo
> potential files
> traj = PickleTrajectory('neb%d.traj' % rank, 'w', images[1 + rank],
> master=True)
> qn.attach(traj)
> qn.run(fmax=0.05)

>>>
>>> myatoms = Atoms(..., ideal positions)
>>>
>>> calc = MyCalculator(arg1, atoms=myatoms, kwargs)
>>> # I greatly prefer this to myatoms.set_calculator(calc)
>>> atoms.get_potential_energy()
>>
>> I like that way of attaching the calculator to the atoms.  I'll put that
>> idea in the proposal.
> >
> >> 1. The first time you run this, a calculation get run.
>
> As a personal preference :-), I do not like that heavy calculations 
> get run when one creates an object (even though some GPAW 
> functionality behaves this way), but that user has to explicitly 
> request calculation by calling a function.
>
> Also, I think that the above approach brings up the question whether 
> one attaches atoms to a calculator, and asks calculator to determine 
> physical quantities for this atomic configuration, or attaches 
> calculator to atoms and asks physical quantities from atoms object, 
> i.e. which comes first, atoms or calculator? There is already some 
> controversy as some quantities are requested from atoms (energy, 
> forces), and some from the calculator (wavefunctions, densities). 
> Personally, I do not have strong feelings over this matter, asking 
> atoms is maybe a little bit more physics oriented (calculator is just 
> black box providing numbers), on the other hand asking calculator 
> would unify things in the sense that everything is requested from 
> calculator (quantities available from MD calculator are of course 
> quite different than the ones availabe from DFT calculator).
>
> One question that should maybe also discussed is if there should 
> standard way to specify the command/binary to be executed when 
> calculator is run. Some calculators (e.g. Siesta) request a python 
> script containing the actual command to be executed, while other 
> calculators (e.g. Castep) ask for a shell command. I prefer the shell 
> command, as it is easier to specify e.g. number of CPUs within a batch 
> job script (at least for casual user who is used to do something like 
> 'mpirun -np 8 siesta'). People wanting to script everything may prefer 
> the first option...
>
> Best regards,
> Jussi