[gpaw-users] GPAW with mpirun on multiple hosts

Jens Jørgen Mortensen jensj at fysik.dtu.dk
Mon Jan 10 09:14:52 CET 2011


On Fri, 2011-01-07 at 05:57 -0800, Chris Willmore wrote:
> Hi Jens,
> 
> Yes, when running w.py on both hosts, the process will spin forever
> without intervention. Running on a single host it finishes < 5 min.
> 
> The two line script prints the exact output you predicted.
> 
> I'm running into trouble executing 
> mpirun --hostfile mpihosts gpaw-python /usr/bin/gpaw-test
> 
> The output shows a problem on the remote host missing a file like
> "tmp/gpaw-test-XXXX".

Hmm.  gpaw-test will create a temporary directory in /tmp which needs to
be accessible from all processes, which is not the case for your setup.
Can you run the tests individually?

mpirun --hostfile mpihosts gpaw-python /path/to/source/gpaw/test/bulk.py

JJ

> Regards,
> Chris
> 
> 
> 
> 
> ______________________________________________________________________
> From: Jens Jørgen Mortensen <jensj at fysik.dtu.dk>
> To: Chris Willmore <chris.willmore at yahoo.com>
> Cc: gpaw-users at listserv.fysik.dtu.dk
> Sent: Fri, January 7, 2011 3:24:13 PM
> Subject: Re: [gpaw-users] GPAW with mpirun on multiple hosts
> 
> On Fri, 2011-01-07 at 04:34 -0800, Chris Willmore wrote:
> > Hi All,
> > 
> > I am a computer science student working with a researcher,
> attempting
> > to scale his gpaw calculations.
> > Currently we are trying to run a job across two hosts, each with a
> > single cpu. The gpaw process executes on both hosts, but no data is
> > written to disk as expected. Running the script on only one host
> works
> > as expected.
> > 
> > We would be grateful for any advice or directions on how to fix or
> > code and/or configuration.
> > 
> > Please find details below.
> > 
> > Thank you,
> > Chris Willmore
> > Software Engineer Master's Student, University of Tartu
> > 
> > The command being run is:
> > 
> > $mpirun --hostfile mpihosts gpaw-python w.py
> 
> Do the two processes continue to run forever?  What happens if you
> replace w.py with a script with these two lines:
> 
> import gpaw.mpi as mpi
> print mpi.rank, mpi.size
> 
> Do you get something like this:
> 
> 0 2
> 1 2
> 
> You can also try to run the testsuite in parallel:
> 
>   mpirun --hostfile mpihosts
> gpaw-python /path/to/gpaw-source/tools/gpaw-test
> 
> Jens Jørgen
> 
> > mpihosts file contains to lines, each with node ip and slots="1"
> > 
> > w.py contents:
> > ------------------------------
> > from ase import *
> > from gpaw import *
> > from math import *
> > from ase.lattice.surface import fcc111
> > 
> > x=0.77
> > y=0.59
> > H2O = Atoms([Atom('O', (0, 0, 0)),
> >              Atom('H', (x, y, 0)),
> >              Atom('H', (-x, y, 0))],pbc=False)
> > H2O.center(vacuum=6)
> > 
> > # Initial state:
> > calc = GPAW(h=0.25, txt='w.txt', parallel={'domain':2}, xc='RPBE')
> > H2O.set_calculator(calc)
> > qn=QuasiNewton(H2O,trajectory='w.traj',restart='w.pckl')
> > qn.attach(calc.write,1,'w.gpw')
> > qn.run(fmax=0.05)
> > calc.write('w.gpw')
> > --------------------------------
> > 
> > 
> > _______________________________________________
> > gpaw-users mailing list
> > gpaw-users at listserv.fysik.dtu.dk
> > https://listserv.fysik.dtu.dk/mailman/listinfo/gpaw-users
> 
> 



More information about the gpaw-users mailing list