[gpaw-users] Restarting an optimization run
Michael Walter
Michael.Walter at fmf.uni-freiburg.de
Fri Jan 18 14:17:13 CET 2013
2013/1/18 Jens Jørgen Mortensen <jensj at fysik.dtu.dk>
> Den 18-01-2013 12:46, Ask Hjorth Larsen skrev:
> > Hi
> >
> > Could someone summarize what exactly the problem with trajectory
> > restarts was? Isn't it just a question of fixing the optimizers so
> > they use all the data available?
>
> If you restart like this:
>
> atoms = read('a1.traj')
> atoms.set_calculator(GPAW(...))
> opt = Optimizer(atoms, trajectory='a2.traj')
> opt.run(fmax=0.05)
>
>
> then after the first line, atoms will have a SinglePointCalculator
> object as its calculator and this object knows about the forces from the
> last image in the trajectory.
>
> Second line: the SinglePointCalculator is replaced by a fresh GPAW
> calculator which doesn't know anything.
>
> Line 4: The forces are calculated again :-(
>
Is it that bad ? The electronic structure, i.e. the Kohn-Sham states have
to be calculated again anyway. It should be of little cost to calculate the
forces again.
In particular in a optimization run, that can have up to hundreds of steps
(my structure guesses are bad sometimes ;) this should be a vanishing low
computational effort.
Or did I overlook something ? I always restart from the trajectory file in
the way of the "bad" example above.
Best,
Michael
>
> Jens Jørgen
>
> > Regards
> > Ask
> >
> > 2013/1/18 Jussi Enkovaara <jussi.enkovaara at aalto.fi>:
> >> On 2013-01-18 11:44, Jens Jørgen Mortensen wrote:
> >>> Den 17-01-2013 17:31, Nichols A. Romero skrev:
> >>>> JJ,
> >>>>
> >>>> I have a different script for restarting.
> >>>>
> >>>> Note that this was doesn't have the shutdown observer
> >>>> http://en.pastebin.ca/2303938
> >>>>
> >>>> But I basically do as Jussi, I restart from HDF5.
> >>>>
> >>>> Which we highly encourage people to use, because it works very well.
> Even on 100,000 cores :)
> >>>>
> >>>> I agree with what you say, most people just submit structural
> optimization for the maximum walltime and just let GPAW run.
> >>>> Then its just interrupted either in the middle of an SCF or between
> ionic steps.
> >>>>
> >>>> I think all these tricks should get documented somewhere. Because the
> most common example of a GPAW calculation is a structural optimization.
> >>> So, one needs to write a gpw or hdf file after every step in order to
> >>> make this work! Hmm ... these files are huge and contains a lot of
> >>> stuff that you normally don't need and they also increase network
> >>> traffic. I wish there was a simple way to restart from a trajectory
> >>> file without loosing the last step.
> >> I do not think restart files are that huge if you do not save the
> >> wavefunctions (they are in any case definitely larger than trajectory
> >> files). Also, I think that at least in some cases it is worthwhile
> >> to save restart files during the SCF cycles. However, I agree that
> >> it would be useful to be able to restart also from a trajectory.
> >>
> >>> We could work on making it possible to restart from GPAW's text output.
> >>> We would need to write all the digits for positions and unit cell in
> >>> order not to loose accuracy. Would this be a good idea?
> >> At least I am not very fond of writing all the digits to a text file...
> >> One possibility might be to copy the forces and energy which are read
> >> from the trajectory to GPAW calculator, and indicate GPAW that the
> >> calculation is already converged.
> >>
> >> With BFGS, one can do a single step directly after reading the image
> >> without calculator attached (just with the forces read from the image)
> >> e.g.
> >>
> >> if os.path.isfile('opt.traj'):
> >> atoms = read('opt.traj')
> >> traj = PickleTrajectory('opt' + '.traj', 'a', atoms=atoms)
> >> opt = BFGS(atoms, trajectory=traj, logfile='qn.log')
> >> opt.run(steps=1)
> >>
> >> but that does not work with optimizers that perform linesearch.
> >>
> >> Best regards,
> >> Jussi
> >>
> >>> Jens Jørgen
> >>>
> >>>> ----- Original Message -----
> >>>>> From: "Jens Jørgen Mortensen" <jensj at fysik.dtu.dk>
> >>>>> To: "Nichols A. Romero" <naromero at alcf.anl.gov>
> >>>>> Cc: gpaw-users at listserv.fysik.dtu.dk
> >>>>> Sent: Thursday, January 17, 2013 7:00:08 AM
> >>>>> Subject: Re: [gpaw-users] Restarting an optimization run
> >>>>> Den 16-01-2013 17:55, Nichols A. Romero skrev:
> >>>>>> I should add that this method is clearly not fault tolerant, is that
> >>>>>> what your are thinking of? For example, some node has an error in
> >>>>>> the middle of a force evaluation which brings down the whole code.
> >>>>> No, I wasn't thinking about such cases.
> >>>>>
> >>>>> I think most people will just let their jobs run until it gets killed
> >>>>> by
> >>>>> the queuing system. If you do this:
> >>>>>
> >>>>> atoms = ...
> >>>>> atoms.set_calculator(GPAW(...))
> >>>>> opt = Optimizer(atoms, trajectory='a1.traj')
> >>>>> opt.run(fmax=0.05)
> >>>>>
> >>>>> Let's say GPAW is stopped in the middle of calculating the forces for
> >>>>> image 8. Then the last image in a1.traj will be image 7 and
> >>>>> corresponding forces. If you then do:
> >>>>>
> >>>>> atoms = read('a1.traj')
> >>>>> atoms.set_calculator(GPAW(...))
> >>>>> opt = Optimizer(atoms, trajectory='a2.traj')
> >>>>> opt.run(fmax=0.05)
> >>>>>
> >>>>> GPAW will recalculate the forces for image 7 ...
> >>>>>
> >>>>> How does one solve this problem? Read atoms and calculator from a gpw
> >>>>> file?
> >>>>>
> >>>>> Jens Jørgen
> >>>>>
> >>>>>> ----- Original Message -----
> >>>>>>> From: "Nichols A. Romero" <naromero at alcf.anl.gov>
> >>>>>>> To: "Jens Jørgen Mortensen" <jensj at fysik.dtu.dk>
> >>>>>>> Cc: gpaw-users at listserv.fysik.dtu.dk
> >>>>>>> Sent: Wednesday, January 16, 2013 10:49:38 AM
> >>>>>>> Subject: Re: [gpaw-users] Restarting an optimization run
> >>>>>>> JJ,
> >>>>>>>
> >>>>>>> I deal with it by not allowing it to happen.
> >>>>>>> http://en.pastebin.ca/2303447
> >>>>>>>
> >>>>>>> Basically, I use the requested scheduler time (PBS, LSF, or
> >>>>>>> whatever)
> >>>>>>> and use that as a parameter to an observer.
> >>>>>>>
> >>>>>>>
> >>>>>>> ----- Original Message -----
> >>>>>>>> From: "Jens Jørgen Mortensen" <jensj at fysik.dtu.dk>
> >>>>>>>> To: gpaw-users at listserv.fysik.dtu.dk
> >>>>>>>> Sent: Wednesday, January 16, 2013 10:34:27 AM
> >>>>>>>> Subject: [gpaw-users] Restarting an optimization run
> >>>>>>>> Hi!
> >>>>>>>>
> >>>>>>>> I'd like to know how people continue optimization runs with GPAW
> >>>>>>>> that
> >>>>>>>> are killed in the middle of a force-calculation.
> >>>>>>>>
> >>>>>>>> Do you have some if-else magic in your script to handle both the
> >>>>>>>> first
> >>>>>>>> run and a continuation run or do just edit the first script to
> >>>>>>>> start
> >>>>>>>> from the last image in the trajectory file form the previous run?
> >>>>>>>>
> >>>>>>>> Do you worry about not repeating the force-calculation for the
> >>>>>>>> last
> >>>>>>>> image in the trajectory file you continue from? If yes, how?
> >>>>>>>>
> >>>>>>>> Jens Jørgen
> >>>>>>>>
> >>>>>>>> _______________________________________________
> >>>>>>>> gpaw-users mailing list
> >>>>>>>> gpaw-users at listserv.fysik.dtu.dk
> >>>>>>>> https://listserv.fysik.dtu.dk/mailman/listinfo/gpaw-users
> >>>>>>> --
> >>>>>>> Nichols A. Romero, Ph.D.
> >>>>>>> Argonne Leadership Computing Facility
> >>>>>>> Argonne National Laboratory
> >>>>>>> Building 240 Room 2-127
> >>>>>>> 9700 South Cass Avenue
> >>>>>>> Argonne, IL 60490
> >>>>>>> (630) 252-3441
> >>>>>>>
> >>>>>>>
> >>>>>>> _______________________________________________
> >>>>>>> gpaw-users mailing list
> >>>>>>> gpaw-users at listserv.fysik.dtu.dk
> >>>>>>> https://listserv.fysik.dtu.dk/mailman/listinfo/gpaw-users
> >>> _______________________________________________
> >>> gpaw-users mailing list
> >>> gpaw-users at listserv.fysik.dtu.dk
> >>> https://listserv.fysik.dtu.dk/mailman/listinfo/gpaw-users
> >>>
> >> _______________________________________________
> >> gpaw-users mailing list
> >> gpaw-users at listserv.fysik.dtu.dk
> >> https://listserv.fysik.dtu.dk/mailman/listinfo/gpaw-users
> > _______________________________________________
> > gpaw-users mailing list
> > gpaw-users at listserv.fysik.dtu.dk
> > https://listserv.fysik.dtu.dk/mailman/listinfo/gpaw-users
>
> _______________________________________________
> gpaw-users mailing list
> gpaw-users at listserv.fysik.dtu.dk
> https://listserv.fysik.dtu.dk/mailman/listinfo/gpaw-users
>
--
------------------------------------------
PD Dr Michael Walter
Address: Freiburger Materialforschungszentrum
Stefan-Meier-Straße 21
D-79104 Freiburg i. Br.
Germany
Tel.: +49 761 203 4758 and +49 761 203 7695
Fax: +49 761 203 4701
email: Michael.Walter at fmf.uni-freiburg.de
www: http://omnibus.uni-freiburg.de/~mw767
publications: http://scholar.google.com/citations?user=vlmryKEAAAAJ&hl=en
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://listserv.fysik.dtu.dk/pipermail/gpaw-users/attachments/20130118/d9f22e00/attachment.html
More information about the gpaw-users
mailing list