[gpaw-users] Restarting an optimization run

Jens Jørgen Mortensen jensj at fysik.dtu.dk
Thu Jan 17 14:00:08 CET 2013


Den 16-01-2013 17:55, Nichols A. Romero skrev:
> I should add that this method is clearly not fault tolerant, is that what your are thinking of? For example, some node has an error in the middle of a force evaluation which brings down the whole code.

No, I wasn't thinking about such cases.

I think most people will just let their jobs run until it gets killed by 
the queuing system.  If you do this:

atoms = ...
atoms.set_calculator(GPAW(...))
opt = Optimizer(atoms, trajectory='a1.traj')
opt.run(fmax=0.05)

Let's say GPAW is stopped in the middle of calculating the forces for 
image 8.  Then the last image in a1.traj will be image 7 and 
corresponding forces.  If you then do:

atoms = read('a1.traj')
atoms.set_calculator(GPAW(...))
opt = Optimizer(atoms, trajectory='a2.traj')
opt.run(fmax=0.05)

GPAW will recalculate the forces for image 7 ...

How does one solve this problem?  Read atoms and calculator from a gpw file?

Jens Jørgen

> ----- Original Message -----
>> From: "Nichols A. Romero" <naromero at alcf.anl.gov>
>> To: "Jens Jørgen Mortensen" <jensj at fysik.dtu.dk>
>> Cc: gpaw-users at listserv.fysik.dtu.dk
>> Sent: Wednesday, January 16, 2013 10:49:38 AM
>> Subject: Re: [gpaw-users] Restarting an optimization run
>> JJ,
>>
>> I deal with it by not allowing it to happen.
>> http://en.pastebin.ca/2303447
>>
>> Basically, I use the requested scheduler time (PBS, LSF, or whatever)
>> and use that as a parameter to an observer.
>>
>>
>> ----- Original Message -----
>>> From: "Jens Jørgen Mortensen" <jensj at fysik.dtu.dk>
>>> To: gpaw-users at listserv.fysik.dtu.dk
>>> Sent: Wednesday, January 16, 2013 10:34:27 AM
>>> Subject: [gpaw-users] Restarting an optimization run
>>> Hi!
>>>
>>> I'd like to know how people continue optimization runs with GPAW
>>> that
>>> are killed in the middle of a force-calculation.
>>>
>>> Do you have some if-else magic in your script to handle both the
>>> first
>>> run and a continuation run or do just edit the first script to start
>>> from the last image in the trajectory file form the previous run?
>>>
>>> Do you worry about not repeating the force-calculation for the last
>>> image in the trajectory file you continue from? If yes, how?
>>>
>>> Jens Jørgen
>>>
>>> _______________________________________________
>>> gpaw-users mailing list
>>> gpaw-users at listserv.fysik.dtu.dk
>>> https://listserv.fysik.dtu.dk/mailman/listinfo/gpaw-users
>> --
>> Nichols A. Romero, Ph.D.
>> Argonne Leadership Computing Facility
>> Argonne National Laboratory
>> Building 240 Room 2-127
>> 9700 South Cass Avenue
>> Argonne, IL 60490
>> (630) 252-3441
>>
>>
>> _______________________________________________
>> gpaw-users mailing list
>> gpaw-users at listserv.fysik.dtu.dk
>> https://listserv.fysik.dtu.dk/mailman/listinfo/gpaw-users



More information about the gpaw-users mailing list