[ase-users] [ase-developers] Genetic algorithm added to ase

Lasse Vilhelmsen lassebv at phys.au.dk
Fri Mar 14 19:14:04 CET 2014


I will look into your suggestions using the integer ID instead, although it will require changes outside the data module. I agree that the two approaches can lead to similar results, but I think  using time stamps is actually better, since it does not make any assumptions about the underlying data structure.

Not being able to use string matching will make that part of the implementation slightly more complicated, but it will probably be okay.

I am on vacation next week, but I will follow up on the implementation the week after.

Best Regards
Lasse

On 12 Mar 2014, at 09:20, Jens Jørgen Mortensen <jensj at fysik.dtu.dk<mailto:jensj at fysik.dtu.dk>> wrote:

Den 11-03-2014 09:26, Lasse Vilhelmsen skrev:
Hi JJ,

I have been working on changing the data.py module to using ase.db, and so far it is going quite well. I am very impressed with the elegance this functionality introduces, and I have the overall functionality in place.

Thanks!

I have however come across two difficulties where I am unsure if I am using the ase.db module wrong, or if some functionality could be added.

The first is with regards to date/time comparisons. Data is added to the database as the GA is running and the method therefore needs to extract this data for use in generating the population. To do this as efficiently as possible I keep track of the time at which data was extracted and then only ask for data newer than the last time I extracted information from the database. I would thus like to do something like
entries = self.c.select('ctime>{0}'.format(since), relaxed=1)
where since is a datetime object. From reading the core.py file it seems like a bit of functionality needs to be added to make the above statement possible, but I am not sure I fully understand the logic behind the ctime and mtime implementation. In my application of sqlite I have simply stored times using the timestamp data type within sqlite and datetime comparisons then just work out of the box.

Using ctime and mtime isn't supported right now.  You are supposed to use something like:

  c.select('age<1h')

But for the GA work I would think that it would be better not to rely on time.  Could you use the integer ID instead?  It is unique and increasing with every new row added.

The second question concerns string matching in select statements. Would it perhaps be possible to implement the SQL “LIKE" functionality so that one can do string matching when using the select statement so that statements like
entries = self.c.select(description='pairing:%’)
would translate into SQL along the lines of
SELECT ? FROM ?? WHERE some_field LIKE ‘pairing:%'
I understand that the above functionality might be implemented with the use of a new keyword for each ‘key:%’ one wants to employ, but I think the above would make for a more flexible functionality in many cases since it would then also be possible to select on arbitrary substrings.

I'd prefer not to go there ...

Another thing is that it might be beneficial to automatically save everything from the Atoms objects “info" dictionary automatically to make a transition from traj files to the database as easy as possible. I use the info dictionary extensively in the GA since it is a very convenient way to carry around metadata and cached information about each configuration.

That's something to consider.  It currently works in the other direction only:  If you extract an Atoms object from a row in a database, keywords, key-value pairs and data can be added to Atoms.info (see ase/db/core.py).

Jens Jørgen

Best Regards
Lasse

On 10/03/2014, at 15.20, Jens Jørgen Mortensen <jensj at fysik.dtu.dk<mailto:jensj at fysik.dtu.dk>> wrote:

Den 10-03-2014 15:19, Lasse Vilhelmsen skrev:
In principle it should be possible to change it to using the ase.db interface. All data communication happens through the module data.py, so the change is fairly isolated. I will try looking into it.

Great!  The ase.db module is quite new so we may need to add some new features or look at performance for ga stuff.  Let me know if you run into problems.

Jens Jørgen


/Lasse

On 10/03/2014, at 14.20, Jens Jørgen Mortensen <jensj at fysik.dtu.dk<mailto:jensj at fysik.dtu.dk>> wrote:

Den 27-02-2014 15:38, Lasse Vilhelmsen skrev:
Hi Marcin,

Thank you for adding the branch to the test build system. I have added a number of tests to the folder you suggested which verifies that essential parts of the GA work as intended. I have just committed the tests to the svn and the buildbot has given them the all clear :)

Concerning your comment about how to refer to py scripts in the documentation and the location for these I am a bit confused. If I look through the other tutorials in doc/tutorials most of them reference py scripts located in doc/tutorials using the .. literalinclude:: syntax. I therefore take it that is the correct way to include py samples in the tutorials?

The tutorial scripts I have created serve as full examples of the GA and they therefore take quite some time to execute and they should therefore not be considered tests.

I see that the new GA stuff uses an SQLite database.  Could the new ase.db module be used instead?

    https://wiki.fysik.dtu.dk/ase/ase/db/db.html

Jens Jørgen


Best Regards
Lasse

On 27/02/2014, at 11.40, Marcin Dulak <Marcin.Dulak at fysik.dtu.dk<mailto:Marcin.Dulak at fysik.dtu.dk>> wrote:

Hi,

On 02/27/2014 11:22 AM, Lasse Vilhelmsen wrote:
Hi Michael,

It was Jens Jørgens suggestion to first put it in a separate branch to let people test it out before moving it into the trunk version.

I have already updated the optimize.rst file in the ga branch with a short description of the method and a reference to the tutorial.

I am unsure when it is an appropriate step to move the code from the branch to the trunk, but I assume that a few need to test it out first to ensure the high quality of the code in the trunk.
your branch is now added to automatic testing at https://ase-buildbot.fysik.dtu.dk/waterfall
Please do not add python scripts to documentation - they should be part of the running tests,
and only referred to in the rst file using :svn:. See https://wiki.fysik.dtu.dk/ase/ase/calculators/abinit.html for an example.
Consider also creating a special ase/test/ga subdirectory.
The tests must be fast - few seconds max.

Best regards,

Marcin


Best Regards
Lasse

On 27/02/2014, at 10.59, Michael Walter <Michael.Walter at fmf.uni-freiburg.de<mailto:Michael.Walter at fmf.uni-freiburg.de>> wrote:

Dear Lasse,

great that there is a genetic algorith in ase now !

I suggest to put the algorithm to trunk and add the explanation (or the link) to the list of global optimization agorithms:
https://wiki.fysik.dtu.dk/ase/ase/optimize.html#global-optimization

Best,
Michael


2014-02-27 9:50 GMT+01:00 Lasse Vilhelmsen <lassebv at phys.au.dk<mailto:lassebv at phys.au.dk>>:
Dear ase-users and ase-developers,

I have in the past couple of years developed and used a genetic algorithm for global structure optimization within ase. The method has been used for the optimization of metal clusters and oxide structures both on supported surfaces, in metal organic frameworks and in vacuum. The method implements the cut-and-splice pairing operator by Deaven and Ho, a set of different mutations, a way to verify if two structures are equal, a starting population generator and a population that can propose structures to pair. The method works with all calculators in ase and it has especially been developed for parallel execution of multiple local relaxations simultaneously using first principles calculations.

The code is currently located in the svn branch ga of ase. The entire genetic algorithm code is located in ase/optimiize/genetic_algorithm with a tutorial describing the method in the documentations section.

My hope is that some of you might have an interest in trying the method and giving some feedback on what aspects of the implementation you find intuitive and easy to use, and which parts of the method you find counter intuitive and weird. The reporting any sorts of bugs are of course also very much appreciated!

I have compiled the current version of the tutorial and published it on the following link for easy reference. This tutorial is the optimal way to start using the method, since it includes full test examples:
http://users-phys.au.dk/lassebv/ga_optimize.html

The ga branch of ase can easily be obtained using the following command
svn co https://svn.fysik.dtu.dk/projects/ase/branches/ga

I look forward to any feedback you might have!

Best Regards
Lasse

_______________________________________________
ase-users mailing list
ase-users at listserv.fysik.dtu.dk<mailto:ase-users at listserv.fysik.dtu.dk>
https://listserv.fysik.dtu.dk/mailman/listinfo/ase-users



--
------------------------------------------
PD Dr Michael Walter
Address: Fraunhofer IWM
         Wöhlerstrasse 11
         D-79108 Freiburg i. Br.
         Germany
Tel.: +49 761 5142 296
email: Michael.Walter at fmf.uni-freiburg.de<mailto:Michael.Walter at fmf.uni-freiburg.de>
www: http://omnibus.uni-freiburg.de/~mw767<http://omnibus.uni-freiburg.de/%7Emw767>
publications: http://scholar.google.com/citations?user=vlmryKEAAAAJ&hl=en




_______________________________________________
ase-users mailing list
ase-users at listserv.fysik.dtu.dk<mailto:ase-users at listserv.fysik.dtu.dk>
https://listserv.fysik.dtu.dk/mailman/listinfo/ase-users






_______________________________________________
ase-developers mailing list
ase-developers at listserv.fysik.dtu.dk<mailto:ase-developers at listserv.fysik.dtu.dk>
https://listserv.fysik.dtu.dk/mailman/listinfo/ase-developers






-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.fysik.dtu.dk/pipermail/ase-users/attachments/20140314/b795db25/attachment.html>


More information about the ase-users mailing list