[ase-users] Database issues
Jens Jørgen Mortensen
jjmo at dtu.dk
Tue Nov 27 12:27:39 CET 2018
On 11/23/18 2:47 PM, David Kleiven via ase-users wrote:
>
> Dear ASE users,
>
>
> we are working on a project that involves many DFT calculations and we
> use the database functionality to keep track of everything. Our
> calculations have a lot of external data for machine learning
> purposes. Currently, the only way we could find to "attach" such info
> to each structure is via the key_value_pair. To our knowledge, the
> key_value_pairs are duplicated in the database 1) They are stored as a
> serialized JSON string on each row and 2) Distributed in
> number_key_values and text_key_values. Hence, when you have for
> instance 5000 key_value_pairs it becomes cumbersome to maintain this
> duplication. Moreover, we have experienced cases where appending more
> key_value_pairs leads to errors and a corrupted database. We tested
> various solution and one way we found is to allow user defined tables
> (i.e. users can create tables with the same schema as
> number_key_values). Hence, big chunks of static data can be placed in
> those tables. One avoids duplication and appending
> dynamic key_value_pairs is no longer a problem. When you read back
> data from the database, all data from these external tables are
> automatically added to the AtomsRow object as if they were regular
> key_value_pairs.
>
>
> The syntax for storing a separate table would be
>
> db.write(atoms, tables={"some_table_name": dict_with_data}, ...
> regular key value pairs...)
>
>
> and the same syntax for read. All external tables would be added to
> AtomsRow as if they where key_value_pairs.
>
>
> Is this solution (separating out big chunks of data in separate
> tables) interesting to include in ASE via a class that inherits from
> SQLite3Database and provide this extra functionality in addition to
> everything that is supported by SQLite3Database? (Note: that in our
> case using the data field is not a good option as this is essentially
> a big binary chunk and it is therefore not so easy to 1) manually look
> at the data via external tools and 2) not so easy to update)
>
That's an interesting idea. If I understand correctly, you would like to
be able to do key=value where value can be (almost) anything and not
just float or str as it must be now. Is that correct? Maybe you can
explain a bit more what your use case is and how using this new feature
would look?
Jens Jørgen
PS: Can you create a simple example script that demonstrates the error
you mentioned?
>
>
> Cheers,
>
> David Kleiven
>
>
>
> _______________________________________________
> ase-users mailing list
> ase-users at listserv.fysik.dtu.dk
> https://listserv.fysik.dtu.dk/mailman/listinfo/ase-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.fysik.dtu.dk/pipermail/ase-users/attachments/20181127/d10f9306/attachment.html>
More information about the ase-users
mailing list