[gpaw-users] small test HDF5 test case is running out of memory
Nichols A. Romero
naromero at alcf.anl.gov
Wed Apr 13 05:20:56 CEST 2011
Hi,
Here is my small HDF5 test case:
http://en.pastebin.ca/2045654
I should have plenty of memory at 64-nodes, but instead
I run out of memory on the restart. Here is this traceback:
GPAW CLEANUP (node 0): <type 'exceptions.MemoryError'> occurred. Calling MPI_Ab
ort!
Traceback (most recent call last):
File "Au_bulk3x3x3_restartwfs.py", line 61, in <module>
calc.initialize_positions()
File "./gpaw/paw.py", line 285, in initialize_positions
File "./gpaw/density.py", line 104, in set_positions
File "./gpaw/lfc.py", line 265, in set_positions
File "./gpaw/lfc.py", line 365, in _update
File "./gpaw/lfc.py", line 177, in normalize
MemoryError
If the restart is done properly, should the _update method be called
by the set_positions method in NewLFC?
https://trac.fysik.dtu.dk/projects/gpaw/browser/branches/gpaw_custom_hdf5/gpaw/lfc.py#L265
My current hypothesis is that we may not be reading D_asp or dH_asp to the
correct MPI task.
https://trac.fysik.dtu.dk/projects/gpaw/browser/branches/gpaw_custom_hdf5/gpaw/io/__init__.py#L567
https://trac.fysik.dtu.dk/projects/gpaw/browser/branches/gpaw_custom_hdf5/gpaw/io/__init__.py#L602
which MPI tasks should these get these arrays?
For D_asp, we seem to read from domain_comm.rank == 0, followed by
initialize_direct_arrays(nt_sG, D_asp)
but only one the domain_comm master has D_asp. Does this look correct?
--
Nichols A. Romero, Ph.D.
Argonne Leadership Computing Facility
Argonne National Laboratory
Building 240 Room 2-127
9700 South Cass Avenue
Argonne, IL 60490
(630) 252-3441
More information about the gpaw-users
mailing list