[gpaw-users] small test HDF5 test case is running out of memory

Nichols A. Romero naromero at alcf.anl.gov
Wed Apr 13 05:20:56 CEST 2011


Hi,

Here is my small HDF5 test case:
http://en.pastebin.ca/2045654

I should have plenty of memory at 64-nodes, but instead
I run out of memory on the restart. Here is this traceback:

GPAW CLEANUP (node 0): <type 'exceptions.MemoryError'> occurred.  Calling MPI_Ab
ort!
Traceback (most recent call last):
  File "Au_bulk3x3x3_restartwfs.py", line 61, in <module>
    calc.initialize_positions()
  File "./gpaw/paw.py", line 285, in initialize_positions
  File "./gpaw/density.py", line 104, in set_positions
  File "./gpaw/lfc.py", line 265, in set_positions
  File "./gpaw/lfc.py", line 365, in _update
  File "./gpaw/lfc.py", line 177, in normalize
MemoryError

If the restart is done properly, should the _update method be called 
by the set_positions method in NewLFC?
https://trac.fysik.dtu.dk/projects/gpaw/browser/branches/gpaw_custom_hdf5/gpaw/lfc.py#L265

My current hypothesis is that we may not be reading D_asp or dH_asp to the
correct MPI task. 
https://trac.fysik.dtu.dk/projects/gpaw/browser/branches/gpaw_custom_hdf5/gpaw/io/__init__.py#L567
https://trac.fysik.dtu.dk/projects/gpaw/browser/branches/gpaw_custom_hdf5/gpaw/io/__init__.py#L602

which MPI tasks should these get these arrays?

For D_asp, we seem to read from domain_comm.rank == 0, followed by
initialize_direct_arrays(nt_sG, D_asp)

but only one the domain_comm master has D_asp. Does this look correct?

-- 
Nichols A. Romero, Ph.D.
Argonne Leadership Computing Facility
Argonne National Laboratory
Building 240 Room 2-127
9700 South Cass Avenue
Argonne, IL 60490
(630) 252-3441



More information about the gpaw-users mailing list