[gpaw-users] small test HDF5 test case is running out of memory
Nichols A. Romero
naromero at alcf.anl.gov
Tue Apr 19 16:30:22 CEST 2011
For anyone that is following the HDF5 stuff, even though I have fixed
HDF5 for the largest test case (Pt 1415 cluster) at 64 MPI tasks. This
is still failing.
I wonder if this is some special case here because 64-nodes is one Pset
(there is only a single I/O node on a 64-node partition of BG/P).
----- Original Message -----
> Hi,
>
> Here is my small HDF5 test case:
> http://en.pastebin.ca/2045654
>
> I should have plenty of memory at 64-nodes, but instead
> I run out of memory on the restart. Here is this traceback:
>
> GPAW CLEANUP (node 0): <type 'exceptions.MemoryError'> occurred.
> Calling MPI_Ab
> ort!
> Traceback (most recent call last):
> File "Au_bulk3x3x3_restartwfs.py", line 61, in <module>
> calc.initialize_positions()
> File "./gpaw/paw.py", line 285, in initialize_positions
> File "./gpaw/density.py", line 104, in set_positions
> File "./gpaw/lfc.py", line 265, in set_positions
> File "./gpaw/lfc.py", line 365, in _update
> File "./gpaw/lfc.py", line 177, in normalize
> MemoryError
>
> If the restart is done properly, should the _update method be called
> by the set_positions method in NewLFC?
> https://trac.fysik.dtu.dk/projects/gpaw/browser/branches/gpaw_custom_hdf5/gpaw/lfc.py#L265
>
> My current hypothesis is that we may not be reading D_asp or dH_asp to
> the
> correct MPI task.
> https://trac.fysik.dtu.dk/projects/gpaw/browser/branches/gpaw_custom_hdf5/gpaw/io/__init__.py#L567
> https://trac.fysik.dtu.dk/projects/gpaw/browser/branches/gpaw_custom_hdf5/gpaw/io/__init__.py#L602
>
> which MPI tasks should these get these arrays?
>
> For D_asp, we seem to read from domain_comm.rank == 0, followed by
> initialize_direct_arrays(nt_sG, D_asp)
>
> but only one the domain_comm master has D_asp. Does this look correct?
>
> --
> Nichols A. Romero, Ph.D.
> Argonne Leadership Computing Facility
> Argonne National Laboratory
> Building 240 Room 2-127
> 9700 South Cass Avenue
> Argonne, IL 60490
> (630) 252-3441
>
> _______________________________________________
> gpaw-users mailing list
> gpaw-users at listserv.fysik.dtu.dk
> https://listserv.fysik.dtu.dk/mailman/listinfo/gpaw-users
--
Nichols A. Romero, Ph.D.
Argonne Leadership Computing Facility
Argonne National Laboratory
Building 240 Room 2-127
9700 South Cass Avenue
Argonne, IL 60490
(630) 252-3441
More information about the gpaw-users
mailing list