[gpaw-users] general comment on memory leaks.

abhishek khetan askhetan at gmail.com
Fri Jan 15 14:58:51 CET 2016


So there are two kind of outputs. Mostly, its like this
  ___ ___ ___ _ _ _
 |   |   |_  | | | |
 | | | | | . | | | |
 |__ |  _|___|_____|  0.12.0.13279
 |___|_|

(really, it doesn't print anything beyond). And sometimes it goes a bit
further
  ___ ___ ___ _ _ _
 |   |   |_  | | | |
 | | | | | . | | | |
 |__ |  _|___|_____|  0.12.0.13279
 |___|_|

User:   ak498084 at linuxitvc08.rz.RWTH-Aachen.DE
Date:   Sun Jan 10 22:28:09 2016
Arch:   x86_64
Pid:    24578
Python: 2.7.9
gpaw:   /home/ak498084/Utility/GPAW/gpaw_devel/gpaw-0.12/gpaw
_gpaw:
/home/ak498084/Utility/GPAW/gpaw_devel/gpaw-0.12/build/bin.linux-x86_64-2.7/gpaw-python
ase:    /home/ak498084/Utility/GPAW/gpaw_devel/ase/ase (version 3.10.0)
numpy:
/usr/local_rwth/sw/python/2.7.9/x86_64/lib/python2.7/site-packages/numpy
(version 1.9.1)
scipy:
/usr/local_rwth/sw/python/2.7.9/x86_64/lib/python2.7/site-packages/scipy
(version 0.15.1)
units:  Angstrom and eV
cores:  84

Memory estimate
---------------
Process memory now: 77.73 MiB
Calculator  1476.36 MiB
    Density  21.41 MiB
        Arrays  6.10 MiB
        Localized functions  13.55 MiB
        Mixer  1.76 MiB
    Hamiltonian  8.87 MiB
        Arrays  4.53 MiB
        XC  0.00 MiB
        Poisson  3.37 MiB
        vbar  0.98 MiB
    Wavefunctions  1446.08 MiB
        Arrays psit_nG  406.12 MiB
        Eigensolver  610.42 MiB
        Projections  1.57 MiB
        Projectors  1.59 MiB
        Overlap op  426.38 MiB

But this happens only when I give it something of the order of 12+gigs per
core for 84 cores.

As I had mentioned in my earlier posts, the memory requirement for such a
similar system, which I was somehow able to get to convergence after a lot
of similar difficulties of seg faults, looks typically like.

top - 23:35:13 up 5 days, 12:23,  0 users,  load average: 6.03, 6.04, 5.93
Tasks: 705 total,   7 running, 698 sleeping,   0 stopped,   0 zombie
Cpu(s): 25.1%us,  0.1%sy,  0.0%ni, 74.7%id,  0.1%wa,  0.0%hi,  0.0%si,
0.0%st
Mem:    47.127G total,   11.120G used,   36.007G free,   36.012M buffers
Swap:    0.000k total,    0.000k used,    0.000k free,  357.062M cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+
COMMAND
24578 ak498084  20   0 2123m 1.7g  22m R 100.0  3.6  67:00.30
gpaw-python
24579 ak498084  20   0 2031m 1.6g  21m R 100.0  3.4  67:06.94
gpaw-python
24580 ak498084  20   0 2008m 1.6g  20m R 100.0  3.3  67:06.98
gpaw-python
24581 ak498084  20   0 2008m 1.6g  21m R 100.0  3.3  67:07.00
gpaw-python
24582 ak498084  20   0 2065m 1.6g  20m R 100.0  3.4  66:59.07
gpaw-python
24583 ak498084  20   0 2009m 1.6g  20m R 100.0  3.3  67:05.41
gpaw-python
23787 ak498084  20   0 88292 5504 1636 S  0.5  0.0   0:19.07
res
24390 ak498084  25   5 19676 1784  924 R  0.1  0.0   0:06.01
top
24452 ak498084  20   0 57300 7176 3432 S  0.1  0.0   0:00.35
mpiexec
23992 ak498084  20   0  9396 1304  940 S  0.0  0.0   0:00.00
1452461243.2555
23996 ak498084  20   0 11344 1320 1092 S  0.0  0.0   0:00.00
sh
24218 ak498084  20   0 18792 2184 1184 S  0.0  0.0   0:00.09 zsh

Where what you see is the memory required by 6 (of 84) processors when the
job is properly runnning and some ionic/electronic relaxations have been
completed. This is when i provide about 8+gigs per core. i really don't
understand why the seg fault arises when the actual requirement is so
modest.

Best,



On Fri, Jan 15, 2016 at 2:26 PM, Ask Hjorth Larsen <asklarsen at gmail.com>
wrote:

> The files are appreciated but the text output (stdout), being the most
> important one, is still missing.
>
> Best regards
> Ask
>
> 2016-01-15 14:09 GMT+01:00 abhishek khetan <askhetan at gmail.com>:
> > Attached are the files. The cif file is actually a gpaw converged output
> > that i extracted using ase and then changed one atom in it. It is quite
> huge
> > in size though.
> >
> > Maybe such errors are related to my installation, although I cannot find
> it
> > in any way.
> >
> > Thanks and Best,
> >
> >
> > On Thu, Jan 14, 2016 at 5:47 PM, Ask Hjorth Larsen <asklarsen at gmail.com>
> > wrote:
> >>
> >> Please attach both input script and text output.
> >>
> >> Best regards
> >> Ask
> >>
> >> 2016-01-14 17:38 GMT+01:00 abhishek khetan <askhetan at gmail.com>:
> >> > Dear gpaw developers,
> >> >
> >> > i have found that in general for large systems (> 150 atoms) or
> systems
> >> > with
> >> > memory intensive methods like the GW, there are always segfault errors
> >> > of a
> >> > similar kind. I have a scalapack compiled working version of gpaw-0.12
> >> > which
> >> > passes all tests in the suite. For a system, small in size, the
> various
> >> > methods in gpaw run properly but for bigger systems of the desired
> sizes
> >> > of
> >> > the same kind, gpaw fails with the exact same kind of error.
> >> >
> >> > gpaw-python:18622 terminated with signal 11 at PC=3d8d6acba8
> >> > SP=7ffe9b9d47b0.  Backtrace:
> >> >
> >> > I have posted about this in the context of GW method on the gpaw
> forums
> >> > a
> >> > couple of dozen times before, but i haven't seen anyone else report
> >> > similar
> >> > errors. Now I am encountering the same unsolved errors in even simple
> >> > relaxation problems where the unit cell happens to be quite large. For
> >> > slightly smaller cases where the systems do converge, i see that the
> >> > memory
> >> > reuirements are actually very modest (1-2) gigs per core for 60 cores.
> >> >
> >> > Any ideas/ methods/ procedures that i can resolve this error as a
> user ?
> >> > Am
> >> > I allowed to make a ticket on this or request for a ticket on this on
> >> > the
> >> > TRAC ?
> >> >
> >> > Thanks and Best,
> >> >
> >> > askhetan
> >> >
> >> > _______________________________________________
> >> > gpaw-users mailing list
> >> > gpaw-users at listserv.fysik.dtu.dk
> >> > https://listserv.fysik.dtu.dk/mailman/listinfo/gpaw-users
> >
> >
> >
> >
> > --
> > || radhe radhe ||
> >
> > abhishek
> >
> > _______________________________________________
> > gpaw-users mailing list
> > gpaw-users at listserv.fysik.dtu.dk
> > https://listserv.fysik.dtu.dk/mailman/listinfo/gpaw-users
>



-- 
|| radhe radhe ||

abhishek
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.fysik.dtu.dk/pipermail/gpaw-users/attachments/20160115/2fb17ab3/attachment.html>


More information about the gpaw-users mailing list