[gpaw-users] general comment on memory leaks.
abhishek khetan
askhetan at gmail.com
Fri Jan 15 14:58:51 CET 2016
So there are two kind of outputs. Mostly, its like this
___ ___ ___ _ _ _
| | |_ | | | |
| | | | | . | | | |
|__ | _|___|_____| 0.12.0.13279
|___|_|
(really, it doesn't print anything beyond). And sometimes it goes a bit
further
___ ___ ___ _ _ _
| | |_ | | | |
| | | | | . | | | |
|__ | _|___|_____| 0.12.0.13279
|___|_|
User: ak498084 at linuxitvc08.rz.RWTH-Aachen.DE
Date: Sun Jan 10 22:28:09 2016
Arch: x86_64
Pid: 24578
Python: 2.7.9
gpaw: /home/ak498084/Utility/GPAW/gpaw_devel/gpaw-0.12/gpaw
_gpaw:
/home/ak498084/Utility/GPAW/gpaw_devel/gpaw-0.12/build/bin.linux-x86_64-2.7/gpaw-python
ase: /home/ak498084/Utility/GPAW/gpaw_devel/ase/ase (version 3.10.0)
numpy:
/usr/local_rwth/sw/python/2.7.9/x86_64/lib/python2.7/site-packages/numpy
(version 1.9.1)
scipy:
/usr/local_rwth/sw/python/2.7.9/x86_64/lib/python2.7/site-packages/scipy
(version 0.15.1)
units: Angstrom and eV
cores: 84
Memory estimate
---------------
Process memory now: 77.73 MiB
Calculator 1476.36 MiB
Density 21.41 MiB
Arrays 6.10 MiB
Localized functions 13.55 MiB
Mixer 1.76 MiB
Hamiltonian 8.87 MiB
Arrays 4.53 MiB
XC 0.00 MiB
Poisson 3.37 MiB
vbar 0.98 MiB
Wavefunctions 1446.08 MiB
Arrays psit_nG 406.12 MiB
Eigensolver 610.42 MiB
Projections 1.57 MiB
Projectors 1.59 MiB
Overlap op 426.38 MiB
But this happens only when I give it something of the order of 12+gigs per
core for 84 cores.
As I had mentioned in my earlier posts, the memory requirement for such a
similar system, which I was somehow able to get to convergence after a lot
of similar difficulties of seg faults, looks typically like.
top - 23:35:13 up 5 days, 12:23, 0 users, load average: 6.03, 6.04, 5.93
Tasks: 705 total, 7 running, 698 sleeping, 0 stopped, 0 zombie
Cpu(s): 25.1%us, 0.1%sy, 0.0%ni, 74.7%id, 0.1%wa, 0.0%hi, 0.0%si,
0.0%st
Mem: 47.127G total, 11.120G used, 36.007G free, 36.012M buffers
Swap: 0.000k total, 0.000k used, 0.000k free, 357.062M cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+
COMMAND
24578 ak498084 20 0 2123m 1.7g 22m R 100.0 3.6 67:00.30
gpaw-python
24579 ak498084 20 0 2031m 1.6g 21m R 100.0 3.4 67:06.94
gpaw-python
24580 ak498084 20 0 2008m 1.6g 20m R 100.0 3.3 67:06.98
gpaw-python
24581 ak498084 20 0 2008m 1.6g 21m R 100.0 3.3 67:07.00
gpaw-python
24582 ak498084 20 0 2065m 1.6g 20m R 100.0 3.4 66:59.07
gpaw-python
24583 ak498084 20 0 2009m 1.6g 20m R 100.0 3.3 67:05.41
gpaw-python
23787 ak498084 20 0 88292 5504 1636 S 0.5 0.0 0:19.07
res
24390 ak498084 25 5 19676 1784 924 R 0.1 0.0 0:06.01
top
24452 ak498084 20 0 57300 7176 3432 S 0.1 0.0 0:00.35
mpiexec
23992 ak498084 20 0 9396 1304 940 S 0.0 0.0 0:00.00
1452461243.2555
23996 ak498084 20 0 11344 1320 1092 S 0.0 0.0 0:00.00
sh
24218 ak498084 20 0 18792 2184 1184 S 0.0 0.0 0:00.09 zsh
Where what you see is the memory required by 6 (of 84) processors when the
job is properly runnning and some ionic/electronic relaxations have been
completed. This is when i provide about 8+gigs per core. i really don't
understand why the seg fault arises when the actual requirement is so
modest.
Best,
On Fri, Jan 15, 2016 at 2:26 PM, Ask Hjorth Larsen <asklarsen at gmail.com>
wrote:
> The files are appreciated but the text output (stdout), being the most
> important one, is still missing.
>
> Best regards
> Ask
>
> 2016-01-15 14:09 GMT+01:00 abhishek khetan <askhetan at gmail.com>:
> > Attached are the files. The cif file is actually a gpaw converged output
> > that i extracted using ase and then changed one atom in it. It is quite
> huge
> > in size though.
> >
> > Maybe such errors are related to my installation, although I cannot find
> it
> > in any way.
> >
> > Thanks and Best,
> >
> >
> > On Thu, Jan 14, 2016 at 5:47 PM, Ask Hjorth Larsen <asklarsen at gmail.com>
> > wrote:
> >>
> >> Please attach both input script and text output.
> >>
> >> Best regards
> >> Ask
> >>
> >> 2016-01-14 17:38 GMT+01:00 abhishek khetan <askhetan at gmail.com>:
> >> > Dear gpaw developers,
> >> >
> >> > i have found that in general for large systems (> 150 atoms) or
> systems
> >> > with
> >> > memory intensive methods like the GW, there are always segfault errors
> >> > of a
> >> > similar kind. I have a scalapack compiled working version of gpaw-0.12
> >> > which
> >> > passes all tests in the suite. For a system, small in size, the
> various
> >> > methods in gpaw run properly but for bigger systems of the desired
> sizes
> >> > of
> >> > the same kind, gpaw fails with the exact same kind of error.
> >> >
> >> > gpaw-python:18622 terminated with signal 11 at PC=3d8d6acba8
> >> > SP=7ffe9b9d47b0. Backtrace:
> >> >
> >> > I have posted about this in the context of GW method on the gpaw
> forums
> >> > a
> >> > couple of dozen times before, but i haven't seen anyone else report
> >> > similar
> >> > errors. Now I am encountering the same unsolved errors in even simple
> >> > relaxation problems where the unit cell happens to be quite large. For
> >> > slightly smaller cases where the systems do converge, i see that the
> >> > memory
> >> > reuirements are actually very modest (1-2) gigs per core for 60 cores.
> >> >
> >> > Any ideas/ methods/ procedures that i can resolve this error as a
> user ?
> >> > Am
> >> > I allowed to make a ticket on this or request for a ticket on this on
> >> > the
> >> > TRAC ?
> >> >
> >> > Thanks and Best,
> >> >
> >> > askhetan
> >> >
> >> > _______________________________________________
> >> > gpaw-users mailing list
> >> > gpaw-users at listserv.fysik.dtu.dk
> >> > https://listserv.fysik.dtu.dk/mailman/listinfo/gpaw-users
> >
> >
> >
> >
> > --
> > || radhe radhe ||
> >
> > abhishek
> >
> > _______________________________________________
> > gpaw-users mailing list
> > gpaw-users at listserv.fysik.dtu.dk
> > https://listserv.fysik.dtu.dk/mailman/listinfo/gpaw-users
>
--
|| radhe radhe ||
abhishek
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.fysik.dtu.dk/pipermail/gpaw-users/attachments/20160115/2fb17ab3/attachment.html>
More information about the gpaw-users
mailing list