[gpaw-users] general comment on memory leaks.

Mon Jan 18 19:35:29 CET 2016

You're right, the word memory leak is a wrong description. I made the
mistake of invariable associating it with the seg fault error, which is
what it actually is. I will make these tests and get back.

On Mon, Jan 18, 2016 at 6:32 PM, Ask Hjorth Larsen <asklarsen at gmail.com>
wrote:

> Why are you so sure that there are memory leaks?  So far we have only
> seen indications that a lot of memory is allocated.
>
> You could for example lower the grid spacing until it runs, then check
> if memory usage increases linearly with subsequent identical
> calculations.  That would indicate a memory leak.  If you do not
> observe this behaviour, then I don't know what you are seeing, but it
> is certainly not a memory leak!
>
> 2016-01-18 13:26 GMT+01:00 abhishek khetan <askhetan at gmail.com>:
> > I tried using the cluster interactively, and it gives me the output as
> > below. I couldn't make the r_memusage function work but its easily
> visible
> > that the memory requirements are quite modest. I do not know why there is
> > seg fault when I allocate it in the regular cluster for production jobs.
> >
> >   ___ ___ ___ _ _ _
> >  |   |   |_  | | | |
> >  | | | | | . | | | |
> >  |__ |  _|___|_____|  0.12.0.13279
> >  |___|_|
> >
> > User:   ak498084 at linuxbmc0002.rz.RWTH-Aachen.DE
> > Date:   Mon Jan 18 13:22:24 2016
> > Arch:   x86_64
> > Pid:    20443
> > Python: 2.7.9
> > gpaw:   /home/ak498084/Utility/GPAW/gpaw_devel/gpaw-0.12/gpaw
> > _gpaw:
> >
> /home/ak498084/Utility/GPAW/gpaw_devel/gpaw-0.12/build/bin.linux-x86_64-2.7/gpaw-python
> > ase:    /home/ak498084/Utility/GPAW/gpaw_devel/ase/ase (version 3.10.0)
> > numpy:
> > /usr/local_rwth/sw/python/2.7.9/x86_64/lib/python2.7/site-packages/numpy
> > (version 1.9.1)
> > scipy:
> > /usr/local_rwth/sw/python/2.7.9/x86_64/lib/python2.7/site-packages/scipy
> > (version 0.15.1)
> > units:  Angstrom and eV
> > cores:  32
> >
> > Memory estimate
> > ---------------
> > Process memory now: 75.02 MiB
> > Calculator  1145.24 MiB
> >     Density  56.04 MiB
> >         Arrays  15.91 MiB
> >         Localized functions  35.58 MiB
> >         Mixer  4.55 MiB
> >     Hamiltonian  23.19 MiB
> >         Arrays  11.82 MiB
> >         XC  0.00 MiB
> >         Poisson  8.81 MiB
> >         vbar  2.56 MiB
> >     Wavefunctions  1066.01 MiB
> >         Arrays psit_nG  523.69 MiB
> >         Eigensolver  2.29 MiB
> >         Projections  2.06 MiB
> >         Projectors  4.17 MiB
> >         Overlap op  533.81 MiB
> >
> >
> > On Mon, Jan 18, 2016 at 1:01 PM, abhishek khetan <askhetan at gmail.com>
> wrote:
> >>
> >> Dear Marcin, and Ask,
> >>
> >> I am indeed on this cluster. And I have already used both these tools.
> >> When I use the r_memusage (to check the peak physical memory), the peak
> >> physical memory is in the order of a few MBs and the process gets killed
> >> right as the beginning with the output only as:
> >>
> >>  |   |   |_  | | | |
> >>  | | | | | . | | | |
> >>  |__ |  _|___|_____|  0.12.0.13279
> >>  |___|_|
> >>
> >>
> >> The same is not the case when I take a pre-converged systems and run the
> >> r_memusage script. It shows me a good 2.5 GBs (and rising) before I
> kill the
> >> process, as I can see its running fine. This is what I mean by saying
> that
> >> the allocation doesn't even start for these unconverged cases. Using
> >> eigensolver=RMM_DIIS(keep_htpsit=False), has the exact same problems. Is
> >> there a way, I can trick gpaw into giving the cluster much less of a
> >> requirement. I want to try this because, as I have mentioned, at the
> peak
> >> condition my jobs don't need more than 2 GB per core and I'm providing
> it 8
> >> GB usually (albiet, to no use).
> >>
> >> Best,
> >>
> >>
> >> On Sat, Jan 16, 2016 at 1:10 PM, Marcin Dulak <mdul at dtu.dk> wrote:
> >>>
> >>> Hi,
> >>>
> >>> are you one this cluster?
> >>> https://doc.itc.rwth-aachen.de/display/CC/r_memusage
> >>>
> >>>
> https://doc.itc.rwth-aachen.de/display/CC/Resource+limitations+on+dialog+systems
> >>> It may be that the batch system (LSF) kills your jobs that exceed given
> >>> resident memory.
> >>> The two links above may help you to diagnose that.
> >>> I recall GPAW's memory estimate is not very accurate for standard
> >>> ground-state, PW or GRID mode jobs
> >>> (~20%) and may be very inaccurate (order of magnitude) for VDW or LCAO
> >>> jobs (Ask correct me if this is not the case anymore).
> >>>
> >>> Best regards,
> >>>
> >>> Marcin
> >>> _______________________________________________
> >>> gpaw-users mailing list
> >>> gpaw-users at listserv.fysik.dtu.dk
> >>> https://listserv.fysik.dtu.dk/mailman/listinfo/gpaw-users
> >>
> >>
> >>
> >>
> >> --
> >> || radhe radhe ||
> >>
> >> abhishek
> >
> >
> >
> >
> > --
> > || radhe radhe ||
> >
> > abhishek
> >
> > _______________________________________________
> > gpaw-users mailing list
> > gpaw-users at listserv.fysik.dtu.dk
> > https://listserv.fysik.dtu.dk/mailman/listinfo/gpaw-users
>

-- 
|| radhe radhe ||

abhishek
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.fysik.dtu.dk/pipermail/gpaw-users/attachments/20160118/6f1c6a1a/attachment.html>