[gpaw-users] Computational cost of vdW-DF

Duy Le ttduyle at gmail.com
Tue Apr 12 18:42:25 CEST 2011


2011/4/12 Jens Jørgen Mortensen <jensj at fysik.dtu.dk>:
> On Tue, 2011-04-12 at 15:29 +0200, Duy Le wrote:
>> 2011/4/12 Jens Jørgen Mortensen <jensj at fysik.dtu.dk>:
>> > On Mon, 2011-04-11 at 23:56 +0200, Duy Le wrote:
>> >> Dear all,
>> >> As I understand, the approach of Román-Pérez and Soler (PRL 2009) has
>> >> been implemented in GPAW.  I wonder why the self-consistent vdW-DF
>> >> calculation is still extremely expensive (3 or 4 times in most of my
>> >> calculations) in comparison to regular GGA-PBE calculation.
>> >> Does any of you experience this slowness? I would like to know the reason.
>> >
>> > For a small system with few electrons, the calculation of the vdW-DF
>> > potential will dominate.  With more electrons the vdW-DF part should
>> > become smaller unless you run on so many cores that the parallelization
>> > of the vdW-DF stuff stops scaling.  So, what is your system and how many
>> > cpu's do you use?
>> It is the case, my systems are big, in the order of about 70-150
>> atoms. Of course I have to use lot of cores (64 to 512 cores in most
>> of the case) otherwise I don't have enough memory allocated.
>> Is there any way to improve it? The PBE calculation is scaled very well.
>
> The current implementation of the vdW-DF xc-potential scales only up to
> 20 cores.  Could you show us the text output from one of your
> calculations - you should be able to see the timings for the different
> parts of the calculation in the text output.
>
Sure. Check attachments for a scf PBE and vdW-DF1. I deleted many
lines which are not so important.
Those are done with 64 cores on AIX power6.
--------------------------------------------------
Duy Le
PhD Student
Department of Physics
University of Central Florida.

"Men don't need hand to do things"
> Jens Jørgen
>
>> > Jens Jørgen
>> >
>>
>> --------------------------------------------------
>> Duy Le
>> PhD Student
>> Department of Physics
>> University of Central Florida.
>>
>> "Men don't need hand to do things"
>> >> Thank you.
>> >> --------------------------------------------------
>> >> Duy Le
>> >> PhD Student
>> >> Department of Physics
>> >> University of Central Florida.
>> >>
>> >> "Men don't need hand to do things"
>> >>
>> >> _______________________________________________
>> >> gpaw-users mailing list
>> >> gpaw-users at listserv.fysik.dtu.dk
>> >> https://listserv.fysik.dtu.dk/mailman/listinfo/gpaw-users
>> >
>> >
>
>
-------------- next part --------------

  ___ ___ ___ _ _ _  
 |   |   |_  | | | | 
 | | | | | . | | | | 
 |__ |  _|___|_____|  0.8.0.7028
 |___|_|             

User: XXXX at vip032
Date: Thu Dec 23 23:47:13 2010
Arch: 00CCF5D14C00
Pid:  3408662
Dir:  /u/XXXX/usr/GPAW-0.8.0.7028-SCA/lib/python/gpaw
ase:   /u/XXXX/usr/ase/ase  version:  3.5.0.1921
numpy: /u/XXXX/usr/lib/python2.6/site-packages/numpy
units: Angstrom and eV

Memory estimate
---------------
Calculator  693.41 MiB
    Initial overhead  356.53 MiB
    Density  13.88 MiB
        Arrays  4.98 MiB
        Localized functions  2.53 MiB
        Mixer  1.15 MiB
        Interpolator  5.23 MiB
    Hamiltonian  34.11 MiB
        Arrays  3.25 MiB
        Restrictor  3.38 MiB
        XC 3D grid  14.27 MiB
        Poisson  13.00 MiB
        vbar  0.20 MiB
    Wavefunctions  288.89 MiB
        Arrays psit_nG  95.13 MiB
        Eigensolver  96.09 MiB
        Projectors  0.32 MiB
        Overlap op  97.01 MiB
        Kinetic operator  0.34 MiB


Unit Cell:
           Periodic     X           Y           Z      Points  Spacing
  --------------------------------------------------------------------
  1. axis:    yes   17.147303    0.000000    0.000000   112     0.1531
  2. axis:    yes    0.000000   17.325000    0.000000   112     0.1547
  3. axis:    yes    0.000000    0.000000   19.800000   128     0.1547


Using the PBE Exchange-Correlation Functional.
Spin-Paired Calculation
Total Charge:      0.000000
Fermi Temperature: 0.000000
Mode: Finite-difference
Eigensolver:       rmm-diis
                   (3 nearest neighbors central finite-difference stencil)
Poisson Solver:    Jacobi 
                   (3 nearest neighbors central finite-difference stencil)
Interpolation:     6th Order
Reference Energy:  -129744.446005

Gamma Point Calculation
Total number of cores used: 64
Domain Decomposition: 4 x 4 x 4
Diagonalizer layout: Serial LAPACK
Orthonormalizer layout: Serial LAPACK

1 k-point in the Irreducible Part of the Brillouin Zone (total: 1)
Linear Mixing Parameter:           0.1
Pulay Mixing with 3 Old Densities
Damping of Long Wave Oscillations: 50

Convergence Criteria:
Total Energy Change per Atom:           0.0001 eV / electron
Integral of Absolute Density Change:    0.0001 electrons
Integral of Absolute Eigenstate Change: 1e-09
Number of Bands in Calculation:         497
Bands to Converge:                      Occupied States Only
Number of Valence Electrons:            504
                     log10-error:    Total        Iterations:
           Time      WFS    Density  Energy       Fermi  Poisson
iter:   1  23:47:48  +0.7            -930.40924   0      31     
iter:  29  23:54:23  -9.2   -5.2     -1146.53569  0      2      
------------------------------------
Converged After 29 Iterations.

Energy Contributions Relative to Reference Atoms:(reference = -129744.44601)
-------------------------
Kinetic:       +1012.57284
Potential:     -1073.40028
External:        +0.00000
XC:            -1092.47786
Entropy (-ST):   -0.00000
Local:           +6.76962
-------------------------
Free Energy:   -1146.53569
Zero Kelvin:   -1146.53569

Fermi Level: -2.36483
 Band   Eigenvalues  Occupancy
Total Charge:  -0.000000 electrons
Dipole Moment: [-58.80760801 -42.45355839  -0.16996932]
Memory usage: 1.89 GB

============================================================
Timing:                               incl.     excl.
============================================================
IO:                                   0.260     0.260   0.1% |
Initialization:                      16.377     3.633   0.8% |
 Hamiltonian:                         4.083     0.000   0.0% |
  Atomic:                             0.003     0.003   0.0% |
  Communicate energies:               1.838     1.838   0.4% |
  Hartree integrate/restrict:         0.011     0.011   0.0% |
  Initialize Hamiltonian:             0.008     0.008   0.0% |
  Poisson:                            1.844     1.844   0.4% |
  XC 3D grid:                         0.378     0.378   0.1% |
  vbar:                               0.002     0.002   0.0% |
 LCAO initialization:                 8.662     0.613   0.1% |
  LCAO eigensolver:                   1.246     0.001   0.0% |
   Atomic Hamiltonian:                0.000     0.000   0.0% |
   Calculate projections:             0.000     0.000   0.0% |
   Distribute overlap matrix:         0.496     0.496   0.1% |
   Orbital Layouts:                   0.494     0.494   0.1% |
   Potential matrix:                  0.255     0.255   0.1% |
  LCAO to grid:                       4.296     4.296   1.0% |
  Set positions (LCAO WFS):           2.506     1.289   0.3% |
   Basic WFS set positions:           0.000     0.000   0.0% |
   Basis functions set positions:     0.002     0.002   0.0% |
   TCI: Calculate S, T, P:            1.215     1.215   0.3% |
SCF-cycle:                          413.675     0.362   0.1% |
 Density:                             2.189     0.001   0.0% |
  Atomic density matrices:            0.001     0.001   0.0% |
  Mix:                                0.331     0.331   0.1% |
  Multipole moments:                  1.028     1.028   0.2% |
  Pseudo density:                     0.828     0.828   0.2% |
 Hamiltonian:                        71.015     0.003   0.0% |
  Atomic:                             0.073     0.073   0.0% |
  Communicate energies:              49.488    49.488  11.5% |----|
  Hartree integrate/restrict:         0.159     0.159   0.0% |
  Poisson:                           11.776    11.776   2.7% ||
  XC 3D grid:                         9.495     9.495   2.2% ||
  vbar:                               0.021     0.021   0.0% |
 Orthonormalize:                     62.902    11.420   2.7% ||
  Band Layouts:                       0.357     0.001   0.0% |
   Distribute results:                0.127     0.127   0.0% |
   Inverse Cholesky:                  0.229     0.229   0.1% |
  calc_matrix:                       25.761    25.761   6.0% |-|
  rotate_psi:                        25.365    25.365   5.9% |-|
 RMM-DIIS:                          172.218    95.120  22.1% |--------|
  precondition:                      77.099    77.099  17.9% |------|
 Subspace diag:                     104.989     0.002   0.0% |
  Band Layouts:                      12.788     0.001   0.0% |
   Diagonalize:                      12.660    12.660   2.9% ||
   Distribute results:                0.126     0.126   0.0% |
  calc_matrix:                       43.176    43.176  10.0% |---|
  rotate_psi:                        49.022    49.022  11.4% |----|
Other:                                0.198     0.198   0.0% |
============================================================
Total:                                        430.510 100.0%
============================================================
date: Thu Dec 23 23:54:24 2010
-------------- next part --------------

  ___ ___ ___ _ _ _  
 |   |   |_  | | | | 
 | | | | | . | | | | 
 |__ |  _|___|_____|  0.8.0.7028
 |___|_|             

User: XXXX at vip032
Date: Thu Dec 23 22:54:50 2010
Arch: 00CCF5D14C00
Pid:  5112266
Dir:  /u/XXXX/usr/GPAW-0.8.0.7028-SCA/lib/python/gpaw
ase:   /u/XXXX/usr/ase/ase  version:  3.5.0.1921
numpy: /u/XXXX/usr/lib/python2.6/site-packages/numpy
units: Angstrom and eV

Memory estimate
---------------
Calculator  4222.41 MiB
    Initial overhead  3885.53 MiB
    Density  13.88 MiB
        Arrays  4.98 MiB
        Localized functions  2.53 MiB
        Mixer  1.15 MiB
        Interpolator  5.23 MiB
    Hamiltonian  34.11 MiB
        Arrays  3.25 MiB
        Restrictor  3.38 MiB
        XC 3D grid  14.27 MiB
        Poisson  13.00 MiB
        vbar  0.20 MiB
    Wavefunctions  288.89 MiB
        Arrays psit_nG  95.13 MiB
        Eigensolver  96.09 MiB
        Projectors  0.32 MiB
        Overlap op  97.01 MiB
        Kinetic operator  0.34 MiB

Positions:
Unit Cell:
           Periodic     X           Y           Z      Points  Spacing
  --------------------------------------------------------------------
  1. axis:    yes   17.147303    0.000000    0.000000   112     0.1531
  2. axis:    yes    0.000000   17.325000    0.000000   112     0.1547
  3. axis:    yes    0.000000    0.000000   19.800000   128     0.1547

Using the vdW-DF Exchange-Correlation Functional.
Spin-Paired Calculation
Total Charge:      0.000000
Fermi Temperature: 0.000000
Mode: Finite-difference
Eigensolver:       rmm-diis
                   (3 nearest neighbors central finite-difference stencil)
Poisson Solver:    Jacobi 
                   (3 nearest neighbors central finite-difference stencil)
Interpolation:     6th Order
Reference Energy:  -129906.984651

Gamma Point Calculation
Total number of cores used: 64
Domain Decomposition: 4 x 4 x 4
Diagonalizer layout: Serial LAPACK
Orthonormalizer layout: Serial LAPACK

1 k-point in the Irreducible Part of the Brillouin Zone (total: 1)
Linear Mixing Parameter:           0.1
Pulay Mixing with 3 Old Densities
Damping of Long Wave Oscillations: 50

Convergence Criteria:
Total Energy Change per Atom:           0.0001 eV / electron
Integral of Absolute Density Change:    0.0001 electrons
Integral of Absolute Eigenstate Change: 1e-09
Number of Bands in Calculation:         497
Bands to Converge:                      Occupied States Only
Number of Valence Electrons:            504
                     log10-error:    Total        Iterations:
           Time      WFS    Density  Energy       Fermi  Poisson
iter:   1  22:56:43  +0.7            -1440.82686  0      31     
iter:  29  23:18:09  -9.0   -5.3     -1673.29994  0      2      
------------------------------------
Converged After 29 Iterations.

Energy Contributions Relative to Reference Atoms:(reference = -129906.98465)
-------------------------
Kinetic:       +1223.68913
Potential:     -1264.71867
External:        +0.00000
XC:            -1639.17934
Entropy (-ST):   -0.00000
Local:           +6.90894
-------------------------
Free Energy:   -1673.29994
Zero Kelvin:   -1673.29994

Fermi Level: -2.44333
 Band   Eigenvalues  Occupancy


Total Charge:  -0.000000 electrons
Dipole Moment: [-57.93971148 -42.43806154  -0.16853521]
Memory usage: 11.40 GB

============================================================
Timing:                               incl.     excl.
============================================================
Initialization:                      94.503     7.771   0.6% |
 Hamiltonian:                        78.103     0.000   0.0% |
  Atomic:                             0.004     0.004   0.0% |
  Communicate energies:               1.611     1.611   0.1% |
  Hartree integrate/restrict:         0.006     0.006   0.0% |
  Initialize Hamiltonian:             0.006     0.006   0.0% |
  Poisson:                            1.839     1.839   0.1% |
  XC 3D grid:                        74.636     1.257   0.1% |
   VdW-DF integral:                  73.380     0.622   0.0% |
    Convolution:                      0.778     0.778   0.1% |
    FFT:                              1.102     1.102   0.1% |
    gather:                          11.532    11.532   0.8% |
    hmm1:                             1.474     1.474   0.1% |
    hmm2:                             3.808     3.808   0.3% |
    iFFT:                             1.700     1.700   0.1% |
    potential:                       13.661     0.057   0.0% |
     collect:                         0.975     0.975   0.1% |
     p1:                              6.533     6.533   0.5% |
     p2:                              4.199     4.199   0.3% |
     sum:                             1.898     1.898   0.1% |
    splines:                         38.702    38.702   2.8% ||
  vbar:                               0.001     0.001   0.0% |
 LCAO initialization:                 8.629     0.629   0.0% |
  LCAO eigensolver:                   1.224     0.001   0.0% |
   Atomic Hamiltonian:                0.000     0.000   0.0% |
   Calculate projections:             0.000     0.000   0.0% |
   Distribute overlap matrix:         0.484     0.484   0.0% |
   Orbital Layouts:                   0.485     0.485   0.0% |
   Potential matrix:                  0.254     0.254   0.0% |
  LCAO to grid:                       4.090     4.090   0.3% |
  Set positions (LCAO WFS):           2.686     1.381   0.1% |
   Basic WFS set positions:           0.000     0.000   0.0% |
   Basis functions set positions:     0.003     0.003   0.0% |
   TCI: Calculate S, T, P:            1.302     1.302   0.1% |
SCF-cycle:                         1305.043     0.414   0.0% |
 Density:                             2.267     0.001   0.0% |
  Atomic density matrices:            0.000     0.000   0.0% |
  Mix:                                0.327     0.327   0.0% |
  Multipole moments:                  1.111     1.111   0.1% |
  Pseudo density:                     0.827     0.827   0.1% |
 Hamiltonian:                       956.456     0.003   0.0% |
  Atomic:                             0.073     0.073   0.0% |
  Communicate energies:              44.944    44.944   3.2% ||
  Hartree integrate/restrict:         0.160     0.160   0.0% |
  Poisson:                           13.482    13.482   1.0% |
  XC 3D grid:                       897.773    32.028   2.3% ||
   VdW-DF integral:                 865.745    16.210   1.2% |
    Convolution:                     20.996    20.996   1.5% ||
    FFT:                             28.850    28.850   2.1% ||
    gather:                         270.157   270.157  19.3% |-------|
    hmm1:                            38.619    38.619   2.8% ||
    hmm2:                            96.337    96.337   6.9% |--|
    iFFT:                            43.221    43.221   3.1% ||
    potential:                      351.354     1.512   0.1% |
     collect:                        26.903    26.903   1.9% ||
     p1:                            169.233   169.233  12.1% |----|
     p2:                            104.305   104.305   7.5% |--|
     sum:                            49.402    49.402   3.5% ||
    splines:                          0.000     0.000   0.0% |
  vbar:                               0.021     0.021   0.0% |
 Orthonormalize:                     62.967    11.492   0.8% |
  Band Layouts:                       0.357     0.001   0.0% |
   Distribute results:                0.127     0.127   0.0% |
   Inverse Cholesky:                  0.229     0.229   0.0% |
  calc_matrix:                       25.714    25.714   1.8% ||
  rotate_psi:                        25.404    25.404   1.8% ||
 RMM-DIIS:                          176.775    96.831   6.9% |--|
  precondition:                      79.944    79.944   5.7% |-|
 Subspace diag:                     106.165     0.002   0.0% |
  Band Layouts:                      12.906     0.001   0.0% |
   Diagonalize:                      12.785    12.785   0.9% |
   Distribute results:                0.120     0.120   0.0% |
  calc_matrix:                       44.168    44.168   3.2% ||
  rotate_psi:                        49.088    49.088   3.5% ||
Other:                                0.294     0.294   0.0% |
============================================================
Total:                                       1399.840 100.0%
============================================================
date: Thu Dec 23 23:18:10 2010


More information about the gpaw-users mailing list