[gpaw-users] Computational cost of vdW-DF
Duy Le
ttduyle at gmail.com
Tue Apr 12 18:42:25 CEST 2011
2011/4/12 Jens Jørgen Mortensen <jensj at fysik.dtu.dk>:
> On Tue, 2011-04-12 at 15:29 +0200, Duy Le wrote:
>> 2011/4/12 Jens Jørgen Mortensen <jensj at fysik.dtu.dk>:
>> > On Mon, 2011-04-11 at 23:56 +0200, Duy Le wrote:
>> >> Dear all,
>> >> As I understand, the approach of Román-Pérez and Soler (PRL 2009) has
>> >> been implemented in GPAW. I wonder why the self-consistent vdW-DF
>> >> calculation is still extremely expensive (3 or 4 times in most of my
>> >> calculations) in comparison to regular GGA-PBE calculation.
>> >> Does any of you experience this slowness? I would like to know the reason.
>> >
>> > For a small system with few electrons, the calculation of the vdW-DF
>> > potential will dominate. With more electrons the vdW-DF part should
>> > become smaller unless you run on so many cores that the parallelization
>> > of the vdW-DF stuff stops scaling. So, what is your system and how many
>> > cpu's do you use?
>> It is the case, my systems are big, in the order of about 70-150
>> atoms. Of course I have to use lot of cores (64 to 512 cores in most
>> of the case) otherwise I don't have enough memory allocated.
>> Is there any way to improve it? The PBE calculation is scaled very well.
>
> The current implementation of the vdW-DF xc-potential scales only up to
> 20 cores. Could you show us the text output from one of your
> calculations - you should be able to see the timings for the different
> parts of the calculation in the text output.
>
Sure. Check attachments for a scf PBE and vdW-DF1. I deleted many
lines which are not so important.
Those are done with 64 cores on AIX power6.
--------------------------------------------------
Duy Le
PhD Student
Department of Physics
University of Central Florida.
"Men don't need hand to do things"
> Jens Jørgen
>
>> > Jens Jørgen
>> >
>>
>> --------------------------------------------------
>> Duy Le
>> PhD Student
>> Department of Physics
>> University of Central Florida.
>>
>> "Men don't need hand to do things"
>> >> Thank you.
>> >> --------------------------------------------------
>> >> Duy Le
>> >> PhD Student
>> >> Department of Physics
>> >> University of Central Florida.
>> >>
>> >> "Men don't need hand to do things"
>> >>
>> >> _______________________________________________
>> >> gpaw-users mailing list
>> >> gpaw-users at listserv.fysik.dtu.dk
>> >> https://listserv.fysik.dtu.dk/mailman/listinfo/gpaw-users
>> >
>> >
>
>
-------------- next part --------------
___ ___ ___ _ _ _
| | |_ | | | |
| | | | | . | | | |
|__ | _|___|_____| 0.8.0.7028
|___|_|
User: XXXX at vip032
Date: Thu Dec 23 23:47:13 2010
Arch: 00CCF5D14C00
Pid: 3408662
Dir: /u/XXXX/usr/GPAW-0.8.0.7028-SCA/lib/python/gpaw
ase: /u/XXXX/usr/ase/ase version: 3.5.0.1921
numpy: /u/XXXX/usr/lib/python2.6/site-packages/numpy
units: Angstrom and eV
Memory estimate
---------------
Calculator 693.41 MiB
Initial overhead 356.53 MiB
Density 13.88 MiB
Arrays 4.98 MiB
Localized functions 2.53 MiB
Mixer 1.15 MiB
Interpolator 5.23 MiB
Hamiltonian 34.11 MiB
Arrays 3.25 MiB
Restrictor 3.38 MiB
XC 3D grid 14.27 MiB
Poisson 13.00 MiB
vbar 0.20 MiB
Wavefunctions 288.89 MiB
Arrays psit_nG 95.13 MiB
Eigensolver 96.09 MiB
Projectors 0.32 MiB
Overlap op 97.01 MiB
Kinetic operator 0.34 MiB
Unit Cell:
Periodic X Y Z Points Spacing
--------------------------------------------------------------------
1. axis: yes 17.147303 0.000000 0.000000 112 0.1531
2. axis: yes 0.000000 17.325000 0.000000 112 0.1547
3. axis: yes 0.000000 0.000000 19.800000 128 0.1547
Using the PBE Exchange-Correlation Functional.
Spin-Paired Calculation
Total Charge: 0.000000
Fermi Temperature: 0.000000
Mode: Finite-difference
Eigensolver: rmm-diis
(3 nearest neighbors central finite-difference stencil)
Poisson Solver: Jacobi
(3 nearest neighbors central finite-difference stencil)
Interpolation: 6th Order
Reference Energy: -129744.446005
Gamma Point Calculation
Total number of cores used: 64
Domain Decomposition: 4 x 4 x 4
Diagonalizer layout: Serial LAPACK
Orthonormalizer layout: Serial LAPACK
1 k-point in the Irreducible Part of the Brillouin Zone (total: 1)
Linear Mixing Parameter: 0.1
Pulay Mixing with 3 Old Densities
Damping of Long Wave Oscillations: 50
Convergence Criteria:
Total Energy Change per Atom: 0.0001 eV / electron
Integral of Absolute Density Change: 0.0001 electrons
Integral of Absolute Eigenstate Change: 1e-09
Number of Bands in Calculation: 497
Bands to Converge: Occupied States Only
Number of Valence Electrons: 504
log10-error: Total Iterations:
Time WFS Density Energy Fermi Poisson
iter: 1 23:47:48 +0.7 -930.40924 0 31
iter: 29 23:54:23 -9.2 -5.2 -1146.53569 0 2
------------------------------------
Converged After 29 Iterations.
Energy Contributions Relative to Reference Atoms:(reference = -129744.44601)
-------------------------
Kinetic: +1012.57284
Potential: -1073.40028
External: +0.00000
XC: -1092.47786
Entropy (-ST): -0.00000
Local: +6.76962
-------------------------
Free Energy: -1146.53569
Zero Kelvin: -1146.53569
Fermi Level: -2.36483
Band Eigenvalues Occupancy
Total Charge: -0.000000 electrons
Dipole Moment: [-58.80760801 -42.45355839 -0.16996932]
Memory usage: 1.89 GB
============================================================
Timing: incl. excl.
============================================================
IO: 0.260 0.260 0.1% |
Initialization: 16.377 3.633 0.8% |
Hamiltonian: 4.083 0.000 0.0% |
Atomic: 0.003 0.003 0.0% |
Communicate energies: 1.838 1.838 0.4% |
Hartree integrate/restrict: 0.011 0.011 0.0% |
Initialize Hamiltonian: 0.008 0.008 0.0% |
Poisson: 1.844 1.844 0.4% |
XC 3D grid: 0.378 0.378 0.1% |
vbar: 0.002 0.002 0.0% |
LCAO initialization: 8.662 0.613 0.1% |
LCAO eigensolver: 1.246 0.001 0.0% |
Atomic Hamiltonian: 0.000 0.000 0.0% |
Calculate projections: 0.000 0.000 0.0% |
Distribute overlap matrix: 0.496 0.496 0.1% |
Orbital Layouts: 0.494 0.494 0.1% |
Potential matrix: 0.255 0.255 0.1% |
LCAO to grid: 4.296 4.296 1.0% |
Set positions (LCAO WFS): 2.506 1.289 0.3% |
Basic WFS set positions: 0.000 0.000 0.0% |
Basis functions set positions: 0.002 0.002 0.0% |
TCI: Calculate S, T, P: 1.215 1.215 0.3% |
SCF-cycle: 413.675 0.362 0.1% |
Density: 2.189 0.001 0.0% |
Atomic density matrices: 0.001 0.001 0.0% |
Mix: 0.331 0.331 0.1% |
Multipole moments: 1.028 1.028 0.2% |
Pseudo density: 0.828 0.828 0.2% |
Hamiltonian: 71.015 0.003 0.0% |
Atomic: 0.073 0.073 0.0% |
Communicate energies: 49.488 49.488 11.5% |----|
Hartree integrate/restrict: 0.159 0.159 0.0% |
Poisson: 11.776 11.776 2.7% ||
XC 3D grid: 9.495 9.495 2.2% ||
vbar: 0.021 0.021 0.0% |
Orthonormalize: 62.902 11.420 2.7% ||
Band Layouts: 0.357 0.001 0.0% |
Distribute results: 0.127 0.127 0.0% |
Inverse Cholesky: 0.229 0.229 0.1% |
calc_matrix: 25.761 25.761 6.0% |-|
rotate_psi: 25.365 25.365 5.9% |-|
RMM-DIIS: 172.218 95.120 22.1% |--------|
precondition: 77.099 77.099 17.9% |------|
Subspace diag: 104.989 0.002 0.0% |
Band Layouts: 12.788 0.001 0.0% |
Diagonalize: 12.660 12.660 2.9% ||
Distribute results: 0.126 0.126 0.0% |
calc_matrix: 43.176 43.176 10.0% |---|
rotate_psi: 49.022 49.022 11.4% |----|
Other: 0.198 0.198 0.0% |
============================================================
Total: 430.510 100.0%
============================================================
date: Thu Dec 23 23:54:24 2010
-------------- next part --------------
___ ___ ___ _ _ _
| | |_ | | | |
| | | | | . | | | |
|__ | _|___|_____| 0.8.0.7028
|___|_|
User: XXXX at vip032
Date: Thu Dec 23 22:54:50 2010
Arch: 00CCF5D14C00
Pid: 5112266
Dir: /u/XXXX/usr/GPAW-0.8.0.7028-SCA/lib/python/gpaw
ase: /u/XXXX/usr/ase/ase version: 3.5.0.1921
numpy: /u/XXXX/usr/lib/python2.6/site-packages/numpy
units: Angstrom and eV
Memory estimate
---------------
Calculator 4222.41 MiB
Initial overhead 3885.53 MiB
Density 13.88 MiB
Arrays 4.98 MiB
Localized functions 2.53 MiB
Mixer 1.15 MiB
Interpolator 5.23 MiB
Hamiltonian 34.11 MiB
Arrays 3.25 MiB
Restrictor 3.38 MiB
XC 3D grid 14.27 MiB
Poisson 13.00 MiB
vbar 0.20 MiB
Wavefunctions 288.89 MiB
Arrays psit_nG 95.13 MiB
Eigensolver 96.09 MiB
Projectors 0.32 MiB
Overlap op 97.01 MiB
Kinetic operator 0.34 MiB
Positions:
Unit Cell:
Periodic X Y Z Points Spacing
--------------------------------------------------------------------
1. axis: yes 17.147303 0.000000 0.000000 112 0.1531
2. axis: yes 0.000000 17.325000 0.000000 112 0.1547
3. axis: yes 0.000000 0.000000 19.800000 128 0.1547
Using the vdW-DF Exchange-Correlation Functional.
Spin-Paired Calculation
Total Charge: 0.000000
Fermi Temperature: 0.000000
Mode: Finite-difference
Eigensolver: rmm-diis
(3 nearest neighbors central finite-difference stencil)
Poisson Solver: Jacobi
(3 nearest neighbors central finite-difference stencil)
Interpolation: 6th Order
Reference Energy: -129906.984651
Gamma Point Calculation
Total number of cores used: 64
Domain Decomposition: 4 x 4 x 4
Diagonalizer layout: Serial LAPACK
Orthonormalizer layout: Serial LAPACK
1 k-point in the Irreducible Part of the Brillouin Zone (total: 1)
Linear Mixing Parameter: 0.1
Pulay Mixing with 3 Old Densities
Damping of Long Wave Oscillations: 50
Convergence Criteria:
Total Energy Change per Atom: 0.0001 eV / electron
Integral of Absolute Density Change: 0.0001 electrons
Integral of Absolute Eigenstate Change: 1e-09
Number of Bands in Calculation: 497
Bands to Converge: Occupied States Only
Number of Valence Electrons: 504
log10-error: Total Iterations:
Time WFS Density Energy Fermi Poisson
iter: 1 22:56:43 +0.7 -1440.82686 0 31
iter: 29 23:18:09 -9.0 -5.3 -1673.29994 0 2
------------------------------------
Converged After 29 Iterations.
Energy Contributions Relative to Reference Atoms:(reference = -129906.98465)
-------------------------
Kinetic: +1223.68913
Potential: -1264.71867
External: +0.00000
XC: -1639.17934
Entropy (-ST): -0.00000
Local: +6.90894
-------------------------
Free Energy: -1673.29994
Zero Kelvin: -1673.29994
Fermi Level: -2.44333
Band Eigenvalues Occupancy
Total Charge: -0.000000 electrons
Dipole Moment: [-57.93971148 -42.43806154 -0.16853521]
Memory usage: 11.40 GB
============================================================
Timing: incl. excl.
============================================================
Initialization: 94.503 7.771 0.6% |
Hamiltonian: 78.103 0.000 0.0% |
Atomic: 0.004 0.004 0.0% |
Communicate energies: 1.611 1.611 0.1% |
Hartree integrate/restrict: 0.006 0.006 0.0% |
Initialize Hamiltonian: 0.006 0.006 0.0% |
Poisson: 1.839 1.839 0.1% |
XC 3D grid: 74.636 1.257 0.1% |
VdW-DF integral: 73.380 0.622 0.0% |
Convolution: 0.778 0.778 0.1% |
FFT: 1.102 1.102 0.1% |
gather: 11.532 11.532 0.8% |
hmm1: 1.474 1.474 0.1% |
hmm2: 3.808 3.808 0.3% |
iFFT: 1.700 1.700 0.1% |
potential: 13.661 0.057 0.0% |
collect: 0.975 0.975 0.1% |
p1: 6.533 6.533 0.5% |
p2: 4.199 4.199 0.3% |
sum: 1.898 1.898 0.1% |
splines: 38.702 38.702 2.8% ||
vbar: 0.001 0.001 0.0% |
LCAO initialization: 8.629 0.629 0.0% |
LCAO eigensolver: 1.224 0.001 0.0% |
Atomic Hamiltonian: 0.000 0.000 0.0% |
Calculate projections: 0.000 0.000 0.0% |
Distribute overlap matrix: 0.484 0.484 0.0% |
Orbital Layouts: 0.485 0.485 0.0% |
Potential matrix: 0.254 0.254 0.0% |
LCAO to grid: 4.090 4.090 0.3% |
Set positions (LCAO WFS): 2.686 1.381 0.1% |
Basic WFS set positions: 0.000 0.000 0.0% |
Basis functions set positions: 0.003 0.003 0.0% |
TCI: Calculate S, T, P: 1.302 1.302 0.1% |
SCF-cycle: 1305.043 0.414 0.0% |
Density: 2.267 0.001 0.0% |
Atomic density matrices: 0.000 0.000 0.0% |
Mix: 0.327 0.327 0.0% |
Multipole moments: 1.111 1.111 0.1% |
Pseudo density: 0.827 0.827 0.1% |
Hamiltonian: 956.456 0.003 0.0% |
Atomic: 0.073 0.073 0.0% |
Communicate energies: 44.944 44.944 3.2% ||
Hartree integrate/restrict: 0.160 0.160 0.0% |
Poisson: 13.482 13.482 1.0% |
XC 3D grid: 897.773 32.028 2.3% ||
VdW-DF integral: 865.745 16.210 1.2% |
Convolution: 20.996 20.996 1.5% ||
FFT: 28.850 28.850 2.1% ||
gather: 270.157 270.157 19.3% |-------|
hmm1: 38.619 38.619 2.8% ||
hmm2: 96.337 96.337 6.9% |--|
iFFT: 43.221 43.221 3.1% ||
potential: 351.354 1.512 0.1% |
collect: 26.903 26.903 1.9% ||
p1: 169.233 169.233 12.1% |----|
p2: 104.305 104.305 7.5% |--|
sum: 49.402 49.402 3.5% ||
splines: 0.000 0.000 0.0% |
vbar: 0.021 0.021 0.0% |
Orthonormalize: 62.967 11.492 0.8% |
Band Layouts: 0.357 0.001 0.0% |
Distribute results: 0.127 0.127 0.0% |
Inverse Cholesky: 0.229 0.229 0.0% |
calc_matrix: 25.714 25.714 1.8% ||
rotate_psi: 25.404 25.404 1.8% ||
RMM-DIIS: 176.775 96.831 6.9% |--|
precondition: 79.944 79.944 5.7% |-|
Subspace diag: 106.165 0.002 0.0% |
Band Layouts: 12.906 0.001 0.0% |
Diagonalize: 12.785 12.785 0.9% |
Distribute results: 0.120 0.120 0.0% |
calc_matrix: 44.168 44.168 3.2% ||
rotate_psi: 49.088 49.088 3.5% ||
Other: 0.294 0.294 0.0% |
============================================================
Total: 1399.840 100.0%
============================================================
date: Thu Dec 23 23:18:10 2010
More information about the gpaw-users
mailing list