[gpaw-users] Installing GPAW with parallel and scalapack support
Ask Hjorth Larsen
asklarsen at gmail.com
Tue Apr 30 17:09:17 CEST 2013
2013/4/30 Gaël Donval <gael.donval at cnrs-imn.fr>:
>> > Hi,
>> >
>> > I'd like to use GPAW to study quite big systems (thousands of bands). In
>> > this case, the use of ScalaPack is recommended.
>> >
>> > My system:
>> > - GCC 4.1.2 (old, did try to upgrade but bug with current
>> > libc...)
>> > - OpenBlas 0.2.5
>> > - Lapack 3.4.2 (integrated to OpenBlas)
>> > - Scalapack 2.0.2
>> > - FFTW3 3.3.3
>> > - OpenMPI 1.6.4
>> > - Python 2.7.4
>> > - Numpy 1.7.1
>> > - Scipy 0.12.0
>> > - GPAW svn
>> >
>> > Everything has been linked to OpenBlas successfully, numpy and scipy
>> > tests all pass, I checked for optimized _dotblas.so and checked the
>> > speed of the accelerated np.dot() method against the non accelerated one
>> > and scipy.linalg.blas.cblas.dgemm(). All is perfectly fine.
>> >
>> > Now, GPAW...
>> >
>> > I have 2 problems:
>> >
>> > 1) fileio/parallel.py test fails: "RuntimeError: MPI barrier
>> > timeout."
>> > I'm gonna try other configuration flags with MPI. I suspect this
>> > is due to some strange interaction with our SGE scheduler. Does
>> > that ring a bell to someone?
>> >
>> > 2) I can't compile GPAW with Scalapack.
>> > I get errors such as:
>> > c/blacs.c: In function ‘pblas_tran’:
>> > c/blacs.c:314: erreur: ‘PyArrayObject’ has no member named ‘descr’
>> > c/blacs.c:314: erreur: ‘PyArray_DOUBLE’ undeclared (first use in this function)
>> > c/blacs.c:314: erreur: (Each undeclared identifier is reported only once
>> > c/blacs.c:314: erreur: for each function it appears in.)
>> > c/blacs.c:323: erreur: ‘PyArrayObject’ has no member named ‘data’
>> > c/blacs.c:325: erreur: ‘PyArrayObject’ has no member named ‘data’
>> > These errors are all the same, in c/blacs.c, but in different
>> > functions. The compilation of object files stops with it. (As a
>> > result, hdf5.o object file has not been compiled and the linker
>> > whines because it is missing. But it's irrelevant.)
>>
>> The serial version passes all the tests.
>>
>> The parallel version:
>> 1) with 2 cores (mpirun -np 2 gpaw-python ...) throws
>> "TypeError: Not a proper NumPy array for MPI
>> communication."
>> on some tests (parallel/overlap.py, pw/slab.py, exx_acdf.py to
>> name a few).
>> 2) with 4+ cores (mpirun -np 4 ...) fails at fileio/parallel.py
>> test fails:
>> "RuntimeError: MPI barrier timeout."
>>
>
>> Still not working scalapack even though OpenBlas, PBLAS, BLACS and
>> Scalapack itself passed all their respective built-in tests perfectly.
> Got it working by disabling no_numpy_depreciated_api precompiler flag.
> However, TypeError: Not a proper NumPy array for MPI communication.
> remains as well as the RunetimeError.
What are the stack traces?
Regards
Ask
More information about the gpaw-users
mailing list