[gpaw-users] Fwd: can't using gpaw-python
Marcin Dulak
Marcin.Dulak at fysik.dtu.dk
Thu Oct 24 11:26:31 CEST 2013
Hi,
On 10/24/2013 10:59 AM, 謝其軒 wrote:
> I've compile gpaw without any reference to mkl library.
> I export $PATH and $LD_LIBRARY_PATH myself and I got gpaw well compiled.
> Then I gpaw-test it in the node with 8 cores. All tests passed but
> with some parallel part acting strange:
> ===================
>
> [z955018 at node05 test]$ mpiexec -np 4 gpaw-python `which gpaw-test`
> 2>&1 | tee testgpaw.log
>
> --------------------------------------------------------------------------
> An MPI process has executed an operation involving a call to the
> "fork()" system call to create a child process. Open MPI is currently
> operating in a condition that could result in memory corruption or
> other system errors; your MPI job may hang, crash, or produce silent
> data corruption. The use of fork() (or system() or other calls that
> create child processes) is strongly discouraged.
>
> The process that invoked fork was:
>
> Local host: node05 (PID 7485)
> MPI_COMM_WORLD rank: 3
>
> If you are *absolutely sure* that your application will successfully
> and correctly survive a call to fork(), you may disable this warning
> by setting the mpi_warn_on_fork MCA parameter to 0.
> --------------------------------------------------------------------------
> python 2.7.5 GCC 4.1.2 20080704 (Red Hat 4.1.2-52) 64bit ELF on Linux
> x86_64 redhat 5.8 Final
> Running tests in /tmp/gpaw-test-ujm_vy
> Jobs: 1, Cores: 4, debug-mode: False
> =============================================================================
> gemm_complex.py 0.014 OK
> mpicomm.py 0.010 OK
> ase3k_version.py 0.007 OK
>
> ....................
>
>
> integral4.py 0.025 OK
> parallel/ut_parallel.py [node05:07477] 3 more processes
> have sent help message help-mpi-runtime.txt / mpi_init:warn-fork
> [node05:07477] Set MCA parameter "orte_base_help_aggregate" to 0 to
> see all help / error messages
> 1.074 OK
> transformations.py 0.017 OK
> parallel/parallel_eigh.py 0.010 OK
> spectrum.py 0.037 OK
> xc.py 0.064 OK
> ......................
>
> parallel/realspace_blacs.py 0.018 OK
> AA_exx_enthalpy.py 139.244 OK
> cmrtest/cmr_test.py 0.023 SKIPPED
> cmrtest/cmr_test3.py 0.012 SKIPPED
> cmrtest/cmr_test4.py 0.018 SKIPPED
> cmrtest/cmr_append.py 0.015 SKIPPED
> cmrtest/Li2_atomize.py 0.015 SKIPPED
> =============================================================================
> Ran 223 tests out of 230 in 2511.9 seconds
> Tests skipped: 7
> All tests passed!
> =============================================================================
> The ground state calculation of parallel version is fine.
>
> But when I run the DF calculations, it seems that something familiar
> emerges up:
i think this uses scipy, so libmkl_intel_thread.so is coming probably
from scipy:
python -c "import scipy; print scipy.__config__.show(); print
scipy.__version__"
If so rebuild scipy. Scipy uses the settings from available numpy (the
one you get with python -c "import numpy; print numpy") for build.
Marcin
>
> python2.7: symbol lookup error:
> /opt/intel/composerxe-2011.3.174/mkl/lib/intel64/libmkl_intel_thread.so:
> undefined symbol: omp_get_num_procs
> (this is run with serial version.....and it pissed me off)
>
> ===============================================================================
> --------------------------------------------------------------------------
> An MPI process has executed an operation involving a call to the
> "fork()" system call to create a child process. Open MPI is currently
> operating in a condition that could result in memory corruption or
> other system errors; your MPI job may hang, crash, or produce silent
> data corruption. The use of fork() (or system() or other calls that
> create child processes) is strongly discouraged.
>
> The process that invoked fork was:
>
> Local host: node03 (PID 24987)
> MPI_COMM_WORLD rank: 3
>
> If you are *absolutely sure* that your application will successfully
> and correctly survive a call to fork(), you may disable this warning
> by setting the mpi_warn_on_fork MCA parameter to 0.
> --------------------------------------------------------------------------
> [node03:24983] 7 more processes have sent help message
> help-mpi-runtime.txt / mpi_init:warn-fork
> [node03:24983] Set MCA parameter "orte_base_help_aggregate" to 0 to
> see all help / error messages
> gpaw-python: symbol lookup error:
> /opt/intel/composerxe-2011.3.174/mkl/lib/intel64/libmkl_intel_thread.so:
> undefined symbol: omp_get_num_procs
> gpaw-python: symbol lookup error:
> /opt/intel/composerxe-2011.3.174/mkl/lib/intel64/libmkl_intel_thread.so:
> undefined symbol: omp_get_num_procs
> gpaw-python: symbol lookup error:
> /opt/intel/composerxe-2011.3.174/mkl/lib/intel64/libmkl_intel_thread.so:
> undefined symbol: omp_get_num_procs
> gpaw-python: symbol lookup error:
> /opt/intel/composerxe-2011.3.174/mkl/lib/intel64/libmkl_intel_thread.so:
> undefined symbol: omp_get_num_procs
> gpaw-python: symbol lookup error:
> /opt/intel/composerxe-2011.3.174/mkl/lib/intel64/libmkl_intel_thread.so:
> undefined symbol: omp_get_num_procs
> gpaw-python: symbol lookup error:
> /opt/intel/composerxe-2011.3.174/mkl/lib/intel64/libmkl_intel_thread.so:
> undefined symbol: omp_get_num_procs
> gpaw-python: symbol lookup error:
> /opt/intel/composerxe-2011.3.174/mkl/lib/intel64/libmkl_intel_thread.so:
> undefined symbol: omp_get_num_procs
> gpaw-python: symbol lookup error:
> /opt/intel/composerxe-2011.3.174/mkl/lib/intel64/libmkl_intel_thread.so:
> undefined symbol: omp_get_num_procs
> --------------------------------------------------------------------------
> mpirun has exited due to process rank 6 with PID 24990 on
> node node03 exiting without calling "finalize". This may
> have caused other processes in the application to be
> terminated by signals sent by mpirun (as reported here).
> --------------------------------------------------------------------------
>
> =============================================================================
>
> And this is the parallel version.
>
> I think right now I might try to build with mkl library or I'd like to
> kill myself....
>
> BR,
>
> chi-hsuan
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.fysik.dtu.dk/pipermail/gpaw-users/attachments/20131024/33d91dca/attachment.html>
More information about the gpaw-users
mailing list