[ase-users] Dacapo/ Altix
Marcin Dulak
Marcin.Dulak at fysik.dtu.dk
Fri Apr 29 08:49:55 CEST 2011
Hi,
make sure that the stacksize is increased during the run on the compute
node - print it in the batch job.
There is a chance that with another compiler there will be no problem
(like http://www.open64.net/) , still this is probably too much work.
If this does not help limit this job to max 8 cores.
Best regards,
Marcin
Tadeu Leonardo Soares e Silva wrote:
> Dear Marcin,
>
> Stack size=1GB
>
> This job run on 8 cores,please see it below, but it does not run on 16
> cores. What should I to do? I should to compile again? Would it be the
> problem?
>
>
>
> 8 cores
>
> ulimit
>
>
> core file size (blocks, -c) unlimited
> data seg size (kbytes, -d) unlimited
> scheduling priority (-e) 0
> file size (blocks, -f) unlimited
> pending signals (-i) 63955
> max locked memory (kbytes, -l) unlimited
> max memory size (kbytes, -m) 6966088
> open files (-n) 16384
> pipe size (512 bytes, -p) 8
> POSIX message queues (bytes, -q) 819200
> real-time priority (-r) 0
> stack size (kbytes, -s) 1024000
> cpu time (seconds, -t) 300
> max user processes (-u) 63955
> virtual memory (kbytes, -v) 8156400
> file locks (-x) unlimited
>
>
>
> * 8 cores
> /sw/mpi/intel/openmpi-1.4.2/bin/mpirun -np 8 -
> machinefile /var/spool/torque/aux//19631.service0.ice.nacad.ufrj.br /home/use
> rs/tadeu33/bin/dacapo_intellinux_mpi.run teste332.nc -out teste332.txt -
> scratch /scratch/tadeu/
> teste332.nc -out teste332.txt -scratch /scratch/tadeu/
>
>
> * 8 cores
> =================================================================
> NACAD Supercomputer Center
> =================================================================
> --- Job Information ---
> Cores available per node = 8
> Cores allocated = 8
> Cores used = 8
> --- Nodes:cores Used ---
> r1i1n5:8
> =================================================================
>
> Inicio : Qua Abr 27 12:37:00 BRT 2011
> BFGSLineSearch: 0 13:33:43 -19023.094270 0.6165
> BFGSLineSearch: 1 15:03:59 -19023.351442 0.2375
> BFGSLineSearch: 2 16:21:32 -19023.410924 0.0359
> -19023.4109243
> [[ 0. 0. 0. ]
> [ 0. 0. 0. ]
> [ 0. 0. 0. ]
> [ 0. 0. 0. ]
> [ 0. 0. 0. ]
> [ 0. 0. 0. ]
> [ 0. 0. 0. ]
> [ 0. 0. 0. ]
> [ 0. 0. 0. ]
> [-0.00116696 -0.00022275 0.03587158]
> [-0.00090078 -0.00051726 0.03563763]
> [-0.00137682 -0.00057754 0.03560661]
> [-0.00102949 -0.00064692 0.03547285]
> [-0.00119101 -0.00090526 0.0355944 ]
> [-0.00126022 -0.00073166 0.03546429]
> [-0.00160995 -0.00092908 0.03582606]
> [-0.00077982 -0.00089796 0.03586229]
> [-0.0010722 -0.00056965 0.03547968]]
> Fim : Qua Abr 27 16:21:32 BRT 2011
>
>
> * 8 cores
>
> /sw/mpi/intel/openmpi-1.4.2/bin/mpirun -np 16 -
> machinefile /var/spool/torque/aux//19637.service0.ice.nacad.ufrj.br /home/use
> rs/tadeu33/bin/dacapo_intellinux_mpi.run teste332.nc -out teste332.txt -
> scratch /scratch/tadeu/
>
> * 16 cores
>
>
> =================================================================
> NACAD Supercomputer Center
> =================================================================
> --- Job Information ---
> Cores available per node = 8
> Cores allocated = 16
> Cores used = 16
> --- Nodes:cores Used ---
> r1i1n12:8 r1i1n11:8
> =================================================================
>
> Inicio : Qua Abr 27 18:29:56 BRT 2011
> Traceback (most recent call last):
> File "./teste.py", line 24, in <module>
> dyn.run(fmax=0.05)
> File "/home/users/tadeu33/local/ase-3.4.1/ase/optimize/optimize.py", line
> 114, in run
> f = self.atoms.get_forces()
> File "/home/users/tadeu33/local/ase-3.4.1/ase/atoms.py", line 571, in
> get_forces
> forces = self._calc.get_forces(self)
> File "/home/users/tadeu33/local/ase-
> 3.4.1/ase/calculators/jacapo/jacapo.py", line 2307, in get_forces
> self.calculate()
> File "/home/users/tadeu33/local/ase-
> 3.4.1/ase/calculators/jacapo/jacapo.py", line 2810, in calculate
> raise DacapoAbnormalTermination(s % txt)
> ase.calculators.jacapo.jacapo.DacapoAbnormalTermination: Dacapo output
> txtfile (teste332.txt) did not end normally.
> KPT: Chadi-Cohen asymptotic error estimate: 0.000118148205
> KPT: (see PRB 8, 5747 (1973); 13, 5188 (1976))
>
> KPT: nkpmem : 1
> Parallel : -- parallel configuration --
> Parallel : There are 16 processors divided into 8 groups
> Parallel : Processors per group : 2
> Parallel : k-points per processor group : 1
> Parallel : Each k-point is parallelized over 2 processors
>
>
>
> Sincerely,
>
> Tadeu
>
>
>
>
>
>
>
>
>
>
> On Wed, 27 Apr 2011 10:17:28 +0200, Marcin Dulak wrote
>
>> Hi,
>>
>> does this job run on 8 cores?
>> Increasing the stack size may help:
>> https://wiki.fysik.dtu.dk/dacapo/Installation#id33
>>
>> Best regards,
>>
>> Marcin
>>
>> Tadeu Leonardo Soares e Silva wrote:
>>
>>> Dear Marcin
>>>
>>> I have installed Dacapo in Rocks/CentOS, but I do not get to install
>>>
> Dacapo
>
>>> in Opensuse/SGIAltix. What can cause this problem? Could you help me,
>>>
> please?
>
>>> Please find as attachment some files.
>>>
>>>
>>> * Ldd dacapo_intellinux_mpi.run
>>>
>>> linux-vdso.so.1 => (0x00007fffcbd8e000)
>>> libmkl_lapack.so
>>> => /sw/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_lapack.so
>>> (0x00007ff44378f000)
>>> libmkl_intel_lp64.so
>>> => /sw/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_intel_lp64.so
>>> (0x00007ff443395000)
>>> libmkl_core.so
>>>
> => /sw/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_core.so
>
>>> (0x00007ff442fe2000)
>>> libguide.so => /sw/intel/Compiler/11.1/072/lib/intel64/libguide.so
>>> (0x00007ff44446a000)
>>> libpthread.so.0 => /lib64/libpthread.so.0 (0x00007ff442dc5000)
>>> libmkl_intel_thread.so
>>> => /sw/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_intel_thread.so
>>> (0x00007ff441b81000)
>>> libmpi_f90.so.0 => /sw/mpi/intel/openmpi-1.4.2/lib64/libmpi_f90.so.0
>>> (0x00007ff44197d000)
>>> libmpi_f77.so.0 => /sw/mpi/intel/openmpi-1.4.2/lib64/libmpi_f77.so.0
>>> (0x00007ff441741000)
>>> libmpi.so.0 => /sw/mpi/intel/openmpi-1.4.2/lib64/libmpi.so.0
>>> (0x00007ff44146f000)
>>> libopen-rte.so.0 => /sw/mpi/intel/openmpi-1.4.2/lib64/libopen-rte.so.0
>>> (0x00007ff441207000)
>>> libopen-pal.so.0 => /sw/mpi/intel/openmpi-1.4.2/lib64/libopen-pal.so.0
>>> (0x00007ff440f8a000)
>>> libdl.so.2 => /lib64/libdl.so.2 (0x00007ff440d86000)
>>> libnsl.so.1 => /lib64/libnsl.so.1 (0x00007ff440b6e000)
>>> libutil.so.1 => /lib64/libutil.so.1 (0x00007ff44096b000)
>>> libm.so.6 => /lib64/libm.so.6 (0x00007ff440715000)
>>> libc.so.6 => /lib64/libc.so.6 (0x00007ff4403b7000)
>>> libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007ff4401a0000)
>>> /lib64/ld-linux-x86-64.so.2 (0x00007ff4443f6000)
>>> libifport.so.5 => /sw/intel/Compiler/11.1/072/lib/intel64/libifport.so.5
>>> (0x00007ff440067000)
>>> libifcoremt.so.5
>>>
> => /sw/intel/Compiler/11.1/072/lib/intel64/libifcoremt.so.5
>
>>> (0x00007ff43fdc2000)
>>> libimf.so => /sw/intel/Compiler/11.1/072/lib/intel64/libimf.so
>>> (0x00007ff43fa2e000)
>>> libsvml.so => /sw/intel/Compiler/11.1/072/lib/intel64/libsvml.so
>>> (0x00007ff43f818000)
>>> libintlc.so.5 => /sw/intel/Compiler/11.1/072/lib/intel64/libintlc.so.5
>>> (0x00007ff43f6da000)
>>>
>>>
>>> * Log openmpi.o19491
>>> =================================================================
>>> NACAD Supercomputer Center
>>> =================================================================
>>> --- Job Information ---
>>> Cores available per node = 8
>>> Cores allocated = 16
>>> Cores used = 16
>>> --- Nodes:cores Used ---
>>> r1i1n10:8 r1i1n9:8
>>> =================================================================
>>>
>>> Inicio : Seg Abr 25 18:59:17 BRT 2011
>>> Traceback (most recent call last):
>>> File "./teste.py", line 24, in <module>
>>> dyn.run(fmax=0.05)
>>> File "/home/users/tadeu33/local/ase-3.4.1/ase/optimize/optimize.py",
>>>
> line
>
>>> 114, in run
>>> f = self.atoms.get_forces()
>>> File "/home/users/tadeu33/local/ase-3.4.1/ase/atoms.py", line 571, in
>>> get_forces
>>> forces = self._calc.get_forces(self)
>>> File "/home/users/tadeu33/local/ase-
>>> 3.4.1/ase/calculators/jacapo/jacapo.py", line 2307, in get_forces
>>> self.calculate()
>>> File "/home/users/tadeu33/local/ase-
>>> 3.4.1/ase/calculators/jacapo/jacapo.py", line 2810, in calculate
>>> raise DacapoAbnormalTermination(s % txt)
>>> ase.calculators.jacapo.jacapo.DacapoAbnormalTermination: Dacapo output
>>> txtfile (teste332.txt) did not end normally.
>>> KPT: Chadi-Cohen asymptotic error estimate: 0.000118148205
>>> KPT: (see PRB 8, 5747 (1973); 13, 5188 (1976))
>>>
>>> KPT: nkpmem : 1
>>> Parallel : -- parallel configuration --
>>> Parallel : There are 16 processors divided into 8 groups
>>> Parallel : Processors per group : 2
>>> Parallel : k-points per processor group : 1
>>> Parallel : Each k-point is parallelized over 2 processors
>>>
>>> Fim : Seg Abr 25 18:59:19 BRT 2011
>>>
>>> * Command Line executed by dacapo.run script
>>>
>>> /sw/mpi/intel/openmpi-1.4.2/bin/mpirun -np 16 -
>>>
>>>
> machinefile /var/spool/torque/aux//19491.service0.ice.nacad.ufrj.br /home/use
>
>>> rs/tadeu33/bin/dacapo_intellinux_mpi.run teste332.nc -out teste332.txt -
>>> scratch /scratch/tadeu/
>>>
>>> * Netcdf Compilation
>>> cd netcdf-3.6.3
>>> export CC=icc
>>> export CXX=icpc
>>> export CFLAGS='-O3 -xssse3 -ip -no-prec-div -
>>>
> static'
>
>>> export CXXFLAGS='-O3 -xssse3 -ip -no-prec-div -
>>> static'
>>> export F77=ifort
>>> export FC=ifort
>>> export F90=ifort
>>> export FFLAGS='-O3 -xssse3'
>>> export CPP='icc -E'
>>> export CXXCPP='icpc -E'
>>> export CPPFLAGS='-DNDEBUG -DpgiFortran'
>>> ./configure --prefix=$HOME/local/netcdf-3.6.3_intel
>>> mkdir $HOME/local/netcdf-3.6.3_intel
>>> make check
>>> make install
>>>
>>>
>>>
>>> * Dacapo Compilation
>>> export BLASLAPACK='-
>>> L/sw/intel/Compiler/11.1/064/mkl/lib/em64t -lmkl_lapack -
>>>
> lmkl_intel_lp64 -
>
>>> lmkl_core -lguide -lpthread -lmkl_intel_thread'
>>> export NETCDF=$HOME/local/netcdf-3.6.3_intel/lib
>>> export FFTW=$HOME/local/fftw2-2.1.5-1.intel/lib
>>> export CC=icc
>>> export CXX=icpc
>>> export CFLAGS='-O3 -xssse3'
>>> export CXXFLAGS='-O3 -xssse3'
>>> export F77=ifort
>>> export FC=ifort
>>> export F90=ifort
>>> export FFLAGS='-O3 -xssse3'
>>> export CPP='icc -E'
>>> export CXXCPP='icpc -E'
>>> export MPIDIR=/sw/mpi/intel/openmpi-1.4.2
>>> export MPI_LIBDIR=${MPIDIR}/lib64
>>> export MPI_BINDIR=${MPIDIR}/bin
>>> export MPI_INCLUDEDIR=${MPIDIR}/include
>>> cp -a ../src/dacapo .
>>> cd dacapo/src
>>> make intellinux
>>> make intellinux MP=mpi
>>> cp intellinux_mpi/dacapo.run
>>> $HOME/bin/dacapo_intellinux_mpi.run
>>> cp intellinux_serial/dacapo.run
>>> $HOME/bin/dacapo_intellinux_serial.run
>>> cd $HOME/work/dacapo/Python
>>> python setup.py install --verbose --prefix='' --
>>> home=$HOME/local/dacapo
>>>
>>>
>>> Sincerely,
>>>
>>> Tadeu Leonardo
>>>
>>>
>>> +++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>
>>> PEQ/COPPE renova o nivel de curso 7, maximo na CAPES:
>>> 45 anos de excelencia no ensino e pesquisa de pos-graduação em
>>> Engenharia Quimica.
>>>
>>> ************************************
>>>
>>> PEQ/COPPE : 45 years of commitment to excellence in teaching and
>>> research in Chemical Engineering.
>>>
>>> +++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>
>>>
>>>
>> --
>> ***********************************
>>
>> Marcin Dulak
>> Technical University of Denmark
>> Department of Physics
>> Building 307, Room 229
>> DK-2800 Kongens Lyngby
>> Denmark
>> Tel.: (+45) 4525 3157
>> Fax.: (+45) 4593 2399
>> email: Marcin.Dulak at fysik.dtu.dk
>>
>> ***********************************
>>
>
>
> +++++++++++++++++++++++++++++++++++++++++++++++++++++
>
> PEQ/COPPE renova o nivel de curso 7, maximo na CAPES:
> 45 anos de excelencia no ensino e pesquisa de pos-graduação em
> Engenharia Quimica.
>
> ************************************
>
> PEQ/COPPE : 45 years of commitment to excellence in teaching and
> research in Chemical Engineering.
>
> +++++++++++++++++++++++++++++++++++++++++++++++++++++
>
--
***********************************
Marcin Dulak
Technical University of Denmark
Department of Physics
Building 307, Room 229
DK-2800 Kongens Lyngby
Denmark
Tel.: (+45) 4525 3157
Fax.: (+45) 4593 2399
email: Marcin.Dulak at fysik.dtu.dk
***********************************
More information about the ase-users
mailing list