[ase-users] Dacapo/ Altix

Marcin Dulak Marcin.Dulak at fysik.dtu.dk
Fri Apr 29 08:49:55 CEST 2011


Hi,

make sure that the stacksize is increased during the run on the compute 
node - print it in the batch job.
There is a chance that with another compiler there will be no problem 
(like http://www.open64.net/) , still this is probably too much work.
If this does not help limit this job to max 8 cores.

Best regards,

Marcin

Tadeu Leonardo Soares e Silva wrote:
> Dear Marcin,
>
> Stack size=1GB
>
> This job run on 8 cores,please see it below, but it does not run on 16 
> cores. What should I to do? I should to compile again? Would it be the 
> problem?
>
>
>
> 8 cores
>
>   ulimit
>
>
> core file size          (blocks, -c) unlimited
> data seg size           (kbytes, -d) unlimited
> scheduling priority             (-e) 0
> file size               (blocks, -f) unlimited
> pending signals                 (-i) 63955
> max locked memory       (kbytes, -l) unlimited
> max memory size         (kbytes, -m) 6966088
> open files                      (-n) 16384
> pipe size            (512 bytes, -p) 8
> POSIX message queues     (bytes, -q) 819200
> real-time priority              (-r) 0
> stack size              (kbytes, -s) 1024000
> cpu time               (seconds, -t) 300
> max user processes              (-u) 63955
> virtual memory          (kbytes, -v) 8156400
> file locks                      (-x) unlimited
>
>
>
> * 8 cores 
> /sw/mpi/intel/openmpi-1.4.2/bin/mpirun -np 8 -
> machinefile /var/spool/torque/aux//19631.service0.ice.nacad.ufrj.br /home/use
> rs/tadeu33/bin/dacapo_intellinux_mpi.run teste332.nc -out teste332.txt -
> scratch /scratch/tadeu/
> teste332.nc -out teste332.txt -scratch /scratch/tadeu/
>
>
> * 8 cores 
> =================================================================
>                    NACAD Supercomputer Center
> =================================================================
> --- Job Information ---
>    Cores available per node = 8
>    Cores allocated = 8
>    Cores used      = 8
> --- Nodes:cores Used ---
> r1i1n5:8
> =================================================================
>  
> Inicio : Qua Abr 27 12:37:00 BRT 2011
> BFGSLineSearch:   0  13:33:43   -19023.094270       0.6165
> BFGSLineSearch:   1  15:03:59   -19023.351442       0.2375
> BFGSLineSearch:   2  16:21:32   -19023.410924       0.0359
> -19023.4109243
> [[ 0.          0.          0.        ]
>  [ 0.          0.          0.        ]
>  [ 0.          0.          0.        ]
>  [ 0.          0.          0.        ]
>  [ 0.          0.          0.        ]
>  [ 0.          0.          0.        ]
>  [ 0.          0.          0.        ]
>  [ 0.          0.          0.        ]
>  [ 0.          0.          0.        ]
>  [-0.00116696 -0.00022275  0.03587158]
>  [-0.00090078 -0.00051726  0.03563763]
>  [-0.00137682 -0.00057754  0.03560661]
>  [-0.00102949 -0.00064692  0.03547285]
>  [-0.00119101 -0.00090526  0.0355944 ]
>  [-0.00126022 -0.00073166  0.03546429]
>  [-0.00160995 -0.00092908  0.03582606]
>  [-0.00077982 -0.00089796  0.03586229]
>  [-0.0010722  -0.00056965  0.03547968]]
> Fim : Qua Abr 27 16:21:32 BRT 2011
>
>
> *  8 cores 
>
> /sw/mpi/intel/openmpi-1.4.2/bin/mpirun -np 16 -
> machinefile /var/spool/torque/aux//19637.service0.ice.nacad.ufrj.br /home/use
> rs/tadeu33/bin/dacapo_intellinux_mpi.run teste332.nc -out teste332.txt -
> scratch /scratch/tadeu/
>
> *  16 cores
>
>
> =================================================================
>                    NACAD Supercomputer Center
> =================================================================
> --- Job Information ---
>    Cores available per node = 8
>    Cores allocated = 16
>    Cores used      = 16
> --- Nodes:cores Used ---
> r1i1n12:8	  r1i1n11:8
> =================================================================
>  
> Inicio : Qua Abr 27 18:29:56 BRT 2011
> Traceback (most recent call last):
>   File "./teste.py", line 24, in <module>
>     dyn.run(fmax=0.05) 
>   File "/home/users/tadeu33/local/ase-3.4.1/ase/optimize/optimize.py", line 
> 114, in run
>     f = self.atoms.get_forces()
>   File "/home/users/tadeu33/local/ase-3.4.1/ase/atoms.py", line 571, in 
> get_forces
>     forces = self._calc.get_forces(self)
>   File "/home/users/tadeu33/local/ase-
> 3.4.1/ase/calculators/jacapo/jacapo.py", line 2307, in get_forces
>     self.calculate()
>   File "/home/users/tadeu33/local/ase-
> 3.4.1/ase/calculators/jacapo/jacapo.py", line 2810, in calculate
>     raise DacapoAbnormalTermination(s % txt)
> ase.calculators.jacapo.jacapo.DacapoAbnormalTermination: Dacapo output 
> txtfile (teste332.txt) did not end normally.
>  KPT: Chadi-Cohen asymptotic error estimate:  0.000118148205
>  KPT: (see PRB 8, 5747 (1973); 13, 5188 (1976))
>   
>  KPT: nkpmem :            1
>  Parallel :   --  parallel configuration -- 
>  Parallel : There are 16 processors divided into  8 groups
>  Parallel : Processors per group         :  2
>  Parallel : k-points per processor group :  1
>  Parallel : Each k-point is parallelized over  2 processors
>
>
>
> Sincerely,
>
> Tadeu
>
>
>
>
>
>
>
>
>
>
> On Wed, 27 Apr 2011 10:17:28 +0200, Marcin Dulak wrote
>   
>> Hi,
>>
>> does this job run on 8 cores?
>> Increasing the stack size may help: 
>> https://wiki.fysik.dtu.dk/dacapo/Installation#id33
>>
>> Best regards,
>>
>> Marcin
>>
>> Tadeu Leonardo Soares e Silva wrote:
>>     
>>> Dear Marcin
>>>
>>> I have installed Dacapo in Rocks/CentOS, but I do not get to install 
>>>       
> Dacapo 
>   
>>> in Opensuse/SGIAltix. What can cause this problem? Could you help me, 
>>>       
> please?
>   
>>> Please find as attachment some files.
>>>
>>>
>>> * Ldd dacapo_intellinux_mpi.run
>>>
>>> linux-vdso.so.1 =>  (0x00007fffcbd8e000)
>>> libmkl_lapack.so 
>>> => /sw/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_lapack.so 
>>> (0x00007ff44378f000)
>>> libmkl_intel_lp64.so 
>>> => /sw/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_intel_lp64.so 
>>> (0x00007ff443395000)
>>> libmkl_core.so 
>>>       
> => /sw/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_core.so 
>   
>>> (0x00007ff442fe2000)
>>> libguide.so => /sw/intel/Compiler/11.1/072/lib/intel64/libguide.so 
>>> (0x00007ff44446a000)
>>> libpthread.so.0 => /lib64/libpthread.so.0 (0x00007ff442dc5000)
>>> libmkl_intel_thread.so 
>>> => /sw/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_intel_thread.so 
>>> (0x00007ff441b81000)
>>> libmpi_f90.so.0 => /sw/mpi/intel/openmpi-1.4.2/lib64/libmpi_f90.so.0 
>>> (0x00007ff44197d000)
>>> libmpi_f77.so.0 => /sw/mpi/intel/openmpi-1.4.2/lib64/libmpi_f77.so.0 
>>> (0x00007ff441741000)
>>> libmpi.so.0 => /sw/mpi/intel/openmpi-1.4.2/lib64/libmpi.so.0 
>>> (0x00007ff44146f000)
>>> libopen-rte.so.0 => /sw/mpi/intel/openmpi-1.4.2/lib64/libopen-rte.so.0 
>>> (0x00007ff441207000)
>>> libopen-pal.so.0 => /sw/mpi/intel/openmpi-1.4.2/lib64/libopen-pal.so.0 
>>> (0x00007ff440f8a000)
>>> libdl.so.2 => /lib64/libdl.so.2 (0x00007ff440d86000)
>>> libnsl.so.1 => /lib64/libnsl.so.1 (0x00007ff440b6e000)
>>> libutil.so.1 => /lib64/libutil.so.1 (0x00007ff44096b000)
>>> libm.so.6 => /lib64/libm.so.6 (0x00007ff440715000)
>>> libc.so.6 => /lib64/libc.so.6 (0x00007ff4403b7000)
>>> libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007ff4401a0000)
>>> /lib64/ld-linux-x86-64.so.2 (0x00007ff4443f6000)
>>> libifport.so.5 => /sw/intel/Compiler/11.1/072/lib/intel64/libifport.so.5 
>>> (0x00007ff440067000)
>>> libifcoremt.so.5 
>>>       
> => /sw/intel/Compiler/11.1/072/lib/intel64/libifcoremt.so.5 
>   
>>> (0x00007ff43fdc2000)
>>> libimf.so => /sw/intel/Compiler/11.1/072/lib/intel64/libimf.so 
>>> (0x00007ff43fa2e000)
>>> libsvml.so => /sw/intel/Compiler/11.1/072/lib/intel64/libsvml.so 
>>> (0x00007ff43f818000)
>>> libintlc.so.5 => /sw/intel/Compiler/11.1/072/lib/intel64/libintlc.so.5 
>>> (0x00007ff43f6da000)
>>>
>>>
>>> * Log openmpi.o19491
>>> =================================================================
>>>                    NACAD Supercomputer Center
>>> =================================================================
>>> --- Job Information ---
>>>    Cores available per node = 8
>>>    Cores allocated = 16
>>>    Cores used      = 16
>>> --- Nodes:cores Used ---
>>> r1i1n10:8	  r1i1n9:8
>>> =================================================================
>>>  
>>> Inicio : Seg Abr 25 18:59:17 BRT 2011
>>> Traceback (most recent call last):
>>>   File "./teste.py", line 24, in <module>
>>>     dyn.run(fmax=0.05) 
>>>   File "/home/users/tadeu33/local/ase-3.4.1/ase/optimize/optimize.py", 
>>>       
> line 
>   
>>> 114, in run
>>>     f = self.atoms.get_forces()
>>>   File "/home/users/tadeu33/local/ase-3.4.1/ase/atoms.py", line 571, in 
>>> get_forces
>>>     forces = self._calc.get_forces(self)
>>>   File "/home/users/tadeu33/local/ase-
>>> 3.4.1/ase/calculators/jacapo/jacapo.py", line 2307, in get_forces
>>>     self.calculate()
>>>   File "/home/users/tadeu33/local/ase-
>>> 3.4.1/ase/calculators/jacapo/jacapo.py", line 2810, in calculate
>>>     raise DacapoAbnormalTermination(s % txt)
>>> ase.calculators.jacapo.jacapo.DacapoAbnormalTermination: Dacapo output 
>>> txtfile (teste332.txt) did not end normally.
>>>  KPT: Chadi-Cohen asymptotic error estimate:  0.000118148205
>>>  KPT: (see PRB 8, 5747 (1973); 13, 5188 (1976))
>>>   
>>>  KPT: nkpmem :            1
>>>  Parallel :   --  parallel configuration -- 
>>>  Parallel : There are 16 processors divided into  8 groups
>>>  Parallel : Processors per group         :  2
>>>  Parallel : k-points per processor group :  1
>>>  Parallel : Each k-point is parallelized over  2 processors
>>>
>>> Fim : Seg Abr 25 18:59:19 BRT 2011
>>>
>>> * Command Line executed by dacapo.run script 
>>>
>>> /sw/mpi/intel/openmpi-1.4.2/bin/mpirun -np 16 -
>>>
>>>       
> machinefile /var/spool/torque/aux//19491.service0.ice.nacad.ufrj.br /home/use
>   
>>> rs/tadeu33/bin/dacapo_intellinux_mpi.run teste332.nc -out teste332.txt -
>>> scratch /scratch/tadeu/
>>>
>>> * Netcdf Compilation 
>>> 			cd netcdf-3.6.3
>>> 			export CC=icc
>>> 			export CXX=icpc 
>>>                         export CFLAGS='-O3 -xssse3 -ip -no-prec-div -
>>>       
> static'
>   
>>>                         export CXXFLAGS='-O3 -xssse3 -ip -no-prec-div -
>>> static'
>>>                         export F77=ifort
>>>                         export FC=ifort
>>>                         export F90=ifort
>>> 			export FFLAGS='-O3 -xssse3'
>>> 			export CPP='icc -E' 
>>> 			export CXXCPP='icpc -E'
>>> 			export CPPFLAGS='-DNDEBUG -DpgiFortran'
>>> 			./configure --prefix=$HOME/local/netcdf-3.6.3_intel
>>> 			mkdir $HOME/local/netcdf-3.6.3_intel
>>> 			make check
>>> 			make install
>>>
>>>
>>>
>>> * Dacapo Compilation 
>>> 			export BLASLAPACK='-
>>> L/sw/intel/Compiler/11.1/064/mkl/lib/em64t -lmkl_lapack -
>>>       
> lmkl_intel_lp64 -
>   
>>> lmkl_core -lguide -lpthread -lmkl_intel_thread' 
>>> 			export NETCDF=$HOME/local/netcdf-3.6.3_intel/lib
>>> 			export FFTW=$HOME/local/fftw2-2.1.5-1.intel/lib
>>>                         export CC=icc
>>>                         export CXX=icpc
>>>                         export CFLAGS='-O3 -xssse3'
>>>                         export CXXFLAGS='-O3 -xssse3'
>>>                         export F77=ifort
>>>                         export FC=ifort
>>>                         export F90=ifort
>>>                         export FFLAGS='-O3 -xssse3'
>>>                         export CPP='icc -E'
>>>                         export CXXCPP='icpc -E'
>>> 			export MPIDIR=/sw/mpi/intel/openmpi-1.4.2
>>> 			export MPI_LIBDIR=${MPIDIR}/lib64
>>> 			export MPI_BINDIR=${MPIDIR}/bin
>>> 			export MPI_INCLUDEDIR=${MPIDIR}/include
>>> 			cp -a ../src/dacapo .
>>> 			cd dacapo/src
>>> 			make intellinux
>>> 			make intellinux MP=mpi	
>>> 			cp intellinux_mpi/dacapo.run 
>>> $HOME/bin/dacapo_intellinux_mpi.run
>>> 			cp intellinux_serial/dacapo.run 
>>> $HOME/bin/dacapo_intellinux_serial.run
>>> 			cd $HOME/work/dacapo/Python
>>> 			python setup.py install --verbose --prefix='' --
>>> home=$HOME/local/dacapo
>>>
>>>
>>> Sincerely,
>>>
>>> Tadeu Leonardo
>>>
>>>
>>>  +++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>
>>>  PEQ/COPPE renova o nivel de curso 7, maximo na CAPES:
>>>  45 anos de excelencia no ensino e pesquisa de pos-graduação em
>>>  Engenharia Quimica.
>>>
>>>  ************************************
>>>
>>>  PEQ/COPPE : 45 years of commitment to excellence in teaching and
>>>  research in Chemical Engineering.
>>>
>>>  +++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>
>>>
>>>       
>> -- 
>> ***********************************
>>
>> Marcin Dulak
>> Technical University of Denmark
>> Department of Physics
>> Building 307, Room 229
>> DK-2800 Kongens Lyngby
>> Denmark
>> Tel.: (+45) 4525 3157
>> Fax.: (+45) 4593 2399
>> email: Marcin.Dulak at fysik.dtu.dk
>>
>> ***********************************
>>     
>
>
>  +++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>  PEQ/COPPE renova o nivel de curso 7, maximo na CAPES:
>  45 anos de excelencia no ensino e pesquisa de pos-graduação em
>  Engenharia Quimica.
>
>  ************************************
>
>  PEQ/COPPE : 45 years of commitment to excellence in teaching and
>  research in Chemical Engineering.
>
>  +++++++++++++++++++++++++++++++++++++++++++++++++++++
>   

-- 
***********************************
 
Marcin Dulak
Technical University of Denmark
Department of Physics
Building 307, Room 229
DK-2800 Kongens Lyngby
Denmark
Tel.: (+45) 4525 3157
Fax.: (+45) 4593 2399
email: Marcin.Dulak at fysik.dtu.dk

***********************************




More information about the ase-users mailing list