[ase-users] ase.io.read error with mpirun

Ask Hjorth Larsen asklarsen at gmail.com
Thu Aug 22 14:30:53 CEST 2019


Hi,

I would look at this but don't have time.  Can someone open an issue
and attach a small, script which reproduces the error?

Best regards
Ask

Am Mi., 21. Aug. 2019 um 16:17 Uhr schrieb Zeeshan Ahmad via ase-users
<ase-users at listserv.fysik.dtu.dk>:
>
> On another cluster, the error on using mpirun with read is: TypeError: Not a proper NumPy array for MPI communication.
>
> rank=000 L00: Traceback (most recent call last):
> rank=000 L01:   File "/home/azeeshan/miniconda3/envs/gpaw/lib/python3.7/site-packages/ase/parallel.py", line 244, in new_generator
> rank=000 L02:     broadcast((None, result))
> rank=000 L03:   File "/home/azeeshan/miniconda3/envs/gpaw/lib/python3.7/site-packages/ase/parallel.py", line 181, in broadcast
> rank=000 L04:     comm.broadcast(string, root)
> rank=000 L05: TypeError: Not a proper NumPy array for MPI communication.
> rank=000 L06:
> rank=000 L07: During handling of the above exception, another exception occurred:
> rank=000 L08:
> rank=000 L09: Traceback (most recent call last):
> rank=000 L10:   File "/home/azeeshan/miniconda3/envs/gpaw/lib/python3.7/site-packages/gpaw/__init__.py", line 201, in main
> rank=000 L11:     runpy.run_path(gpaw_args.script, run_name='__main__')
> rank=000 L12:   File "/home/azeeshan/miniconda3/envs/gpaw/lib/python3.7/runpy.py", line 263, in run_path
> rank=000 L13:     pkg_name=pkg_name, script_name=fname)
> rank=000 L14:   File "/home/azeeshan/miniconda3/envs/gpaw/lib/python3.7/runpy.py", line 96, in _run_module_code
> rank=000 L15:     mod_name, mod_spec, pkg_name, script_name)
> rank=000 L16:   File "/home/azeeshan/miniconda3/envs/gpaw/lib/python3.7/runpy.py", line 85, in _run_code
> rank=000 L17:     exec(code, run_globals)
> rank=000 L18:   File "F_str.py", line 7, in <module>
> rank=000 L19:     atoms = read('../Lisurf_lc_rot_F_532.traj')
> rank=000 L20:   File "/home/azeeshan/miniconda3/envs/gpaw/lib/python3.7/site-packages/ase/io/formats.py", line 498, in read
> rank=000 L21:     parallel=parallel, **kwargs))
> rank=000 L22:   File "/home/azeeshan/miniconda3/envs/gpaw/lib/python3.7/site-packages/ase/parallel.py", line 247, in new_generator
> rank=000 L23:     broadcast((ex, None))
> rank=000 L24:   File "/home/azeeshan/miniconda3/envs/gpaw/lib/python3.7/site-packages/ase/parallel.py", line 181, in broadcast
> rank=000 L25:     comm.broadcast(string, root)
> rank=000 L26: TypeError: Not a proper NumPy array for MPI communication.
> GPAW CLEANUP (node 0): <class 'TypeError'> occurred.  Calling MPI_Abort!
> application called MPI_Abort(MPI_COMM_WORLD, 42) - process 0
>
>
>
>
> On Aug 20, 2019, at 5:15 PM, Zeeshan Ahmad <azeeshan at cmu.edu> wrote:
>
> Hi,
>
> I am able to run the following file in serial (gpaw-python file.py) but not with mpirun:
>
> from ase.io import read, write
> from ase.build import bulk
>
> atoms = read('anyfile.cif')
>
> The error on running: mpirun -np 4 gpaw-python file.py is: (I have reproduced the same error with different cif and traj files.)
>
> Fatal error in PMPI_Bcast: Other MPI error, error stack:
> PMPI_Bcast(1600)........: MPI_Bcast(buf=0x5653ca2b0f60, count=17432, MPI_BYTE, root=0, MPI_COMM_WORLD) failed
> MPIR_Bcast_impl(1452)...:
> MPIR_Bcast(1476)........:
> MPIR_Bcast_intra(1249)..:
> MPIR_SMP_Bcast(1088)....:
> MPIR_Bcast_binomial(250): message sizes do not match across processes in the collective routine: Received 8 but expected 17432
> Fatal error in PMPI_Bcast: Other MPI error, error stack:
> PMPI_Bcast(1600)........: MPI_Bcast(buf=0x555eb9c60020, count=17432, MPI_BYTE, root=0, MPI_COMM_WORLD) failed
> MPIR_Bcast_impl(1452)...:
> MPIR_Bcast(1476)........:
> MPIR_Bcast_intra(1249)..:
> MPIR_SMP_Bcast(1088)....:
> MPIR_Bcast_binomial(250): message sizes do not match across processes in the collective routine: Received 8 but expected 17432
> Fatal error in PMPI_Bcast: Other MPI error, error stack:
> PMPI_Bcast(1600)........: MPI_Bcast(buf=0x55b0603a7f60, count=17432, MPI_BYTE, root=0, MPI_COMM_WORLD) failed
> MPIR_Bcast_impl(1452)...:
> MPIR_Bcast(1476)........:
> MPIR_Bcast_intra(1249)..:
> MPIR_SMP_Bcast(1088)....:
> MPIR_Bcast_binomial(310): Failure during collective
>
> The problem is with the read method. When I replace the file with:
>
> from ase.io import read, write
> from ase.build import bulk
>
> atoms = bulk('Li', a=3.4)
>
> I don't get any error.
> I installed ase and gpaw using conda: conda install -c conda-forge gpaw
>
> My gpaw and ase versions are: 1.5.2 and 3.18.0 respectively.
>
> Thanks,
> -Zeeshan
>
>
> --
> Zeeshan Ahmad
> PhD candidate, Mechanical Engineering
> Carnegie Mellon University
>
> _______________________________________________
> ase-users mailing list
> ase-users at listserv.fysik.dtu.dk
> https://listserv.fysik.dtu.dk/mailman/listinfo/ase-users


More information about the ase-users mailing list