[gpaw-users] MPI Error, Fatal Error in PMPI_Comm_dup

Jens Jørgen Mortensen jjmo at dtu.dk
Mon Aug 5 12:23:18 CEST 2019


On 8/1/19 10:38 AM, Ali Malik via gpaw-users wrote:
> Dear gpaw-users,
> 
> I have been doing calculations on slabs of many systems including 
> Cr2AlC, Cr2GaC etc. Recently, I have been facing the MPI error, which 
> seems to occur randomly during the execution of the job. Sometimes, the 
> jobs get completed, but most of the time, I get this MPI error, which 
> terminates the job. I have been unable to identify the root cause of 
> this error.
> 
> I am running calculations on HPC cluster, using gpaw-1.5.2, 
> intelmpi-2018.4, python-3.7.2, scalapack-2.0.2. The HPC support desk 
> told me the error is due to the probable bug in 
> gpaw.logger(gpaw/io/logger.py lines 32-46). Their response:
> 
>    "when you call calc.set(txt = "...") the old logfile is not closed 
> properly, only a new one is created. I suspect that you reach the limit 
> of concurrently open files".

Does it work OK if you comment out the "calc.set(txt=...)" line?

Jens Jørgen

> 
> The input script and error output file is attached. If you need anything 
> else, please feel free to ask. Any help to debug the issue would be 
> highly appreciated. Or should I report this in bug tracker? Thanks
> 
> Here is the function*gpaw_optimize*, used in the input script which is 
> just a wrapper:
> 
> def gpaw_optimize(atoms, calc, relax='', fmax=0.01, relaxalgorithm= 
> "BFGS", mask=None, attach=False, gpawwrite="", verbose=True, **alargs):
>      """
>          wrapper function for relaxation
> 
>      :param atoms: ase atom object
>      :param calc:  Calculator object
>      :param relax: string (cell, full, "") , type of relaxation,
>      :param fmax: number, force criteria
>      :param relaxalgorithm: relax algorithm
>      :param attach: bool, default False
>      :param verbose: bool, default True
>      :return: atoms object
>      """
> 
>      if not attach:
> 
>          atoms.set_calculator(calc)
> 
>          if verbose:
> 
>              parprint("attaching the calculator", flush=True)
> 
>      if atoms.get_calculator() is None: # recheck
> 
>          if verbose:
> 
>              parprint("The Calculator is not attached", flush=True)
> 
>          atoms.set_calculator(calc)
>          if verbose:
> 
>              parprint("It has been attached", flush=True)
>          attach=True
> 
> 
> 
>      optimizer_algorithms = {"QuasiNewton": QuasiNewton, "BFGS": BFGS, 
> "CG": CG, "ScBFGS": ScBFGS, "BFGSLS": BFGSLS} # relaxation algorithms
> 
>      if relaxalgorithm in optimizer_algorithms:
>          pass
>      else:
> 
>          raise KeyError("The %s is invalid or  not found.\n The 
> available algorithms are: %s"
>                                 % ( relaxalgorithm, 
> optimizer_algorithms.values()) )
> 
>      #TODO: single relax statement outside if.
> 
>      if relax == 'full':
> 
>          uf = UnitCellFilter(atoms, mask=mask)
>          relax = optimizer_algorithms[relaxalgorithm](uf, 
> logfile="rel-all.log", **alargs)
> 
>          if verbose:
> 
>              parprint("Full relaxation", flush=True)
> 
> 
>      elif relax == 'cell':
> 
>          cf = StrainFilter(atoms, mask=mask)
>          relax = optimizer_algorithms[relaxalgorithm](cf, 
> logfile="rel-cell.log", **alargs)
> 
>          if verbose:
> 
>              parprint("Cell relaxation only", flush=True)
> 
> 
>      elif relax == 'ions':  # ionic_relaxation
> 
> 
>          relax = optimizer_algorithms[relaxalgorithm](atoms, 
> logfile="rel-ionic.log", **alargs)
> 
>          if verbose:
> 
>              parprint("Ions relaxation only", flush=True)
> 
>      else:
> 
>          raise RelaxationTypeException("The entered relaxation string is 
> incorrect")
> 
> 
>      relax.run(fmax=fmax)
> 
>      if gpawwrite:  # last state only
> 
>          calc.write(gpawwrite, mode="all")
> 
>      return atoms
> 
> 
> Best Regards,
> 
> Ali Muhammad Malik
> 
> 
> 
> 
> 
> 
> 
> 
> _______________________________________________
> gpaw-users mailing list
> gpaw-users at listserv.fysik.dtu.dk
> https://listserv.fysik.dtu.dk/mailman/listinfo/gpaw-users
> 



More information about the gpaw-users mailing list