From MetaCentrum
Jump to navigation Jump to search

(Česká verze)

Related topics
Applications with MPI

Parallel computing can significantly shorten time of your job because the job uses multiple resources at once. You can read more about parallel computing here.

MetaCentrum offers two ways of parallel computing - OpenMP and MPI, or you can combine both.


Warning.gif Warning: Make sure the OMP_NUM_THREADS environment variable is set before running your job

If your application is able to use multiple threads via a shared memory, request a single node with multiple processors. For example:

qsub -l select=1:ncpus=4:mem=16gb:scratch_local=5gb -l walltime=24:00:00

Make sure that before running your application, the OMP_NUM_THREADS environment variable is appropriately set (it should already be automatically done by batch system to the number of requested cores). Otherwise your application will use all the available cores on the node and influence other jobs (and batch system will kill your job). The check can by done by issuing a command:


Notice: if you unset this variable, your job will be probably killed by batch system.

Setting OMP_NUM_THREADS variable

Type the following line into your batch script.

export OMP_NUM_THREADS=$PBS_NUM_PPN #restricts count of parallel processes to number of processors given by scheduling system


ZarovkaMala.png Note: Running an MPI computation is possible via mpirun command

If your application consists of multiple processes communicating via a message passing interface (see category applications with MPI), request for a set of nodes (with arbitrary number of processors). For example:

qsub -l select=2:ncpus=2:mem=1gb:scratch_local=2gb -l walltime=1:00:00

If you want to be sure that each chunk will be on a different node, it is a good idea to include the place = scatter parameter.

 qsub -l select=2:ncpus=2:mem=1gb:scratch_local=2gb -l place=scatter -l walltime=1:00:00
  • Make sure the appropriate openmpi/mpich2/mpich3/lam module is loaded into your environment before running your computation

Speed acceleration

You can request special nodes, which are interconnected by a low-latency InfiniBand connection to accelerate the speed of your job. For example:

qsub -l select=4:ncpus=4:mem=1gb:scratch_local=1gb -l walltime=1:00:00 -l place=group=infiniband
  • The InfiniBand is automatically detected when running an MPI computation using InfiniBand. You can begin your computation in the common way. For example:
mpirun myMPIapp

MPI and OpenMP interaction

If your application supports both types of parallelization (MPI and OpenMP), you can interact them. You have to run your computation correctly, otherwise the job might get to conflict with the scheduling system, which will kill your job. Examples of correct using the OpenMP library:

ZarovkaMala.png Note: First two examples are interchangeable, however they can influence the calculation speed. Try both and select the faster method

Requested resources Example
1 device, more processors
mpirun -n 1 /path/to/program ...
1 device, more processors
mpirun /path/to/program ...
2 device, more processors
cat $PBS_NODEFILE |uniq >nodes.txt
mpirun -n 2 --hostfile nodes.txt /path/to/program ...