|Applications with MPI||
Parallel computing can significantly shorten time of your job because the job uses multiple resources at once. You can read more about parallel computing here.
If your application is able to use multiple threads via a shared memory, request a single node with multiple processors. For example:
qsub -l select=1:ncpus=4:mem=16gb:scratch_local=5gb -l walltime=24:00:00 script.sh
Make sure that before running your application, the OMP_NUM_THREADS environment variable is appropriately set (it should already be automatically done by batch system to the number of requested cores). Otherwise your application will use all the available cores on the node and influence other jobs (and batch system will kill your job). The check can by done by issuing a command:
Notice: if you unset this variable, your job will be probably killed by batch system.
Setting OMP_NUM_THREADS variable
Type the following line into your batch script.
export OMP_NUM_THREADS=$PBS_NUM_PPN #restricts count of parallel processes to number of processors given by scheduling system
If your application consists of multiple processes communicating via a message passing interface (see category applications with MPI), request for a set of nodes (with arbitrary number of processors). For example:
qsub -l select=2:ncpus=2:mem=1gb:scratch_local=2gb -l walltime=1:00:00 script.sh
If you want to be sure that each chunk will be on a different node, it is a good idea to include the place = scatter parameter.
qsub -l select=2:ncpus=2:mem=1gb:scratch_local=2gb -l place=scatter -l walltime=1:00:00 skript.sh
- Make sure the appropriate openmpi/mpich2/mpich3/lam module is loaded into your environment before running your computation
You can request special nodes, which are interconnected by a low-latency InfiniBand connection to accelerate the speed of your job. For example:
qsub -l select=4:ncpus=4:mem=1gb:scratch_local=1gb -l walltime=1:00:00 -l place=group=infiniband script.sh
- The InfiniBand is automatically detected when running an MPI computation using InfiniBand. You can begin your computation in the common way. For example:
MPI and OpenMP interaction
If your application supports both types of parallelization (MPI and OpenMP), you can interact them. You have to run your computation correctly, otherwise the job might get to conflict with the scheduling system, which will kill your job. Examples of correct using the OpenMP library:
|1 device, more processors|
export OMP_NUM_THREADS=$PBS_NUM_PPN mpirun -n 1 /path/to/program ...
|1 device, more processors|
export OMP_NUM_THREADS=1 mpirun /path/to/program ...
|2 device, more processors|
cat $PBS_NODEFILE |uniq >nodes.txt export OMP_NUM_THREADS=$PBS_NUM_PPN mpirun -n 2 --hostfile nodes.txt /path/to/program ...