How to compute/Requesting resources

From MetaCentrum
Jump to: navigation, search
Related topics
PBS Professional
Scheduling system
Resources request examples

In January 2017 we switched to a new scheduling system - PBS Professional. The old TORQUE environment is no longer accessible. How to request resources in a new syntax of PBS Pro is described in details on this page.

General syntax of the PBS Pro command:


qsub -l select=1:ncpus=1:mem=1gb -l walltime=1:00:00 -l option1 -l option2 ... script.sh


Basic options

Information.png Please note: Only one select argument is allowed at a time.

  • maximal duration of a job – is set by -l walltime=[[hh:]mm:]ss, default walltime is 24:00:00. Queues q_* (such as q_2h, q_2d etc.) are not accessible for submit jobs, rout queue (default) automatically chose appropriate time queue based on specified walltime. Examples:
    • -l walltime=1:00:00 (one hour)
    • -l walltime=24:00:00 (one day)
    • -l walltime=120:00:00 (5 days)
  • number of machines and processors – number of processors and "chunks" is set with -l select=[number]:ncpus=[number], terminology of PBS Pro defines "chunk" as further indivisible set of resources allocated to a job on 1 physical node, job with more chunks is analogy of job with more nodes in the old TORQUE system. Chunks can be on one machine next to each other or conversely always on different machines, eventually they can be placed according to available resources. Examples:
    • -l select=1:ncpus=2 – two processors on one chunk
    • -l select=2:ncpus=1 – two chunks each with one processor
    • -l select=1:ncpus=1+1:ncpus=2 – two chunks, one with one processor and second with two processors
    • -l select=2:ncpus=1 -l place=pack – all chunks must be on one node (if there is not any big enough node, the job will never run)
    • -l select=2:ncpus=1 -l place=scatter – each chunk will be placed on different node (default for old TORQUE system)
    • -l select=2:ncpus=1 -l place=free – permission to plane chunks on nodes arbitrarily, according to actual resource availability on nodes (chunks can be on one or more nodes, default behavior for PBS Pro):
    • if you are unsure about the number of needed processors, ask for an exclusive reservation of the whole machine using the parameter "-l place=":
    • -l select=2:ncpus=1 -l place=exclhost – request for 2 exclusive nodes (without cpu and mem limit control)
    • -l select=3:ncpus=1 -l place=scatter:excl – it is possible to combine exclusivity with specification of chunk planning
    • -l select=102:place=group=cluster – 102 cpus on one cluster
  • amount of temporary scratch – fast and reliable, required for job processing, always specify type and size of scratch, job has no default scratch assigned, scratch type can be one of scratch_local|scratch_ssd|scratch_shared. Example:
    • -l select=1:ncpus=1:mem=4gb:scratch_local=10gb
    • -l select=1:ncpus=1:mem=4gb:scratch_ssd=1gb
    • -l select=1:ncpus=1:mem=4gb:scratch_shared=1gb
  • after the request for scratch, following variables are present in work environment:
$SCRATCH_VOLUME=<dedicated capacity>
$SCRATCHDIR=<directory>
$SCRATCH_TYPE=<scratch_local|scratch_ssd|scratch_shared>
  • amount of needed memory – job is implicitly assigned with 400MB of memory if not specified otherwise. Examples:
    • -l select=1:ncpus=1:mem=1gb
    • -l select=1:ncpus=1:mem=10gb
    • -l select=1:ncpus=1:mem=200mb
  • licence – is set by parameter -l
    • -l select=3:ncpus=1 -l walltime=1:00:00 -l matlab=1 – one licence for Matlab
  • sending the information emails about the job state For example:
    • -m abe – sends an email when the job aborts (a), begins (b) and completes/ends (e)

You can use the tool Command qsub refining for condition definition.

To submit a job on special nodes with a particular OS

  • To submit a job to a machine with Debian9, please use "os=debian9" in job specification:
 zuphux$ qsub -l select=1:ncpus=2:mem=1gb:scratch_local=1gb:os=debian9 …
  • To run tasks on a machine with any OS, type "os = ^ any"
 zuphux$ qsub -l select=1:ncpus=2:mem=1gb:scratch_local=1gb:os=^any …
  • If you experience any problem with libraries or applications compatibility in Debian9, please add module debian8-compat.

Advanced options

Resources of computational nodes

Provided list of attributes may not be complete. You can find actual list on the web in section Node properties

  • node with specific featurevalue of a feature must be always specified (either True or False). Examples:
    • -l select=1:ncpus=1:cluster=tarkil – request for a node from cluster tarkil
    • -l select=1:ncpus=1:cluster=^tarkil – request for a node except cluster tarkil
  • request for specific node – always use shortened name. Example:
    • -l select=1:ncpus=1:vnode=tarkil3 – request for node tarkil3.metacentrum.cz
  • request for a host – use full host name
    • -l select=1:ncpus=1:host=tarkil3.grid.cesnet.cz
  • cgroups – request limiting memory usage by using cgroups, limiting memory by cgroups aren't enabled on all machine. Example:
    • -l select=1:ncpus=1:mem=5gb:cgroups=memory
  • cgroups – request limiting CPU usage by using cgroups, limiting CPU by cgroups aren't enabled on all machine. Example:
    • -l select=1:ncpus=1:mem=5gb:cgroups=cpuacct
  • networking cards – "-l place" is also used for infiniband request:
    • -l select=3:ncpus=1 -l walltime=1:00:00 -l place=group=infiniband
  • CPU flags – limit submission on nodes with specific CPU flags
    • -l select=cpu_flag=sse3
    • list of available flags can be obtained by command pbsnodes -a | grep resources_available.cpu_flag | awk '{print $3}' | tr ',' '\n' | sort | uniq – this list is updated with every addition of some nodes or their removal. It is thus wise to check the available flags before you need anything special.
  • multicpu jub on a same cluster
    • qsub -l place=group=cluster

Moving job to another queue

qmove uv@wagap-pro.cerit-sc.cz 475337.wagap-pro.cerit-sc.cz


GPU computing

  • For computing on GPU a gpu queue is used (specified can be either gpu or gpu_long). GPU queues are accessible for all MetaCentrum members, one gpu card is assigned by default. IDs of GPU cards are stored in CUDA_VISIBLE_DEVICES variable.
    • -l select=ncpus=1:ngpus=2 -q gpu

Job Array

  • The job array is submitted as:
 # general command
 $ qsub -J X-Y[:Z] script.sh
 # example
 $ qsub -J 2-7:2 script.sh
  • X is first index of a job, Y is upper border of an index and Z in optional parameter of an index step, therefore the example command will generate subjobs with indexes 2,4,6.
  • The job array is represented by a single job whose job number is followed by "[]", this main job provides an overview of unfinished sub jobs.
$ qstat -f 969390'[]' -x | grep array_state_count
    array_state_count = Queued:0 Running:0 Exiting:0 Expired:0 
  • An example of sub job ID is 969390[1].arien-pro.ics.muni.cz.
  • The sub job can be queried by a qstat command (qstat -t).
  • PBS Pro uses PBS_ARRAY_INDEX instead of Torque's PBS_ARRAYID inside of a sub job. The varibale PBS_ARRAY_ID contains job ID of the main job.

MPI processes

  • How many MPI processes would run on one chunk is specified by mpiprocs=[number].
  • For each MPI process there is one line in nodefile $PBS_NODEFILE that specifies allocated vnode.
    • -l select=3:ncpus=2:mpiprocs=2 – 6 MPI processes (nodefile contains 6 lines with names of vnodes), 2 MPI processes always share 1 vnode with 2 CPU
  • How many OpenMP threads would run in 1 chunk (ompthreads=[number]), 2 omp threads on 1 chunks is default behaviour (ompthreads = ncpus)

Default working directory setting in shell

for shell in tmux -- add the following command to the .bashrc in your /home directory

    case "$-" in *i*) cd /storage/brno1/home/LOGNAME/ ;; esac

for /home directory

    case "$-" in *i*) cd ;; esac