How to compute/Job management
Skočit na navigaci
Skočit na vyhledávání
Related topics | |
---|---|
A job tracking video tutorial |
|
In this page we discuss how to
- track the status of a job
- get the data in case of failure
- clean the scratch manually
Tracking job status
You can track your jobs via online application PBSmon:
- Waiting jobs http://metavo.metacentrum.cz/pbsmon2/queues/jobsQueued
- Personal view http://metavo.metacentrum.cz/pbsmon2/person (change "person" for your META login)
qstat
- Completed job is in the stage "F" (finished).
- Command qstat views only waiting and running jobs. Displaying finished jobs is done by calling qstat -x.
- For smaller groups of jobs, PBS Pro can display expected start (Est Start Time). This can be done with command qstat -T.
It is also possible to track your job via its ID (jobID) and terminal. For example:
qstat -u <login> lists all user running or queueing jobs on actual PBS server
qstat -u <login> @arien-pro.ics.muni.cz @wagap-pro.cerit-sc.cz @pbs.elixir-czech.cz list all running or queueing jobs on all PBS servers
qstat -x -u <login> list finished user jobs
qstat -f <jobID> list details of the running or queueing job
qstat -x -f <jobID> list details of the finished job
Tracking running jobs
Follow these steps if you would like to check outputs of a job, which has not finished yet:
1. Find what machine is your job running on -> use e.g. PBSmon (https://metavo.metacentrum.cz/pbsmon2/person)
2. Login to the machine from any frontend with ssh target_machine command. E.g.:
ssh zapat112.cerit-sc.cz
3. Navigate to the /var/spool/pbs/spool/ directory and examine the files:
- $PBS_JOBID.OU for standard output (stdout – e.g., “1234.arien-pro.ics.muni.cz.OU”)
- $PBS_JOBID.ER for standard error output (stderr – e.g., “1234.arien-pro.ics.muni.cz.ER”)