Monitoring Job Status

PBS and Moab provide multiple tools to view queue, system, and job status. Below are the most common and useful of these tools.

qstat

Use qstat -a to check the status of submitted jobs.

nid00004: ORNL/CCS
                                           Time In Req'd  Req'd   Elap
Job ID Username Queue    Jobname    SessID  Queue  Nodes  Time  S Time
------ -------- -------- ---------- ------ ------- ------ ----- - -----
107    user1    short    run128       5095  000:14    128 02:00 R 00:13
108    user2    long     job1         6860  000:55   1024 12:00 R 00:54
109    user1    sys      job           --      --    3500 12:00 Q   --  

Total compute nodes allocated: 1152

The first column is the ID of each job (which has been truncated), and the second column is the owner. The S column gives the status of each job. Here are some common job-status values.

Status value Meaning
E Exiting after having run
H Held
Q Queued; eligible to run
R Running
S Suspended
T Being moved to new location
W Waiting for its execution time

showq

The Moab utility showq can be used to view a more detailed description of the queue. The utility will display the queue in the following states:

Active
These jobs are currently running.
Eligible
These jobs are currently queued awaiting resources. A user is allowed two jobs in the eligible state.
Blocked
These jobs are currently queued but are not eligible to run. Common reasons for jobs in this state are jobs on hold and the owning user currently having two jobs in the eligible state.

checkjob

The Moab utility checkjob can be used to view details of a job in the queue. For example, if job 736 is a job currently in the queue in a blocked state, the following can be used to view why the job is in a blocked state:

>checkjob 736

The return may contain a line similar to the following:

BlockMsg: job 736 violates idle HARD MAXJOB limit of 2 for user <userid> (Req: 1 InUse: 2)

This line indicates the job is in the blocked state because the owning user has reached the limit of two job currently in the eligible state.

showstart

The Moab utility showstart can be used to view an estimated start time for a given job. For example,

> showstart 736
 job 736 requires 2048 procs for 1:00:00:00  

 Estimated Rsv based start in           3:41:18 on Tue March 1 19:21:18
 Estimated Rsv based completion in   1:03:41:18 on Wed March 2 19:21:18
>

psview

psview is a very useful Cray utility that displays job information as seen by psched, the underlying system scheduler. It will show if a job is waiting to start, migrating, or in a system queued state.

Queued jobs will be listed in the Posted list. If a job is being migrated by the system or waiting to start, it will be noted in the Notes column.