How to check the state of your jobs
High level overview via web browser
If you want to monitor how and if your jobs progress, we have a poor men's high level overview named CondorWatch
available. Here, you can find all jobs currently known to the system with submit hosts as columns and user/accounting tags as rows.
Each field with contents shows three bits of information: The first one shows how many jobs are currently running while the second one how many jobs encountered an error and are now awaiting further handling by the user (see below). The final bit of information shows how many jobs are currently waiting to be scheduled to a free resource. If a job requires multiple CPU cores, it will show the total number of CPU cores in parentheses after each number.
command line interface on the submit host
For a rough overview, the command
will yield a summary for your user - your user name is an implicit argument to
If you want more details you will be able to get those by running
As with all command line options for condor executables you can write fewer letters as long as the option name stays unique, i.e.
will still work but
where are my jobs running?
To list on which machine a job runs, the option
may be used, e.g.
condor_q -nob -run
1517656.13 username 5/29 13:12 0+00:19:44 email@example.com
means that job number
was submitted on May 29th at 13:12 UTC. When
was issued, the job had a total wall clock run time of 19:44 minutes and was currently running on node
One could now
into that node and inspect the system state with
or other tools.
how to debug failed jobs?
A job which failed (usually this means it exited with a non-zero exit code) will usually be placed into the "hold" state. One can see the error messages by running
condor_q -nob -hold
45344094.0 user name 5/28 11:27 Error from firstname.lastname@example.org: Job has gone over memory limit of 8064 megabytes. Peak usage: 8048 megabytes.
In this case, the job asked for about 8 GB of memory but went up to the limit/past its limit. The slight discrepancy in numbers shown here may arise from slightly different ways how they are measured/computed.