Durham COSMA utilities

Batch queue utilities

The COSMA batch queues are implemented using PLATFORM LSF which provides a number of commands for inspecting the queues, however, these are not always presented in a convenient or targeted format, so a number of local utilities are also available.

Inspecting the state of the queues (showq,cutilisation)

The most useful utility is:

That shows a more readable form than the standard bjobs command and also shows how many nodes and cores are in use, likely start times etc.

You can also see which nodes are not currently running any jobs using the command:

to get a rough idea of what resources are not being used (ordered by queue), although for the main time limited queues (cosma, cosma-prince, cosma5, cosma5-prince, cosma5-pauper) a better idea is given using the backfill utilities.

Getting a job to run when the queues are busy (c4backfill, c4backfill)

For the time limited queues the best way to see what resource are free is to use one of the:

commands. The no limit queues, cordelia and the shm4 and shm5 queues can be understood using the cutilisation and showq commands.

Miscellaneous utilities

Finding out your disk quota (c4quota)

Your quota and usage on the main /cosma/home, /gpfs and /cosma5 GPFS partitions can be reported using the c4quota command. Here's an example:

   > c4quota
   Quota for pdraper
   Filesystem     usage      quota      limit      number-of-files
   ---------------------------------------------------------------
   /cosma         2130.47MB  14.5371GB  14.9082GB  25030
   /gpfs          61.5715GB  1024GB     1100GB     486093
   /cosma5        0GB        0GB        0GB        
    

command.

Running PLATFORM MPI jobs interactively (module load interactive_platform_mpi)

Normally PLATFORM MPI jobs will only run when submitted to a batch queue. That is difficult to work with when developing code (you can use the -I flag to interact directly with a running job and use the ddt debugger this way). To work around this you need to define the environment variable MPI_USELSF to "no". To make life simple all you actually need to do is load the module interactive_platform_mpi:

      module load interactive_platform_mpi
    

and then run your job using mpirun in the normal way (it is also possible to use ddt like this). Having a module gives you the opportunity to unload it when submitting the job to run in the queues.

Usage of projects and users

Each night the usage of the batch queues and filesystems is processed to produce a number of summary web pages that you will find useful or illuminating if you need or want to know about this. This is where you can find out if your project has gone over budget for the current quarter, which explains why your jobs have been demoted to the cosma5-pauper queue, and who is using all the disk space etc. These pages are at:

To access these pages you'll need your COSMA username and password.