LSF

Using the Scheduler: IBM Platform LSF

El Gato uses the IBM Platform LSF scheduler to provide queues and schedule jobs on the nodes. LSF is a lot like the PBS Pro scheduler used on the other UITS HPC systems, but with a different set of commands and syntax. The main LSF commands you'll need are listed below, but feel free to mail el-gato-support with any questions.

$bjobs provides a list of currently running jobs. See $man bjobs for more information.

$bqueues provides a list of currently available bqueues. See $man bqueues for more information.

A description of the queues are provided below. The windfall queue is the default, and all users may submit jobs to it. Jobs in the windfall queue are pre-emptable by higher priority queues, consistent with the windfall queue on the other UITS HPC systems and as described below.

$bkill terminates a running job. See $man bkill for more information.

$bsub is the command used to submit LSF scripts to the queue. The use of bsub is described in detail below, but an example follows.

Example LSF Script

The following LSF script will request a 32 core job on the windfall queue, using 16 cores per node, with exclusive use of each node. To submit this job from the command line, place this text in a file called lsf.sh and then execute the command


$bsub < lsf.sh

Example LSF script:
###========================================
#!/bin/bash
#BSUB -n 32
#BSUB -o lsf.out
#BSUB -e lsf.err
#BSUB -q "windfall"
#BSUB -J hello_world
#---------------------------------------------------------------------
module load openmpi
mpirun -np 32 ./mpi_hello_world > output.txt
###end of script

Here is a description of each #BSUB command:

  • #BSUB -n 32 requests a total of 32 cores. To request a single core, use #BSUB -n 1.
  • #BSUB -R "span[ptile=16]" requests 16 cores per node. ptile=1 would request 1 core per node.
  • #BSUB -o lsf.out requests that stdout from the job be placed in a file lsf.out in the current directory.
  • #BSUB -e lsf.err requests that stderr from the job be placed in a file lsf.err in the current directory.
  • #BSUB -q "windfall" requests that the job be submitted to the windfall queue.
  • #BSUB -x requests exclusive use of each node, preventing other jobs from sharing the node (available only on standard, medium and debug).
  • #BSUB -J hello_world requests that the job name be "hello_world".

Available Queues

El Gato has a variety of queues available for various users. Descriptions of these queues follows:

  • windfall is the default queue, available to all users. This queue may be requested using #BSUB -q "windfall" or by omitting a -q flag from your LSF script. Windfall  Users are limited to 192 cores or 12 complete nodes.  There is no time limit for a job.
  • standard is a queue available to users who have received approval for "standard" accounts. These accounts come with guaranteed time and disk space, but require a science justification. Please contact el-gato-support for more information. This queue may be requested using #BSUB -q "standard". Standard  Users are limited to 128 cores or 8 complete nodes.  This limit is independent of the windfall limit so you can run jobs to both.  There is a 48 hour time limit on jobs.
  • medium is a collective queue for users associated with the NSF MRI project that funded El Gato. If you believe you should have access to medium but do not, please contact el-gato-support.
  • medium_gpu is a collective GPU queue for users associated with the NSF MRI project that funded El Gato.  You will generally select #BSUB -R "span[ptile=2]” since there are two GPU’s in each of those nodes.
  • debug is a queue for short runs to test code.  It has a high priority but a limited duration. It is for the  groups associated with the NSF MRI project that funded El Gato.
  • If you have a justifiable need for computational resources beyond the available queues, please contact el-gato-support to discuss access

Requesting GPU Nodes with LSF

To request that your job run on nodes enabled with GPUs, add the following line to your LSF script:

#BSUB -R gpu

The -R requests an LSF Resource. The GPU nodes are marked with the "gpu" resource.

Requesting Phi Nodes with LSF

To request that your job run on nodes enabled with Phis, add the following line to your LSF script:

#BSUB -R phi

The -R requests an LSF Resource. The Phi nodes are marked with the "phi" resource.

Requesting an Interactive Job with LSF

To request an interactive job with LSF (including X-forwarding), use the following command:

$bsub -XF -Is bash

The -Is bash flag requests an interactive bash shell, while the -XF flag requests X-forwarding. This bsub command can be combined with the -q flag to request a specific queue (default is windfall) or with the -R flag to request resources (e.g., gpu or phi).