Using the Scheduler: IBM Platform LSF
El Gato uses the IBM Platform LSF scheduler to provide queues and schedule jobs on the nodes. LSF is a lot like the PBS Pro scheduler used on the other UITS HPC systems, but with a different set of commands and syntax. The main LSF commands you'll need are listed below, but feel free to mail el-gato-support with any questions.
$bjobs provides a list of currently running jobs. See $man bjobs for more information.
$bqueues provides a list of currently available bqueues. See $man bqueues for more information.
A description of the queues are provided below. The windfall queue is the default, and all users may submit jobs to it. Jobs in the windfall queue are pre-emptable by higher priority queues, consistent with the windfall queue on the other UITS HPC systems and as described below.
$bkill terminates a running job. See $man bkill for more information.
$bsub is the command used to submit LSF scripts to the queue. The use of bsub is described in detail below, but an example follows.
Example LSF Script
The following LSF script will request a 32 core job on the
windfall queue, using 16 cores per node, with exclusive use of each node. To submit this job from the command line, place this text in a file called
lsf.sh and then execute the command
$bsub < lsf.sh
Example LSF script:
#BSUB -n 32
#BSUB -o lsf.out
#BSUB -e lsf.err
#BSUB -q "windfall"
#BSUB -J hello_world
module load openmpi
mpirun -np 32 ./mpi_hello_world > output.txt
###end of script
Here is a description of each
#BSUB -n 32requests a total of 32 cores. To request a single core, use
#BSUB -n 1.
#BSUB -R "span[ptile=16]"requests 16 cores per node.
ptile=1would request 1 core per node.
#BSUB -o lsf.outrequests that
stdoutfrom the job be placed in a file
lsf.outin the current directory.
#BSUB -e lsf.errrequests that
stderrfrom the job be placed in a file
lsf.errin the current directory.
#BSUB -q "windfall"requests that the job be submitted to the
#BSUB -xrequests exclusive use of each node, preventing other jobs from sharing the node (available only on standard, medium and debug).
#BSUB -J hello_worldrequests that the job name be "
El Gato has a variety of queues available for various users. Descriptions of these queues follows:
- windfall is the default queue, available to all users. This queue may be requested using #BSUB -q "windfall" or by omitting a -q flag from your LSF script. Windfall Users are limited to 192 cores or 12 complete nodes. There is no time limit for a job.
- standard is a queue available to users who have received approval for "standard" accounts. These accounts come with guaranteed time and disk space, but require a science justification. Please contact el-gato-support for more information. This queue may be requested using #BSUB -q "standard". Standard Users are limited to 128 cores or 8 complete nodes. This limit is independent of the windfall limit so you can run jobs to both. There is a 48 hour time limit on jobs.
- medium is a collective queue for users associated with the NSF MRI project that funded El Gato. If you believe you should have access to medium but do not, please contact el-gato-support.
- medium_gpu is a collective GPU queue for users associated with the NSF MRI project that funded El Gato. You will generally select #BSUB -R "span[ptile=2]” since there are two GPU’s in each of those nodes.
- debug is a queue for short runs to test code. It has a high priority but a limited duration. It is for the groups associated with the NSF MRI project that funded El Gato.
- If you have a justifiable need for computational resources beyond the available queues, please contact el-gato-support to discuss access
Requesting GPU Nodes with LSF
To request that your job run on nodes enabled with GPUs, add the following line to your LSF script:
#BSUB -R gpu
-R requests an LSF Resource. The GPU nodes are marked with the "
Requesting Phi Nodes with LSF
To request that your job run on nodes enabled with Phis, add the following line to your LSF script:
#BSUB -R phi
-R requests an LSF Resource. The Phi nodes are marked with the "
Requesting an Interactive Job with LSF
To request an interactive job with LSF (including X-forwarding), use the following command:
$bsub -XF -Is bash
-Is bash flag requests an interactive bash shell, while the
-XF flag requests X-forwarding. This
bsub command can be combined with the
-q flag to request a specific queue (default is
windfall) or with the
-R flag to request resources (e.g.,