Slurm Job Submission and Scheduling

Job submission and scheduling with Slurm

MSI uses Slurm as its job queueing and resource management system to efficiently and fairly manage the resources used by all users.  When jobs are submitted to a queue (‘partition’ in Slurm nomenclature) they wait until the appropriate computational resources are available.

To submit jobs to a Slurm partition, users must first write Slurm job scripts. Slurm job scripts contain information on the resources requested for the calculation, as well as the commands for executing the calculation.

Writing Jobs Scripts

A Slurm job script is a shell script with special lines that specify the resources required for the job as well as the commands to perform the computation. The script must be in plain text (i.e., no formatting such as bold or italics and not a word processor document). An example job script is shown below:

#!/bin/bash -l        
#SBATCH --time=8:00:00
#SBATCH --ntasks=8
#SBATCH --mem=10g
#SBATCH --tmp=10g
#SBATCH --mail-type=BEGIN,END,FAIL  
cd ~/program_directory
module load intel 
module load ompi/intel mpirun -np 8 program_name < inputfile > outputfile 

The first line in the Slurm script defines which shell the script will be run with (how the system will read the file).  This is required of all shell scripts, and Slurm job scripts are no exception. It is recommended to use bash by specifying this as the first line in the job script:
#!/bin/bash -l

Directives for the Slurm queuing system are used to specify the resources requested by the job; these lines begin with #SBATCH.  Lines 2–5 in the above sample script contain the Slurm resource request. The sample job will require 8 hours (--time=8:00:00), 8 processor cores (--ntasks=8),  10 gigabytes of memory (--mem=10gb), and 10 gigabytes of temporary or local scratch space (--tmp=10gb).  The resource request must contain appropriate values; if the requested time, processors, or memory are not suitable for the hardware the job will not be able to run.

The two lines containing #SBATCH --mail-type, and #SBATCH, are both directives having to do with sending message emails to the user. The --mail-type directive instructs the Slurm system to send an email when the job begins, aborts with an error, or finishes successfully.  Other options here include NONE or ALL. The --mail-user directive specifies the email address to be used. Using the message emails is recommended because the reason for a job failure can often be determined using the information in the emails.

The rest of the sample Slurm script contains the commands which will be executed to perform the calculation. A Slurm script must  contain the appropriate commands to set up and run the calculation, including ‘cd’ (change directory) and ‘module load’ (load software modules) commands. The last lines of a Slurm script contain commands used to execute the calculation. In the above example the final line contains an execution command to start a program which uses MPI communication to run on 8 processor cores.

Submitting Job Scripts

Once a job script is written it is submitted using the sbatch command:

sbatch -p partitionname scriptname 

Here partitionname is the name of the partition being submitted to, and scriptname is the name of the job script.  The -p option is not strictly required, but we recommend that you submit your jobs to specific partitions because it is easier to troubleshoot potential problems arising from insufficient requested resources.  Alternatively, the partition specification can be placed inside the job script as a directive (see “Slurm Directives” below).

Viewing and Canceling Jobs

To view the jobs submitted by a particular user use the ‘squeue’ command:

squeue -u username 

This will display the status of the specified jobs, and the associated job ID numbers. When run with no options, the squeue command will show all jobs on the system.

To cancel a submitted job use the ‘scancel’ command:

scancel jobIDnumber 

Here jobIDnumber should be replaced with the appropriate job ID number determined by using the squeue command.

Slurm Directives

Below is a table summarizing some directives that can be used inside Slurm job scripts. Each directive must be on its own line in the job script and must be specified before any other commands (cd, module load, etc).

For a full list of possible directives, please see the “OPTIONS” section for the ‘sbatch’ command: All options to ‘sbatch’ can be specified as directives in a job script.

Slurm CommandEffect
#SBATCH --time=8:00:00Specifies the maximum limit for how long the job will be allowed to run. (8 hours)
#SBATCH -N 1Specifies the number of nodes allocated for this job. Nodes are physically separated, and special instructions (like MPI) are required to use multiple nodes.
#SBATCH --ntasks=8Specifies the number of processors (cores) that will be reserved for this job  (8). These cores may be spread across multiple nodes, limited by -N. 
#SBATCH --mem=10gSpecifies the maximum limit for memory usage. This job will die if the application tries to use more than 10GB of memory.
#SBATCH --tmp=10gSpecifies 10 GB of temporary disk will be available for this job in /tmp.
#SBATCH --mail-type=ALLSpecifies which events will trigger an email message. Other options here include NONE, BEGIN, END, and FAIL.
#SBATCH --mail-user=me@umn.eduSpecifies the email address that should be used when the Slurm system sends message emails.
#SBATCH -p small,mygroupSpecifies the partition to be the “small” partition, or a partition named “mygroup”. The job will start at the earliest time one of these partitions can accommodate the job.

#SBATCH --gres=gpu:v100:2

#SBATCH -p v100

Request two v100 GPUs for a job submitted to the V100 partition.

Discover Advanced Computing and Data Solutions at MSI

Our Services