Slurm Job Partitions at MSI
MSI uses the Slurm scheduler to fairly allocate compute resources to users of our systems. Slurm uses partitions to organize jobs with similar resource requirements. The job partitions on our systems manage different sets of hardware and have different limits for various compute resources. For example, we have partitions that support long execution times, high memory (RAM) requirements, or GPUs. When submitting a job it is important to choose a partition where the job is suited to the hardware and resource limitations.
Selecting a Partition
MSI has job partitions that are specific to a single cluster and partitions that are federated across clusters. Which partition to choose depends highly on the resources appropriate for your software/script. More information about selecting a partition and the different partition parameters can be found on the Choosing A Partition (Slurm) page.
In the tabs below you will find a summary of the available partitions organized by system, and the associated limitations. The quantities listed are totals or upper limits.
Partitions for jobs that can run on either Agate or Mesabi/Mangi.
Job arrays will only run on the cluster you submit the job on.
Partition name | Node sharing? | Cores per node | Walltime limit | Total node memory | Advised memory per core | Local scratch per node | Maximum nodes per job |
msismall | Yes | 24-128 | 96:00:00 | 40-499 GB | 1900 MB | 380-850 GB | 1 |
msilarge | No | 24-128 | 24:00:00 | 40-499 GB | 1900 MB | 380-850 GB | 32 |
msibigmem | Yes | 24-128 | 24:00:00 | 499-1995 GB | 3999 MB | 380-850 GB | 1 |
msigpu | Yes | 24-128 | 24:00:00 | 40-499 GB | 1900 MB | 380-850 GB | 1 |
interactive | Yes | 24-128 | 24:00:00 | 60-499 GB | 2 GB | 228-850 GB | 2 |
interactive-gpu | Yes | 24-128 | 24:00:00 | 60-499 GB | 2 GB | 228-850 GB | 2 |
preempt | Yes | 24-64 | 24:00:00 | 60-499 GB | 2 GB | 228-850 GB | 1-2 |
preempt-gpu | Yes | 24-64 | 24:00:00 | 60-499 GB | 2 GB | 228-850 GB | 1-2 |
Note: Partitions with the same name across clusters are federated (interactive, interactive-gpu, preempt, preempt-gpu).
Partition name | Node sharing? | Cores per node | Walltime limit | Total node memory | Advised memory per core | Local scratch per node | Maximum nodes per job |
agsmall | Yes | 128 | 96:00:00 | 499 GB | 3999 MB | 850 GB | 1 |
aglarge | No | 128 | 24:00:00 | 499 GB | 3999 MB | 850 GB | 32 |
ag2tb | Yes | 128 | 96:00:00 | 1995 GB | 15.5 GB | 850 GB | 1 |
a100-4(1) | Yes | 64 | 96:00:00 | 499 GB | 3999 MB | 850 GB | 4 |
a100-8(1) | Yes | 128 | 24:00:00 | 1002 GB | 7.5 GB | 850 GB | 1 |
interactive | Yes | 128 | 24:00:00 | 499 GB | 3999 MB | 850 GB | 2 |
interactive-gpu(2) | Yes | 64 | 24:00:00 | 499 GB | 8000 MB | 850 GB | 2 |
preempt | Yes | 128 | 24:00:00 | 499 GB | 3999 MB | 850 GB | 1 |
preempt-gpu(5) | Yes | 64 | 24:00:00 | 499 GB | 8000 MB | 850 GB | 1 |
amdsmall | Yes | 128 | 96:00:00 | 248 GB | 1900 MB | 415 GB | 1 |
amdlarge | No | 128 | 24:00:00 | 248 GB | 1900 MB | 415 GB | 32 |
amd512 | Yes | 128 | 96:00:00 | 499 GB | 4000 MB | 415 GB | 1 |
amd2tb | Yes | 128 | 96:00:00 | 1995 GB | 15 GB | 415 GB | 1 |
v100 (1) | Yes | 24 | 24:00:00 | 374 GB | 15 GB | 859 GB | 1 |
interactive (4) | Yes | 24 | 24:00:00 | 60 GB | 2 GB | 228 GB | 2 |
interactive-gpu (4) | Yes | 24 | 24:00:00 | 60 GB | 2 GB | 228 GB | 2 |
(1)Note: Jobs using an Agate partition must be submitted from an Agate host (ex. Agate login node).
(2)Note: In addition to selecting a100-4 or a100-8 GPU partitions, GPUs need to be requested for all GPU jobs.
One or more A100 GPU can be requested by including the following two lines in your submission script:
This example asks for a single A100 GPU, using the a100-4 partition
#SBATCH -p a100-4
#SBATCH --gres=gpu:a100:1
(3)Note: The interactive-gpu and preempt-gpu partitions contain A40 GPUs, so include the following two lines in your submission script:
This example asks for a single A40 GPU, using the interactive-gpu partition
#SBATCH -p interactive-gpu
#SBATCH --gres=gpu:a40:1
(4)Note: Users are limited to 2 jobs in the interactive
and interactive-gpu
partitions.
(5)Note: Jobs in the preempt
and preempt-gpu
partitions may be killed at any time to make room for jobs in the interactive or interactive-gpu partitions.