Job Partitions

Slurm Job Partitions at MSI

MSI uses the Slurm scheduler to fairly allocate compute resources to users of our systems. Slurm uses partitions to organize jobs with similar resource requirements. The job partitions on our systems manage different sets of hardware and have different limits for various compute resources. For example, we have partitions that support long execution times, high memory (RAM) requirements, or GPUs.  When submitting a job it is important to choose a partition where the job is suited to the hardware and resource limitations.

Selecting a Partition

MSI has job partitions that are specific to a single cluster and partitions that are federated across clusters. Which partition to choose depends highly on the resources appropriate for your software/script. More information about selecting a partition and the different partition parameters can be found on the Choosing A Partition (Slurm) page.

In the tabs below you will find a summary of the available partitions organized by system, and the associated limitations. The quantities listed are totals or upper limits.


Federated Partitions

Partitions for jobs that can run on either Agate or Mesabi/Mangi.
Job arrays will only run on the cluster you submit the job on.

Partition nameNode sharing?Cores per nodeWalltime limitTotal node memoryAdvised memory per coreLocal scratch per nodeMaximum nodes per job
msismallYes24-12896:00:0040-499 GB1900 MB380-850 GB1
msilargeNo24-12824:00:0040-499 GB1900 MB380-850 GB32
msibigmemYes24-12824:00:00499-1995 GB3999 MB380-850 GB1
msigpuYes24-12824:00:0040-499 GB1900 MB380-850 GB1
interactiveYes24-12824:00:0060-499 GB2 GB228-850 GB2
interactive-gpuYes24-12824:00:0060-499 GB2 GB228-850 GB2
preemptYes24-6424:00:0060-499 GB2 GB228-850 GB1-2
preempt-gpuYes24-6424:00:0060-499 GB2 GB228-850 GB1-2

Note: Partitions with the same name across clusters are federated (interactive, interactive-gpu, preempt, preempt-gpu).


 

Agate Partitions
Partition nameNode sharing?Cores per nodeWalltime limitTotal node memoryAdvised memory per coreLocal scratch per nodeMaximum nodes per job
agsmallYes12896:00:00499 GB3999 MB850 GB1
aglargeNo12824:00:00499 GB3999 MB850 GB32
ag2tbYes12896:00:001995 GB15.5 GB850 GB1
a100-4(1)Yes6496:00:00499 GB3999 MB850 GB4
a100-8(1)Yes12824:00:001002 GB7.5 GB850 GB1
interactiveYes12824:00:00499 GB3999 MB850 GB2
interactive-gpu(2)Yes6424:00:00499 GB8000 MB850 GB2
preemptYes12824:00:00499 GB3999 MB850 GB1
preempt-gpu(5)Yes6424:00:00499 GB8000 MB850 GB1
amdsmallYes12896:00:00248 GB1900 MB415 GB1
amdlargeNo12824:00:00248 GB1900 MB415 GB32
amd512Yes12896:00:00499 GB4000 MB415 GB1
amd2tbYes12896:00:001995 GB15 GB415 GB1
v100 (1)Yes2424:00:00374 GB15 GB859 GB1
interactive (4)Yes2424:00:0060 GB2 GB228 GB2
interactive-gpu (4)Yes2424:00:0060 GB2 GB228 GB2

(1)Note: Jobs using an Agate partition must be submitted from an Agate host (ex. Agate login node).

(2)Note: In addition to selecting a100-4 or a100-8 GPU partitions, GPUs need to be requested for all GPU jobs.

One or more A100 GPU can be requested by including the following two lines in your submission script:

This example asks for a single A100 GPU, using the a100-4 partition

#SBATCH -p a100-4     
#SBATCH --gres=gpu:a100:1 

(3)Note: The interactive-gpu and preempt-gpu partitions contain A40 GPUs, so include the following two lines in your submission script:

This example asks for a single A40 GPU, using the interactive-gpu partition

#SBATCH -p interactive-gpu
#SBATCH --gres=gp
u:a40:1

(4)Note: Users are limited to 2 jobs in the interactive and interactive-gpu partitions.

(5)Note: Jobs in the preempt and preempt-gpu partitions may be killed at any time to make room for jobs in the interactive or interactive-gpu partitions.

 

Discover Advanced Computing and Data Solutions at MSI

Our Services