High Performance Computing : Partitions (Queues)
Created by Potthoff, Sebastian, last modified on 12. Jun 2024
The batch or job scheduling system on PALMA-II is called SLURM. If you are used to PBS/Maui and want to switch to SLURM, this document might help you. The job scheduler is used to start and manage computations on the cluster but also to distribute resources among all users depending on their needs. Computation jobs (but also interactive sessions) can be submitted to different queues (or partitions in the slurm language), which have different purposes:
Partitions
Available for everyone:
Name | Purpose | CPU Arch | # Nodes | # GPUs / node | Compute capability of GPU | max. CPUs (threads) / node | max. Mem / node | max. Walltime | BeeOND storage |
---|---|---|---|---|---|---|---|---|---|
normal | general computations | Skylake (Gold 6140) | 143 | - | - | 36 | 92 GB / 192 GB | 24 hours | 350 GB |
long | general computations | Skylake (Gold 6140) | - | - | 36 | 92 GB / 192 GB | 7 days | 350 GB | |
express | short running (test) jobs, compilation | Skylake (Gold 6140) | 5 | - | - | 36 | 92 GB | 2 hours | 350 GB |
bigsmp | SMP | Skylake (Gold 6140) | 3 | - | - | 72 | 1.5 TB | 7 days | 350 GB |
largesmp | SMP | Skylake (Gold 6140) | 2 | - | - | 72 | 3 TB | 7 days | 350 GB |
requeue* | This queue will use the free nodes from the group exclusive nodes listed below. | Skylake (Gold 6140) | 68 / 50 / 3 | - | - | 36 / 36 / 72 | 92 GB / 192 GB / 1.5 TB | 1 day | 350 GB |
gpuv100 | Nvidia V100 GPUs | Skylake (Gold 6140) | 1 | 4 | 7.0 | 24 | 192 GB | 7 days | 930 GB |
vis-gpu | Nvidia Titan XP | Skylake (Gold 6140) | 1 | 8 | 6.1 | 24 | 192 GB | 2 days | -- |
vis | Visualization / GUIs | Skylake (Gold 6140) | 1 | - | - | 36 | 92 GB | 2 hours | -- |
zen2-128C-496G | SMP | Zen2 (EPYC 7742) | 12 | - | - | 128 | 496 GB | 7 days | 1.8 TB |
gpu2080 | GeForce RTX 2080 Ti | Zen3 (EPYC 7513) | 5 | 8 | 7.5 | 32 | 240 GB | 7 days | 930 GB |
gpuexpress | GeForce RTX 2080 Ti | Zen3 (EPYC 7513) | 1 | 8 | 7.5 | 32 | 240 GB | 2 hours | 930 GB |
gputitanrtx | Nvidia Titan RTX | Zen3 (EPYC 7343) | 1 | 4 | 7.5 | 32 | 240 GB | 7 days | 1.4 TB |
gpu3090 | GeForce RTX 3090 | Zen3 (EPYC 7413) | 2 | 8 | 8.6 | 48 | 240 GB | 7 days | -- |
gpua100 | Nvidia A100 | Zen3 (EPYC 7513) | 5 | 4 | 8.0 | 32 | 240 GB | 7 days | 930 GB |
gpuhgx | Nvidia A100 SXM 80GB | Zen3 (EPYC 7343) | 2 | 8 | 8.0 | 64 | 990 GB | 7 days | 7 TB |
gpuexpress
You can allocate a maximum of 1 Job with 2 GPUs, 8 CPU cores and 60G of RAM on this node.requeue*
If your jobs are running on one of the requeue nodes while they are requested by one of the exclusive group partitions, your job will be terminated and resubmitted, so use with care!
Group exclusive:
Name | # Nodes | max. CPUs (threads) / node | max. Mem / node | max. Walltime |
---|---|---|---|---|
p0fuchs | 9 | 36 | 92 GB | 7 days |
p0kulesz | 6 / 3 | 36 | 92 GB / 192 GB | 7 days |
p0kapp | 1 | 36 | 92 GB | 7 days |
p0klasen | 1 / 1 | 36 | 92 GB / 192 GB | 7 days |
hims | 25 / 1 | 36 | 92 GB / 192 GB | 7 days |
d0ow | 1 | 36 | 92 GB | 7 days |
q0heuer | 15 | 36 | 92 GB | 7 days |
e0mi | 2 | 36 | 192 GB | 7 days |
e0bm | 1 | 36 | 192 GB | 7 days |
p0rohlfi | 7 / 8 | 36 | 92 GB / 192 GB | 7 days |
SFB858 | 3 | 72 | 1.5 TB | 21 days |
The above listed partitions and their resources might vary from time to time. To receive further and up to date information about the partitions use: scontrol show partition. Even more details can be shown by using sinfo -p
or just sinfo.