High Performance Computing : Partitions (Queues)

Created by Potthoff, Sebastian, last modified on 12. Jun 2024

The batch or job scheduling system on PALMA-II is called SLURM. If you are used to PBS/Maui and want to switch to SLURM, this document might help you. The job scheduler is used to start and manage computations on the cluster but also to distribute resources among all users depending on their needs. Computation jobs (but also interactive sessions) can be submitted to different queues (or partitions in the slurm language), which have different purposes:

Partitions

Available for everyone:

Name	Purpose	CPU Arch	# Nodes	# GPUs / node	Compute capability of GPU	max. CPUs (threads) / node	max. Mem / node	max. Walltime	BeeOND storage
normal	general computations	Skylake (Gold 6140)	143	-	-	36	92 GB / 192 GB	24 hours	350 GB
long	general computations	Skylake (Gold 6140)		-	-	36	92 GB / 192 GB	7 days	350 GB
express	short running (test) jobs, compilation	Skylake (Gold 6140)	5	-	-	36	92 GB	2 hours	350 GB
bigsmp	SMP	Skylake (Gold 6140)	3	-	-	72	1.5 TB	7 days	350 GB
largesmp	SMP	Skylake (Gold 6140)	2	-	-	72	3 TB	7 days	350 GB
requeue*	This queue will use the free nodes from the group exclusive nodes listed below.	Skylake (Gold 6140)	68 / 50 / 3	-	-	36 / 36 / 72	92 GB / 192 GB / 1.5 TB	1 day	350 GB
gpuv100	Nvidia V100 GPUs	Skylake (Gold 6140)	1	4	7.0	24	192 GB	7 days	930 GB
vis-gpu	Nvidia Titan XP	Skylake (Gold 6140)	1	8	6.1	24	192 GB	2 days	--
vis	Visualization / GUIs	Skylake (Gold 6140)	1	-	-	36	92 GB	2 hours	--
zen2-128C-496G	SMP	Zen2 (EPYC 7742)	12	-	-	128	496 GB	7 days	1.8 TB
gpu2080	GeForce RTX 2080 Ti	Zen3 (EPYC 7513)	5	8	7.5	32	240 GB	7 days	930 GB
gpuexpress	GeForce RTX 2080 Ti	Zen3 (EPYC 7513)	1	8	7.5	32	240 GB	2 hours	930 GB
gputitanrtx	Nvidia Titan RTX	Zen3 (EPYC 7343)	1	4	7.5	32	240 GB	7 days	1.4 TB
gpu3090	GeForce RTX 3090	Zen3 (EPYC 7413)	2	8	8.6	48	240 GB	7 days	--
gpua100	Nvidia A100	Zen3 (EPYC 7513)	5	4	8.0	32	240 GB	7 days	930 GB
gpuhgx	Nvidia A100 SXM 80GB	Zen3 (EPYC 7343)	2	8	8.0	64	990 GB	7 days	7 TB

gpuexpress
You can allocate a maximum of 1 Job with 2 GPUs, 8 CPU cores and 60G of RAM on this node.

requeue*
If your jobs are running on one of the requeue nodes while they are requested by one of the exclusive group partitions, your job will be terminated and resubmitted, so use with care!

Group exclusive:

Name	# Nodes	max. CPUs (threads) / node	max. Mem / node	max. Walltime
p0fuchs	9	36	92 GB	7 days
p0kulesz	6 / 3	36	92 GB / 192 GB	7 days
p0kapp	1	36	92 GB	7 days
p0klasen	1 / 1	36	92 GB / 192 GB	7 days
hims	25 / 1	36	92 GB / 192 GB	7 days
d0ow	1	36	92 GB	7 days
q0heuer	15	36	92 GB	7 days
e0mi	2	36	192 GB	7 days
e0bm	1	36	192 GB	7 days
p0rohlfi	7 / 8	36	92 GB / 192 GB	7 days
SFB858	3	72	1.5 TB	21 days

The above listed partitions and their resources might vary from time to time. To receive further and up to date information about the partitions use: scontrol show partition. Even more details can be shown by using sinfo -p or just sinfo.