Linuxcluster: Hardware

Hardware configuration

The HPC cluster at TUHH-RZ consists of 200 compute nodes, several login nodes and a parallel storage system with a capacity of 300TB. All in all, about 8700 CPU cores, 50 TByte RAM and some GPUs are available for compute intensive workloads.

Login nodes

The HPC cluster has several login nodes. Some login nodes may be temporarily unavailable due to maintenance. If you do not have specific hardware or software requirements you are advised to use the alias hpclogin.rz.tuhh.de.
Nodes Cores CPU Type RAM Recommended usage
hpc1.rz.tuhh.de 2 (virtual) 4 GB managing batch jobs, data transfer
hpc2.rz.tuhh.de 2× 16 2× AMD Epyc 9124 384 GB managing batch jobs, data transfer, building software,
pre- and postprocessing, short test runs
hpc3.rz.tuhh.de 2× 16 2× AMD Epyc 9124 384 GB managing batch jobs, data transfer, building software,
pre- and postprocessing, short test runs
hpc4.rz.tuhh.de 2× 10 2× Intel Xeon E5-2660v3 128 GB managing batch jobs, data transfer, building software,
pre- and postprocessing, short test runs
hpc5.rz.tuhh.de 2× 10 2× Intel Xeon E5-2660v3 128 GB managing batch jobs, data transfer, building software,
pre- and postprocessing, short test runs

Compute nodes

Nodes Cores CPU Type RAM Comment
n[001-056] 2× 32 2× AMD Epyc 9354 384 GB
n[057-112] 2× 32 2× AMD Epyc 9354 768 GB
g[209-216] 2× 14 2× Intel E5-2680v4 128 GB
g[217-224] 2× 16 2× Intel Xeon Gold 6130 192 GB
g[225-228] 2× 24 2× Intel Xeon Gold 5318Y 512 GB
u[008-009] 2× 36 2× Xeon Platinum 8352V 512 GB With four NVidia Tesla A100
(80GB Memory each)
u[010-011] 2× 32 2× AMD Epyc 9334 768 GB With four NVidia Tesla H100
(80GB Memory each)
 

Software

Storage

  • Home directory
    • The home directory is mounted from central file servers of TUHH-RZ and is available in the Linux PC pools as well. The file system is backupped and snapshots are available.
    • Standard quota is 10GB which can be increased on request.
    • Slow storage for crucial data, not exceptionally useful for large scientific data sets.
  • Local file systems
    • Each node has local storage. Below /usertemp a personal subdirectory is created for each user like /usertemp/<unix-group>/<username>, e.g. /usertemp/rzt/rztkm .
    • The path /usertemp exists on all nodes but points to the local storage. Each node can only access its own /usertemp.
    • Data below /usertemp are not backupped are subject to deletion after 14 days inactivity or after a reboot of the node.
    • Fast storage as a local working directory.
  • parallel Lustre network file system
    • The HPC cluster is equipped with a parallel storage system (Lustre).
    • Below /work a personal subdirectory /work/<unix-group>/<username> is created for each user, e.g. /work/rzt/rztkm.
    • The parallel file system is intended for temporary data during the simulation. All data is subject to automatic deletion after 90 days of inactivity.
    • Globally visible.
    • Tradeoff between home directory (globally visible, secure, slow, small) and local storage (locally visible, fast).
    • This storage class has no backup - no permanent storage of important data !