Hardware Overview¶
The Future Technologies Partition is currently split into two different cluster installations with different hardware components, different cluster management software stacks and different login nodes.
FTP is a distributed memory parallel computer consisting of multiple individual servers called "nodes". Nodes are divided into two clusters, one supports ARM instruction set and the other supports x86. Each node has either two Intel Xeon processors or up to two ARM processors, at least 32 GB of local memory, local NVMe or SATA SSD disks and two high-performance network adapters. All nodes are connected by an extremely fast, low-latency InfiniBand 4X HDR interconnect. In addition two large parallel file systems are connected to FTP.
The operating system installed on every node is Red Hat Enterprise Linux (RHEL) 8.x. On top of this operating system, a set of (open source) software components like Slurm has been installed. Some of these components are of special interest to end users and are briefly discussed here. Others are mostly just of importance to system administrators and are thus not covered by this documentation.
The different server systems in FTP have different roles and offer different services.
Login Nodes
The login nodes are the only nodes directly accessible to end users. These nodes can be used for interactive logins, file management, software development and interactive pre- and postprocessing. Two nodes are dedicated as login nodes.
Compute Nodes
The majority of the nodes (25 out of 29) are dedicated to computations. These nodes are not directly accessible to users. Instead the calculations have to be submitted to a so-called batch system. The batch system manages all compute nodes and executes the queued jobs depending on their priority and as soon as the required resources become available. A single job may use hundreds of compute nodes and many thousand CPU cores at once.
Administrative Service Nodes
Some nodes provide additional services like resource management, external network connections, monitoring, security etc. These nodes can only be accessed by system administrators.
FTP-A64 ARM cluster¶
Login node | ARM-A100 | A64FX nodes | Dual Socket ARM Altra Max | Grace-Grace | |
---|---|---|---|---|---|
No. Nodes | 1 | 4 | 8 | 6 | 6 |
CPU | 1x Ampere Altra Q80-33 | 1x Ampere Altra Q80-33 | 1x Fujitsu A64FX | 2x Ampere Altra Max M128-30 | 2x Grace CPU Superchip |
Total Cores | 80 | 80 | 48 | 256 | 144 |
CPU Clock | 3.3 GHz | 3.3 GHz | 1.8 GHz | 3 GHz | 3.4 GHz |
Instruction Set | ARMv8.2+ | ARMv8.2+ | ARMv8.2-A + SVE | ARMv8.2+ | ARMv9 + SVE2 |
Memory | 512 GiB | 512 GiB | 32 GiB | 512 GiB | 480GB |
Accelerator Type | - | NVIDIA A100 PCIe 40GB | - | - | - |
No. Accelerators | - | 2 | - | - | - |
Disk (useable) | 890 GB | 890 GB | 372 GB | 1.8 TB | 3.6 TB |
Disk type | SATA-SSD | SATA-SSD | NVMe | SATA-SSD | NVMe |
InfiniBand | 200 GBit/s BlueField 2 | 200 GBit/s BlueField 2 | 100GBit/s ConnectX-6 | 200 GBit/s ConnectX-6 | 200 GBit/s ConnectX-6 |
FTP-X86 cluster¶
Login node | Intel Cascade Lake + NVIDIA V100 | AMD EPYC Milan + Graphcore | AMD EPYC Milan + MI100 | AMD EPYC Milan + MI210 | AMD EPYC Milan + MI250 | |
---|---|---|---|---|---|---|
No. Nodes | 1 | 2 | 1 | 1 | 1 | 2 |
CPU | 2x Intel Xeon Gold 6230 | 2x Intel Xeon Gold 6230 | 2x AMD EPYC 7543 | 2x AMD EPYC 7543 | 2x AMD EPYC 7543 | 2x AMD EPYC 7713 |
Sockets | 2 | 2 | 2 | 2 | 2 | 2 |
Total Cores | 40 | 40 | 64 | 64 | 64 | 128 |
Total Threads | 40 | 40 | 128 | 128 | 128 | 128 |
CPU Base Clock | 2.1 GHz | 2.1 GHz | 2.8 GHz | 2.8 GHz | 2.8 GHz | 2.0 GHz |
Instruction Set | x86_64 | x86_64 | x86_64 | x86_64 | x86_64 | x86_64 |
Memory | 192 GB | 192 GB | 512 GB | 512 GB | 512 GB | 1 TB |
Accelerator type | - | NVIDIA V100 | Graphcore IPU-M2000 | AMD MI100 | AMD MI210 | AMD MI250 |
No. Accelerators | - | 2 | 4 (=16 IPUs) | 4 | 4 | 4 |
Accelerator TFLOPS (FP64) | - | 7 | - | 11,5 | 22,6 | 45,3 |
Accelerator TFLOPS (FP32) | - | 14 | - | 23 | 22,6 | 45,3 |
Accelerator TFLOPS (FP16) | - | 28,2 | 250 | 184,6 | 181 | 362,1 |
Accelerator Mem b/w (GB/s) | - | 900 | 1228 | 1600 | 3276 | |
Disk | 890GB | 890 GB | 3 TB | 3 TB | 3 TB | 1.8 TB |
Disk type | NVMe | NVMe | NVMe | NVMe | NVMe | NVMe |
Intel Sapphire Rapids | Intel Sapphire Rapids + HBM | Intel Sapphire Rapids + Ponte Vecchio | AMD Instinct MI300A Accelerators | |
---|---|---|---|---|
No. Nodes | 2 | 2 | 1 | 1 |
CPU | 2x Intel Xeon Platinum 8480+ | 2x Intel Xeon CPU Max 9470 | 2x Intel Xeon Platinum 8460Y+ | 4x AMD Instinct MI300A Accelerator (Zen4) |
Sockets | 2 | 2 | 2 | 4 |
Total Cores | 112 | 104 | 80 | 96 |
Total Threads | 224 | 208 | 160 | 192 |
CPU Base Clock | 3.8 GHz | 3.5 GHz | 3.7 GHz | |
Instruction Set | x86_64 | x86_64 | x86_64 | x86_64 |
Memory | 512 GB | 640 GB | 512 GB | 512 GB |
Accelerator type | - | - | Intel Ponte Vecchio XT | Instinct MI300A |
No. Accelerators | - | - | 4 | 4 |
Accelerator TFLOPS (FP64) | - | - | 52 | 61,3 |
Accelerator TFLOPS (FP32) | - | - | 52 | 122,6 |
Accelerator TFLOPS (FP16) | - | - | 52 | 980,6 |
Accelerator Mem b/w (GB/s) | - | - | 3276 | 5300 |
Disk | 890GB | 890 GB | 890 GB | 29 TB |
Disk type | NVMe | NVMe | NVMe | NVMe |
Interconnect¶
An important component of FTP is the InfiniBand 4x HDR 200 GBit/s interconnect. All nodes are attached to this high-throughput, very low-latency (~ 1 microsecond) network. InfiniBand is ideal for communication intensive applications and applications that e.g. perform a lot of collective MPI communications.