PERUN Architecture
The PERUN Supercomputer serves as the central computing system within the Supercomputing Center of the Technical University of Košice.
It is designed for highly parallel computations, scientific simulations, big data processing, and GPU-accelerated tasks using the latest generation of computing technologies.
The system is built upon two complementary partitions — a universal computing partition and an accelerated GPU partition — together providing a balanced combination of performance, flexibility, and energy efficiency.
Universal Computing Partition (PERUN Universal)
This partition is designed for a wide range of HPC workloads — from parallel scientific simulations to data-intensive applications.
- Number of nodes: 32 × HPE Cray XD2000 (XD225v)
- Processor: 2 × AMD EPYC 9745 (256 cores per node)
- Memory (RAM): 1,536 GB DDR5 ECC
- Networking:
- 2 × 100 Gb/s Ethernet
- 1 × 200 Gb/s NDR200 InfiniBand
This partition delivers high computational performance for traditional HPC applications, optimized parallel algorithms, and workloads with demanding memory requirements.
Accelerated GPU Partition (PERUN AI)
This section is dedicated to the most demanding computations in artificial intelligence, machine learning, numerical simulations, and data-intensive workloads utilizing GPU acceleration.
- Number of nodes: 26 × HPE ProLiant Compute XD685
- Processor (CPU): 2 × AMD EPYC 9535 (128 cores per node)
- GPU Accelerator: 8 × NVIDIA H200 with 141 GB HBM3e memory each
- Memory (RAM): 2,304 GB DDR5 ECC
- Networking:
- 2 × 100 Gb/s Ethernet
- 4 × 400 Gb/s NDR InfiniBand
- GPU-to-GPU bandwidth: 900 GB/s
- GPU-to-CPU bandwidth: 128 GB/s
- Internal GPU Network: NVIDIA NVLink
Thanks to the NVIDIA Hopper architecture and its high-throughput interconnects, this partition provides exceptional performance for massively parallel workloads, neural-network training, and large-scale data modeling.
Performance and Flexibility
By combining these two components — the universal CPU partition and the accelerated GPU partition — the PERUN system enables optimal allocation of computing resources based on task requirements.
Its architecture ensures high performance, stability, and scalability to support research, education, and innovative technological projects.
- Total Performance: 10.7 PFlops (Rmax)
The PERUN Supercomputer is directly connected to a high-speed InfiniBand network and the HPC PERUN data storage system, enabling efficient data processing and full utilization of available computational power.
What Makes This Cluster Unique in Slovakia
- High-Performance NVIDIA H200 GPU Accelerators: TUKE operates H200 accelerators equipped with 141 GB of HBM3e memory per adapter - significantly more than the 96 GB HBM3 found in standard GH200 systems. For large-scale datasets and AI training workloads, this means substantially more high-speed memory, enabling faster, more efficient model training.
- Ultra-Fast GPU Interconnect with NVLink: Each accelerated compute node features 8 GPUs interconnected with a 900 GB/s NVLink fabric. This bandwidth is orders of magnitude higher than traditional InfiniBand communication between nodes, enabling far more efficient parallel training of large language models and other computationally intensive AI workloads.
- Exceptional System Memory Capacity: Memory is crucial for large-scale simulations, HPC tasks, big-data processing, and modern AI research. The cluster provides outstanding RAM capacity:
- 2304 GB DDR5 ECC in each accelerated node
- 1536 GB DDR5 ECC in each general-purpose node
- Integrated Quantum Simulator: The infrastructure includes both hardware and software components of a quantum simulator, with the ability to integrate a future quantum computer seamlessly. This ensures a smooth transition to real quantum processing without the need to redesign the existing software architecture.
