Łukasz Szustak

36195063000

Publications - 2

Prediction model of performance–energy trade-off for CFD codes on AMD-based cluster

Publication Name: Future Generation Computer Systems

Publication Date: 2025-08-01

Volume: 169

Issue: Unknown

Page Range: Unknown

Description:

This work explores the importance of performance–energy correlation for CFD codes, highlighting the need for sustainable and efficient use of clusters. The prime goal includes the optimisation of selecting and predicting the optimal number of computational nodes to reduce energy consumption and/or improve calculation time. In this work, the utilisation cost of the cluster, measured in core-hours, is used as a crucial factor in energy consumption and selecting the optimal number of computational nodes. The work is conducted on the cluster with AMD EPYC Milan-based CPUs and OpenFOAM application using the Urban Air Pollution model. In order to investigate performance–energy correlation on the cluster, the CVOPTS (Core VOlume Points per TimeStep) metric is introduced, which allows a direct comparison of the parallel efficiency for applications in modern HPC architectures. This metric becomes essential for evaluating and balancing performance with energy consumption to achieve cost-effective hardware configuration. The results were confirmed by numerous tests on a 40-node cluster, considering representative grid sizes. Based on the empirical results, a prediction model was derived that takes into account both the computational and communication costs of the simulation. The research reveals the impact of the AMD EPYC architecture on superspeedup, where performance increases superlinearly with the addition of more computational resources. This phenomenon enables a priori the prediction of performance–energy trade-offs (computing-faster or energy-save setups) for a specific application scenario, through the utilisation of varying quantities of computing nodes.

Open Access: Yes

DOI: 10.1016/j.future.2025.107810

Evaluating AMD EPYC CPU architectures on CFD applications

Publication Name: Future Generation Computer Systems

Publication Date: 2026-04-01

Volume: 177

Issue: Unknown

Page Range: Unknown

Description:

In this work, the authors focus on assessing the impact of the AMD EPYC processor architecture on the performance of CFD applications. Several generations of architectures were analyzed, such as Rome, Milan, Milan X, Genoa, Genoa X and Bergamo, characterized by a different number of cores (64-128), L3 cache size (256 - 1152 MB) and RAM type (8-channel DDR4 or 12-channel DDR5). The research was conducted based on the OpenFOAM application using two memory-bound models: motorBike and Urban Air Pollution. In order to compare the performance of applications on different architectures, the FVOPS (Finite VOlumes solved Per Second) metric was introduced, which allows a direct comparison of the performance on the different architectures. It was noticed that local maximum performance occurs at different values of grid element per CPU when utilizing different processor types. Additionally, the behaviour of the models was analyzed in detail using the AMD µProf and LIKWID software profiling analysis tools to reveal the applications’ interaction with the hardware. It enabled fine-tuned monitoring of the CPU’s behaviours and identified potential inefficiencies in AMD EPYC CPUs. Particular attention was paid to the effective use of L2 and L3 cache memory in the context of their capacity and the bandwidth of memory channels, which are a key factor in memory-bound applications. Processor features were analyzed from a cross-platform perspective, which allowed for the determination of metrics of particular importance in terms of their impact on the performance achieved by CFD applications.

Open Access: Yes

DOI: 10.1016/j.future.2025.108237