Csaba Nemes

55185335100

Publications - 2

Accelerating unstructured finite volume computations on field-programmable gate arrays

Publication Name: Concurrency and Computation Practice and Experience

Publication Date: 2014-03-10

Volume: 26

Issue: 3

Page Range: 615-643

Description:

In the paper, an field-programmable gate array (FPGA)-based framework is described to efficiently accelerate unstructured finite volume computations where the same mathematical expression has to be evaluated at every point of the mesh. The irregular memory access patterns caused by the unstructured spatial discretization are eliminated by a novel mesh node reordering technique, and a special architecture is designed to fully utilize the benefits of the predictable memory access patterns. In the proposed architecture, a fixed-size moving window of the input stream of the reordered state variables is cached into the on-chip memory and a pipelined chain of processing elements, which gets input only from the fast on-chip memory, is used to carry out the computations. The arithmetic unit (AU) of the processing elements is generated from the data flow graph extracted from the given mathematical expression. The data flow graph is partitioned with a novel graph partitioning algorithm to break up the AU into smaller locally controlled parts, which can be more efficiently implemented in FPGA than the globally controlled AU. The proposed architecture and algorithms are presented via a case study solving the Euler equations on an unstructured mesh. On the currently available largest FPGA, the generated architecture contains three processing elements working in a pipelined fashion to provide one order of magnitude speedup compared with a high performance microprocessor and three times speedup compared with a high performance graphics processing unit. Copyright © 2013 John Wiley & Sons, Ltd. Copyright © 2013 John Wiley & Sons, Ltd.

Open Access: Yes

DOI: 10.1002/cpe.3022

FPGA based acceleration of computational fluid flow simulation on unstructured mesh geometry

Publication Name: Proceedings 22nd International Conference on Field Programmable Logic and Applications Fpl 2012

Publication Date: 2012-12-12

Volume: Unknown

Issue: Unknown

Page Range: 128-135

Description:

Numerical simulation of complex computational fluid dynamics problems evolving in time plays an important role in scientific and engineering applications. Accurate behavior of dynamical systems can be understood using large scale simulations which traditionally requires expensive super-computing facilities. In the paper a Field Programmable Gate Array (FPGA) based framework is described to accelerate simulation of complex physical spatio-temporal phenomena. Simulating complicated geometries requires unstructured spatial discretization which results in irregular memory access patterns severely limiting computing performance. Data locality is improved by mesh node renumbering technique which results in a sequential memory access pattern. Additionally storing a small window of cell-centered state values in the on-chip memory of the FPGA can increase data reuse and decrease memory bandwidth requirements. Generation of the floating-point data path and control structure of the arithmetic unit containing dozens of operators is a very challenging task when the goal is high operating frequency. Efficiency and use of the framework is described by a case study solving the Euler equations on an unstructured mesh using finite volume technique. On the currently available largest FPGA the generated architecture contains three processing elements working in parallel providing 75 times speedup compared to a high performance microprocessor. © 2012 IEEE.

Open Access: Yes

DOI: 10.1109/FPL.2012.6339276