Straight to the Point: Intra- and Intercluster LUT Connections to Mitigate the Delay of Programmable Routing

Published in FPGA'20, 2020

Abstract: Technology scaling makes metal delay ever more problematic, but routing between Look-Up Tables (LUTs) still passes through a series of transistors. It seems wise to avoid the corresponding delay whenever possible. Direct connections between LUTs, both within and across multiple clusters, can eschew the transistor delays of crossbars, connection blocks, and switch blocks. In this paper we investigate the usefulness of enhancing classical Field-Programmable Gate Array (FPGA) architectures with direct connections between LUTs. We present an efficient algorithm for searching automatically the most interesting patterns of such direct connections. Despite our methods being fairly conservative and relying on the use of unmodified standard CAD tools, we obtain a 2.77% improvement of the geometric mean critical path delay of a standard benchmark set, with improvement ranging from -0.17% to 7.3% for individual circuits. As modest as these results may seem at first glance, we believe that they position direct connections between LUTs as a promising topic for future research. Extending this work with dedicated CAD algorithms and exploiting the increased possibilities for optimal buffering, diagonal routing, and pipelining could prove direct connections important to the continuation of performance improvement into next generation FPGAs.

[DOI] [Author’s local PDF copy of the paper] [Slides]