Skip to main content Skip to secondary navigation

Developing FPGAs as an Acceleration Platform for Data-Intensive Applications

Main content start

Speaker: James Thomas, PhD Student, Stanford University
Date: July 13, 2022

As the use of FPGAs has grown in datacenters, it has become increasingly clear that they are fairly unproductive for developers compared to competing platforms like GPUs and CPUs. We aim to close this gap by taking a domain-specific approach. We argue that general-purpose FPGA development tools are fundamentally limited by the complexity of the platform, and therefore focus on building faster and simpler tools for the specific case of streaming data-intensive applications.

We first present Fleet, a system for accelerating massively parallel streaming workloads on FPGAs. Fleet provides a simple language for users to define a compute unit that processes a single stream of data, and then automatically replicates the compute unit many times into a memory controller fabric so that the final design can process many independent streams at once. We next present a fast compilation system for Fleet-like applications. Our system leverages the fact that the memory controller design is fixed across applications, and therefore compiles it ahead of time, leaving empty slots for copies of the user's compute unit. The user-visible compile time is thus reduced only to the time required to compile a few copies of the compute unit and replicate them into the prebuilt memory controller. Finally, we leverage the design patterns of identical compute units and streaming DRAM access to design a FPGA accelerator for the problem of finding interesting subgroups in large tabular datasets. This accelerator is able to outperform GPUs and CPUs on a cost per throughput basis due to its customized partitioning of SRAM resources across compute units.

Developing FPGAs as an Acceleration Platform for Data-Intensive Applications (James Thomas, Stanford University)