While advances in software tools and frameworks have enabled individuals to create interesting new applications/products in reasonable time frames, hardware designs take large teams multiple years. This disparity in required effort decreases hardware innovation and interest in the field. To address this issue, we must make hardware/software systems easier and more fun to develop, which means that we need to enable a more “agile” hardware development flow – making it possible to easily (and quickly) modify an existing design and play with the resulting system. To foster this goal of agile hardware design, we propose to create an open source hardware/software tool chain to rapidly create and validate alternative hardware implementations and a new open-source system RISC-V/CGRA SoC which will enable rapid execution/emulation of the resulting design.
Today’s CAD tools are extremely powerful and allow us to create systems of enormous complexity. Unfortunately, their evolutional development path and focus on achieving the best possible performance mean that both their run time and the ramp time to use the tools don’t fit into an agile design flow. Our recent work on domain specific languages (DSLs) for creating hardware (Genesis2, FPGen, Darkroom, Rigel, Halide) has shown that a different approach is possible, and we base our new tool chain on this approach. In addition to extending this base work on generating hardware RTL, this research adds fast physical mapping of RTL to FPGAs and CGRAs, and automatic generation of hardware/software APIs to interface the new hardware into a RISC-V/Linux system. In the spirit of agile design, we will be building this toolset in an agile manner, leveraging existing tools when possible. We will use image/vision computation as our initial domain, and we plan to work closely with Kayvon’s CMU/Intel center in application selection. This research involves two mutually dependent tasks: the design tool flow and the configurable SoC. The design tool flow makes use of modern SMT solvers, and we expect to make contributions in both using SMT for these applications, and in the underlying solvers as well. Below we summarize the key challenges and approaches in each of these areas.
Tool flow for agile hardware. Enabling agile hardware design requires solving two problems: providing a light-weight environment where designers can experiment at a high level to explore the solution space; and building a tool chain that can generate an implementation of a design choice rapidly. The key to solving both issues is creating clean design abstractions, which makes it possible to embed knowledge about optimization into the tools, enabling one to leverage other people’s tools and expertise. We will extend our prior work to create physical implementations as well, bypassing the slow and complex FPGA tools. Given that we start from the high-level description and know the system microarchitecture (it comes from a template), we can do placement at a higher-level, which will reduce the number of objects that we need to deal with. This should allow us to use linear-integer programs or SMT solvers to address placement and routing problem. Furthermore, since we are interested in rapid design cycles, we will create tools that are able to deal with incremental changes, rather than having to restart the problem from scratch each time. This then changes the objectives of the placement and routing tools – the smallest, lowest-power design might not be the best, since it might be the hardest to incrementally update.
Coarse Grain Reconfigurable SoC. In parallel with the DSL and tool flow development, we will also develop an open-source SoC that incorporates RISC-V processors and its peripherals with a coarse-grained reconfigurable array. This array will be designed and tuned to be efficient in image and vision processing applications and will leverage our previous work on the Frankencamera project, which showed how a CGRA is more efficient than other programmable engines for this application class. Working with the tool designers, we will co-design the array’s micro-architecture to optimize the quality and speed of the CAD tools, including pathways that allow incremental updates of the FPGA programming and also include local caching of next configuration data to allow rapid hardware changes. Since much of the design cost of custom hardware is in the software needed to interface with the new hardware, our generators will customize the OS drivers as needed and will also generate the user-level API to the synthesized hardware blocks.
SMT solver. Some of the new place and routing tools will leverage SMT solvers, and we expect to increase the performance of these solvers for these applications by working at higher-levels of abstraction, by developing domain-specific solver techniques, and by adding optimization capabilities.