Last week at the Embedded Systems Conference (ESC) held in San Jose, California, Xilinx disclosed additional information about its upcoming Extensible Processing Platform (EPP), which I previously discussed in a February 1 blog entry written just after RTECC (the Real Time Embedded Computing Conference, see Designing Low-Power Systems with FPGAs, Part 2). This past week at a press conference, Xilinx’s Senior VP of Worldwide Marketing and Business Development Vin Ratford again spoke of the upcoming processor-centric devices Xilinx plans to introduce next year, but this time he provided far more detail. As promised, the devices fuse features of a high-end microcontroller (hard-core implementations of a 32-bit processor, memory, and I/O) with an FPGA fabric. But wait, you say, haven’t both Xilinx and Altera (and other FPGA vendors) tried this before? Yes, they have, with uninspiring results. However, I submit that Xilinx’s EPP is substantially different and it stands a very good chance of capturing significant market share from microcontrollers and from discrete processors. It may also be very attractive to design teams considering the development of certain types of SOCs. Consequently, the Xilinx EPP family may well become the family of high-volume parts Xilinx wants to have in its product catalog. Ratford provided so much information in his ESC announcement that I’ll need multiple blog entries to cover it all. In this first entry, I’ll describe what Xilinx’s EPP is and I’ll cover some of the thinking behind the architecture; In the second entry, I’ll describe some case studies that illustrate why this component family might be very attractive for a certain class of embedded product—because it promises lower parts count, lower cost, and higher performance with lower power consumption. Please understand that Xilinx stopped short of announcing actual products. Ratford described an architecture that will be used to produce a product family with actual products starting to appear next year.
There are two major components to Xilinx’s EPP: a hard-wired, high-end, microcontroller-like block and a connected FPGA fabric based on Xilinx’s 28nm unified FPGA logic-cell design as shown in the diagram below.
First, let’s look at the hard-wired portion. It’s well known that processors don’t run very fast when implemented with FPGAs. The reason mostly revolves around the wiring congestion associated with the large register files of 32-bit RISC processors. Wiring congestion translates into “slow” and you can figure on giving up 50-75% or more of the processor’s maximum clock rate in a given process technology when comparing a synthesized ASIC implementation against a synthesized FPGA implementation. Hand optimization can reclaim some of that speed but if you’re planning on using a standard processor architecture anyway, it makes perfect sense to implement the processor on the FPGA as a hard core using a standard ASIC synthesis flow. That way, you get the full speed of the IC process technology along with the full logic density and therefore a much lower silicon cost.
Xilinx has chosen ARM’s Cortex-A9 32-bit RISC processor core for the EPP but has gone a step farther by implementing a dual-core version of this processor. That choice immediately puts the Xilinx EPP family at the high-end of the microcontroller spectrum. First, there are two 32-bit processor cores. Second, a Cortex-A9 processor can run at 2 GHz in TSMC’s 40nm, high-performance process technology. That’s one fast processor—much faster that many embedded applications require. A dual-core version, as is employed in Xilinx’s EPP family, is faster still.
In choosing a standard processor core from ARM’s extremely successful stable of processors, Xilinx has plugged directly into a broad community of embedded software developers. In other words, choosing the widely used ARM architecture telegraphs Xilinx’s recognition that embedded software development is now the largest and most expensive part of any high-end embedded project. In many such projects, software developers often outnumber hardware developers by 10:1. In announcing the EPP, Xilinx shows that it fully recognizes the need to make the software development team happy first. The company’s selection of an ARM processor core also leverages the associated large and familiar development-tool set, the good selection of operating systems, and the extended ecosystem that goes with the ARM architecture’s large and growing market dominance in the embedded space. All of these factors make the ARM processor very attractive to embedded development teams.
To the dual-core ARM Cortex-A9 processor, Xilinx has added a number of hard-core peripherals including SRAM caches, timers, interrupt controllers, switches, memory controllers, and commonly used I/O peripherals certain to be useful for many high-end embedded designs. Because these additional blocks are all hard-core implementations, they too take little room on the chip and consume much less power than they’d need if implemented in an FPGA fabric. Note that the EPP chips will contain enough SRAM for caches and small scratchpads however bulk memory, generally implemented with DRAM, will be off-chip. Consequently, the EPP architecture includes hard-core DRAM controllers to manage off-chip memory. Ratford’s talk at ESC did not elaborate on the type of memory the on-chip controller can handle however DDR2, DDR3 or both DDR2 and DDR3 would probably be a good guess, considering the high-end nature of the EPP family. The targeted applications will need a lot of memory and DDR2 and DDR3 DRAM are now the best choices in terms of cost/bit.
Key to the software-friendly approach Xilinx is taking with the EPP, the architecture boots code upon power up just like a microcontroller. Only then is the FPGA fabric configured. This approach makes the EPP look very familiar to software developers who are not at all comfortable with writing code for a fluid, amorphous system that’s not well-defined when power comes up. The FPGA vendors spent a lot of money on reconfigurable architectures learning this lesson. In addition, HLL compilers don’t much care for undefined hardware either—undefined hardware just doesn’t fit the standard software-programming models. So the implementation of a complete, hard-wired microcontroller within the EPP cuts out a lot of that old unfamiliar strangeness associated with previous attempts to marry hard processor cores and FPGA fabrics.
Speaking of the FPGA fabric, Xilinx will be using the unified 28nm FPGA fabric in the EPP. Xilinx developed this fabric for its next-generation Spartan and Virtex FPGAs. (If you want more details about this FPGA fabric, take a look at the White Paper here. According to Ratford, Xilinx’s Virtex and Spartan FPGAs will both employ this fabric, which is the first time that Xilinx has used the same FPGA fabric for its high-performance and its low-cost FPGA product families. Using the same fabric for the two Xilinx FPGA product lines and for the EPP means that Xilinx need only develop one set of hardware-design tools for the 28nm node and it also means that hardware designers only need to learn one set of tools as well.
The EPP’s hard-core embedded microcontroller communicates with the on-chip FPGA fabric using ARM’s newly announced AMBA 4/AXI bus. Ratford said at RTECC and repeated again at ESC that Xilinx worked with ARM to develop a version of this new bus specifically for FPGA use but he’s not provided details. The diagram of the EPP Ratford projected (reproduced above) shows multiple buses connecting the EPP’s hard-core embedded microcontroller and the on-chip FPGA fabric. Although Ratford provided no additional details, I plan to write a third blog entry discussing possible ways of optimally connecting the processor cores to the FPGA fabric. In the next installment of this blog, I’ll discuss some specific case studies Ratford covered in his ESC presentation that show how the EPP can reduce the parts count, cost, and the power consumption of high-end embedded systems.
(You can find a White Paper describing the Xilinx EPP here.)