It’s Raining Low-Power Microcontrollers
March 5, 2010 on 4:53 pm | In Design, Low-Power | No CommentsWow. The Embedded World show in Nuremberg is really shaking the low-power microcontrollers out of the tree this year. Cases in point: announcements of new, low-power 8- and 32-bit microcontrollers from Microchip and Energy Micro (a Norwegian fabless microcontroller company) respectively. Microchip’s 8-bit parts are offered in packages ranging from a tiny 8-lead device to 64 leads. Energy Micro’s parts are available in 20-, 32-, and 64-lead packages.
Energy Micro’s EFM32 “Tiny Gecko” microcontrollers are the little brothers to the Gecko microcontrollers I wrote about last November when the company first announced them at the ARM Techcon 3 conference held in Santa Clara, California. Like their bigger brethren, the Tiny Gecko processors are based on the ARM Cortex-M3 processor core. They consume 180µA/MHz, have a deep-sleep current draw of 900nA, and an “off” mode where the part draws a mere 20nA. There are 13 new members of the Tiny Gecko family with Flash capacities of 8 to 32 Kbytes, RAM capacities of 2 or 4 Kbytes, and 24 or 56 multipurpose I/O pins. Other peripherals included in the Tiny Gecko microcontroller family are a low-energy UART, I2C serial interfaces, A/D and D/A converters, and several counters and timers. Unique to the Gecko microcontroller and continued in the Tiny Gecko line is what Energy Micro calls the “peripheral reflex system,” which allows peripherals to run and communicate autonomously while the CPU sleeps for a big cut in energy consumption. Architecture, instruction set, and peripherals are compatible between the Gecko and Tiny Gecko families, which is the surest way to building a following. Embedded systems designers need broad lines of compatible microcontrollers to accommodate the wide-ranging, diverse needs of embedded design.
To that end, Microchip’s offerings fill in the low-power end of its line. The PIC12F182X and PIC16F182X (PIC1XF182X) microcontrollers—which consume less than 50 µA/MHz and have a rated sleep current of 20nA at 1.8V (30nA at 3V)—extend Microchip’s “Enhanced Mid-range” 8-bit core product line into the realm of 8-pin devices and bring the total number of Enhanced 8-bit core PIC microcontrollers to 16, available in packages ranging from 8 to 64 pins. The family features a range of internal peripherals including Microchip’s mTouch capacitive touch-sensing technology and multiple communications peripherals. The PIC1XF182X microcontrollers include dual I2C/SPI interfaces, more PWM outputs with independent time bases, and a “Data Signal Modulator” that implements a variety of modulation schemes including frequency-shift, phase-shift, and on-off keying along with synchronization and polarity control. Microchip is targeting these general-purpose microcontrollers at a wide range of applications in the appliance (coffee makers, blenders, dishwashers); consumer (vacuum cleaners, printers, remote controls); and automotive markets (LED lighting, keyless entry, body electronics). Here’s a video demonstrating the various low-power operating modes of Microchip’s PIC16LF1823 microcontroller:
Microchip’s Enhanced Mid-range 8-bit architecture employs 14-bit instructions that improve performance as much as 50% relative to 8-bit instructions and 14 new instructions in the Enhanced architecture improve code-execution performance by as much as 40% over Microchip’s previous-generation 8-bit PIC16 MCUs. Significantly, the “enhanced” architecture and instructions extend the instruction address space from 8K to 32K instructions and RAM space from 446 bytes to >4 Kbytes.
Although these new microcontrollers from Energy Micro and Microchip both focus on extremely low-power embedded design and essentially cost a buck each, give or take, they’re really quite different. There’s a substantive difference in both ability and ease of programming between a 32-bit device and an “8-bit” device (I still find it hard to label a processor “8-bit” when it has 14-bit instructions, but Microchip’s parts do operate on 8-bit data) and there are many differences in the available on-chip peripheral devices. Your specific application will likely not need all of the available on-chip peripherals but there are some unique ones in there from both vendors that can make the difference between an easy design and a tough one. Microchip has been in the microcontroller business for decades, has built a substantial ecosystem around its devices, and has recruited a small army of loyal users familiar with its architectures. Energy Micro is the definite newcomer, first announcing products late in 2009. It is inheriting and leveraging the huge ecosystem of the ARM 32-bit architectural empire. You, fortunately, get nothing but tremendous advantage in the form of choice.
Power Over Ethernet: One Cable Defines an Entire Product Envelope
March 1, 2010 on 7:07 pm | In Design, Low-Power | No CommentsLast week, I moderated a Power-Over-Ethernet (POE) session at the Ethernet Technology Summit held in San Jose. The idea of supplying high-speed communications and power over one standard cable that works anywhere in the world is certainly compelling because there are many difficulties in developing a power-delivery scheme that works worldwide thanks to differing wall-socket connectors and line voltages but that RJ45 Ethernet plug works anywhere. Consequently, POE makes a lot of sense for a large number of diverse, high-volume products including VOIP phones, thin clients, wireless access points, security cameras, and digital signage.
In an interesting twist on the concept, the amount of power delivered over a POE cable fairly well defines the power envelope of the end product. The initial 802.3af POE specification can deliver a maximum of 13W to the end product, after power losses in the cable are accounted for. The new 802.3at specification nearly doubles the available power to 25.5W, but that’s not even enough to run the laptop I’m using to write this blog post, so there’s clearly a limitation there. (However, as Microsemi’s Dan Feldman explained in his presentation, there’s also a way to get nearly 50W from a POE cable by using all four cable pairs to deliver power. Two of the pairs must then serve double duty by carrying both power and communications.)
In addition to the lack of a universal wall power plug, most of the devices that are candidates for POE adaptation are currently powered by wall warts, the ubiquitous black boxes that most of us plug as best we can into inexpensive power strips that are not designed to accommodate these “fat plugs.” In addition to their inconvenient form factor and the extra wiring they create—which often ends up creating a rat’s nest of wiring under your desk—wall warts bleed energy because they are not efficient. Because they’re generally inexpensive, not much engineering goes into the design of a wall wart. In fact, said Matthew Tyler of ON Semiconductor, the best you can get from a 50W wall wart is 84% efficiency. Wall warts delivering less power are even less efficient.
Consequently, one of POE’s advantages is the ability to develop high-quality, highly efficient power supplies for the network switches that deliver power to POE devices. (The POE realm refers to these switches as PSEs, power sourcing equipment, and calls the end devices PDs—powered devices.) Using POE topologies, says Tyler, you can drive efficiency up to 90%. In addition, says Akros Silicon’s Amit Gattani, the PSE becomes an intelligent controller that can manage the power supplied to the network’s PDs. For example, in a digital-signage application, the PSE can automatically power down the signs 15 minutes after a store closes and can power them back up 15 minutes before a store opens.
Reduced energy use saves companies money, so that’s one economic advantage enjoyed by POE. However there’s an even bigger economic advantage—one you might not expect. You need licensed electricians and permits to run ac power cables while you generally do not need either to run Category 5 Ethernet cable. Consequently, a system employing POE has lower installation costs compared to a system that employs a combination of ac power and wired Ethernet or even one that uses ac power and wireless Ethernet. There are real installation saving to be had for applications such as VOIP telephony, thin clients, and digital signage.
There is far more subtlety in the design of POE equipment than I’ve described here. All three of the companies listed above have developed expertise in the design of POE equipment and are willing to share that expertise with you.
Designing Low-Power Systems with FPGAs, Part 2
February 1, 2010 on 5:34 pm | In Design, FPGA, SOC | No CommentsLiterally within an hour of posting my last blog entry on designing low-power systems with FPGAs, Altera’s marketing engine issued a related email and dropped it into my inbox. Altera’s email pre-announces the company’s upcoming FPGAs based on 28nm lithography. The email included the following marketing graph (with no scale) to explain the advantages of the smaller geometries for FPGA manufacture.

The first set of bars in the graph set the baseline using Altera’s 40nm devices as a reference. The next set of bars show that the feature shrink alone improves FPGA gate density by 25% and power consumption by about 12.5%. (Note: That’s my eyeball talking, not Altera’s official numbers.)
The next set of bars shows what happens incrementally when Altera takes some major logic blocks and hard-codes them. Suddenly, gate density doubles and power consumption drops by 40% compared to 40nm FPGA.
The last set of bars shows what happens when you combine the lithography shrink and hard-coded IP. Suddenly you’re getting 4x the gate density at a mere 25% of the power consumption compared to 40nm devices. (Note: I’m not sure what suddenly happened to the transceiver count, that third bar in the group, which had been constant until everything got combined in the last set. My guess is that the marketing artist who drew the graph got overzealous, cut everything 75% for visual consistency, and the proofreaders missed it. I think the number of transceivers is supposed to stay constant, based on the first three sets of bars in the graph.)
Two things to note here. First, you get a lot of bang out of hard-coded IP. Coincidentally, MIPS announced that Altera had licensed the MIPS32 architecture back in October, 2008 but Altera was mum on the subject back then. RISC processor cores make lousy targets for programmable FPGA fabrics, largely because of the routing congestion around their large register files, so processor core IP is one of the IP types that really should be hard-coded onto an FPGA. Although both Altera and Xilinx did not have much success with their first-generation FPGAs that incorporated hard-coded processor cores, that doesn’t mean they’re not going to try again and the MIPS announcement late last year telegraphed that move.
Want more proof? Last week at the Real Time Embedded Computing Conference held in Santa Clara, California, Xilinx’s Senior VP of Worldwide Marketing and Business Development Vin Ratford did more than telegraph his company’s intent to put processor cores back into FPGAs. He announced and elaborated on that intent. Xilinx will be adopting the ARM architecture and an FPGA-friendly version of ARM’s AMBA interconnect in future FPGA generations.
Make no mistake. Processors are coming to FPGAs for several reasons. First, a RISC processor core consumes between 25,000 and 50,000 gates. You can drop one of those puppies into an FPGA fabric and never see it. In essence, those transistors are “free.” That’s the nature of an FPGA’s programmable interconnect. Logic just sort of disappears.
Second, you can’t build a system without at least one processor these days. Which immediately leads to the third reason. If Xilinx and Altera truly wish to convert their “We’re taking over everything” or “All your chips are belong to us” attitudes, then the processor will just have to live on the FPGA silicon. Otherwise, the FPGA companies don’t get all of the chips. It’s as simple as that.
However, as both Altera and Xilinx discovered last time they tried this, dropping a processor core into an FPGA and making it usable is not just a matter of burying some gates into the FPGA fabric. Effective ways of connecting the processor to the programmable FPGA fabric must also exist and the software developers—who represent more than 90% of modern embedded development teams—must also be happy with the integration. You only make them happy with good development, profiling, and debugging tools.
And there’s the rub.
(It’s possible that Shakespeare’s Hamlet was indeed an embedded systems developer.)
Designing Low-Power Systems with FPGAs
February 1, 2010 on 3:47 pm | In Design, FPGA, Flash, Low-Power | No CommentsActel has published a White Paper discussing low-power aspects of using FPGAs. It should not surprise you that the White Paper’s points and conclusions favor Actel’s Flash-based FPGAs over SRAM-based FPGAs from other vendors but that bias should not stop you from extracting some good meat from the document.
The first important point from the White Paper: designers considering the use of an FPGA have decided not to take the ASIC/SOC route for one of several reasons. Carefully tailored ASICs and SOCs should always deliver the lowest unit-cost system chip with the lowest power—but there’s always a cost. That cost involves a large and complex design process that requires a substantial team of trained silicon designers, a big stack of expensive ASIC design tools, expensive fabrication masks, and weeks or months of fabrication delay after tapeout. Contrast that with no up-front NRE costs for an FPGA, inexpensive FPGA design tools, and no need to be familiar with the arcane world of chip design when using an FPGA to implement a system. For system designs shipping in lower volumes, FPGAs are mighty attractive.
Once you decide to use an FPGA, you must then decide on the FPGA technology you’ll use (SRAM-based, Flash-based, or antifuse-based) and you must pick an FPGA vendor. Given that you’ve selected to take the FPGA route, there are five components of device power consumption for you to examine when evaluating different FPGA technologies:
- Static power (leakage)
- Dynamic power (frequency dependent)
- Power-up (or inrush power)
- Configuration power
- Sleep-mode power
The total energy consumed by the FPGA (which is the most important design criteria for battery-powered designs) combines all five of these power components over time. It’s here that the Actel White Paper unsurprisingly starts to make the case for Actel’s Flash-based FPGAs, but again, the information provided in the White Paper is instructional.
Figure 1 shows a startup scenario for SRAM-based and Flash-based FPGAs. Power is applied to the system at T0 (time = 0) on the graph. As the input power supply voltage rises from zero volts, the SRAM-based FPGA draws a large inrush current as its SRAM configuration array powers on. Is the inrush current really as large for an SRAM-based FPGA as shown in Figure 1? Is it as small for a Flash-based FPGA as shown in Figure 1? Well, there’s no scale (making Figure 1 a marketing graph), so who’s to say? What you should get from this point is that you need to find out what that inrush current is for the FPGA’s you’re considering.

Figure 1: FPGA power consumption for power-up stage
Something else of interest is happening in Figure 1 and you might be tempted to misinterpret it. The blue line representing the Flash-based FPGA power consumption starts to ramp up well before the purple line representing the SRAM-based FPGA. At first glance, the lines make it appear that the Flash-based FPGA will consume more power over time than the SRAM-based FPGA. However, what the curves actually show is that the SRAM-based FPGA needs time to download configuration data into its configuration SRAM while the Flash-based FPGA starts to perform its system duties more quickly because there’s no configuration overhead.
Figure 2, another marketing graph, compares the power consumption of an SRAM-based FPGA with that of a Flash-based FPGA. Keep in mind that this is a marketing graph comparing two unspecified FPGAs which may or may not have similar gate counts performing some sort of unspecified workload. However, what’s shown that is useful is that you do need to consider the FPGA’s power consumption in these various operating phases and you need to weight the power use by the amount of time your system will spend in each phase to arrive at an estimate for battery life.

Figure 2: FPGA power consumption in various operating stages
One final note of interest in the Actel White Paper is that a Flash-based FPGA configuration cell is smaller than an SRAM-based configuration cell, so leakage currents are also smaller for Flash-based FPGAs. This point appears in the “Static” sections of Figure 2.
TI MSP430 Low-Power Microcontroller Demo Runs on Grapes
January 4, 2010 on 8:00 pm | In Low-Power | No CommentsThis TI video has been on YouTube for more than a year, but it’s new to me and pretty interesting. With all of the new low-power microcontroller announcements lately, this video is an excellent reminder that there are lots of good choices for low-power processors out there. If you don’t want to run your design on grapes, the video demonstrates that strawberries, kiwis, and other fruits are just as powerful.
Touchless Slider is One Cool User Interface, Driven by Low-Power Microcontroller
January 4, 2010 on 12:48 am | In Low-Power | No CommentsSilicon Labs has a diverse set of chips on offer and I’m really taken by the video demo of its new Si1120 Touchless Slider evaluation kit. The QuickSense Si1120 is an active infrared proximity sensor that you can use to build a variety of products with innovative, ultra-low power, touchless human interfaces. The chip itself incorporates an infrared LED driver, an infrared photodiode, an ambient light sensor, and control logic. The high-sensitivity infrared photodiode provides a single-pulse infrared proximity measurement allowing you to implement user interfaces using infrared light emitting diodes operating at unusually low power levels. The device is packaged in a tiny 3×3 mm clear surface-mount package and when it’s combined with a Silicon Labs low-power microcontroller, the Si1120 can be used for advanced motion and gesture recognition in products such as:
- Touch screens
- Instrumentation panels
- Kiosks
- Gaming systems
- Industrial interface
- Security
- Smoke detectors
- Residential HVAC
- Home appliances
- Toys
- Keyboards
- Fax/printer/scanner front panels
All this is just words. A video speaks volumes. So here’s the video:
The demonstration shows a user-interface slider board that incorporates an Si1120, two infrared LEDs, and eight visible LEDs. This board is controlled by Silicon Labs new ultra-low power C8051F900 microcontroller, which consumes as little as 160 μA/MHz in active mode and 10 nA in sleep mode with full memory retention. It will run on supply voltages as low as 0.9V. The microcontroller is based on an 8-bit, 25-MIPS 8051 controller core with a slew of peripheral devices including four timers, a UART, and a 12-bit A/D converter with a 15-channel analog multiplexer. The microcontroller is available with either 8 or 16 kbytes of on-chip flash and has 768 bytes of on-chip RAM.
Perhaps just as important, Silicon Labs supports this unique demonstration board with its QuickSense Studio development environment, a graphical environment wrapping multiple applications that guide user-interface developers through a development flow that includes graphical configuration wizards, firmware templates and performance monitoring tools. These programs interface with Silicon Labs’ QuickSense firmware API, which is a configurable firmware library that supports the development of many different interface types, from simple buttons to full gesture recognition. After configuring a project using the QuickSense Studio Configuration Wizard, the software simplifies the integration of human interface generates all the C code required for the selected functions.
I still find it hard to believe that a small 3×3 mm package can do all of this, but seeing is believing and the video makes a believer out of me. I’ve long been a user-interface enthusiast and the Silicon Labs Si1120 evaluation kit and demo board is simply one of the snazziest new user-interface components I’ve seen in quite a while. We’ve been watching lead characters use gestures to control sophisticated equipment for decades in science fiction movies and TV shows—most memorably perhaps in 2002’s Minority Report. Gesture interfaces, when combined with graphical displays are some of the most intuitive and most usable interfaces for all sorts of high-tech products and the Silicon Labs Si1120 looks to be one truly inexpensive way to implement a user interface that appears pretty darn sophisticated to an end user. Sophisticated user interfaces entice consumers to buy, so be sure to check out the new way to interact with your product. It’s clearly worth a few minutes of consideration.
7 Tricks from Microchip to Drop Power Consumption on any Microcontroller
January 3, 2010 on 1:56 am | In Low-Power | No CommentsMicrochip is an incredibly successful microcontroller vendor with a massive array of chips to choose from. The company has a series of low-power microcontrollers and refers to them as NanoWatt XLP (extremely low power) devices. In support of those devices, Microchip published a chapter on “Tips ‘n Tricks” to wring every nanoWatt of waste out of a design using their brand of microcontroller, but the first seven tricks will work with any vendor’s microcontroller, so these seven are well worth reviewing.
1. Switch Off Unneeded External Circuits and Control Duty Cycle
Almost all microcontrollers from all vendors have multiple similar-sounding low-power modes (sleepy, snoozy, droopy, drowsy, etc.) Sounds like the silicon version of the Seven Dwarfs, right? Well, all the low power modes in the world won’t help application if your application code doesn’t manage the power consumed by circuits that are external to the microcontroller. Microchip’s document uses lighting an LED as an example. Just a single lit LED is equivalent to running most of Microchip’s PIC microcontrollers at 5V and 20 MHz. When you design your microcontroller-based embedded system, always decide what physical modes or states it requires and make sure the microcontroller can cut power to external circuits when their function isn’t required.
For example, cut power to that boot EPROM after your circuit boots if the first thing the microcontroller does is download code from the EPROM to the microcontroller’s internal RAM. Alternatively, if you’ve got a high-resolution A/D converter outside of the microcontroller—because perhaps you needed more than the 12-bit resolution provided by the on-chip converter—be sure to include a transistor in the external converter’s Vcc line so that you can cut its power when it’s not needed.
2. Budget Your Power
Calculate the amount of charge used by each system mode by multiplying the current in mA by the amount of time spent in that mode during one loop of the application. Then average the sum of all the results in mA*sec over the entire length of the application loop to get the average operating current for all modes during one iteration of the application. Divide that result by the length of the application loop to get average operating current. Use that figure to help you size the battery needed by using the battery’s mAh rating and the number of days, weeks, or years you want the battery to last.
3. Do Something Smart with Port Pins
All microcontrollers have configurable ports that may serve as input, output, input/output, or analog input pins. Make sure you always configure all of the microcontroller’s pins to use the minimum amount of power.
4. Use High-Value Pull-up Resistors
If you use a pull-up resistor to keep an input high, then make sure to use a big resistor to minimize current consumption. Don’t just use a 2.2K or 4.7K resistor from habit or rule of thumb. Maybe you can use a 10K resistor. Maybe 100K or 1M. The bigger the resistor, the smaller the drain on your battery.
5. Reduce the Clock Speed and Operating Voltage to Minimums
Don’t run the microcontroller any faster than needed for the system design. Then set the operating voltage accordingly. Each clock cycle drives charge through the microcontroller and that charge comes straight from the battery. Fewer clocks per second means fewer charge packets to suck from the battery and fewer clock cycles per second also mean the operating voltage can be lower.
6. Disable the Microcontroller’s Internal Voltage Regulator and Get the Core Voltage Elsewhere
If your selected microcontroller operates the processor core on a separate voltage from the peripheral circuitry, chances are you can disable the internal voltage regulator and supply that core voltage externally. The advantage here is that you can then set the core operating voltage exactly where you need it, not where the internal regulator wants it.
7. Use Schottky Diodes to Switch Between a Power Supply and Battery
If your system can be powered from either a mains-powered supply or a battery, you can put a diode in series with each supply and the diodes will automatically supply power from the source with the highest voltage. Use Schottky diodes to minimize power loss through the diode.
If you’d like to peruse the full text of the Microchip document, you’ll find it here.
More on Mentor’s Catapult C from John Cooley and Other Designers
December 18, 2009 on 11:15 pm | In Design, EDA, SOC | No CommentsEarlier this month, I wrote about Mentor’s C-to-gates synthesis tool Catapult C and low-power design. The EDA industry’s self-appointed gadfly and uber-user John Cooley has just written an extensive blog posting about Catapult C complete with detailed comments from several of his reader/users. These comments and Cooley’s conclusions are very, very interesting for people in the ESL space, as well as anyone involved in chip design, so I thought I’d highlight some of Cooley’s conclusions.
First, Cooley quotes EDA analyst Gary Smith’s published numbers to quantify Catapult C’s lead in the high-level synthesis arena. He then uses the anecdotal evidence of the large number of comments (both good and bad) that his reader/users make about Catapult C relative to the other high-level synthesis tools to conclude that there do seem to be more IC designers using Catapult C than competing tools.
Cooley then hands the microphone over to his designer/readers for comments. One thing that really strikes me about these comments is the number of people who want to use C++ to describe hardware. Now C++ compilers have a tough enough time creating streamlined object code out of C++ descriptions. C++ allows such a high level of abstract description that algorithmic descriptions more resemble poetry than precise engineering-style descriptions. My opinion is that expecting any and all C++ descriptions to result in efficient hardware is a bit of a reach. No matter how good the compiler is, C++ descriptions can be so abstract that it can be tremendously difficult to infer any sort of efficient hardware design from such descriptions. The likelihood of developing mind-reading compilers in the near future seem mighty slim to me.
Other designer/readers seem to share my concerns. One engineer who sent a comment to Cooley and who preferred to remain anonymous wrote: “I remain concerned that quality-of-results derived from designs developed in ANSI C/C++ will not compare well to hand-coded RTL for our design area (hardware accelerators for broadband communications), regardless of claimed market share of Catapult C.”
Now don’t make something out of this skepticism (mine and “anonymous”) that’s not there. When logic synthesis first appeared in the early 1980s, it too “suffered” from a quality-of-results issue. The earliest logic-synthesis tools could not generate gate-level designs that were as efficient as manually-created designs by even moderately good human logic designers just as C compilers could not initially generate assembly code that was as efficient as code written by a good human codesmith. However, two things happed to make this issue become a non-issue.
First, the tools simply got better. Adoption of Verilog and VHDL as description languages helped to standardize the sea of HDL slopping over the EDA bucket back in the 1980s. Standardization gave compiler designers a focused target and channeled their creative energy into building better synthesis tools rather than creating ever-more-elegant description languages.
Second, Moore’s Law made irrelevant the difference between 10 and 100 gates or even between 100 and 1000 gates. At some point, we stopped counting gates just as we’d previously stopped counting polygons and transistors. We don’t really know how many gates there are on a chip any more and what’s more, we not longer care. Not really. Because it’s square millimeters of silicon that actually costs money, not gates. So today we use square millimeters and then use a fudge factor to estimate the number of gates represented by the square-millimeter metric.
Perhaps something like that will happen with high-level synthesis. The jury’s still out.
Laser Spike Annealing of Nickel in Nanometer CMOS ICs Cuts Leakage 10x
December 6, 2009 on 8:22 pm | In CMOS, Design, EDA, Green Design, Low-Power, SOC | No CommentsOne of the sad facts of life for nanometer silicon has been the rise of leakage current as device geometries shrink. At 65nm, CMOS leakage currents roughly equal operating currents, making it virtually impossible to reduce overall operating current by more than half. I’ve long thought this was the result of low-Vt transistors that can never fully turn off, a consequence of the drive to recover speed that’s lost when supply voltages are cut to reduce operating power. Turns out there’s another culprit: nickel contamination that occurs when nickel atoms drift away from the nickel-silicide interface layer used to improve the connectivity of metal inter-layer contact plugs. The nickel atoms drift during the annealing process, which is used to drive the deposited nickel atoms into the transistors’ source and drain contact pads. The first of two annealing cycles drives the metallic nickel atoms into the silicon source and drain pads creating Ni2Si silicide. A second, higher-temperature annealing process converts the Ni2Si into NiSi, which has lower resistance and thus provides good electrical connectivity between the contact pad and the metal interconnect plug.
It turns out that the current “soak” annealing (which lasts for tens of seconds) processes allow the nickel atoms to drift far afield. Like beach sand in your bathing suit, the nickel gets into places you’d rather not have it. The drifting nickel atoms seem to have an affinity for silicon lattice discontinuities, which can be found at the outside ends of the transistor where source and drain diffusions meet the isolation trenches and in long, narrow voids that run from the source and drain regions towards and into the FET channel. Both of these hiding places cause leakage because the metallic nickel conducts electricity where there should be insulator or semiconductor material. Nickel at the ends of the transistor causes substrate leakage and nickel atoms in the channel naturally cause channel leakage.
Applied Materials and European semiconductor research powerhouse IMEC have jointly developed a laser-annealing process with one-millisecond duration instead of taking tens of seconds. As a result, the diffusing nickel doesn’t have time to drift into these unwanted places during the second annealing step that generates NiSi. Applied Materials described a similar laser-spike annealing process back in 2004 (see article here), but reportedly achieved only a 3-4% leakage reduction back then. This latest development appears to be a refinement of that earlier technique. The two companies will be presenting their findings at this week’s IEDM conference in Baltimore, Maryland.
IMEC and Applied Materials will indeed have pulled a rabbit out of the hat if this laser-spike annealing process plus the application of appropriate transistor-design rules result in cutting leakage currents by 90% for nanometer CMOS. Leakage-driven power loss has become a significant problem for advanced IC design and had appeared to be insurmountable, even with the addition of high-K and metal-gate processing. Now, it appears there’s a real solution with the best of all possible implications for system and logic designers: they don’t need to learn anything new. They can leave this fix to the design tools and to the process engineers and once again skirt the system-level and architectural issues of low-power design.
C-to-Gates Synthesis and Low-Power Design
December 4, 2009 on 1:42 pm | In Design, Low-Power, SOC | No CommentsOne of the many “pushbutton” design-automation tools that chip designers have sought is a “C-to-Gates” tool that would allow the automated development of hardware from algorithmic descriptions written in the C programming language.
The place to start almost any system design is with the most fundamental aspect of system design: algorithm development. Systems are collections of independently and dependently operating algorithms. For example, a DVD player uses one algorithm to decompress the combined media stream, another to decode the resulting video stream, and yet another algorithm to decode the resulting audio stream. All systems are based on the execution of one or more algorithms.
Most algorithm development begins and ends with C. One exception to this rule is MATLAB from The Mathworks, which is very popular with many algorithm developers. However, there are ways to get MATLAB to produce C from MATLAB-based algorithms as well.
Given the close association between algorithm development and the C language, it’s only natural to want a tool that automatically converts C descriptions to hardware netlists. However, early attempts to create such tools didn’t meet with much commercial success, largely because the quality of results (number of gates, operational speed, and resulting power consumption) compared poorly to manual RTL-generation design techniques.
The tools have been getting better over the years, leading to this recent announcement by Mentor Graphics:
“Mentor Graphics Corp. (NASDAQ: MENT), today announced that Fujitsu Kyushu Network Technologies Limited (Fujitsu QNET) has chosen the Catapult C Synthesis tool for use in its design tool environment to implement complex algorithms in hardware that were previously processed by a processor implemented on LSI. Fujitsu’s growing expertise with the Catapult C tool is also a key enabler in the expansion of their design services business.
Fujitsu QNET was able to dramatically cut power consumption by using the Catapult C Synthesis tool to create a dedicated hardware accelerator for mobile voice processing algorithm versus running in software. The resulting silicon implementation yielded a reduction in power consumption of 83%. This was made possible by the ability of the Catapult C tool to find the optimal trade-off between power, performance and area, in this case, implementing a design satisfying voice performance requirements while running at a lower clock frequency than the previous implementation using a processor.”
Based on this announcement, it would seem that C-to-gates tools, at least Mentor’s Catapult C, are getting closer to reality. In fact, the above announcement would lead you to believe that such tools are here and production-ready today. Indeed, Mentor’s description of the Catapult C synthesis tool appears mighty attractive:
“Catapult C Synthesis reduces design time and verification effort. When writing pure C++, designers focus on the functional intent of their application. Timing and architectural information is abstracted away from the source description. With fewer details in the model, testbench development is also simplified.
Implementation of specific details are automatically added during the synthesis process, eliminating error-prone manual interventions and resulting in RTL designs correct by construction. Debug of the resulting RTL is in turn eliminated, further reducing the overall verification effort.
The Catapult C automated verification environment allows any RTL implementation of a C++ model to be verified using the original C++ testbench. This eliminates the need to write pin-level interfacing and bit-timed RTL environments to verify the RTL blocks created by Catapult before moving to system integration.”
There’s a caveat or two to remember, however. First, there’s good C code and bad C code whether you’re writing code to run on a processor or code that’s to be synthesized into gates. In the case of C-to-gates synthesis, good code signals the design intent as clearly as possible so that the synthesis tool needs to infer as little as possible. Machine inference is the second caveat. Every detail that Catapult C—or any synthesis tool for that matter—must infer is a design detail that you didn’t put into the design.
Conventional RTL-driven logic synthesis makes such inferences all the time and, over the years, designers have gotten savvy to the kinds of inferences that will be made by their logic-synthesis tools and have compensated by adapting their code-writing styles when writing hardware descriptions in the Verilog and VHDL hardware description languages. However, C has largely been used as a sequential algorithm description tool to create software that runs on single processors and use patterns by engineers reflect that long history. In addition, C describes algorithms at a higher level of abstraction than descriptions written in hardware description languages. As a result, there’s always more to infer from a C algorithm description just due to the higher abstraction level.
So eye that 83% power-reduction that Fujitsu QNET achieved for that voice-processing algorithm with envy. Just remember that engineering isn’t as simple as pushing a button.
Powered by WordPress with Pool theme design by Borja Fernandez.
Entries and comments feeds.
Valid XHTML and CSS. ^Top^