NOCs: The Undead of the SOC World

November 8, 2009 on 6:14 pm | In SOC, Uncategorized | 3 Comments

The 7th International SOC Conference in Newport Beach featured a session on NOCs (networks on chip). Perhaps it’s the undue influence of the recent Halloween festivities, but NOCs remind me of vampires, of the undead. They just keep coming back no matter what, despite the lack of uptake in the commercial sector.

Academics love NOCs because they can be analyzed to death and they provide wonderful fodder for postgraduate work. You can come up with increasingly elegant, time-consuming, and costly routing algorithms for NOCs, which has permitted the creation of many, many academic papers. Each and every paper lists the prior failings of earlier NOC approaches, analyzes the shortcomings, and then proposes an even more elegant and costly NOC that solves the technical problems of predecessors. But these more elegant solutions have even less commercial potential because of the costs.

When will it end?

Perhaps never.

One of the speakers at last week’s International SOC Conference was Professor Nader Bagherzadeh of UC Irvine’s EECS Department. His presentation was sensibly titled “Is Network-on-Chip (NoC) a Viable Choice for the Future?” That’s a very reasonable question and Processor Bagherzadeh gave a reasoned presentation. One of his first slides contrasted three approaches to SOC interconnect design. The first approach, popular with most of today’s SOC designers, is the use of bus hierarchies.

Buses are the dinosaurs of system design. The fossils of bus-based, board-level designs from decades past form the bones of new SOC designs even though the economics of on-chip nanometer silicon interconnect now bear no resemblance to the copper-and-fiberglass design rules and economics of the 1980s. As Processor Bagherzadeh said, bus-based designs are not scalable, they enforce centralized control in increasingly decentralized systems of growing complexity, and they force the use of long wires on the SOC, which severely degrades performance and needlessly exposes system designs to the newest bugaboo for deep-submicron design: on-chip variability.

The current leader for efficient, fast SOC designs is point-to-point interconnect, which offers low latency, application-specific optimization, very high bandwidth, and low cost. Deep-submicron wires are plentiful and cheap. System designers should use them accordingly.

And then there are NOCs, which also promise shorter wiring runs between on-chip routers. High levels of interconnectivity mean that NOCs can provide high bandwidth with distributed traffic control. However, said Processor Bagherzadeh, NOCs are not as efficient as point-to-point wiring for carrying traffic on application-specific SOCs and consequently we have still not seen many tapeouts that use NOCs for real chips in real applications.

But that doesn’t mean that NOCs are elegantly useless. I think Processor Bagherzadeh made a good case for NOCs to be used as flexible interconnect when designing a platform chip. Here, you don’t have all of the knowledge to predict traffic flows over an entire chip and need some flexibility when routing high-bandwidth traffic. In such cases, you might be willing to suffer the silicon overhead of a NOC in exchange for interconnect flexibility.

It was at that point that Processor Bagherzadeh started to discuss his work with a 7-channel NOC router, which is even bigger, better, and more elegant than the conventional 5-port NOC router, offers more effective traffic bandwidth and throughput, and requires even more elegant routing algorithms. We now return you to our regular NOC programming where the usual solution to low uptake in NOC usage is to create bigger, better, and more elegant NOC hardware and routing algorithms.

A Low-Power, ARM-based Microcontroller from Oslo with a Winning Presentation

November 1, 2009 on 3:07 pm | In Uncategorized | 1 Comment

Last month at the ARM Techcon 3 conference, I watched as the CEO of a Norwegian fabless semiconductor company named Energy Micro leapt on stage, imitated Tom Cruse in his Mission Impossible role, opened his black-and-silver attache case, and announced the company’s EFM32 low-power microcontroller based on an ARM Cortex-M3 processor core. What really impressed me was not the over-amped Mission Impossible intro video or the bright green neckties that served as the company uniform at the conference. No, I was impressed by the strikingly graphical way the Energy Micro marketing crew came up with to demonstrate why their microcontroller has the lowest power. I was impressed enough to go through those slides here with you. See if you don’t agree with me about the effectiveness of this graphical presentation.

Energy Micro 1

This first slide shows a power consumption profile curve for a microcontroller as it wakes up, does its thing, and then goes back to sleep. The area shown under the curve is the total expended energy for this profile. Reduce the area under the curve and you’ve cut energy consumption. Are you with me so far?

The first and most obvious thing to do to cut energy consumption is reduce the amount of power drawn by the microcontroller while it’s running in active mode. At 3V and with a 25 to 35 MHz clock, Energy Micro’s EFM32 consumes 180 microamps/MHz when executing code from internal Flash memory. At 3V and 1 MHz, the current consumption is 220 microamps/MHz. (In other words, at 1 MHz the current consumption is 220 microamps.)

Energy Micro 2

The next step towards reducing the microcontroller’s energy consumption is to use a processor core that executes code efficiently so that the microcontroller spends less time in active mode. The EFM32 employs a 32-bit ARM core, which is way more efficient than older 8- and 16-bit microcontroller processor architectures at performing today’s more advanced tasks, so tasks can be executed more quickly—in fewer clock cycles.

Energy Micro 3

Next, you need to deal with the energy consumed between the time the processor starts to wake up from sleep mode and the time it starts executing code. This is dead time when the processor isn’t doing anything useful (just like in sleep mode). However, during this time the microcontroller draws way more current than it does in sleep mode and that power is essentially wasted with respect to “getting the work done.” Some processors don’t wake up very fast, so they waste a non-negligible amount of power between the time they exit sleep mode and the time they start to execute code. The EFM32 wakes up its deep-sleep and stop modes in 2 microseconds, which appears to be relatively fast for this sort of thing compared to the numbers for competing processors in Energy Micro’s ARM Techcon 3 presentation.

Energy Micro 4

In both of these modes, the EFM32 draws less than one microamp of current. The difference between the modes is that in deep-sleep mode, various low-frequency (32-KHz) peripherals continue to operate and can wake the processor. In stop mode, only interrupts, the I2C interface, and the on-chip analog comparators can wake the processor.

Energy Micro 5

Because many embedded applications that have extremely low power and energy consumption requirements tend to put processors to sleep most of the time, it’s critical that the microcontroller have extremely low current consumption during its deepest sleep mode. The EFM32’s shutoff-mode current rating is a mere 20 nanoamps but it takes the processor 160 microseconds to come out of this mode, versus 2 microseconds for the lesser sleep modes. However, with 20 nanoamps of current consumption, the dirt on the board could consume more current than the processor through surface leakage if you’re not careful in cleaning the circuit board.

Energy Micro 6

You need to assert the reset pin to bring the EFM32 out of shutoff mode so there are four other operating modes (stop, deep sleep, sleep, and run) with increasing levels of on-chip activity and increasing amounts of current consumption (from 0.6 microamps/MHz to 180 microamps/MHz).

What do you get by nibbling away various rectangles from the area under the original power-profile curve? You get a processor that might be able to run for more than 4 years from a CR2032 coin cell, which is longer than competing microcontrollers according to Energy Micro.

But wait, there’s more! The EFM32 sports “smart” autonomous peripherals, so the internal ARM Cortex-M3 processor core can spend even more time sleeping and less time working. The EFM32’s intelligent peripherals, which can be time- or data-triggered, include a 6-to-12-bit A/D converter with 8 analog input channels that draws 500 nanoamps running at 1K 6-bit samples/sec to 200 microamps running at 1M 12-bit samples/sec, a 4×40-segment LCD driver with built-in voltage booster that draws 900 nanoamps, a low-energy UART (a “LUART”) that draws 100 nanoamps running at 9600 bps, and a 32-KHz clock/counter that draws 50 nanoamps.

Energy Micro 7

Energy Micro claims that the autonomous peripherals in the EFM32 microcontroller can chop a few more rectangles out of the energy-consumption curve, keeping the processor dormant longer, so that it can get 10 years out of that CR2032 coin-cell battery. That’s four times longer than the next competitive microcontroller, according to Energy Micro.

Energy Micro 8

In addition to these autonomous peripherals there’s a DAC, a power-on reset circuit, real-time clock/counter, watchdog timer, power-monitor, etc. Oh yes, there’s 16 to 128Kbytes of Flash and 8 to 16 Kbytes of RAM on the chip along with the ARM Cortex-M3 processor core and the assorted peripherals. A large number of family members (22) with the usual mix-and-match combinations of peripherals and memory found in most microcontroller families are planned.

What might you do with such low-power devices? Energy Micro’s Web site lists a lot of interesting applications including energy and utility metering (electricity meters, water meters, gas meters, and heat cost allocators), home and building control (HVAC systems, lighting control, smart home systems), alarm and security systems (burglar alarms, fire and safety alarms, smoke detectors, surveillance systems), industrial automation (temperature sensors, pressure sensors, vibration sensors, motion sensors), medical devices (pacemakers and defibrillators, glucose meters, blood-pressure monitors), remote controls (IR and RF remote controls, keyless entry), identification systems (RFID, tracking systems, access control), sporting goods and equipment (GPS, sport watches, MP3 players, pulse and pace monitors), and climate monitoring (humidity sensors, CO2 and gas sensors, temperature sensors, and corrosion detectors). That list is hardly exhaustive, but it’s a darn good start.

The first EFM32 microcontroller chips are packaged in QFN64 and BGA112 packages, which are currently sampling with lead customers. Pricing starts at $1.55 in 100k quantities for 32-pin packages. Interested? Development kits are supposed to be available this month. Samples will be available next month in December. Volume deliveries are scheduled for February, 2010. www.energymicro.com.

Give OTP a chance for low-power, on-chip storage

October 4, 2009 on 6:58 pm | In CMOS, Design, Flash, Hubble, Low-Power, Space, Uncategorized | No Comments

The on-chip memories that get most of the attention are read/write memories such as SRAM, DRAM, Flash, and MRAM (which I just covered in my previous blog entry). However, there’s a place for OTP (one-time programmable) memory on chip, so the technology bears some thought. I discussed OTP at last week’s GSA Emerging Opportunities Expo and Conference in Santa Clara, California with Jim Lipman of Sidense, a vendor that offers hard IP for on-chip OTP memory.

Sidense’s SiPROM memory cell consists of one specially designed FET as shown in the figure below. The special part of the FET’s design is a stepped gate-oxide layer with two thicknesses: thick and thin. Unprogrammed, the FET looks like a FET. Programming causes a controlled disruption in the thin part of the FET’s channel-oxide insulation to produce a conduction path from the FET’s gate to the conduction channel. Charge-coupled sense amps can detect whether or not an FET in the OTP array has or has not been programmed.

It’s because of the charge-coupled sense amps that Sidense’s SiPROM technology qualifies as a low-power memory technology. These sense amps are only on for tens of nanoseconds during a read cycle and are not powered continuously. This is a patented feature of Sidense’s technology.

Although designers have an obvious bias towards read/write technologies for on-chip memory, OTP memory can be quite useful for storing infrequently programmed or reprogrammed data such as calibration and trim settings, serial numbers, configurations, boot code, and security keys. This last application is particularly interesting. Lipman provided an example. The security keys for the HDMI digital display interface spec need about 2.5 kbits for storage. However, there’s the possibility that the security can be broken and that new keys will need to be distributed. A 16-kbit array of OTP memory can store about six sets of HDMI keys, which should be enough storage to last beyond the expected life of the end equipment.

You should also be aware of the factors that argue in favor of on-chip OTP memory. Sidense’s cells are about 1.2x larger than ROM cells, so there’s a 20% size penalty in exchange for the flexibility of programmability. In exchange for this size penalty, there’s no need for a mask change if the data stored in the OTP ROM needs to be changed in the factory or in the field (for an update).

In addition, Sidense’s OTP memory easily tracks IC manufacturing process changes although it’s hard IP, so Sidense must tailor the IP for each vendor’s process technology. Sidense’s SiPROM products are currently available from 180nm to 55nm and are portable to 40nm and below. Supported foundries include TSMC, UMC, Fujitsu Microelectronics, SMIC, Tower, IBM and Chartered.

It’s also interesting to compare OTP memory with Flash. Lipman says that Sidense’s OTP SiPROM cells are about half the size of Flash cells for a given semiconductor technology. In addition, the creation of Flash-cell floating gates adds process changes that can add roughly 30% to wafer production costs. Finally, Flash process technology is clearly getting into trouble as lithographies shrink. Some presenters at the recent Flash Memory Summit were predicting that the 22nm node might be the last node to support Flash memory, although such end-of-the-world prognostications from the semiconductor pundits are often wrong. By contrast, Sidense’s SiPROM cells require only standard CMOS processing, so the company claims it’s easier for their OTP memory than it is for Flash cells to track process improvements.

FPGAs as ASSP/Microcontroller Helpers – When ASICs and SOCs Won’t Do

September 1, 2009 on 2:40 pm | In Uncategorized | No Comments

Without doubt, an ASIC or SOC is the way to create systems with the lowest power dissipation. However, many other factors can mitigate the advantages of custom system silicon. Those factors include a critical and looming market window, a lack of funds for the resulting ASIC/SOC NRE charges and design-tool costs, a design team that simply lacks experience with chip design, or inadequate projected sales volumes to justify the time and expense of ASIC/SOC design. In such circumstances, the design team will usually try to find an ASSP (application-specific standard product) or an off-the-shelf microcontroller that closely meets the design specs and will then fill the inevitable functional gaps with software or firmware.

But what if that’s not possible? What if there is no such ASSP or microcontroller? What if software can’t fill the gap? Then the only choice is to get the hardware as close as possible and then plug the gap with additional circuitry. But extra circuitry brings added disadvantages. First, it adds to the BOM, assembly, and unit-test costs. Second, it consumes space on the circuit board and in applications that are really short on room (such as mobile phone handsets) it consumes additional cubic millimeters that probably cannot be spared. Finally, it consumes added power.

Enter the FPGA vendors, who would have you believe that one of their components can save the day by adding huge numbers of “system gates” at low cost. You can get both large gate counts and low price from FPGAs, but rarely at the same time. Further, FPGAs with large gate counts have large accompanying power-consumption specs. That’s because the major FPGA vendors have pursued performance over all other characteristics and their static- and dynamic-power consumption specs reflect that chase. As FPGAs have become IC process-technology drivers, they have continued to push lithography limits using performance-tuned process parameters. The resulting FPGAs make the most of process speed at the cost of dynamic current consumption and high leakage.

However, if all you’re doing with the FPGA is making relatively simple additions to an ASSP or microcontroller, you may need an FPGA tuned for a different design approach. That’s the philosophy behind SiliconBlue Technology’s iCE65 FPGAs, which employ a low-leakage, albeit slower version of TSMC’s 65nm process to produce low-cost FPGAs (on the order of a buck or two) with microwatt power requirements at moderate gate capacities (a few thousand 4-input LUTs). These small, low-power FPGAs are designed to be ASSP/Microcontroller helpers. They’re designed to allow the needed customization while relegating most of a system to a well-optimized standard chip or chip set.

What can you use such devices for? I asked that of Denny Steele, SiliconBlue’s Director of Marketing and Applications. Here’s the list he reeled off the top of his head:

  • An interrupt queue for a GUI-driven application to reduce the frequency that software must bring a host processor out of sleep mode thus minimizing processor power consumption
  • A port multiplexer to add an extra SD memory card to an ASSP with only one storage port
  • A buffered port switch to allow host and application processors to share storage media such as an SD card
  • An interface adapter that allows an existing LCD interface port to communicate with a different sort of display—such as an ePaper or eInk display that has radically different timing requirements
  • A parallel-to-serial or serial-to-parallel converter to mate one type of display interface to the other
  • An display-format converter so that an ASSP/microcontroller designed for one display size can more easily control displays of other sizes
  • A cafeteria of virtual, configurable legacy interfaces that aren’t all needed for any one design but are needed over the full usage spectrum for the final hardware design

Of course, these are just a few of the application ideas for an ASSP/microcontroller helper in the form of an FPGA. Significantly, the FPGA system-augmentation design path can help when product life cycles are quite short, as they are for mobile phone handsets in the developed world (as it so happens, outside of the US in the case of cellular telephony). In such markets, product life cycles are measured in months and design teams may be creating three or four designs per year.

An FPGA-augmented design based on standard handset chip sets comes in quite handy in such situations because the FPGA can be used to add features that end-users notice such as enhanced display resolution, touch screens, extra SD or SIM cards, and so on. Such desirable features prompt customers to unsheathe their credit cards. One board-level design with FPGA augmentation can accommodate more than one product design and more than one product generation without requiring a BOM change. That’s a real competitive advantage in today’s quick-turn world of consumer electronics.

Consequently, this is the world for which SiliconBlue optimized its iCE65 FPGA. The device is tiny—as small as 3×4 mm—to fit in small, handheld consumer products. A 4000-LUT iCE65 device draws a mere 15 microamps at 1V when running at 32 KHz, the standard heartbeat of a mobile phone handset in standby mode. That’s slow enough to use very little power but fast enough to catch an event that merits waking the host processor. Of course, the FPGA can run much faster, with higher resulting dynamic power consumption.

Are these low-cost, low-power, small-size optimizations enough to create a niche for SiliconBlue in the fiercely competitive FPGA market? “We’re betting the farm that customers will want to do this” replied Steele.

Free Pass to DAC Exhibits, All Week Long

July 9, 2009 on 11:27 pm | In Uncategorized | No Comments

Are you an EDA user with a hankering to attend DAC in a couple of weeks but don’t have the dough-re-mi and your company won’t spring for such a “frill” this year? Recently laid off as an EDA user? Denali, Atrenta, and Springsoft want to make you an offer you can’t refuse: a full-week’s pass to DAC exhibits in exchange for a bit of information from you. Only for the first 600 people though, so better sign up quickly. Like right now! Where? Here.

Powered by WordPress with Pool theme design by Borja Fernandez.
Entries and comments feeds. Valid XHTML and CSS. ^Top^