How low can you go? ARM does the limbo with Cortex-M0+ processor core. Tiny. Ultra-low-power.

Jack be limbo, Jack be quick
Jack go unda limbo stick
All around the limbo clock
Hey, let’s do the limbo rock

Limbo lower now
Limbo lower now

(From “Limbo Rock” by Chubby Checker)

How low can you go? ARM has pushed further into small-processor territory—going whole hog after the 8-bit processor cores—with the newly announced ARM Cortex-M0+ processor core. This is still a 32-bit processor core with the same 56-instruction Thumb ISA implemented by the ARM Cortex-M0 processor core. However, ARM has tossed more hardware overboard to cut the transistor count and the power requirements relative to the ARM Cortex-M0 core. Most noticeable is a jump from the ARM Cortex-M0 processor core’s 3-stage pipeline to the ARM Cortex-M0+ core’s 2-stage pipeline.

As a result, the ARM Cortex-M0+ core draws even less power than the already low-powered ARM Cortex-M0. For example, in 180nm process technology, the ARM Cortex-M0 core draws 73µW/MHz while the ARM Cortex-M0+ core draws 52µW/MHz. That’s nearly 30% less power. In 40nm process technology, the ARM Cortex-M0 draws 4µW/MHz while the ARM Cortex-M0+ processor core draws 3µW/MHz. OK, so it’s only a single microwatt of difference in power, but it’s 25% less (as they say in Marketing). Meanwhile, ARM gets slightly more performance from the ARM Cortex-M0+ core with the reduction in pipeline stages: a CoreMark/MHz score of 1.77 versus 1.62 for the ARM Cortex-M0 core.

Even though tiny, ARM has enhanced the Cortex-M0+ with some notable features including an optional 8-region Memory Protection Unit, one non-maskable and as many as 32 physical interrupts, sleep modes (with an optional data-retention mode), an optional 32×32-bit hardware single-cycle multiplier, optional CoreSight JTAG and debug ports, and an optional Micro Trace Buffer.

Often, there’s an automatic assumption that 32-bit processors require a larger code footprint in memory than required for 8-bit processors. It’s a natural assumption but it’s not necessarily true. The ARM Cortex-M0+ employs the Thumb ISA, which consists largely of 16-bit instructions that specify 32-bit operations. Meanwhile, 8-bit processors often require two- and three-byte instructions to specify 8-bit operations. The ARM Cortex-M0+ page shows a worst-case example of a 16-bit multiplication operation that requires 30 instruction bytes for an unspecified 8-bit processor while the ARM Cortex-M0+ processor executes the same operation (actually, a full 32×32-bit multiply) using one 16-bit instruction. Of course, your mileage may vary.

Because of the tiny real-estate footprint and extremely low power consumption, the ARM Cortex-M0 processor core has already caught the attention of some microcontroller vendors shooting for the very low end including NXP, STMicroelectronics, and NuvoTon. It has also become a popular processor core for use as a firmware-programmable state-machine replacement in SoC designs because the processor itself consumes only 0.04mm2 in 90nm process technology and less than 0.01mm2 in 40nm process technology. That’s 100 ARM Cortex-M0+ processors per square millimeter at 40nm—not including memory. The ARM Cortex-M0+ processor core will likely prove even more attractive than the ARM Cortex-M0 core to these vendors and SoC designers because of the even lower power consumption, the improved performance, and the available enhancement options.

Freescale has already announced plans to bring out a new Kinetis L family of microcontrollers based on the ARM M0+ processor core. The company plans on demonstrating some aspect of the Kinetis L family at the Design West Conference in Silicon Valley followed by the unveiling of more details at the Freescale Technology Forum being held in San Antonio, Texas this June.

This entry was posted in ARM, IP, Low-Power and tagged , . Bookmark the permalink.