The Power Wall: Are we scaling it or is it just getting higher?

Cadence hosted a Low-Power Summit this month at which Jan Rabaey was the keynote speaker. Jan is the Donald O. Pederson  Distinguished Professor in the EECS department at U.C. Berkeley; Scientific Co-Director of the Berkeley Wireless Research Center (BWRC); and director of the Multi-Scale Systems Center (MuSyC). As someone who literally wrote the book on low-power design, he had a lot to say on the subject of the Power Wall.

We’re now in an era where electronic devices are quickly becoming the leading consumers of electricity. We have “millions of servers, billions of mobile devices, and trillions of sensors.” Whereas sensors tend to be energy frugal; and mobile devices are “energy bounded,” with a fixed amount of energy that must be carefully conserved; servers are “energy hungry”—and there are lots of them. According to Rabaey, “The Cloud is where 99% of processing is or will be done.” A typical server in a server farm can consume up to 20 kW; 100 servers in a small- to mid-sized server farm can easily consume 2 MW. Low-Power design has some high-power implications.

In his book Rabaey gives a tongue in cheek answer to “Why worry about power?” Assuming that Moore’s Law continues unabated and that computational requirements keep doubling every year, then assume:

  • The total energy of the Milky Way galaxy is 1059 J;
  • The minimum switching energy for a digital gate (1 electron @ 100 mV) is 1.6 x 10-20 J (limited by thermal noise);
  • The upper bound on the number of digital operations is 6 x 1079;
  • The number of operations/year performed by 1 billion 100 MOPS computers is 3 x 1024;
  • Then the entire energy of the Milky Way would be consumed in 180 years.

Not entirely convinced that computers will lead to cosmic catastrophe, Rabaey quotes Gordon Moore as saying, “No exponential is forever…but forever can be delayed.” Just how to delay it was the subject of his talk.

“Where is the next factor of 10 in energy reduction coming from?”

Over the last decade chip engineers have come up with a large number of techniques to reduce power consumption: clock gating; power gating; multi-VDD; dynamic, even adaptive voltage and frequency scaling; multiple power-down modes; and of course scaling to ever smaller geometries. However, according to Rabaey “technology scaling is slowing down, leakage has made our lives miserable, and the architectural tricks are all being used.”

If all of the tricks have already been applied, then where is the next factor of 10 in energy reduction coming from? Basically it’s a system-level problem with a number of components:

  1. Continue voltage scaling. As processor geometries keep shrinking, so to do core voltages—to a point. Sub-threshold bias voltages have been the subject of a great deal of research, and the results are promising. Sub-threshold operation leads to minimum energy/operation; the problem is it’s slow. Leakage is an issue, as is variability. But you can operate at multiple MHz at sub-threshold voltages. Worst case when you need speed you can always temporarily increase the voltage. But before that look to parallelism.
  2. Use truly energy-proportional systems. It’s very rare that any system runs at maximum utilization all the time. If you don’t do anything you should not consume anything. This is mostly a software problem. Manage the components you have effectively, but make sure that the processor has the buttons you need to power down.
  3. Use always-optimal systems. Such system modules are adaptively biased to adjust to operating, manufacturing, and environmental conditions. Use sensors to adjust parameters for optimal operation. Employ closed-loop feedback. This is a design paradigm shift: always-optimal systems utilize sensors and a built-in controller.
  4. Focus on aggressive deployment. Design for “better than worst-case”—the worst case is rarely encountered. Operate circuits at lower voltages and deal with the consequences.
  5. Use self-timing when possible. This reduces overall power consumption by not burning cycles waiting for a clock edge.
  6. Think beyond Turing. Computation does NOT have to be deterministic. Design a probabilistic Turing machine. “If it’s close enough, it’s good enough.” Statistical computing I/O is stochastic variables; errors just add noise. This doesn’t change the results as long as you stay within boundaries. Software should incorporate Algorithmic Noise Tolerance (ANT). Processors then can then consist of a main block designed for average case and a cheap estimator block for when that block is in error.

Rabaey emphasized several points he wanted everyone to take away to their labs:

  • Major reductions in energy/operation are not evident in the near future;
  • Major reductions in design margins are an interesting proposition;
  • Computational platforms should be dynamically self-adapting and include self-regulating feedback systems;
  • Most applications do not need high resolution or deterministic outcomes;
  • The challenge is rethinking applications, algorithms, architectures, platforms, and metrics. This requires inspiration.

What does all of this mean for design methodology? For one thing, “The time of deterministic ‘design time’ optimization is long gone!” How do you specify, model, analyze and verify systems that dynamically adapt? You can’t expect to successfully take a static approach to a dynamic system.

So what can you do? You can start using probabilistic engines in your designs, using statistical models of components; input descriptions that capture intended statistical behavior; and outputs that that are determined by inputs that fall within statistically meaningful parameters. Algorithmic optimization and software generation (aka compilers) need to be designed so that the intended behavior is obtained.

For a model of the computer of the future Rabaey pointed to the best known “statistical engine”—the human brain. The brain has a memory capacity of 100K terabytes and consumes about 20 W—about 20% of total body dissipation and 2% of its weight. It has a power density ~15 mW/cm3 and can perform 1015 computations/second using only 1-2 fJ per computation—a good 100 orders of magnitude better than we  can do in silicon today.

So if we use our brains to design computers that resemble our brains perhaps we can avoid the cosmic catastrophe alluded to earlier. Sounds like a good idea to me.

About John Donovan

Writer, editor, Dad
This entry was posted in Cloud computing, Low-power design, semiconductors and tagged , . Bookmark the permalink.

Leave a Reply