Verification Methodology for Low Power—Part 4
Multivoltage Verification—Dynamic Verification and Hierarchical Power Management
This is the fourth of four weekly serialized installments from the Verification Methodology Manual for Low Power. Part 1 covered Multi-Voltage Testbench Architecture—Testbench Structure and Components. Part 2 covered Multi-Voltage Testbench Architecture—Coding Guidelines as well as Library Modeling for Low Power. Part 3 addressed Multivoltage Verification—Static Verification. Part 4 covers Multivoltage Verification—Dynamic Verification and Hierarchical Power Management.
6.4 DYNAMIC VERIFICATION
The first objective of dynamic verification is to exercise the power state table. Assuming that static verification yields a clean result, we can assume that in a steady multi-voltage state, there are no further obvious electrically hazardous conditions. Corner cases may well exist that need to be uncovered by dynamic verification. However, before we get there we have some basic functionality to verify.
For example, consider the situation from Figure 6-3, where it is not the implementation that has an error; rather, that a multiplier is incorrectly partitioned into a domain that can be turned off. However, the multiplier is potentially needed as a resource in one of the modes, which happens to unintentionally turn off the multiplier. If there are no tests that observe the output of the multiplier in this mode, then the error is unlikely to be detected. Note that the design could be completely correct structurally and yet encounter this error.
The reader might conclude that this is a trivial problem: the multiplier outputs would all be isolated values and the result, once used in the logic downstream, will definitely yield the error. However, consider the situation where the number of power domains is seven: each domain can be On/Off. That yields about 128 possible legal states. The partitioning is not at the multiplier level. It is possibly at the level of an IP block such as a processor core. The error in partitioning the multiplier is caught iff there are tests for the multiplier in every mode where the processor core is on. Even with a small number of power domains, the problem at this granularity becomes quite a monster. We need to arrive at exhaustive coverage in each mode of the design through random testing methods.
Recommendation 6.6 — Each power state needs to be tested exhaustively by covering all the major micro-architectural elements in that state.
Recommendation 6.7 — In each power state, coverage of all logic that is on should be as close to complete as possible.
The opposite situation is equally troublesome: verifying that resources not needed in a given mode are actually turned off. For example, consider the situation where the multiplier is in a domain that is on while the processor core is turned off. This is quite a hard problem to detect because the multiplier inputs are isolated in this state and hence controllability becomes practically impossible. However, Recommendation 6.7 will yield the offending block as an uncovered element and the debug process should result in an architectural analysis. Therein lies the lesson: power management failure conditions tend to be difficult to debug and may involve a re-examination of the architecture.
Moving further beyond power states, state transitions are the next target of verification. The ability of the PMU to sense the need for a state transition, assert and de-assert the appropriate control signals relevant to the transition, and signal a completion of transition and resumption of execution need to be well tested.
The verification of transitions is complicated by the following factors:
- The transition may be aborted and a safe return to the original or other state may be required .
- Electrically unsafe conditions such as level shifters being required where they would not be or level shifters going out of input/output voltage range may occur.
- The power integrity of the design may be stressed by rush currents or by the fact that many voltage rails are changing at the same time, causing a lot of noise on the power rails.
In terms of state transitions and sequences, a must have for verification coverage is the set of possible sequences for wakeup and shutdown to an all-off state, such as the bring up sequence. A number of complex tasks happen in this set of states and transitions often involve asynchronous and mixed-signal events. For example, the power on reset of the chip or parts of it may be triggered by the supply voltage passing a certain trip point. This in turn may latch configuration bits, device status bits and others before proceeding to full power up or aborting the power up. This is also a place where system-level deadlocks are likely. Hence, tests must be directed at this part of the power management state space, not just focus on functional states/ modes.
Hence, the verification of transitions needs to comply with the following rules, apart from the obvious one that all possible state transitions must be covered.
Rule 6.8 — Transitions must be tested for abort signals if applicable.
Rule 6.9 — Assertions to guard against multiple rail changes at the same time must be present.
Recommendation 6.10 — Level shifter range violations must be guarded against by writing appropriate assertions at each spatial crossing across voltage domain boundaries.
Transitions can also occur for multiple reasons and perhaps conflict with each other. For example, an incoming phone call may direct the CPU to operate at 1 .2V whereas a camera “click” in progress may be operating the CPU at 1 .4V.
Rule 6.10 — Power states must be tested for conflicting transition inputs and priority of transitions must be resolved as architecturally intended.
However, in the case of the above example with a phone call there is a camera conflict. Consider the situation in which the phone call is indeed the priority, but the camera data is not discarded. The process is merely sent into the background. This implies that once the phone call is done, there is a need to restore the camera mode and execute accordingly. This now brings us to sequences. Even a small design with few power states, if likely to see numerous sequences, particularly with logic elements saving relevant context in various states.
Thorough coverage of sequences is not possible without resorting to some random stimulus techniques. Further, sequences should represent real life usage. In modern SoCs, coverage of sequences is best accomplished by applying as many software tests as possible. In addition, one must apply random stimuli such as interrupts, critical failure conditions, timer/counter triggers to ensure that in every case, the DUT stays within defined power states and without deadlock conditions.
As the number of power domains and states increases, this becomes an exponentially increasing coverage space. Such designs do not lend themselves to ordinary coverage metrics. One must necessarily think of the design as a hierarchy of connected systems and target verification at the interaction between such connected systems and within each subsystem. We must also try to prune the coverage required by being selective about which metrics are relevant to verification.
6.4.1 IMPACT OF DESIGN STYLE—ARCHITECTURE AND MICRO-ARCHITECTURE
One of the unique aspects of power management is that the design style chosen makes a significant impact on the bug types and hence the verification strategy. For example, a power gated design is susceptible to isolation errors, rush current/voltage scheduling errors or reset errors, but does not have level shifting or memory corruption issues. Likewise, a design that uses power gated state retention is unlikely to have a logic conversion error.
Apart from the generics of walking through the state table, transitions and sequences, we need to focus verification on the micro-architectural implementation of control as well. If we turned our attention to the design structure, islands, power gating enables, voltage ID codes, retention controls, isolation enables, and charge pumps, we can ask ourselves what the coverage metrics look like from this point of view. In general, it is not sufficient to walk through the power state table and transitions. In fact, this becomes prohibitive as the number of islands and hence state combinations grows.
However, when one focuses on the design elements, a more manageable coverage metric emerges. We still need to watch for illegal states and transitions, but it would help to know how many times an island has been shutdown. Sometimes, this might yield a vector that exercises all power states, but never shuts down a particular island, thereby indicating an error in the state table or the partition.
Rule 6.11 — Coverage must be measured on design elements such as islands and power management controls, apart from the power state table, transitions and sequences.
Focusing further on control, one must take into consideration special effects such as selective retention, staggered power switch control, split isolation structures, and latch-based isolation, and test them at the appropriate points in the power management sequence. For example, consider the example of selective isolation enable as shown in Figure 6-4. It is important not only to cover Iso_en1 and Iso_en2, but also ensure that any sequence between them is honored. It is important also to test the interval between the isolation enables and disables to see if the design's functionality is maintained.
Overall, micro-architectural coverage, while essential, is quite design specific. The test plan hence needs to have a dedicated section aimed at these design elements. Focusing on the design elements yields another extremely beneficial result. Critical control signals that are essential to the power management scheme will be defined rigorously in the process, which can be statically verified as described in “Recommendation 6.6” on page 114.
6.5 HIERARCHICAL POWER MANAGEMENT
In most current systems, the design is already organized as a collection of resources and subsystems, such as the memory subsystem, the video subsystem, the graphics rendering engine, and the analog subsystem. The implication here is that each of these functions presents a view (say a control interface) of power management to its master controller, which in turn presents a view of functions and interface to its master controller. This interface need not necessarily be a power management view, it works well for any functionality as well, such as a DMA transfer.
Thankfully, while most of today’s systems organize themselves in such a fashion for reasons other than power management, the boundaries of such subsystems form natural boundaries for voltage control.
We can now view such a system as a bunch of finite state machines linked to one another with a defined pecking order. This order is quite relevant. We cannot have a situation where the master controller of a device/subsystem is off, but the device is itself in the on state.
Some readers may be thinking, “Wait! My keyboard subsystem recognizes the power button or lid opening on my laptop and wakes up the entire system. So, it is not an error for a device to be on while its master is off”. While this is an excellent observation of most real life systems, the subtlety in the master-slave relationship is that the power rails may follow a different hierarchy. It is a hierarchy of voltage rails.
In the above example, while the input devices as a class are controlled by some master for their functionality, a power domain view of the keyboard may show the power and lid inputs being partitioned into a separate always on domain. Further vexing the reader perhaps, is the fact that the functionality of the power button and lid can be configured by the user, once the system is up and running, for power management options and one option may appear to change the hierarchy. For example: the lid input maybe programmed not to cause any state change. Note that this may not necessarily change the hierarchy of power domains; it may be changing the system's response to an input.
This brings us to an important rule of hierarchical power management.
Rule 6.12 — The default, unconfigured functionality of configurable power management hierarchy must be tested.
Rule 6.12a — The programmed/configured functionality of configurable power management hierarchy must be tested.
One of the hidden aspects of hierarchical power management is that once the rails / domains are ordered as masters/slaves in a tree, the state table of the system is quite amenable to derivation. We can further derive transitive relationships or disjoint properties between the power domains, thereby reducing the need to cover modes that may not be realistic.
An excellent example of hierarchical power management can be found in with additional background in the following references: , , .
We now return to the basic aspect of verification: writing tests and measuring coverage. This is the topic of the next chapter.
Printed with permission from Jadcherla, et al, Verification Methodology Manual for Low Power (Synopsys, Inc.: Mountain View, CA). Copyright (c) 2009 by Synopsys, Inc., ARM Limited, and Renasas Technology Corp. All rights reserved.
Synopsys customers can download a free copy of the book at www.synopsys.com/vmmlp. The companion Low-Power Methodology Manual is similarly available at www.synopsys.com/lpmm.