Optimizing Video Encoders with Digital Signal Processors
The range of features in advanced compression standards offer a large potential for trading off the options in order to balance complexity, delay and other real-time constraints.
Video compression allows for digital video encoding, using as few bits as possible while maintaining acceptable visual quality. However, video compression involves sacrificing some degree of picture quality for a lower bit rate that facilitates transmission and storage. In addition, compression requires a high level of performance from the processor as well as versatility in design, since different types of video applications have different sets of requirements for resolution, bandwidth and resiliency. The extended flexibility provided by digital signal processors (DSP) address these differences and take full advantage of the options offered by advanced video compression standards to help system developers optimize their products.
The inherent structure and complexity of video encoding and decoding (codec) algorithms drives the optimization approach. Encoders are particularly important because they must adapt to the application and they represent a major portion of the heavy processing load of video applications. While encoders are based on the mathematical principles of information theory, they may still require implementation trade-offs, which can be quite complex. Developers can benefit from encoders that are highly configurable and can provide an easy-to-use system interface and performance optimization for a wide range of video applications.
Video compression features
Figure 1: Flow of a generic motion compensation-based video encoder
Raw digital video requires a lot of data to be transmitted or stored. An advanced video codec such as H.264/MPEG-4 AVC can achieve compression ratios of between 60:1 and 100:1 with sustained throughput. This makes it possible to squeeze video with a high data rate through a narrow transmission channel and store it in a limited space.
Like JPEG for still images, ITU and MPEG video encoding algorithms can occupy a combination of discrete transform coding (DCT or similiar), quantization and variable-length coding to compress macro-blocks within a frame (intra-frame). Once the algorithm has established a baseline intra-coded (I) frame, a number of subsequent predicted (P) frames are created by coding only the difference in visual content or residual between each of them. This inter-frame compression is achieved using a technique called motion compensation. The algorithm first estimates where the macro-blocks of an earlier reference frame have moved in the current frame, then subtracts and compresses the residual.
Figure 2: Coding the difference between frame 1 and frame 2
Figure 1 shows the flow of a generic motion compensation-based video encoder. The motion estimation stage, which creates motion vector (MV) data that describe where each of the blocks has moved, is usually the most computation-intensive stage of the algorithm.
Figure 2 shows a P frame (right) and its reference (left). Below the P frame, the residual (black) shows how little encoding remains once the motion vectors (blue) have been calculated.
Video compression standards specify only the bit-stream syntax and the decoding process, allowing for significant latitude for innovation within the encoders. Another area of opportunity to innovate is with rate control, which allows the encoder to assign quantization parameters and therefore “shapes” the noise in the video signal in appropriate ways. In addition, the advanced H.264/MPEG-4 AVC standard adds flexibility and functionality by providing multiple options for macro-block size, quarter-pel (pixel) resolution for motion compensation, multiple-reference frames, bi-directional frame prediction (B frames) and adaptive in-loop de-blocking.
Varied Application Requirements
Video application requirements can vary enormously. The range of features in advanced compression standards offer a large potential for trading off the options in order to balance complexity, delay and other real-time constraints. Consider, for instance, the different requirements for video phones, video conferencing and digital video recorders (DVRs).
Video phones and video conferencing
With video phone and video conferencing applications, transmission bandwidth is typically the most important issue. This can range from tens of kilobits per second up to multi-megabits per second depending on the link. In some cases, the bit rate is guaranteed but with the Internet and many intranets, bit rates are highly variable. As a result, video conferencing encoders frequently need to address the delivery requirements of different types of links and adapt in real time to changing bandwidth availability. When the transmitting system is notified of reception conditions, it should be able to adjust its encoded output continually so that the best possible video is delivered with minimal interruption. When delivery is poor, the encoder may respond by reducing its average bit rate, skipping frames or changing the group of pictures (GoP), the mix of I and P frames. I frames are not as heavily compressed as P frames, so a GoP with fewer I frames requires less bandwidth overall. Since the visual content of a video conference does not change frequently, it is usually acceptable to send fewer I frames than would be needed for entertainment applications.
Figure 3: Progressive strips of P frames can be intra-coded (I strips), which eliminates the need
for complete I frames (after the initial frame).
H.264 uses an adaptive in-loop de-blocking filter that operates on the block edges to smooth the video for current and future frames, resulting in an improvement of the quality of video encoded especially at low bit rates. Alternatively, turning off the filter can increase the amount of visual data at a given bit rate, as can changing the motion estimation resolution from quarter-pel to half-pel or more. In some cases, it may be necessary to sacrifice the higher quality of de-blocking and fine resolution in order to reduce the complexity of encoding.
Since packet delivery via the Internet is not guaranteed, video conferencing often benefits from encoding mechanisms that increase error resilience. As Figure 3 illustrates, progressive strips of P frames can be intra-coded (I strips), which eliminates the need for complete I frames (after the initial frame) and reduces the risk that an entire I frame will be dropped and the picture broken up.
Digital Video Recording
Digital video recorders (DVRs) for home entertainment are perhaps the most widely used application for real-time video encoders. For these systems, achieving the best trade-off of storage with picture quality is a significant problem. Unlike video conferencing, which is not delay-tolerant, compression for video recording can take place with some real-time delay if sufficient memory is available in the system for buffering. Realistic design considerations mean the output buffer is designed for handling several frames, which is sufficient to keep a steady flow of data to the disk. Under certain conditions, however, the buffer may become congested because the visual information is changing quickly and the algorithm is creating a large amount of P frame data. When the congestion has been taken care of, then the quality can be increased again.
A mechanism for performing this trade-off effectively is by changing the quantization parameter, Qp, on the fly. Quantization is one of the last steps in the algorithm for compressing data. Increased quantization reduces the bit rate output of the algorithm but creates picture distortion in direct proportion to the square of Qp. Increasing Qp reduces the bit rate output of the algorithm but sacrifices picture quality. However, since the change occurs in real-time, it reduces the likelihood of frame skips or picture break-up. When the visual content is changing rapidly, as it is when the buffer is congested, lower image quality is likely to be less noticeable than it is when the content changes slowly. When the visual content returns to a lower bit rate and the buffer clears, Qp can be reset to its normal value.
Flexibility with Encoders
Since developers utilize DSPs for a wide range of video applications, DSP encoders should be designed to take advantage of the flexibility inherent in compression standards. An example can be found in the encoders that operate on Texas Instruments’ OMAP media processors for mobile applications, TMS320C64x+ DSPs or processors based on DaVinci. To maximize compression performance, each of the encoders is designed to leverage the DSP architecture of its platform, including the video and imaging coprocessor (VICP) that is designed into some of the processors.
A basic set of APIs with default parameters is used in all the encoders, so that the system interface remains the same, regardless of the type of system. Extended API parameters adapt an encoder to the requirements of specific applications. By default, parameters are preset to high-quality settings; a high-speed preset is also available. All preset parameters can be overridden by the program using extended parameters.
Extended parameters adapt the application to either H.264 or MPEG-4. The encoders support several options including YUV 4:2:2 and YUV 4:2:0 input formats, motion resolution down to a quarter-pel, I frame intervals ranging from every frame to none after the first frame, Qp bit rate control, access to motion vectors, de-blocking filter control, simultaneous encoding of two or more channels, I strips and other options. The encoders dynamically and unrestrictedly determine the search range for motion vectors by default, a technique that improves on fixed-range searches.
Furthermore, there are generally “sweet spots” of operation, where output bit rates operate optimally for a given input resolution and frames per second (fps). Developers should be aware of the “sweet spots” with encoders in order to design their systems for the best trade-offs of transmission and picture quality.
As digital video continues to spread to new types of systems, developers need to be aware of the many differences that exist among the wide range of video applications. In general, compression requirements trade off bit rate and picture quality, though the many different ways of achieving these trade-offs can be complicated. DSP encoders provide systems engineers with the performance and flexibility to adapt video compression to the requirements in the rapidly expanding world of digital video.
Texas Instruments Inc.
This article originally appeared in the February, 2008 issue of Portable Design. Reprinted with permission.