Previous Work

Feedback based Approaches

The best decision between inter or intra mode generally requires sufficient knowledge of the quality of reception at the receiver. This approach inevitably demands the use of a feedback channel connecting encoder and decoder. In practice, feedback is not the most convenient method, particularly because of the additionally incurred transmission delay. In case of broadcast applications, where there is more than one receiver involved, feedback cannot even be realized in principle.

Non-Feedback based Approaches

When a feedback channel is not available, switching between inter and intra coding at the encoder can be accomplished in various ways, all of which are based on an estimate of the distortion of the reconstructed frames at the decoder. The encoder aims at minimizing the expected error in order to maintain a high video quality at the receiver:

Heuristic Methods

Applying intra coding at a heuristically pre-determined frequency is an almost intuitive approach. The so-called intra-refresh can be carried out periodically for whole frames, contiguous blocks of the same frame or even random blocks, whereby the frequency decision is based on heuristic considerations. Apparently, this scheme does not guarantee the right balance between compression efficiency and robustness. [Turletti,Huitema]

Threshold Methods

When a rough estimate of the decoder error exceeds a given threshold, the macro block is intra coded. Again, this technique does not adapt to the actual transmission conditions. [Liao, Villasenor]

Content Adaptive Methods

To some extent similar to the previous scheme, this technique leads to an intra coded macro block whenever the changes of this block exceed a given threshold value. Consequently, both the bit rate and the average error lifetime depend on the threshold value and the actual input video properties. The limitations of the previous schemes apply here as well. [Haskell, Messerschmitt]

Vulnerability Adaptive Methods

The error sensitivity of a particular macro block depends on its position with respect to the whole frame. Macro blocks that are closer to synchronization markers for instance are less frequently subject to loss than others. Furthermore, the activity of a macro block indicated by the size of the motion vectors is taken into account. Generally, the more bits required for coding of the respective macro block the higher the error probability in terms of the transmitted bit stream.

Rate-Distortion optimized Mode Selection

Rate-distortion optimization is probably the most promising scheme that is currently in use. Various techniques based on this approach have been proposed so far: early ones considered only the distortion due to packet loss, while more recent ones tend to also incorporate error concealment in the computation of distortion. Although rate-distortion based selection methods represent a significant advantage over the previously mentioned mode switching strategies, they suffer from a severe drawback: the encoder does not have the capability of accurately estimating the overall distortion that the decoder will be faced with. This is mainly due to due to quantization and error propagation at the decoder.

Coté and Kossentini proposed to use three coding modes denoted by inter, intra and skip. The skip mode is a special case of the inter mode for which no data is transmitted. Instead, the spatially corresponding blocks of the previous frame are simply repeated. The cost function based on which the mode decision for each individual macro block takes place, depends on both quantization errors and errors due to concealment. Even if this approach benefits from a better tradeoff between data rate and error resilience, it is not able to capture the effects of error propagation beyond more than one frame. [Coté, Kossentini]

Per-Pixel Estimation

In order to reach a higher accuracy of the estimation of distortion at the decoder, this approach operates on the level of pixels. In other words, the overall distortion is computed as the sum of the distortion measurements of the individual pixels. This approach turns out not to be suitable in practice due to its enormous computational complexity. Furthermore its application is limited to integer pixel accuracy only, representing a severe disadvantage compared to rate-distortion optimization for instance which allows to exploit sub-integer pixel accuracy. [Zhang, Reghunatan, Rose]