IMPLEMENTATION

Basic Properties of the H.263 Source Code
Our Encoder's Source Code Structure
Encoder-Decoder Configuration for Simulations

Basic Properties of the H.263 Source Code

All the experiments carried out during the course of this project were based on the publicly available source code TMN8 H.263 from Nortel. Despite lacking some of the most recent extensions to the H.263 standard the TMN8 provided the basic framework that allowed us to examine the effect of multiple channel realizations within the encoder's mode decision architecture. The TMN8 strictly adheres to the basic hybrid video coding concept as described before. Consequently it is able to exploit both spatial and temporal redundancy of the video sequence. Furthermore it supports motion compensation and prediction which has been entirely adopted for our modified version with multi-channel realizations. Most of the advanced features, however, have been neglected as they are not expected to qualitatively affect the results. Some of those are PB-frame mode, advanced prediction mode and unrestricted motion vector mode, just to name a few.

The basic principles of the encoder's software architecture have been preserved in our implementation and will therefore be mentioned briefly hereafter. The following pseudo-code roughly summarizes the operational principles of the encoder's main loop which is executed once for each frame contained in the input sequence.

switch picture_type:
`case TYPE_INTRA:`
`CodeOneIntra`	All macro blocks are intra coded. Returns the pointer to the reconstructed image which becomes the previous reconstructed frame in the next cycle.
`case TYPE_INTER:`
`CodeOneOrTwo`
`MotionEstimatePicture`	Mode selection and motion estimation. Returns the motion vectors.
`Predict_P`	Predicts the macro block based on motion vectors and the previous frame. Returns the predicted frame `pred_P` and difference `diff` with respect to the current image.
`MB_Encode`	DCT coefficients on each 8x8 block `diff`.
`Quant_blk`	Quantization of the DCT coefficients.
`MB_Decode`	Inverse DCT transform of the quantized signal.
`MB_Reconstruct`	Based on the predicted frame `pred_P` and the reconstructed coefficients the current macro block is reconstructed the same way as accomplished by the decoder.
`ReconImage`	Reconstruction of the entire frame based on the reconstructed macro blocks

Besides some advanced options that are not shown here this main loop can be subdivided into two branches corresponding to the two frame modes P and I. In case of I mode every macro block is encoded in intra mode which is accomplished by the function CodeOneIntra. In case of P mode the decision whether to encode a macro block in intra or inter mode is based on a threshold criterion in the original TMN8 as mentioned before. The latter case deserves further scrutiny. Similar to the the first case a routine called CodeOneOrTwo takes care of the encoding process of the current frame. For this purpose it requires the contents of the frame buffer as an input and provides the current reconstructed frame as an output which in turn represents the contents of the frame buffer in terms of the next frame to be encoded. This function is, however, by far more complex than CodeOneIntra which is illustrated by the subroutine calls in the pseudo-code above.

The functions MotionEstimatePicture and Predict_P serve the purpose of obtaining information about the inherent motion of each macro block. The resulting motion vectors characterize the shift of a macro block between consecutive frames. Exploiting this information allows to transmit only the difference signal denoted by diff between the shifted version of a macro block in the current frame and its counterpart in the previous frame. The subsequent steps, DCT transform and quantization, generate the raw data that is supposed to be transmitted. The encoder processes this data even further by applying dequantization and inverse DCT transform (MB_Decode) in order to reconstruct the encoded macro block by means of MB_Reconstruct. This method allows to determine the errors that were introduced by quantization. Of course, any errors occurring during actual transmission are omitted by this approach. We seek to tackle this issue by means of multiple channels being simulated within the encoder.

Back to Top

Our Encoder's Source Code Structure

Our proposed scheme uses multiple channel realizations in order to improve the mode decision rules. If a frame has to be completely intra coded, e.g. the first frame of each sequence, the procedure remains unchanged with respect to the original version, and therefore the same routine CodeOneIntra can be employed. The additional functionality of our encoder resides entirely in the P frame branch. The changes comprise:

Mode selection is accomplished individually for each macro block according to a rate-distortion optimized approach as opposed to the original threshold method.
A number of parallel channels simulate in advance how each macro block might be affected by errors occurring on the real channel. Hereby a rate-distortion pair is obtained for each of the two modes available which is expected to capture more realistically the conditions that the receiver is faced with.

For each channel realization, a modified version of the routine CodeOneOrTwo called MultiCodeOneOrTwo is executed. Figure 9 roughly illustrates its architecture whereby only one realization is shown. Each input frame is fed through two separate paths in order to obtain the encoded intra- or respectively inter-coded frame. These parallel paths grant an insight into how the real encoder would have encoded a certain macro block given its mode decision. Notably, the frame buffer contents at this point are those of the final encoding stage whose output is transmitted to the receiver. Our encoder proceeds to model the channel errors.

The error patterns are generated randomly by the routine ApplyError with each error pattern being applied to only one pair of intra and inter paths. The errors are concealed according to one of the schemes described in the previous section on multi-channel realizations. Unlike before the frame buffer contents used for concealment are unique for each path in order to also take the effect of error propagation into account. This idea represents the core of our approach and is particularly emphasized in figure 9.

Figure 9: Mode selection and frame reconstruction in the encoder

Using the two modes in parallel enables us to determine two rate-distortion pairs for inter and intra mode respectively. The method for the computation of distortion as described in the section on multi-channel realizations yields one distortion measurement for each macro block. Eventually, a comparison of the cost for the two modes leads to a mode decision. As these decisions are achieved individually for each macro block the set of all mode decisions for an entire frame is conveniently represented by a selection matrix (refer to figure 10). This selection matrix is passed both to the actual encoder using again CodeOneOrTwo) whose mode selection has been modified such that it strictly follows the entries of the selection matrix. Finally, the frame buffers of the channel realizations are updated using the outputs of only one path depending on which of the two paths was specified in the selection matrix. This so-called previous reconstructed frame is in turn needed for concealment in the next cycle.

This approach takes care of error propagation and is particularly suitable for error-prone channels, but requires a certain computational effort that increases with the number of channel realizations. An accurate study of the complexity of this technique goes beyond the goals of this project, especially because the code at its present stage leaves sufficient room for efficiency improvements.

Figure 10: Selection matrix used for mode selection

Back to Top

Encoder-Decoder Configuration for Simulations

In order to perform the simulations, the following parameters have been set and the following conditions assumed:

No advanced H.263 coding scheme or options used. The default coder configuration has been used.
We assume that the real channel conditions (such as error probability) are known on the average, which enables us adjust the encoder's simulation conditions accordingly, e.g. error probability that governs the random error pattern generation for each channel realization)
We do not consider any intra refresh update for both the original and our coder schemes. This has been done in order to prove the robustness of our scheme to error propagation. As a consequence, reasonable medium-size sequences have been considered for the experiments, in order to have consistent results (from 200 to 300 frames of foreman.qcif).
The number of frames skipped by the encoder is 1. So, for a sequence of 300 frames, only one half is encoded and transmitted.
A synchronization marker for each group of block is considered allowing resynchronization at the beginning of each GOB in case of errors.
The full range of possible quantization parameters has been investigated, in order to have encoded sequences at different frame rates and to measure the overall performance in terms of PSNR.
Up to 20 channel realizations have been considered for our proposed encoder.

IMPLEMENTATION

Table of Contents

Basic Properties of the H.263 Source Code

Our Encoder's Source Code Structure

Encoder-Decoder Configuration for Simulations

Basic Properties of the H.263 Source Code

Back to Top

Our Encoder's Source Code Structure

Back to Top

Encoder-Decoder Configuration for Simulations

Back to Top