All the experiments carried out during the course of this project were based on the publicly available source code TMN8 H.263 from Nortel. Despite lacking some of the most recent extensions to the H.263 standard the TMN8 provided the basic framework that allowed us to examine the effect of multiple channel realizations within the encoder's mode decision architecture. The TMN8 strictly adheres to the basic hybrid video coding concept as described before. Consequently it is able to exploit both spatial and temporal redundancy of the video sequence. Furthermore it supports motion compensation and prediction which has been entirely adopted for our modified version with multi-channel realizations. Most of the advanced features, however, have been neglected as they are not expected to qualitatively affect the results. Some of those are PB-frame mode, advanced prediction mode and unrestricted motion vector mode, just to name a few.
The basic principles of the encoder's software architecture have been preserved in our implementation and will therefore be mentioned briefly hereafter. The following pseudo-code roughly summarizes the operational principles of the encoder's main loop which is executed once for each frame contained in the input sequence.
switch picture_type: | |
|
|
|
|
All macro blocks are intra coded. Returns the pointer to the reconstructed image which becomes the previous reconstructed frame in the next cycle. |
|
|
|
|
|
|
|
Mode selection and motion estimation. Returns the motion vectors. |
|
|
Predicts the macro block based on motion vectors and the previous frame.
Returns the predicted frame pred_P and difference diff with
respect to the current image. |
|
|
DCT coefficients on each 8x8 block diff. |
|
|
Quantization of the DCT coefficients. |
|
|
Inverse DCT transform of the quantized signal. |
|
|
Based on the predicted frame pred_P and the reconstructed
coefficients the current macro block is reconstructed the same way as accomplished
by the decoder. |
|
|
Reconstruction of the entire frame based on the reconstructed macro blocks |
Besides some advanced options that are not shown here this main loop can be subdivided into two
branches corresponding to the two frame modes P and I.
In case of I mode every macro block is encoded in intra mode which is accomplished by
the function CodeOneIntra. In case of P mode the decision whether to encode
a macro block in intra or inter mode is based on a threshold criterion
in the original TMN8 as mentioned before. The latter case deserves further scrutiny.
Similar to the the first case a routine called CodeOneOrTwo takes care of the
encoding process of the current frame. For this purpose it requires the contents of the frame buffer
as an input and provides the current reconstructed frame as an output which in turn represents the
contents of the frame buffer in terms of the next frame to be encoded. This function is,
however, by far more complex than CodeOneIntra which is illustrated by the
subroutine calls in the pseudo-code above.
The functions MotionEstimatePicture and Predict_P serve the purpose
of obtaining information about the inherent motion of each macro block. The resulting motion
vectors characterize the shift of a macro block between consecutive frames. Exploiting this
information allows to transmit only the difference signal denoted by diff
between the shifted version of a macro block in the current frame and its counterpart
in the previous frame. The subsequent steps, DCT transform and quantization, generate
the raw data that is supposed to be transmitted. The encoder processes this data even further
by applying dequantization and inverse DCT transform (MB_Decode) in order to reconstruct
the encoded macro block by means of MB_Reconstruct. This method allows to
determine the errors that were introduced by quantization. Of course, any errors occurring during
actual transmission are omitted by this approach. We seek to tackle this issue by means of
multiple channels being simulated within the encoder.
Our proposed scheme uses multiple channel realizations in order to improve the mode decision
rules. If a frame has to be completely intra coded, e.g. the first frame of each sequence,
the procedure remains unchanged with respect to the original version, and therefore the same
routine CodeOneIntra can be employed. The additional functionality of our encoder
resides entirely in the P frame branch. The changes comprise:
Mode selection is accomplished individually for each macro block according to a rate-distortion optimized approach as opposed to the original threshold method.
A number of parallel channels simulate in advance how each macro block might be affected by errors occurring on the real channel. Hereby a rate-distortion pair is obtained for each of the two modes available which is expected to capture more realistically the conditions that the receiver is faced with.
For each channel realization, a modified version of the routine CodeOneOrTwo
called MultiCodeOneOrTwo is executed. Figure 9 roughly illustrates its
architecture whereby only one realization is shown. Each input frame is fed through
two separate paths in order to obtain the encoded intra- or
respectively inter-coded frame. These parallel paths grant an insight into how the real
encoder would have encoded a certain macro block given its mode decision. Notably, the frame buffer
contents at this point are those of the final encoding stage whose output is transmitted to the
receiver. Our encoder proceeds to model the channel errors.
The error patterns are generated randomly by the routine ApplyError
with each error pattern being applied to only one pair of intra and inter
paths. The errors are concealed according to one of the schemes described in the previous
section on multi-channel realizations. Unlike before the
frame buffer contents used for concealment are unique for each path in order to also
take the effect of error propagation into account. This idea represents the core of our
approach and is particularly emphasized in figure 9.

Figure 9: Mode selection and frame reconstruction in the encoder
Using the two modes in parallel enables us to determine two rate-distortion pairs for
inter and intra mode respectively. The method for the computation of distortion
as described in the section on multi-channel realizations
yields one distortion measurement for each macro block. Eventually, a comparison of the cost for
the two modes leads to a mode decision. As these decisions are achieved individually for each
macro block the set of all mode decisions for an entire frame is conveniently represented by a
selection matrix (refer to figure 10). This selection matrix is passed both to the actual
encoder using again CodeOneOrTwo) whose mode selection has been modified such
that it strictly follows the entries of the selection matrix. Finally, the frame buffers of the
channel realizations are updated using the outputs of only one path depending on which of the
two paths was specified in the selection matrix. This so-called previous reconstructed frame
is in turn needed for concealment in the next cycle.
This approach takes care of error propagation and is particularly suitable for error-prone channels, but requires a certain computational effort that increases with the number of channel realizations. An accurate study of the complexity of this technique goes beyond the goals of this project, especially because the code at its present stage leaves sufficient room for efficiency improvements.

Figure 10: Selection matrix used for mode selection
In order to perform the simulations, the following parameters have been set and the following conditions assumed:
No advanced H.263 coding scheme or options used. The default coder configuration has been used.
We assume that the real channel conditions (such as error probability) are known on the average, which enables us adjust the encoder's simulation conditions accordingly, e.g. error probability that governs the random error pattern generation for each channel realization)
We do not consider any intra refresh update for both the original and our coder schemes. This has been done in order to prove the robustness of our scheme to error propagation. As a consequence, reasonable medium-size sequences have been considered for the experiments, in order to have consistent results (from 200 to 300 frames of foreman.qcif).
The number of frames skipped by the encoder is 1. So, for a sequence of 300 frames, only one half is encoded and transmitted.
A synchronization marker for each group of block is considered allowing resynchronization at the beginning of each GOB in case of errors.
The full range of possible quantization parameters has been investigated, in order to have encoded sequences at different frame rates and to measure the overall performance in terms of PSNR.
Up to 20 channel realizations have been considered for our proposed encoder.