12.5. Video Telephony or Streaming with FUs and Forward Error Correction
12.5. Video Telephony or Streaming with FUs and Forward Error Correction
This scheme has been implemented and has been shown to provide good performance, especially at higher packet loss rates [19].
The most efficient means to combat packet losses for scenarios where retransmissions are not applicable is forward error correction (FEC). Although application layer, end-to-end use of FEC is often less efficient than a FEC-based protection of individual links (especially when links of different characteristics are in the transmission path), application layer, end-to-end FEC is unavoidable in some scenarios. RFC 5109 [18] provides means to use generic, application layer, end-to-end FEC in packet loss environments. A binary forward error correcting code is generated by applying the XOR operation to the bits at the same bit position in different packets. The binary code can be specified by the parameters (n,k), in which k is the number of information packets used in the connection and n is the total number of packets generated for k information packets; that is, n-k parity packets are generated for k information packets.
When a code is used with parameters (n,k) within the RFC 5109 framework, the following properties are well known:
a) If applied over one RTP packet, RFC 5109 provides only packet repetition.
b) RFC 5109 is most bitrate efficient if XOR-connected packets have equal length.
c) At the same packet loss probability p and for a fixed k, the greater the value of n, the smaller the residual error probability becomes. For example, for a packet loss probability of 10%, k=1, and n=2, the residual error probability is about 1%, whereas for n=3, the residual error probability is about 0.1%.
d) At the same packet loss probability p and for a fixed code rate k/n, the greater the value of n, the smaller the residual error probability becomes. For example, at a packet loss probability of p=10%, k=1, and n=2, the residual error rate is about 1%, whereas for an extended Golay code with k=12 and n=24, the residual error rate is about 0.01%.
For applying RFC 5109 in combination with H.264 baseline-coded video without using fragmentation units (FUs), several options might be considered:
-
The video encoder produces NAL units for which each video frame is coded in a single slice. Applying FEC, one could use a simple code, e.g., (n=2, k=1). That is, each NAL unit would basically just be repeated. The disadvantage is obviously the bad code performance according to d), above, and the low flexibility, as only (n, k=1) codes can be used.
-
The video encoder produces NAL units for which each video frame is encoded in one or more consecutive slices. Applying FEC, one could use a better code, e.g., (n=24, k=12), over a sequence of NAL units. Depending on the number of RTP packets per frame, a loss may introduce a significant delay, which is reduced when more RTP packets are used per frame. Packets of completely different lengths might also be connected, which decreases bitrate efficiency according to b), above. However, with some care and for slices of 1 kb or larger, similar length (100-200 bytes difference) may be produced, which will not lower the bit efficiency catastrophically.
-
The video encoder produces NAL units, for which a certain frame contains k slices of possibly almost equal length. Then, applying FEC, a better code, e.g., (n=24, k=12), can be used over the sequence of NAL units for each frame. The delay compared to that of 2), above, may be reduced, but several disadvantages are obvious. First, the coding efficiency of the encoded video is lowered significantly, as slice-structured coding reduces intra-frame prediction and additional slice overhead is necessary. Second, pre-encoded content or, when operating over a gateway, the video is usually not appropriately coded with k slices such that FEC can be applied. Finally, the encoding of video producing k slices of equal length is not straightforward and might require more than one encoding pass.
Many of the mentioned disadvantages can be avoided by applying FUs in combination with FEC. Each NAL unit can be split into any number of FUs of basically equal length; therefore, FEC, with a reasonable k and n, can be applied, even if the encoder made no effort to produce slices of equal length. For example, a coded slice NAL unit containing an entire frame can be split to k FUs, and a parity check code (n=k+1, k) can be applied. However, this has the disadvantage that unless all created fragments can be recovered, the whole slice will be lost. Thus, a larger section is lost than would be if the frame had been split into several slices.
The presented technique makes it possible to achieve good transmission error tolerance, even if no additional source coding layer redundancy (such as periodic intra frames) is present. Consequently, the same coded video sequence can be used to achieve the maximum compression efficiency and quality over error-free transmission and for transmission over error-prone networks. Furthermore, the technique allows the application of FEC to pre-encoded sequences without adding delay. In this case, pre-encoded sequences that are not encoded for error-prone networks can still be transmitted almost reliably without adding extensive delays. In addition, FUs of equal length result in a bitrate efficient use of RFC 5109.
If the error probability depends on the length of the transmitted packet (e.g., in case of mobile transmission [15]), the benefits of applying FUs with FEC are even more obvious. Basically, the flexibility of the size of FUs allows appropriate FEC to be applied for each NAL unit and unequal error protection of NAL units.
When FUs and FEC are used, the incurred overhead is substantial but is in the same order of magnitude as the number of bits that have to be spent for intra-coded macroblocks if no FEC is applied. In [19], it was shown that the overall performance of the FEC-based approach enhanced quality when using the same error rate and same overall bitrate, including the overhead.