3.6. Considerations on the Group Size
This section provides some guidelines to the group sizes at which the various feedback modes may be used.
3.6.1. ACK Mode
The RTP session MUST have exactly two members and this group size MUST NOT grow, i.e., it MUST be point-to-point communications. Unicast addresses SHOULD be used in the session description.
For unidirectional as well as bi-directional communication between two parties, 2.5% of the RTP session bandwidth is available for RTCP traffic from the receivers including feedback. For a 64-kbit/s stream this yields 1,600 bit/s for RTCP. If we assume an average of 96 bytes (=768 bits) per RTCP packet, a receiver can report 2 events per second back to the sender. If acknowledgements for 10 events are collected in each FB message, then 20 events can be acknowledged per second. At 256 kbit/s, 8 events could be reported per second; thus, the ACKs may be sent in a finer granularity (e.g., only combining three ACKs per FB message).
From 1 Mbit/s upwards, a receiver would be able to acknowledge each individual frame (not packet!) in a 30-fps video stream.
ACK strategies MUST be defined to work properly with these bandwidth limitations. An indication whether or not ACKs are allowed for a session and, if so, which ACK strategy should be used, MAY be conveyed by out-of-band mechanisms, e.g., media-specific attributes in a session description using SDP.
3.6.2. NACK Mode
Negative acknowledgements (and the other types of feedback exhibiting similar reporting characteristics) MUST be used for all sessions with a group size that may grow larger than two. Of course, NACKs MAY be used for point-to-point communications as well.
Whether or not the use of Early RTCP packets should be considered depends upon a number of parameters including session bandwidth, codec, special type of feedback, and number of senders and receivers.
The most important parameters when determining the mode of operation are the allowed minimal interval between two compound RTCP packets (T_rr) and the average number of events that presumably need reporting per time interval (plus their distribution over time, of course). The minimum interval can be derived from the available RTCP bandwidth and the expected average size of an RTCP packet. The number of events to report (e.g., per second) may be derived from the packet loss rate and sender's rate of transmitting packets. From these two values, the allowable group size for the Immediate Feedback mode can be calculated.
As stated in Section 3.3:
Let N be the average number of events to be reported per interval T by a receiver, B the RTCP bandwidth fraction for this particular receiver, and R the average RTCP packet size, then the receiver operates in Immediate Feedback mode as long as N<=B*T/R.
The upper bound for the Early RTCP mode then solely depends on the acceptable quality degradation, i.e., how many events per time interval may go unreported.
As stated in Section 3.3:
Using the above notation, Early RTCP mode can be roughly characterized by N > B*T/R as "lower bound". An estimate for an upper bound is more difficult. Setting N=1, we obtain for a given R and B the interval T = R/B as average interval between events to be reported. This information can be used as a hint to determine whether or not early transmission of RTCP packets is useful.
Example: If a 256-kbit/s video with 30 fps is transmitted through a network with an MTU size of some 1,500 bytes, then, in most cases, each frame would fit in into one packet leading to a packet rate of 30 packets per second. If 5% packet loss occurs in the network (equally distributed, no inter-dependence between receivers), then each receiver will, on average, have to report 3 packets lost each two seconds. Assuming a single sender and more than three receivers, this yields 3.75% of the RTCP bandwidth allocated to the receivers and thus 9.6 kbit/s. Assuming further a size of 120 bytes for the average compound RTCP packet allows 10 RTCP packets to be sent per second or 20 in two seconds. If every receiver needs to report three lost packets per two seconds, this yields a maximum group size of 6-7 receivers if all loss events are reported. The rules for transmission of Early RTCP packets should provide sufficient flexibility for most of this reporting to occur in a timely fashion.
Extending this example to determine the upper bound for Early RTCP mode could lead to the following considerations: assume that the underlying coding scheme and the application (as well as the tolerant users) allow on the order of one loss without repair per two seconds. Thus, the number of packets to be reported by each receiver decreases to two per two seconds and increases the group size to 10. Assuming further that some number of packet losses are correlated, feedback traffic is further reduced and group sizes of some 12 to 16 (maybe even 20) can be reasonably well supported using Early RTCP mode. Note that all these considerations are based upon statistics and will fail to hold in some cases.