1. Introduction

The Real-time Transport Protocol (RTP) [1] comprises two components: a data transfer protocol and an associated control protocol (RTCP). Historically, RTP and RTCP have been run on separate UDP ports. With increased use of Network Address Port Translation (NAPT) [14], this has become problematic, since maintaining multiple NAT bindings can be costly. It also complicates firewall administration, since multiple ports must be opened to allow RTP traffic. This memo discusses how the RTP and RTCP flows for a single media type can be run on a single port, to ease NAT traversal and simplify firewall administration, and considers when such multiplexing is appropriate. The multiplexing of several types of media (e.g., audio and video) onto a single port is not considered here (but see Section 5.2 of [1]).

This memo is structured as follows: in Section 2 we discuss the design choices that led to the use of separate ports and comment on the applicability of those choices to current network environments. We discuss terminology in Section 3 and how to distinguish multiplexed packets in Section 4; we then specify when and how RTP and RTCP should be multiplexed, and how to signal multiplexed sessions, in Section 5. Quality of service and bandwidth issues are discussed in Section 6. We conclude with security considerations in Section 7 and IANA considerations in Section 8.

This memo updates Section 11 of [1].

2. Background

An RTP session comprises data packets and periodic control (RTCP) packets. RTCP packets are assumed to use "the same distribution mechanism as the data packets", and the "underlying protocol MUST provide multiplexing of the data and control packets, for example using separate port numbers with UDP" [1]. Multiplexing was deferred to the underlying transport protocol, rather than being provided within RTP, for the following reasons:

Simplicity: an RTP implementation is simplified by moving the RTP and RTCP demultiplexing to the transport layer, since it need not concern itself with the separation of data and control packets. This allows the implementation to be structured in a very natural fashion, with a clean separation of data and control planes.
Efficiency: following the principle of integrated layer processing [15], an implementation will be more efficient when demultiplexing happens in a single place (e.g., according to UDP port) than when split across multiple layers of the stack (e.g., according to UDP port and then according to packet type).
To enable third-party monitors: while unicast voice-over-IP has always been considered, RTP was also designed to support loosely coupled multicast conferences [16] and very large-scale multicast streaming media applications (such as the so-called triple-play IP television (IPTV) service). Accordingly, the design of RTP allows the RTCP packets to be multicast using a separate IP multicast group and UDP port to the data packets. This not only allows participants in a session to get reception-quality feedback but also enables deployment of third-party monitors, which listen to reception quality without access to the data packets. This was intended to provide manageability of multicast sessions, without compromising privacy.

While these design choices are appropriate for many uses of RTP, they are problematic in some cases. There are many RTP deployments that don't use IP multicast, and with the increased use of Network Address Translation (NAT) the simplicity of multiplexing at the transport layer has become a liability, since it requires complex signalling to open multiple NAT pinholes. In environments such as these, it is desirable to provide an alternative to demultiplexing RTP and RTCP using separate UDP ports, instead using only a single UDP port and demultiplexing within the application.

This memo provides such an alternative by multiplexing RTP and RTCP packets on a single UDP port, distinguished by the RTP payload type and RTCP packet type values. This pushes some additional work onto the RTP implementation, in exchange for simplified NAT traversal.

3. Terminology

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [2].

4. Distinguishable RTP and RTCP Packets

When RTP and RTCP packets are multiplexed onto a single port, the RTCP packet type field occupies the same position in the packet as the combination of the RTP marker (M) bit and the RTP payload type (PT). This field can be used to distinguish RTP and RTCP packets when two restrictions are observed: 1) the RTP payload type values used are distinct from the RTCP packet types used; and 2) for each RTP payload type (PT), PT+128 is distinct from the RTCP packet types used. The first constraint precludes a direct conflict between RTP payload type and RTCP packet type; the second constraint precludes a conflict between an RTP data packet with the marker bit set and an RTCP packet.

The following conflicts between RTP and RTCP packet types are known:

RTP payload types 64-65 conflict with the (obsolete) RTCP FIR and NACK packets defined in the original "RTP Payload Format for H.261 Video Streams" [3] (which was obsoleted by [17]).
RTP payload types 72-76 conflict with the RTCP SR, RR, SDES, BYE, and APP packets defined in the RTP specification [1].
RTP payload types 77-78 conflict with the RTCP RTPFB and PSFB packets defined in the RTP/AVPF profile [4].
RTP payload type 79 conflicts with RTCP Extended Report (XR) [5] packets.
RTP payload type 80 conflicts with Receiver Summary Information (RSI) packets defined in "RTCP Extensions for Single-Source Multicast Sessions with Unicast Feedback" [6].

New RTCP packet types may be registered in the future and will further reduce the RTP payload types that are available when multiplexing RTP and RTCP onto a single port. To allow this multiplexing, future RTCP packet type assignments SHOULD be made after the current assignments in the range 209-223, then in the range 194-199, so that only the RTP payload types in the range 64-95 are blocked. RTCP packet types in the ranges 1-191 and 224-254 SHOULD only be used when other values have been exhausted.

Given these constraints, it is RECOMMENDED to follow the guidelines in the RTP/AVP profile [7] for the choice of RTP payload type values, with the additional restriction that payload type values in the range 64-95 MUST NOT be used. Specifically, dynamic RTP payload types SHOULD be chosen in the range 96-127 where possible. Values below 64 MAY be used if that is insufficient, in which case it is RECOMMENDED that payload type numbers that are not statically assigned by [7] be used first.

Note: Since Section 6.1 of [1] specifies that all RTCP packets MUST be sent as compound packets beginning with a Sender Report (SR) or a Receiver Report (RR) packet, one might wonder why RTP payload types other than 72 and 73 are prohibited when multiplexing RTP and RTCP. This is done to support [18], which allows the use of non-compound RTCP packets in some circumstances.