5. Datagram Packetization Layer PMTUD

This section specifies Datagram PLPMTUD (DPLPMTUD). The method can be introduced at various points in the IP protocol stack to discover the PLPMTU so that an application can utilize an appropriate MPS for the current network path.

DPLPMTUD SHOULD be performed only at one layer between a pair of endpoints. Therefore, when DPLPMTUD is enabled at a lower layer, an upper layer PL or application ought to avoid using DPLPMTUD. A PL MUST adjust the MPS indicated by DPLPMTUD to account for any additional overhead introduced by the PL.

DPLPMTUD Implementation Location Examples

Application Data  *
   ↓
QUIC/RTP         * (Can implement DPLPMTUD)
   ↓
UDP              * (Can implement DPLPMTUD)
   ↓
IP Layer
   ↓
Network Interface

The central idea of DPLPMTUD is probing by the sender. Probe packets are sent to find the maximum size of user message that can be completely transferred from the sender to the destination.

The following sections identify the components required for implementation, provide an overview of the operational phases, and specify the state machine and search algorithm.

5.1. DPLPMTUD Components

This section describes the timers, constants, and variables of DPLPMTUD.

5.1.1. Timers

The method utilizes up to three timers:

PROBE_TIMER

Configuration: Timeout greater than the maximum time to receive an acknowledgment to a probe packet
Minimum: MUST NOT be less than 1 second
Recommended: SHOULD be larger than 15 seconds
Reference: Section 3.1.1 of the UDP Usage Guidelines [BCP145] provides guidance on selection of timer values

PMTU_RAISE_TIMER

Function: The period a sender will continue to use the current PLPMTU, after which it reenters the Search Phase
Period: 600 seconds, as recommended by PLPMTUD [RFC4821]
Optimization: DPLPMTUD MAY inhibit sending probe packets when no application data has been sent since the last probe packet. A PL preferring to use the latest PMTU when again sending user data can choose to continue PMTU discovery for each path. However, this will result in sending additional packets

CONFIRMATION_TIMER

Applicability: MUST NOT be used when an Acknowledged PL is used
Function: For other PLs, configured as the period a PL sender waits before confirming the current PLPMTU is still supported
Relationship: Smaller than PMTU_RAISE_TIMER, used to reduce the PLPMTU (e.g., when a black hole is encountered)
Frequency: Confirmation needs to be frequent enough that the sending PL does not black-hole a large amount of traffic when data is flowing
Reference: Section 3.1.1 of the UDP Usage Guidelines [BCP145] provides guidance on selection of timer values
Optimization: DPLPMTUD MAY inhibit sending probe packets when no application data has been sent since the last probe packet

Note: DPLPMTUD specifies various timers; however, an implementation can choose to realize these timer functions using a single timer.

5.1.2. Constants

The following constants are defined:

MAX_PROBES

Definition: The maximum value of the PROBE_COUNT counter
Meaning: Represents a limit on the number of consecutive probe attempts of any size
Benefit: A MAX_PROBES value greater than 1 can provide robustness to isolated packet loss
Default: 3

MIN_PLPMTU

Definition: The smallest PLPMTU size that DPLPMTUD will attempt to use
Configuration: An endpoint might need to configure MIN_PLPMTU to provide space for extension headers and other encapsulation at layers below the PL
Path Dependency: This value can be interface and path dependent
IPv6: This size is greater than or equal to the size at the PL that results in a 1280-byte IPv6 packet, as specified in [RFC8200]
IPv4: This size is greater than or equal to the size at the PL that results in a 68-byte IPv4 packet
- Note: IPv4 routers are required to be able to forward a datagram of 68 bytes without further fragmentation. This is the combined size of an IPv4 header and the minimum fragment size of 8 bytes. In addition, receivers are required to be able to reassemble fragmented datagrams at least 576 bytes in size, as stated in Section 3.3.3 of [RFC1122]

MAX_PLPMTU

Definition: The largest PLPMTU size
Limitation: Must be less than or equal to the maximum size of PL packet that can be sent on the outgoing interface (constrained by the local interface MTU)
Consideration: Ought also to be less than the maximum size of PL packet that the remote endpoint can receive (constrained by EMTU_R) when this is known
Design Limitation: Can be limited by the design or configuration of the PL in use
Application Limitation: An application or PL MAY choose a smaller MAX_PLPMTU when there is no need to send packets larger than a specific size

BASE_PLPMTU

Definition: A configured size expected to work for most paths
Range: Equal to or larger than MIN_PLPMTU and smaller than MAX_PLPMTU
Recommended: For most PLs, a suitable BASE_PLPMTU will be larger than 1200 bytes
IPv4: When using IPv4, there is no currently specified equivalent size, a RECOMMENDED default BASE_PLPMTU of 1200 bytes

5.1.3. Variables

This method utilizes a set of variables:

PROBED_SIZE

Definition: The size of the current probe packet as determined at the PL
Nature: This is a tentative value for the PLPMTU, awaiting confirmation

PROBE_COUNT

Definition: A count of the number of successive unsuccessful probe packets that have been sent
Reset: This is set to zero each time a probe packet is acknowledged
Note: Some loss of probes is expected during a search, so the loss of a single probe is not an indication of a PMTU problem

Packet Size Relationship Diagram

MAX_PLPMTU ────────┐
                   │
                   ↓
PROBED_SIZE ───────┤ (Under Probe)
                   │
                   ↓
PLPMTU ────────────┤ (Currently Used)
                   │
                   ↓
BASE_PLPMTU ───────┤ (Baseline)
                   │
                   ↓
MIN_PLPMTU ────────┘ (Minimum)

The diagram above illustrates the relationship between the packet size constants and variables when the DPLPMTUD algorithm performs path probing to increase the PLPMTU size. Probe packets of size PROBED_SIZE have been sent. Once acknowledged, the PLPMTU will be raised to PROBED_SIZE, allowing the DPLPMTUD algorithm to further increase PROBED_SIZE, moving toward sending probe packets of the actual PMTU size.

5.1.4. Overview of DPLPMTUD Phases

This section provides a high-level, informative view of the DPLPMTUD method by describing movement of the method through several operational phases. More detail can be found in the state machine (Section 5.2).

DPLPMTUD Phase Flow Diagram

Initial → Base Phase → Search Phase → Search Complete Phase
  ↓                                   ↑
  └──────→ Error Phase ←──────────────┘

Base Phase

Purpose: The Base Phase uses packets of size BASE_PLPMTU to confirm connectivity to the remote peer
Connection Confirmation: For a connection-oriented PL, connection confirmation is implicit (can be performed in the PL connection handshake). A connectionless PL sends probe packets and uses acknowledgment of this probe packet to confirm the remote peer is reachable
PLPMTU Confirmation: The sender also confirms that the network path supports BASE_PLPMTU. This can be achieved by using PL mechanisms (e.g., using a handshake packet of size BASE_PLPMTU) or by sending a probe packet of BASE_PLPMTU size and confirming reception of that probe packet
Probe Timing: A probe packet of BASE_PLPMTU size can be sent immediately upon entry to the Base Phase (following the connection check). A PL not wishing to support paths with a PLPMTU less than BASE_PLPMTU can simplify this phase to a single step by performing the connection check using a probe of BASE_PLPMTU size
Success: Once confirmed, DPLPMTUD enters the Search Phase
Failure: If the Base Phase fails to confirm BASE_PLPMTU, DPLPMTUD enters the Error Phase

Search Phase

Purpose: The Search Phase utilizes a search algorithm to send probe packets to seek to increase the PLPMTU
Termination: The algorithm concludes by entering the Search Complete Phase when a suitable PLPMTU is found
PTB Response: The PL can respond to PTB messages using PTB messages to advance or terminate the search, see Section 4.6

Search Complete Phase

State: The Search Complete Phase is entered when the PLPMTU is supported on a network path
Periodic Confirmation: The PL can use the CONFIRMATION_TIMER to periodically repeat probe packets of the current PLPMTU size
Black Hole Detection: If the sender is unable to confirm reachability (e.g., if the CONFIRMATION_TIMER expires) or the PL signals a lack of reachability, then a black hole is detected and DPLPMTUD enters the Base Phase
Periodic Search: The PMTU_RAISE_TIMER is used to periodically resume the Search Phase to discover whether the PLPMTU can be raised

Error Phase

Trigger: The Error Phase is entered when the PLPMTU information for a path is conflicting or invalid (e.g., cannot support BASE_PLPMTU), which prevents DPLPMTUD from continuing and reduces the PLPMTU
Mitigation: This state implements a method to mitigate oscillations in the state event engine. It signals a conservative MPS value to higher layers via the PL
Exit: This state is exited when probe packets no longer detect an error. The PL sender then enters the Search State

Robustness: A method solely reducing the PLPMTU to a suitable size is sufficient to ensure reliable operation, but could be very inefficient when the actual PMTU changes or when the method (for whatever reason) makes a suboptimal choice for the PLPMTU.

Complete Implementation: A complete implementation of DPLPMTUD provides an algorithm that allows a DPLPMTUD sender to increase the PLPMTU following changes in the path characteristics, such as when a link is reconfigured with a larger MTU, or when there is a change to the set of links traversed by an end-to-end flow (e.g., after a routing or path failover decision).

5.2. State Machine

The state machine for DPLPMTUD is depicted below. If multipath or multihoming is supported, a state machine is needed for each path.

Note: For clarity, the diagram does not show all transitions.

State Machine Diagram

        [DISABLED]
            ↓ ↑
      Connection Established/Lost
            ↓ ↑
         [BASE] ←──────────────┐
            ↓                   │
        Probe Success          Black Hole Detection
            ↓                   │
       [SEARCHING] ─────────────┤
            ↓                   │
      Probe Complete/Fail       │
            ↓                   │
    [SEARCH_COMPLETE] ──────────┘
            ↑ ↓
      Periodic Raise/Black Hole Detection

        [ERROR]
      (Error Handling)

State Definitions

DISABLED

Initial State: The initial state before probing has started
Entry Condition: Entered from any other state when the PL indicates a loss of connectivity
Exit Condition: Leaving this state once the PL indicates connectivity to the remote PL
Transition: When transitioning to BASE state, a probe packet of size BASE_PLPMTU can be sent immediately

BASE

Purpose: Used to confirm the network path supports the BASE_PLPMTU size, intended to allow an application to continue to work when the actual PMTU is temporarily reduced. It also seeks to avoid a sender using DPLPMTUD from not knowing that packets are undelivered due to a packet or ICMP black hole for an extended period during which it is searching for a larger PLPMTU
On Entry: PROBED_SIZE is set to the BASE_PLPMTU size and PROBE_COUNT is set to zero
Probing: Each time a probe packet is sent, the PROBE_TIMER is started
Successful Exit: The state is exited when a probe packet is acknowledged, the PL sender enters the SEARCHING state
Failure Exit: The state is also left when the PROBE_COUNT reaches MAX_PROBES or a validated PTB message is received. This causes the PL sender to enter the ERROR state

SEARCHING

Primary State: This is the main probing state
Entry Condition: Entered when a probe of BASE_PLPMTU completes
Successful Probe: Each time a probe packet is acknowledged, PROBE_COUNT is set to zero, PLPMTU is set to PROBED_SIZE, and PROBED_SIZE is then increased using the search algorithm (as described in Section 5.3)
Probe Failure: When a probe packet is sent without being acknowledged within the PROBE_TIMER period, PROBE_COUNT is incremented and a new probe is transmitted
Exit Condition: Exiting when PROBE_COUNT reaches MAX_PROBES to enter SEARCH_COMPLETE, a validated PTB is received corresponding to the last successful probe size (PL_PTB_SIZE = PLPMTU), or a probe of MAX_PLPMTU size is acknowledged (PLPMTU = MAX_PLPMTU)
Black Hole Detection: When a black hole is detected while in the SEARCHING state, this causes the PL sender to enter the BASE state

SEARCH_COMPLETE

Completion Flag: Indicates the search has completed. This is the normal maintenance state where the PL is not probing to update the PLPMTU
Duration: DPLPMTUD remains in this state until either the PMTU_RAISE_TIMER expires or a black hole is detected
Unacknowledged PL: When DPLPMTUD uses an Unacknowledged PL and is in the SEARCH_COMPLETE state, the CONFIRMATION_TIMER periodically resets PROBE_COUNT and schedules a probe packet of size PLPMTU. If MAX_PROBES successive PLPMTU-sized probe packets fail to be acknowledged, the method enters the BASE state
Acknowledged PL: When used with an Acknowledged PL (e.g., SCTP), DPLPMTUD SHOULD NOT continue to generate PLPMTU probes in this state

ERROR

Failure Situation: Indicates the network path is not known to support a PLPMTU of at least BASE_PLPMTU size or there is contradictory information about the network path that could otherwise cause the MPS signal to higher layers to oscillate excessively
Oscillation Mitigation: This state implements a method to mitigate oscillations in the state event engine
Conservative Value: It signals a conservative MPS value to higher layers via the PL
Exit: This state is exited when probe packets no longer detect an error. The PL sender then enters the SEARCHING state
Endpoint Fragmentation: The implementation permits enabling endpoint fragmentation if DPLPMTUD is unable to validate MIN_PLPMTU within PROBE_COUNT probes
Disable: If DPLPMTUD is unable to validate MIN_PLPMTU, implementations will transition to the DISABLED state
Note: MIN_PLPMTU can be the same as BASE_PLPMTU, simplifying the operation of this state

5.3. Search to Increase the PLPMTU

This section describes the algorithms used by DPLPMTUD to search for a larger PLPMTU.

5.3.1. Probing for a Larger PLPMTU

Implementations use a search algorithm across the search range to determine whether the network path can support a larger PLPMTU.

The method discovers the search range by confirming the minimum PLPMTU and then using probing to select a PROBED_SIZE less than or equal to MAX_PLPMTU. The MAX_PLPMTU is the minimum of the local MTU and EMTU_R (when learned from the remote endpoint). MAX_PLPMTU MAY be reduced by an application that sets a maximum to the size of datagrams it will send.

When the first probe of size greater than or equal to PLPMTU is sent, PROBE_COUNT is initialized to zero. Each probe packet that is successfully sent to the remote peer is confirmed by an acknowledgment from the PL (see Section 4.1).

Each time a probe packet is sent to the destination, the PROBE_TIMER is started. The timer is canceled when the PL receives an acknowledgment that the probe packet has been successfully sent across the path (Section 4.1). This confirms PROBED_SIZE is supported, and the PROBED_SIZE value is then assigned to PLPMTU. The search algorithm can continue to send subsequent probes of increasing size.

If the timer expires before a probe packet is acknowledged, the probe has failed to confirm PROBED_SIZE. Each time the PROBE_TIMER expires, PROBE_COUNT is incremented, the PROBE_TIMER is reinitialized, and a new probe of the same size or any other size (as determined by the search algorithm) can be sent. A maximum number of consecutive failed probes (MAX_PROBES) is configured. If the value of PROBE_COUNT reaches MAX_PROBES, probing will stop, and the PL sender enters the SEARCH_COMPLETE state.

5.3.2. Selection of Probe Sizes

The search algorithm determines the minimum useful increase in the PLPMTU. It is not constructive for the PL sender to attempt to probe all sizes. This would impose unnecessary load on the path. Implementations SHOULD select a set of probe packet sizes to maximize the gain in PLPMTU from each search step.

Implementations can optimize the search procedure by selecting step sizes from a table of common PMTU sizes. When selecting an appropriate next size to search, implementers ought also to consider common MPS sizes that applications might seek to use and that there could be common MTU sizes in use within the network.

5.3.3. Resilience to Inconsistent Path Information

The decision to increase the PLPMTU needs to be resilient to the possibility of inconsistency in the information that has been learned about the network path. Inconsistency in the path can arise when probe packets are lost for reasons other than the packet size (i.e., not size-related loss) or due to frequent path changes. Frequent path changes could result from unexpected "jitter" -- where some packets from a flow are delivered along one path, but other packets follow a different path with different properties.

A PL sender is able to detect inconsistency from a sequence of acknowledged PLPMTU probes or from a sequence of PTB messages that it receives. A PL sender can use an alternative search pattern when it detects inconsistent path information, one that limits the MPS that is provided to a smaller value for a period of time. This avoids unnecessary packet loss.

5.4. Robustness to Inconsistent Paths

Some paths could be unable to sustain packets of size BASE_PLPMTU. The Error State can be implemented to provide robustness for such paths. This allows fallback to a PLPMTU smaller than desired rather than suffer connection failure. This can utilize methods such as endpoint IP fragmentation to enable the PL sender to communicate using packets smaller than BASE_PLPMTU.

Algorithm Summary

Key elements of the DPLPMTUD algorithm:

Conservative Start: Begin from BASE_PLPMTU
Progressive Probing: Gradually increase probe size
Confirmation Mechanism: Verify each size is available
Black Hole Detection: Quickly respond to path problems
Periodic Maintenance: Keep PLPMTU up-to-date
Error Recovery: Handle exceptional situations

These mechanisms together ensure that DPLPMTUD can work reliably and efficiently under various network conditions.

DPLPMTUD Implementation Location Examples​

5.1. DPLPMTUD Components​

5.1.1. Timers​

PROBE_TIMER​

PMTU_RAISE_TIMER​

CONFIRMATION_TIMER​

5.1.2. Constants​

MAX_PROBES​

MIN_PLPMTU​

MAX_PLPMTU​

BASE_PLPMTU​

5.1.3. Variables​

PROBED_SIZE​

PROBE_COUNT​

Packet Size Relationship Diagram​

5.1.4. Overview of DPLPMTUD Phases​

DPLPMTUD Phase Flow Diagram​

Base Phase​

Search Phase​

Search Complete Phase​

Error Phase​

5.2. State Machine​

State Machine Diagram​

State Definitions​

DISABLED​

BASE​

SEARCHING​

SEARCH_COMPLETE​

ERROR​

5.3. Search to Increase the PLPMTU​

5.3.1. Probing for a Larger PLPMTU​

5.3.2. Selection of Probe Sizes​

5.3.3. Resilience to Inconsistent Path Information​

5.4. Robustness to Inconsistent Paths​

Algorithm Summary​