Appendix A. Requirements Analysis
[RFC5479] describes security requirements for media keying. This section evaluates this proposal with respect to each requirement.
A.1. Forking and Retargeting (R-FORK-RETARGET, R-BEST-SECURE, R-DISTINCT)
In this document, the SDP offer (in the INVITE) is simply an advertisement of the capability to do security. This advertisement does not depend on the identity of the communicating peer, so forking and retargeting work when all the endpoints will do SRTP. When a mix of SRTP and non-SRTP endpoints are present, we use the SDP capabilities mechanism currently being defined [MMUSIC-SDP] to transparently negotiate security where possible. Because DTLS establishes a new key for each session, only the entity with which the call is finally established gets the media encryption keys (R3).
A.2. Distinct Cryptographic Contexts (R-DISTINCT)
DTLS performs a new DTLS handshake with each endpoint, which establishes distinct keys and cryptographic contexts for each endpoint.
A.3. Reusage of a Security Context (R-REUSE)
DTLS allows sessions to be resumed with the 'TLS session resumption' functionality. This feature can be used to lower the amount of cryptographic computation that needs to be done when two peers re-initiate the communication. See [RFC5764] for more on session resumption in this context.
A.4. Clipping (R-AVOID-CLIPPING)
Because the key establishment occurs in the media plane, media need not be clipped before the receipt of the SDP answer. Note, however, that only confidentiality is provided until the offerer receives the answer: the answerer knows that they are not sending data to an attacker but the offerer cannot know that they are receiving data from the answerer.
A.5. Passive Attacks on the Media Path (R-PASS-MEDIA)
The public key algorithms used by DTLS cipher suites, such as RSA, Diffie-Hellman, and Elliptic Curve Diffie-Hellman, are secure against passive attacks.
A.6. Passive Attacks on the Signaling Path (R-PASS-SIG)
DTLS provides protection against passive attacks by adversaries on the signaling path since only a fingerprint is exchanged using SIP signaling.
A.7. (R-SIG-MEDIA, R-ACT-ACT)
An attacker who controls the media channel but not the signaling channel can perform a MITM attack on the DTLS handshake but this will change the certificates that will cause the fingerprint check to fail. Thus, any successful attack requires that the attacker modify the signaling messages to replace the fingerprints.
If RFC 4474 Identity or an equivalent mechanism is used, an attacker who controls the signaling channel at any point between the proxies performing the Identity signatures cannot modify the fingerprints without invalidating the signature. Thus, even an attacker who controls both signaling and media paths cannot successfully attack the media traffic. Note that the channel between the UA and the authentication service MUST be secured and the authentication service MUST verify the UA's identity in order for this mechanism to be secure.
Note that an attacker who controls the authentication service can impersonate the UA using that authentication service. This is an intended feature of SIP Identity -- the authentication service owns the namespace and therefore defines which user has which identity.
A.8. Binding to Identifiers (R-ID-BINDING)
When an end-to-end mechanism such as SIP-Identity [RFC4474] and SIP- Connected-Identity [RFC4916] or S/MIME are used, they bind the endpoint's certificate fingerprints to the From: address in the signaling. The fingerprint is covered by the Identity signature. When other mechanisms (e.g., SIPS) are used, then the binding is correspondingly weaker.
A.9. Perfect Forward Secrecy (R-PFS)
DTLS supports Diffie-Hellman and Elliptic Curve Diffie-Hellman cipher suites that provide PFS.
A.10. Algorithm Negotiation (R-COMPUTE)
DTLS negotiates cipher suites before performing significant cryptographic computation and therefore supports algorithm negotiation and multiple cipher suites without additional computational expense.
A.11. RTP Validity Check (R-RTP-VALID)
DTLS packets do not pass the RTP validity check. The first byte of a DTLS packet is the content type and all current DTLS content types have the first two bits set to zero, resulting in a version of zero; thus, failing the first validity check. DTLS packets can also be distinguished from STUN packets. See [RFC5764] for details on demultiplexing.
A.12. Third-Party Certificates (R-CERTS, R-EXISTING)
Third-party certificates are not required because signaling (e.g., [RFC4474]) is used to authenticate the certificates used by DTLS. However, if the parties share an authentication infrastructure that is compatible with TLS (third-party certificates or shared keys) it can be used.
A.13. FIPS 140-2 (R-FIPS)
TLS implementations already may be FIPS 140-2 approved and the algorithms used here are consistent with the approval of DTLS and DTLS-SRTP.
A.14. Linkage between Keying Exchange and SIP Signaling (R-ASSOC)
The signaling exchange is linked to the key management exchange using the fingerprints carried in SIP and the certificates are exchanged in DTLS.
A.15. Denial-of-Service Vulnerability (R-DOS)
DTLS offers some degree of Denial-of-Service (DoS) protection as a built-in feature (see Section 4.2.1 of [RFC4347]).
A.16. Crypto-Agility (R-AGILITY)
DTLS allows cipher suites to be negotiated and hence new algorithms can be incrementally deployed. Work on replacing the fixed MD5/SHA-1 key derivation function is ongoing.
A.17. Downgrading Protection (R-DOWNGRADE)
DTLS provides protection against downgrading attacks since the selection of the offered cipher suites is confirmed in a later stage of the handshake. This protection is efficient unless an adversary is able to break a cipher suite in real-time. RFC 4474 is able to prevent an active attacker on the signaling path from downgrading the call from SRTP to RTP.
A.18. Media Security Negotiation (R-NEGOTIATE)
DTLS allows a User Agent to negotiate media security parameters for each individual session.
A.19. Signaling Protocol Independence (R-OTHER-SIGNALING)
The DTLS-SRTP framework does not rely on SIP; every protocol that is capable of exchanging a fingerprint and the media description can be secured.
A.20. Media Recording (R-RECORDING)
An extension, see [SIPPING-SRTP], has been specified to support media recording that does not require intermediaries to act as an MITM.
When media recording is done by intermediaries, then they need to act as an MITM.
A.21. Interworking with Intermediaries (R-TRANSCODER)
In order to interface with any intermediary that transcodes the media, the transcoder must have access to the keying material and be treated as an endpoint for the purposes of this document.
A.22. PSTN Gateway Termination (R-PSTN)
The DTLS-SRTP framework allows the media security to terminate at a PSTN gateway. This does not provide end-to-end security, but is consistent with the security goals of this framework because the gateway is authorized to speak for the PSTN namespace.
A.23. R-ALLOW-RTP
DTLS-SRTP allows RTP media to be received by the calling party until SRTP has been negotiated with the answerer, after which SRTP is preferred over RTP.
A.24. R-HERFP
The Heterogeneous Error Response Forking Problem (HERFP) is not applicable to DTLS-SRTP since the key exchange protocol will be executed along the media path and hence error messages are communicated along this path and proxies do not need to progress them.