Best Practices for Securing RTP Media Signaled with SIP
Neustar, Inc.
jon.peterson@team.neustar
Cisco
rlb@ipv.sx
Vigil Security, LLC
housley@vigilsec.com
SIP
RTP
security
Although the Session Initiation Protocol (SIP) includes a suite of security services that has been expanded by numerous specifications over the years, there is no single place that explains how to use SIP to establish confidential media sessions. Additionally, existing mechanisms have some feature gaps that need to be identified and resolved in order for them to address the pervasive monitoring threat model. This specification describes best practices for negotiating confidential media with SIP, including a comprehensive protection solution that binds the media layer to SIP layer identities.
Introduction
The Session Initiation
Protocol (SIP) includes a suite of security services, including
Digest Authentication for authenticating
entities with a shared secret, TLS for
transport security, and (optionally) S/MIME
for body security. SIP is frequently used to establish media sessions -- in
particular, audio or audiovisual sessions, which have their own
security mechanisms available, such as the Secure Real-time Transport Protocol (SRTP). However, the practices needed to bind security at the media layer to security at the SIP layer, to provide an assurance that protection is in place all the way up the stack, rely on a great many external security mechanisms and practices. This document provides documentation to explain their optimal use as a best practice.
Revelations about widespread pervasive monitoring of the Internet have led to a greater desire to protect Internet communications . In order to maximize the use of security features, especially of media confidentiality, opportunistic measures serve as a stopgap when a full suite of services cannot be negotiated all the way up the stack. Opportunistic media security for SIP is described in , which builds on the prior efforts of . With opportunistic encryption, there is an attempt to negotiate the use of encryption, but if the negotiation fails, then cleartext is used. Opportunistic encryption approaches typically have no integrity protection for the keying material.
This document contains the SIP Best-practice Recommendations Against
Network Dangers to privacY (SIPBRANDY) profile of Secure Telephone
Identity Revisited (STIR) for media
confidentiality, providing a comprehensive security solution for SIP media
that includes integrity protection for keying material and offers
application-layer assurance that media confidentiality is in place.
Various specifications that User Agents (UAs) must implement to support media
confidentiality are given in the sections below; a summary of the best
current practices appears in .
Terminology
The key words "MUST", "MUST NOT",
"REQUIRED", "SHALL",
"SHALL NOT", "SHOULD",
"SHOULD NOT",
"RECOMMENDED", "NOT RECOMMENDED",
"MAY", and "OPTIONAL" in this document are
to be interpreted as described in BCP 14
when, and only when, they appear in all
capitals, as shown here.
Security at the SIP and SDP Layer
There are two approaches to providing confidentiality for media sessions set up with SIP: comprehensive protection and opportunistic security (as defined in ). This document only addresses comprehensive protection.
Comprehensive protection for media sessions established by SIP
requires the interaction of three protocols: the Session Initiation
Protocol (SIP), the Session Description Protocol (SDP), and the
Real-time Transport Protocol
(RTP) -- in particular, its secure profile SRTP. Broadly, it is the responsibility of SIP to provide integrity protection for the media keying attributes conveyed by SDP, and those attributes will in turn identify the keys used by endpoints in the RTP media session(s) that SDP negotiates.
Note that this framework does not apply to keys that also require confidentiality protection in the signaling layer, such as the SDP "k=" line, which MUST NOT be used in conjunction with this profile.
In that way, once SIP and SDP have exchanged the necessary information to initiate a session, media endpoints will have a strong assurance that the keys they exchange have not been tampered with by third parties and that end-to-end confidentiality is available.
To establish the identity of the endpoints of a SIP session, this
specification uses STIR. The STIR Identity header has been
designed to prevent a class of impersonation attacks that are commonly
used in robocalling, voicemail hacking, and related threats. STIR
generates a signature over certain features of SIP requests, including
header field values that contain an identity for the originator of the
request, such as the From header field or P&nbhy;Asserted-Identity
field, and also over the media keys in SDP if they are present. As
currently defined, STIR provides a signature over the "a=fingerprint"
attribute, which is a fingerprint of the key used by DTLS-SRTP; consequently, STIR
only offers comprehensive protection for SIP sessions in concert with
SDP and SRTP when DTLS-SRTP is the media security service. The
underlying Personal Assertion
Token (PASSporT) object used by STIR is extensible, however, and it would be possible to provide signatures over other SDP attributes that contain alternate keying material. A profile for using STIR to provide media confidentiality is given in .
STIR Profile for Endpoint Authentication and Verification Services
STIR defines the Identity header field for SIP, which provides a cryptographic attestation of the source of communications. This document includes a profile of STIR, called the SIPBRANDY profile, where the STIR verification service will act in concert with an SRTP media endpoint to ensure that the key fingerprints, as given in SDP, match the keys exchanged to establish DTLS-SRTP. To satisfy this condition, the verification service function would in this case be implemented in the SIP User Agent Server (UAS), which would be composed with the media endpoint. If the STIR authentication service or verification service functions are implemented at an intermediary rather than an endpoint, this introduces the possibility that the intermediary could act as a man in the middle, altering key fingerprints. As this attack is not in STIR's core threat model, which focuses on impersonation rather than man-in-the-middle attacks, STIR offers no specific protections against such interference.
The SIPBRANDY profile for media confidentiality thus shifts these responsibilities to the endpoints rather than the intermediaries. While intermediaries MAY provide the verification service function of STIR for SIPBRANDY transactions, the verification needs to be repeated at the endpoint to obtain end-to-end assurance. Intermediaries supporting this specification MUST NOT block or otherwise redirect calls if they do not trust the signing credential. The SIPBRANDY profile is based on an end-to-end trust model, so it is up to the endpoints to determine if they support signing credentials, not intermediaries.
In order to be compliant with best practices for SIP media confidentiality with comprehensive protection, UA implementations MUST implement both the authentication service and verification service roles described in . STIR authentication services MUST signal their compliance with this specification by including the "msec" claim defined in this specification to the PASSporT payload. Implementations MUST provide key fingerprints in SDP and the appropriate signatures over them as specified in .
When generating either an offer or an answer , compliant implementations MUST include an "a=fingerprint" attribute containing the fingerprint of an appropriate key (see ).
Credentials
In order to implement the authentication service function in the UA,
SIP endpoints will need to acquire the credentials needed to
sign for their own identity. That identity is typically carried in the
From header field of a SIP request and contains either a greenfield
SIP URI (e.g., "sip:alice@example.com") or a telephone number (which
can appear in a variety of ways, e.g., "sip:+17004561212@example.com;user=phone"). contains guidance for separating the two and determining what sort of credential is needed to sign for each.
To date, few commercial certification authorities (CAs) issue
certificates for SIP URIs or telephone numbers; though work is ongoing
on systems for this purpose (such as ), it is not
yet mature enough to be recommended as a best practice. This is one
reason why STIR permits intermediaries to act as an authentication
service on behalf of an entire domain, just as in SIP a proxy server
can provide domain-level SIP service. While CAs that offer
proof-of-possession certificates similar to those used for email could
be offered for SIP -- for either greenfield identifiers or telephone
numbers -- this specification does not require their use.
For users who do not possess such certificates, DTLS-SRTP permits the use of self-signed
public keys. The profile of STIR in this document, called the
SIPBRANDY profile, employs the more relaxed authority
requirements of to allow the
use of self-signed public keys for authentication services that are
composed with UAs, by generating a certificate (per the
guidance in ) with a subject
corresponding to the user's identity. To obtain comprehensive protection with a self-signed certificate, some out-of-band verification is needed as well. Such a credential could be used for trust on first use (see ) by relying parties. Note that relying parties SHOULD NOT use certificate revocation mechanisms or real-time certificate verification systems for self-signed certificates, as they will not increase confidence in the certificate.
Users who wish to remain anonymous can instead generate self-signed certificates as described in .
Generally speaking, without access to out-of-band information about which certificates were issued to whom, it will be very difficult for relying parties to ascertain whether or not the signer of a SIP request is genuinely an "endpoint". Even the term "endpoint" is a problematic one, as SIP UAs can be composed in a variety of architectures and may not be devices under direct user control. While it is possible that techniques based on certificate transparency or similar practices could help UAs to recognize one another's certificates, those operational systems will need to ramp up with the CAs that issue credentials to end-user devices going forward.
Anonymous Communications
In some cases, the identity of the initiator of a SIP session may be withheld due to user or provider policy. Following the recommendations of , this may involve using an identity such as "anonymous@anonymous.invalid" in the identity fields of a SIP request. does not currently permit authentication services to sign for requests that supply this identity. It does, however, permit signing for valid domains, such as "anonymous@example.com", as a way of implementing an anonymization service as specified in .
Even for anonymous sessions, providing media confidentiality and
partial SDP integrity is still desirable. One-time self-signed
certificates for anonymous communications SHOULD
include a subjectAltName of "sip:anonymous@anonymous.invalid".
After a session is terminated, the
certificate SHOULD be discarded, and a new one, with
fresh keying material, SHOULD be generated before each
future anonymous call. As with self-signed certificates, relying
parties SHOULD NOT use certificate revocation
mechanisms or real-time certificate verification systems for anonymous
certificates, as they will not increase confidence in the
certificate.
Note that when using one-time anonymous self-signed certificates, any
man in the middle could strip the Identity header and replace it with
one signed by its own one-time certificate, changing the "mky"
parameters of PASSporT and any "a=fingerprint" attributes in SDP as it
chooses. This signature only provides protection against non&nbhy;Identity-aware entities that might modify SDP without altering the PASSporT conveyed in the Identity header.
Connected Identity Usage
STIR provides integrity
protection for the fingerprint attributes in SIP request bodies but
not SIP responses. When a session is established, therefore, any SDP body carried by a 200&nbhy;class response in the backwards direction will not be protected by an authentication service and cannot be verified. Thus, sending a secured SDP body in the backwards direction will require an extra RTT, typically a request sent in the backwards direction.
explored the problem of providing "connected
identity" to implementations of (which is obsoleted by );
uses a provisional or
mid-dialog UPDATE request in the backwards (reverse) direction to
convey an Identity header field for the recipient of an INVITE. The
procedures in are largely compatible with the
revision of the Identity header in .
However, the following need to be considered:
-
The UPDATE carrying signed SDP with a fingerprint in the backwards
direction needs to be sent during dialog establishment, following the
receipt of a Provisional Response Acknowledgement (PRACK) after a provisional 1xx response.
-
For use with this SIPBRANDY profile for media confidentiality, the UAS that responds to the INVITE request needs to act as an authentication service for the UPDATE sent in the backwards direction.
-
Per the text in regarding the receipt at a User Agent Client (UAC)
of error code 428, 436, 437, or 438 in response to a mid-dialog
request, it is RECOMMENDED that the dialog be treated as terminated. However,
allows the retransmission of requests with repairable error conditions. In particular, an authentication service might retry a mid-dialog rather than treating the dialog as terminated, although only one such retry is permitted.
-
Note that the examples in
are based on
and will not match signatures using .
Future work may be done to revise for STIR; that work should take into account any
impacts on the SIPBRANDY profile described in this document. The use
of has some further
interactions with Interactive Connectivity Establishment (ICE) ; see .
Authorization Decisions
grants STIR verification
services a great deal of latitude when making authorization decisions
based on the presence of the Identity header field. It is largely a
matter of local policy whether an endpoint rejects a call based on the
absence of an Identity header field, or even the presence of a header that fails an integrity check against the request.
For this SIPBRANDY profile of STIR, however, a compliant verification service that receives a dialog-forming SIP request containing an Identity header with a PASSporT type of "msec", after validating the request per the steps described in , MUST reject the request if there is any failure in that validation process with the appropriate status code per . If the request is valid, then if a terminating user accepts the request, it MUST then follow the steps in to act as an authentication service and send a signed request with the "msec" PASSporT type in its Identity header as well, in order to enable end&nbhy;to-end bidirectional confidentiality.
For the purposes of this profile, the "msec" PASSporT type can be used
by authentication services in one of two ways: as a mandatory request
for media security or as a merely opportunistic request for media
security. As any verification service that receives an Identity header
field in a SIP request with an unrecognized PASSporT type will simply
ignore that Identity header, an authentication service will know
whether or not the terminating side supports "msec" based on whether
or not its UA receives a signed request in the backwards direction per
. If no such requests are
received, the UA may do one of two things: shut down the dialog, if
the policy of the UA requires that "msec" be supported by the
terminating side for this dialog; or, if policy permits (e.g., an
explicit acceptance by the user), allow the dialog to continue without
media security.
Media Security Protocols
As there are several ways to negotiate media security with SDP, any of which might be used with either opportunistic or comprehensive protection, further guidance to implementers is needed. In , opportunistic approaches considered include DTLS-SRTP, security descriptions, and ZRTP.
Support for DTLS-SRTP is REQUIRED by this specification.
The "mky" claim of PASSporT provides integrity protection for "a=fingerprint" attributes in SDP, including cases where multiple "a=fingerprint" attributes appear in the same SDP.
Relayed Media and Conferencing
Providing end-to-end media confidentiality for SIP is complicated by the presence of many forms of media relays. While many media relays merely proxy media to a destination, others present themselves as media endpoints and terminate security associations before re&nbhy;originating media to its destination.
Centralized conference bridges are one type of entity that typically
terminates a media session in order to mux media from multiple sources
and then to re-originate the muxed media to conference
participants. In many such implementations, only hop-by-hop media
confidentiality is possible. Work is ongoing to specify a means to
encrypt both (1) the hop-by-hop media between a UA and a
centralized server and (2) the end-to-end media between UAs,
but it is not sufficiently mature at this time to become a best practice. Those protocols are expected to identify their own best-practice recommendations as they mature.
Another class of entities that might relay SIP media are Back-to-Back
User Agents (B2BUAs). If a B2BUA follows the guidance in , it may be possible for B2BUAs
to act as media relays while still permitting end-to-end
confidentiality between UAs.
Ultimately, if an endpoint can decrypt media it receives, then that
endpoint can forward the decrypted media without the knowledge or
consent of the media's originator. No media confidentiality mechanism
can protect against these sorts of relayed disclosures or against a
legitimate endpoint that can legitimately decrypt media and record a copy to be sent
elsewhere (see ).
ICE and Connected Identity
Providing confidentiality for media with comprehensive protection requires careful timing of when media streams should be sent and when a user interface should signify that confidentiality is in place.
In order to best enable end-to-end connectivity between UAs and to
avoid media relays as much as possible, implementations of this
specification MUST support ICE . To speed up call
establishment, it is RECOMMENDED that implementations
support Trickle ICE .
Note that in the comprehensive protection case, the use of connected identity with ICE implies that the answer containing the key fingerprints, and thus the STIR signature, will come in an UPDATE sent in the backwards direction, a provisional response, and a PRACK, rather than in any earlier SDP body. Only at such a time as that UPDATE is received will the media keys be considered exchanged in this case.
Similarly, in order to prevent, or at least mitigate, the
denial-of-service attack described in , this specification incorporates
best practices for ensuring that recipients of media flows have
consented to receive such flows. Implementations of this specification
MUST implement the Session Traversal Utilities for NAT (STUN) usage for consent freshness defined in .
Best Current Practices
The following are the best practices for SIP UAs to provide media confidentiality for SIP sessions.
- Implementations MUST support the SIPBRANDY
profile as defined in and
signal such support in PASSporT via the "msec" header element.
- Implementations MUST follow the authorization
decision behavior described in .
- Implementations MUST support DTLS-SRTP for
management of keys, as described in .
- Implementations MUST support ICE and the STUN
consent freshness mechanism, as specified in .
IANA Considerations
This specification defines a new value for the "Personal Assertion Token
(PASSporT) Extensions" registry called "msec". IANA has added
the entry to the registry with a value pointing to this document.
Security Considerations
This document describes the security features that provide media sessions established with SIP with confidentiality, integrity, and authentication.
References
Normative References
Trickle ICE: Incremental Provisioning of Candidates for the
Interactive Connectivity Establishment (ICE) Protocol
Session Description Protocol (SDP) Offer/Answer Procedures for Interactive Connectivity Establishment (ICE)
A Session Initiation Protocol (SIP) Usage for Incremental
Provisioning of Candidates for the Interactive Connectivity
Establishment (Trickle ICE)
Informative References
Acknowledgements
We thank , , , and for contributions to this problem statement and framework. We
thank and for their careful review.