Interworking between the Session Initiation Protocol (SIP) and the Extensible Messaging and Presence Protocol (XMPP): Media SessionsCiscopsaintan@cisco.com
Applications
SIPXMPPJingleThis document defines a bi-directional protocol mapping for use by gateways that enable the exchange of media signalling messages between systems that implement the Jingle extensions to the Extensible Messaging and Presence Protocol (XMPP) and those that implement the Session Initiation Protocol (SIP).The Session Initiation Protocol is a widely-deployed technology for the management of media sessions (such as voice calls) over the Internet. SIP itself provides a signalling channel (typically via the User Datagram Protocol ), over which two or more parties can exchange messages for the purpose of negotiating a media session that uses a dedicated media channel such as the Real-time Transport Protocol .The Extensible Messaging and Presence Protocol also provides a signalling channel, typically via the Transmission Control Protocol . Given the significant differences between XMPP and SIP, it is difficult to combine the two technologies in a single user agent. Therefore, developers wishing to add media session capabilities to XMPP clients have defined an XMPP-specific negotiation protocol called Jingle .However, Jingle has been designed to easily map to SIP for communication through gateways or other transformation mechanisms. Therefore, consistent with existing specifications for mapping between SIP and XMPP (see and other specifications in that "series"), this document describes a bi-directional protocol mapping for use by gateways that enable the exchange of media signalling messages between systems that implement SIP and those that implement the XMPP Jingle extensions.Note: The capitalized key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.As mentioned, Jingle was designed in part to enable straightforward protocol mapping between XMPP and SIP. However, given the significantly different technology assumptions underlying XMPP and SIP, Jingle is naturally different from SIP in several important respects:Base SIP messages and headers use a plaintext format similar in some ways to the Hypertext Transport Protocol , whereas Jingle messages are pure XML. Mappings between SIP headers and Jingle message syntax are provided below.The SIP payloads defining session semantics use the Session Description Protocol , whereas the equivalent Jingle payloads are defined as XML child elements of the Jingle <content/> element. However, the Jingle specifications defining such child elements specify mappings to SDP for all Jingle syntax, making the mapping relatively straightforward.The SIP signalling channel is transported over UDP, whereas the signalling channel for Jingle is XMPP over TCP. Mapping between the transport layers typically happens within a gateway using techniques below the application level, and therefore is not addressed in this specification.Jingle is designed in a modular fashion, so that session description data is generally carried in a payload within the generic Jingle elements, i.e., the <jingle/> element and its <content/> child. The following example illustrates this structure, where the XMPP stanza is a request to initiate an audio session using RTP over a raw UDP transport.In the foregoing example, the syntax and semantics of the <jingle/> and <content/> elements are defined in , the syntax and semantics of the <description/> element are defined in , and the syntax and semantics of the <transport/> element are defined in . Other <description/> elements are defined in specifications for the appropriate application types (see for example ) and other <transport/> elements are defined in the specifications for appropriate transport methods (see for example , which defines an XMPP profile of ).At the core Jingle layer, the following mappings are defined.The 'action' attribute of the <jingle/> element has nine allowable values. In general they should be mapped as shown in the following table, with some exceptions as described herein.A Jingle application format for audio exchange via RTP is specified in . This application format effectively maps to the "RTP/AVP" profile specified in , where the media type is "audio" and the specific mappings to SDP syntax are provided in .A Jingle application format for video exchange via RTP is specified in . This application format effectively maps to the "RTP/AVP" profile specified in , where the media type is "audio" and the specific mappings to SDP syntax are provided in .A basic Jingle transport method for exchanging media over UDP is specified in . This transport method involves the negotiation of an IP address and port only, and does not provide NAT traversal. The Jingle 'ip' attribute maps to the connection-address parameter of the SDP c= line and the 'port' attribute maps to the port parameter of the SDP m= line.A more advanced Jingle transport method for exchanging media over UDP is specified in . Under ideal conditions this transport method provides NAT traversal by following the Interactive Connectivity Exchange methodology specified in . The relevant SDP mappings are provided in .The following sections provide sample scenarios (or "call flows") that illustrate the principles of interworking from Jingle to SIP. These scenarios are not exhaustive.The protocol flow for a basic voice chat for which an XMPP user (juliet@example.com) is the iniator and a SIP user (romeo@example.net) is the responder. The voice chat is consummated through a gateway. To simplify the example, the transport method negotiated is "raw user datagram protocol" as specified in .The packet flow is as follows.First the XMPP user sends a Jingle session-initiation request to the SIP user.The gateway returns an XMPP IQ-result to the initiator on behalf of the responder.The gateway transforms the Jingle session-initiate action into a SIP INVITE.The responder returns a SIP 180 Ringing message.The gateway transforms the ringing message into XMPP syntax.The initiator returns an IQ-result acknowledging receipt of the ringing message, which is used only by the gateway and not transformed into SIP syntax.The responder sends a SIP 200 OK to the initiator.The gateway transforms the 200 OK into a Jingle session-accept action.If the payload types and transport candidate can be successfully used by both parties, then the initiator acknowledges the session-accept action.The parties now begin to exchange media. In this case they would exchange audio using the Speex codec at a clockrate of 8000 since that is the highest-priority codec for the responder (as determined by the XML order of the <payloadtype/> children).The parties may continue the session as long as desired.Eventually, one of the parties (in this case the responder) terminates the session.The gateway transforms the SIP BYE into XMPP syntax.The initiator returns an IQ-result acknowledging receipt of the session termination, which is used only by the gateway and not transformed into SIP syntax.To follow.Detailed security considerations for session management are given for SIP in and for XMPP in (see also ).Interactive Connectivity Establishment (ICE): A Protocol for Network Address Translator (NAT) Traversal for Offer/Answer ProtocolsThis document describes a protocol for Network Address Translator (NAT) traversal for UDP-based multimedia sessions established with the offer/answer model. This protocol is called Interactive Connectivity Establishment (ICE). ICE makes use of the Session Traversal Utilities for NAT (STUN) protocol and its extension, Traversal Using Relay NAT (TURN). ICE can be used by any protocol utilizing the offer/answer model, such as the Session Initiation Protocol (SIP).Jinglescottlu@google.comjbeda@google.comstpeter@jabber.orgrobert.mcqueen@collabora.co.ukseanegan@google.comjhildebr@cisco.comJingle RTP Sessionsscottlu@google.comseanegan@google.comrobert.mcqueen@collabora.co.ukJingle ICE-UDP Transport Methodjbeda@google.comscottlu@google.comjhildebrand@jabber.comseanegan@google.comJingle Raw UDP Transportjbeda@google.comscottlu@google.comjhildebrand@jabber.comseanegan@google.comRTP Profile for Audio and Video Conferences with Minimal ControlThis document describes a profile called "RTP/AVP" for the use of the real-time transport protocol (RTP), version 2, and the associated control protocol, RTCP, within audio and video multiparticipant conferences with minimal control. It provides interpretations of generic fields within the RTP specification suitable for audio and video conferences. In particular, this document defines a set of default mappings from payload type numbers to encodings. This document also describes how audio and video data may be carried within RTP. It defines a set of standard encodings and their names when used within RTP. The descriptions provide pointers to reference implementations and the detailed standards. This document is meant as an aid for implementors of audio, video and other real-time multimedia applications. This memorandum obsoletes RFC 1890. It is mostly backwards-compatible except for functions removed because two interoperable implementations were not found. The additions to RFC 1890 codify existing practice in the use of payload formats under this profile and include new payload formats defined since RFC 1890 was published. [STANDARDS TRACK]SDP: Session Description ProtocolThis memo defines the Session Description Protocol (SDP). SDP is intended for describing multimedia sessions for the purposes of session announcement, session invitation, and other forms of multimedia session initiation. [STANDARDS TRACK]SIP: Session Initiation ProtocolInterworking between the Session Initiation Protocol (SIP) and the Extensible Messaging and Presence Protocol (XMPP): CoreAs a foundation for the definition of application-specific, bi-directional protocol mappings between the Session Initiation Protocol (SIP) and the Extensible Messaging and Presence Protocol (XMPP), this document specifies the architectural assumptions underlying such mappings as well as the mapping of addresses and error conditions.Key words for use in RFCs to Indicate Requirement LevelsHarvard University1350 Mass. Ave.CambridgeMA 02138- +1 617 495 3864-
General
keywordIn many standards track documents several words are used to signify the requirements in the specification. These words are often capitalized. This document defines these words as they should be interpreted in IETF documents. Authors who follow these guidelines should incorporate this phrase near the beginning of their document:
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.Note that the force of these words is modified by the requirement level of the document in which they are used.Extensible Messaging and Presence Protocol (XMPP): CoreJabber Software FoundationHypertext Transfer Protocol -- HTTP/1.1Department of Information and Computer ScienceUniversity of California, IrvineIrvineCA92697-3425+1(949)824-1715fielding@ics.uci.eduWorld Wide Web ConsortiumMIT Laboratory for Computer Science, NE43-356545 Technology SquareCambridgeMA02139+1(617)258-8682jg@w3.orgCompaq Computer CorporationWestern Research Laboratory250 University AvenuePalo AltoCA94305mogul@wrl.dec.comWorld Wide Web ConsortiumMIT Laboratory for Computer Science, NE43-356545 Technology SquareCambridgeMA02139+1(617)258-8682frystyk@w3.orgXerox CorporationMIT Laboratory for Computer Science, NE43-3563333 Coyote Hill RoadPalo AltoCA94034masinter@parc.xerox.comMicrosoft Corporation1 Microsoft WayRedmondWA98052paulle@microsoft.comWorld Wide Web ConsortiumMIT Laboratory for Computer Science, NE43-356545 Technology SquareCambridgeMA02139+1(617)258-8682timbl@w3.org
The Hypertext Transfer Protocol (HTTP) is an application-level
protocol for distributed, collaborative, hypermedia information
systems. It is a generic, stateless, protocol which can be used for
many tasks beyond its use for hypertext, such as name servers and
distributed object management systems, through extension of its
request methods, error codes and headers . A feature of HTTP is
the typing and negotiation of data representation, allowing systems
to be built independently of the data being transferred.
HTTP has been in use by the World-Wide Web global information
initiative since 1990. This specification defines the protocol
referred to as "HTTP/1.1", and is an update to RFC 2068 .
RTP: A Transport Protocol for Real-Time ApplicationsThis memorandum describes RTP, the real-time transport protocol. RTP provides end-to-end network transport functions suitable for applications transmitting real-time data, such as audio, video or simulation data, over multicast or unicast network services. RTP does not address resource reservation and does not guarantee quality-of- service for real-time services. The data transport is augmented by a control protocol (RTCP) to allow monitoring of the data delivery in a manner scalable to large multicast networks, and to provide minimal control and identification functionality. RTP and RTCP are designed to be independent of the underlying transport and network layers. The protocol supports the use of RTP-level translators and mixers. Most of the text in this memorandum is identical to RFC 1889 which it obsoletes. There are no changes in the packet formats on the wire, only changes to the rules and algorithms governing how the protocol is used. The biggest change is an enhancement to the scalable timer algorithm for calculating when to send RTCP packets in order to minimize transmission in excess of the intended rate when many participants join a session simultaneously. [STANDARDS TRACK] Transmission Control ProtocolUniversity of Southern California (USC)/Information Sciences Institute4676 Admiralty WayMarina del ReyCA90291USUser Datagram ProtocolUniversity of Southern California (USC)/Information Sciences Institute4676 Admiralty WayMarina del ReyCA90291US+1 213 822 1511