Basic operation
RTMP is a TCP-based protocol which maintains persistent connections and allows low-latency communication. To deliver streams smoothly and transmit as much information as possible, it splits streams into fragments, and their size is negotiated dynamically between the client and server. Sometimes, it is kept unchanged; the default fragment sizes are 64 bytes for audio data, and 128 bytes for video data and most other data types. Fragments from different streams may then be interleaved, and multiplexed over a single connection. With longer data chunks, the protocol thus carries only a one-byte header per fragment, so incurring very little overhead. However, in practice, individual fragments are not typically interleaved. Instead, the interleaving and multiplexing is done at the packet level, with RTMP packets across several different active channels being interleaved in such a way as to ensure that each channel meets its bandwidth, latency, and other quality-of-service requirements. Packets interleaved in this fashion are treated as indivisible, and are not interleaved on the fragment level. The RTMP defines several virtual channels on which packets may be sent and received, and which operate independently of each other. For example, there is a channel for handling RPC requests and responses, a channel for video stream data, a channel for audio stream data, a channel for out-of-band control messages (fragment size negotiation, etc.), and so on. During a typical RTMP session, several channels may be active simultaneously at any given time. When RTMP data is encoded, a packet header is generated. The packet header specifies, amongst other matters, the ID of the channel on which it is to be sent, a timestamp of when it was generated (if necessary), and the size of the packet's payload. This header is then followed by the actual payload content of the packet, which is fragmented according to the currently agreed-upon fragment size before it is sent over the connection. The packet header itself is never fragmented, and its size does not count towards the data in the packet's first fragment. In other words, only the actual packet payload (the media data) is subject to fragmentation. At a higher level, the RTMP encapsulates MP3 orEncryption
RTMP sessions may be encrypted using either of two methods: * Using industry standardHTTP tunneling
In RTMP Tunneled (RTMPT), RTMP data is encapsulated and exchanged via HTTP, and messages from the client (the media player, in this case) are addressed to port 80 (the default for HTTP) on the server. While the messages in RTMPT are larger than the equivalent non-tunneled RTMP messages due to HTTP headers, RTMPT may facilitate the use of RTMP in scenarios where the use of non-tunneled RTMP would otherwise not be possible, such as when the client is behind a firewall that blocks non-HTTP and non-HTTPS outbound traffic. The protocol works by sending commands through the POST URL, and AMF messages through the POST body. An example isPOST /open/1 HTTP/1.1for a connection to be opened.
Specification document and patent license
Adobe has released a specification for version 1.0 of the protocol, dated 21 December 2012. The web landing page leading to that specification notes that "To benefit customers who want to protect their content, the open RTMP specification does not include Adobe's unique secure RTMP measures". A document accompanying the Adobe specification grants "non-exclusive, royalty-free, nontransferable, non-sublicensable, personal, worldwide" patent license to all implementations of the protocol, with two restrictions: one forbids use for intercepting streaming data ("any technology that intercepts streaming video, audio and/or data content for storage in any device or medium"), and another prohibits circumvention of "technological measures for the protection of audio, video and/or data content, including any of Adobe’s secure RTMP measures".Patents and related litigation
Stefan Richter, author of some books on Flash, noted in 2008 that while Adobe is vague as to which patents apply to RTMP, appears to be one of them. In 2011, Adobe did sue Wowza Media Systems claiming, among other things, infringement of their RTMP patents. In 2015, Adobe and Wowza announced that the lawsuits have been settled and dismissed with prejudice.Packet structure
Invoke Message Structure (0x14, 0x11)
Some of the message types shown above, such as Ping and Set Client/Server Bandwidth, are considered low level RTMP protocol messages which do not use the AMF encoding format. Command messages on the other hand, whether AMF0 (Message Type of 0x14) or AMF3 (0x11), use the format and have the general form shown below:(String)The transaction id is used for commands that can have a reply. The value can be either a string like in the example above or one or more objects, each composed of a set of key/value pairs where the keys are always encoded as strings while the values can be any AMF data type, including complex types like arrays.(Number) (Mixed) ex. Null, String, Object:
Control Message Structure (0x04)
Control messages are not AMF encoded. They start with a stream Id of 0x02 which implies a full (type 0) header and have a message type of 0x04. The header is followed by six bytes, which are interpreted as such: * #0-1 - Control Type. * #2-3 - Second Parameter (this has meaning in specific Control Types) * #4-5 - Third Parameter (same) The first two bytes of the message body define the Ping Type, which can apparently take six possible values. * Type 0 - Clear Stream: Sent when the connection is established and carries no further data * Type 1 - Clear the Buffer. * Type 2 - Stream Dry. * Type 3 - The client's buffer time. The third parameter holds the value in millisecond. * Type 4 - Reset a stream. * Type 6 - Ping the client from server. The second parameter is the current time. * Type 7 - Pong reply from client. The second parameter is the time when the client receives the Ping. * Type 8 - UDP Request. * Type 9 - UDP Response. * Type 10 - Bandwidth Limit. * Type 11 - Bandwidth. * Type 12 - Throttle Bandwidth. * Type 13 - Stream Created. * Type 14 - Stream Deleted. * Type 15 - Set Read Access. * Type 16 - Set Write Access. * Type 17 - Stream Meta Request. * Type 18 - Stream Meta Response. * Type 19 - Get Segment Boundary. * Type 20 - Set Segment Boundary. * Type 21 - On Disconnect. * Type 22 - Set Critical Link. * Type 23 - Disconnect. * Type 24 - Hash Update. * Type 25 - Hash Timeout. * Type 26 - Hash Request. * Type 27 - Hash Response. * Type 28 - Check Bandwidth. * Type 29 - Set Audio Sample Access. * Type 30 - Set Video Sample Access. * Type 31 - Throttle Begin. * Type 32 - Throttle End. * Type 33 - DRM Notify. * Type 34 - RTMFP Sync. * Type 35 - Query IHello. * Type 36 - Forward IHello. * Type 37 - Redirect IHello. * Type 38 - Notify EOF. * Type 39 - Proxy Continue. * Type 40 - Proxy Remove Upstream. * Type 41 - RTMFP Set Keepalives. * Type 46 - Segment Not Found. ''Pong'' is the name for a reply to a Ping, with the values used as seen above.ServerBw/ClientBw Message Structure (0x05, 0x06)
This relates to messages that have to do with the client up-stream and server down-stream bit-rate. The body is composed of four bytes showing the bandwidth value, with a possible extension of one byte which sets the Limit Type. This can have one of three possible values which can be: hard, soft or dynamic (either soft or hard).Set Chunk Size (0x01)
The value received in the four bytes of the body. A default value of 128 bytes exists, and the message is sent only when a change is wanted.Protocol
Handshake
After establishing a TCP connection, an RTMP connection is established first, performing a handshake through the exchange of three packets from each side (also referred to as Chunks in the official documentation). These are referred in the official spec as C0-2 for the client sent packets and S0-2 for the server side respectively and are not to be confused with RTMP packets that can be exchanged only after the handshake is complete. These packets have a structure of their own and C1 contains a field setting the "epoch" timestamp, but since this can be set to zero, as is done in third party implementations, the packet can be simplified. The client initialises the connection by sending the C0 packet with a constant value of 0x03 representing the current protocol version. It follows straight with C1 without waiting for S0 to be received first which contains 1536 bytes, with the first four representing the epoch timestamp, the second four all being 0, and the rest being random (and which can be set to 0 in third party implementations). C2 and S2 are an echo of S1 and C1 respectively, except with the second four bytes being the time the respective message was received (instead of 0). After C2 and S2 are received, the handshake is considered complete.Connect
At this point, the client, and server can negotiate a connection by exchanging AMF encoded messages. These include key value pairs which relate to variables that are needed for a connection to be established. An example message from the client is:flashVer
string is the same as returned by the Action-script getversion()
function. The audioCodec
and videoCodec
are encoded as doubles and their meaning can be found in the original spec. The same is true for the videoFunction
variable, which in this case is the self-explanatory SUPPORT_VID_CLIENT_SEEK constant. Of special interest is the objectEncoding
which will define whether the rest of the communication will make use of the extended AMF3 format or not. As version 3 is the current default, the flash client has to be told explicitly in Action-script code to use AMF0 if that is requested. The server then replies with a ServerBW, a ClientBW and a SetPacketSize message sequence, finally followed by an Invoke, with an example message.
clientId
will establish a number for the session to be started by the connection. Object encoding must match the value previously set.
Play video
To start a video stream, the client sends a "createStream" invocation followed by a ping message, followed by a "play" invocation with the file name as argument. The server will then reply with a series of "onStatus" commands followed by the video data as encapsulated within RTMP messages. After a connection is established, media is sent by encapsulating the content of FLV tags into RTMP messages of type 8 and 9 for audio and video, respectively.HTTP tunneling (RTMPT)
This refers to the HTTP tunneled version of the protocol. It communicates over port 80 and passes the AMF data inside HTTP POST request and responses. The sequence for connection is as follows:Software implementations
RTMP is implemented at these three stages: * Live video encoder * Live and on-demand media streaming server * Live and on-demand clientrtmpdump
The open-source RTMP client command-line tool rtmpdump is designed to play back or save to disk the full RTMP stream, including the RTMPE protocol Adobe uses for encryption. RTMPdump runs on Linux, Android, Solaris, , and most other Unix-derived operating systems, as well as Microsoft Windows. Originally supporting all versions of 32-bit Windows including Windows 98, from version 2.2 the software will run only on Windows XP and above (although earlier versions remain fully functional). Packages of the rtmpdump suite of software are available in the major open-source repositories (Linux distributions). These include the front-end apps "rtmpdump", "rtmpsrv" and "rtmpsuck." Development of RTMPdump was restarted in October 2009, outside the United States, at the MPlayer site. The current version features greatly improved functionality, and has been rewritten to take advantage of the benefits of theFLVstreamer
FLVstreamer is a fork of RTMPdump, without the code, which Adobe claims violates the DMCA in the USA. This was developed as a response to Adobe's attempt in 2008 to suppress RTMPdump. FLVstreamer is an RTMP client that will save a stream of audio or video content from any RTMP server to disk, if encryption (RTMPE) is not enabled on the stream.See also
*References
External links