The Transmission Control Protocol (TCP) is one of the main
protocol
Protocol may refer to:
Sociology and politics
* Protocol (politics), a formal agreement between nation states
* Protocol (diplomacy), the etiquette of diplomacy and affairs of state
* Etiquette, a code of personal behavior
Science and technology
...
s of the
Internet protocol suite
The Internet protocol suite, commonly known as TCP/IP, is a framework for organizing the set of communication protocols used in the Internet and similar computer networks according to functional criteria. The foundational protocols in the sui ...
. It originated in the initial network implementation in which it complemented the
Internet Protocol
The Internet Protocol (IP) is the network layer communications protocol in the Internet protocol suite for relaying datagrams across network boundaries. Its routing function enables internetworking, and essentially establishes the Internet.
...
(IP). Therefore, the entire suite is commonly referred to as
TCP/IP
The Internet protocol suite, commonly known as TCP/IP, is a framework for organizing the set of communication protocols used in the Internet and similar computer networks according to functional criteria. The foundational protocols in the suit ...
. TCP provides
reliable, ordered, and
error-checked delivery of a
stream of
octets
Octet may refer to:
Music
* Octet (music), ensemble consisting of eight instruments or voices, or composition written for such an ensemble
** String octet, a piece of music written for eight string instruments
*** Octet (Mendelssohn), 1825 compos ...
(bytes) between applications running on hosts communicating via an IP network. Major internet applications such as the
World Wide Web
The World Wide Web (WWW), commonly known as the Web, is an information system enabling documents and other web resources to be accessed over the Internet.
Documents and downloadable media are made available to the network through web se ...
,
email
Electronic mail (email or e-mail) is a method of exchanging messages ("mail") between people using electronic devices. Email was thus conceived as the electronic ( digital) version of, or counterpart to, mail, at a time when "mail" mean ...
,
remote administration
Remote administration refers to any method of controlling a computer from a remote location. Software that allows remote administration is becoming increasingly common and is often used when it is difficult or impractical to be physically near a ...
, and
file transfer File transfer is the transmission of a computer file through a communication channel from one computer system to another. Typically, file transfer is mediated by a communications protocol. In the history of computing, numerous file transfer protocol ...
rely on TCP, which is part of the
Transport Layer
In computer networking, the transport layer is a conceptual division of methods in the layered architecture of protocols in the network stack in the Internet protocol suite and the OSI model. The protocols of this layer provide end-to-end ...
of the TCP/IP suite.
SSL/TLS
Transport Layer Security (TLS) is a cryptographic protocol designed to provide communications security over a computer network. The protocol is widely used in applications such as email, instant messaging, and voice over IP, but its use in secu ...
often runs on top of TCP.
TCP is
connection-oriented
Connection-oriented communication is a network communication mode in telecommunications and computer networking, where a communication session or a semi-permanent connection is established before any useful data can be transferred. The establish ...
, and a connection between client and server is established before data can be sent. The server must be listening (passive open) for connection requests from clients before a connection is established. Three-way handshake (active open),
retransmission, and error detection adds to reliability but lengthens
latency. Applications that do not require reliable
data stream
In connection-oriented communication, a data stream is the transmission of a sequence of digitally encoded coherent signals to convey information. Typically, the transmitted symbols are grouped into a series of packets.
Data streaming has ...
service may use the
User Datagram Protocol
In computer networking, the User Datagram Protocol (UDP) is one of the core communication protocols of the Internet protocol suite used to send messages (transported as datagrams in packets) to other hosts on an Internet Protocol (IP) networ ...
(UDP) instead, which provides a
connectionless
Connectionless communication, often referred to as CL-mode communication,Information Processing Systems - Open Systems Interconnection, "Transport Service Definition - Addendum 1: Connectionless-mode Transmission", International Organization for ...
datagram
A datagram is a basic transfer unit associated with a packet-switched network. Datagrams are typically structured in header and payload sections. Datagrams provide a connectionless communication service across a packet-switched network. The del ...
service that prioritizes time over reliability. TCP employs
network congestion avoidance
Network congestion in data networking and queueing theory is the reduced quality of service that occurs when a network node or link is carrying more data than it can handle. Typical effects include queueing delay, packet loss or the blocking o ...
. However, there are vulnerabilities in TCP, including
denial of service
In computing, a denial-of-service attack (DoS attack) is a cyber-attack in which the perpetrator seeks to make a machine or network resource unavailable to its intended users by temporarily or indefinitely disrupting services of a host conn ...
,
connection hijacking, TCP veto, and
reset attack.
Historical origin
In May 1974,
Vint Cerf
Vinton Gray Cerf (; born June 23, 1943) is an American Internet pioneer and is recognized as one of " the fathers of the Internet", sharing this title with TCP/IP co-developer Bob Kahn. He has received honorary degrees and awards that includ ...
and
Bob Kahn
Robert Elliot Kahn (born December 23, 1938) is an American electrical engineer who, along with Vint Cerf, first proposed the Transmission Control Protocol (TCP) and the Internet Protocol (IP), the fundamental communication protocols at the hear ...
described an
internetworking
Internetworking is the practice of interconnecting multiple computer networks, such that any pair of hosts in the connected networks can exchange messages irrespective of their hardware-level networking technology. The resulting system of interc ...
protocol for sharing resources using
packet switching
In telecommunications, packet switching is a method of grouping Data (computing), data into ''network packet, packets'' that are transmitted over a digital Telecommunications network, network. Packets are made of a header (computing), header and ...
among network nodes. The authors had been working with
Gérard Le Lann
Gérard Le Lann is a French computer scientist at INRIA.
In networking, he worked on the project CYCLADES with an intermediate stint on the Arpanet team.
Life and career
Gérard Le Lann's career has been summarized in 1975 as follows:
::Gé ...
to incorporate concepts from the French
CYCLADES
The Cyclades (; el, Κυκλάδες, ) are an island group in the Aegean Sea, southeast of mainland Greece and a former administrative prefecture of Greece. They are one of the island groups which constitute the Aegean archipelago. The na ...
project into the new network. The
specification
A specification often refers to a set of documented requirements to be satisfied by a material, design, product, or service. A specification is often a type of technical standard.
There are different types of technical or engineering specificat ...
of the resulting protocol, (''Specification of Internet Transmission Control Program''), was written by Vint Cerf,
Yogen Dalal
Instead of having a single "inventor", the Internet was developed by many people over many years. The following are some Internet pioneers who contributed to its early and ongoing development. These include early theoretical foundations, specify ...
, and Carl Sunshine, and published in December 1974. It contains the first attested use of the term ''
internet
The Internet (or internet) is the global system of interconnected computer networks that uses the Internet protocol suite (TCP/IP) to communicate between networks and devices. It is a ''internetworking, network of networks'' that consists ...
'', as a shorthand for ''internetwork''.
A central control component of this model was the ''Transmission Control Program'' that incorporated both connection-oriented links and datagram services between hosts. The monolithic Transmission Control Program was later divided into a modular architecture consisting of the ''Transmission Control Protocol'' and the ''Internet Protocol''. This resulted in a networking model that became known informally as ''TCP/IP'', although formally it was variously referred to as the Department of Defense (DOD) model, and
ARPANET
The Advanced Research Projects Agency Network (ARPANET) was the first wide-area packet-switched network with distributed control and one of the first networks to implement the TCP/IP protocol suite. Both technologies became the technical foun ...
model, and eventually also as the ''Internet Protocol Suite''.
In 2004,
Vint Cerf
Vinton Gray Cerf (; born June 23, 1943) is an American Internet pioneer and is recognized as one of " the fathers of the Internet", sharing this title with TCP/IP co-developer Bob Kahn. He has received honorary degrees and awards that includ ...
and
Bob Kahn
Robert Elliot Kahn (born December 23, 1938) is an American electrical engineer who, along with Vint Cerf, first proposed the Transmission Control Protocol (TCP) and the Internet Protocol (IP), the fundamental communication protocols at the hear ...
received the
Turing Award
The ACM A. M. Turing Award is an annual prize given by the Association for Computing Machinery (ACM) for contributions of lasting and major technical importance to computer science. It is generally recognized as the highest distinction in compu ...
for their foundational work on TCP/IP.
Network function
The Transmission Control Protocol provides a communication service at an intermediate level between an application program and the Internet Protocol. It provides host-to-host connectivity at the
transport layer
In computer networking, the transport layer is a conceptual division of methods in the layered architecture of protocols in the network stack in the Internet protocol suite and the OSI model. The protocols of this layer provide end-to-end ...
of the
Internet model
The Internet protocol suite, commonly known as TCP/IP, is a framework for organizing the set of communication protocols used in the Internet and similar computer networks according to functional criteria. The foundational protocols in the suit ...
. An application does not need to know the particular mechanisms for sending data via a link to another host, such as the required
IP fragmentation
400px, An example of the fragmentation of a protocol data unit in a given layer into smaller fragments.
IP fragmentation is an Internet Protocol (IP) process that breaks packets into smaller pieces (fragments), so that the resulting pieces can ...
to accommodate the
maximum transmission unit
In computer networking, the maximum transmission unit (MTU) is the size of the largest protocol data unit (PDU) that can be communicated in a single network layer transaction. The MTU relates to, but is not identical to the maximum frame size th ...
of the transmission medium. At the transport layer, TCP handles all handshaking and transmission details and presents an abstraction of the network connection to the application typically through a
network socket
A network socket is a software structure within a network node of a computer network that serves as an endpoint for sending and receiving data across the network. The structure and properties of a socket are defined by an application programmin ...
interface.
At the lower levels of the protocol stack, due to
network congestion
Network congestion in data networking and queueing theory is the reduced quality of service that occurs when a network node or link is carrying more data than it can handle. Typical effects include queueing delay, packet loss or the blocking ...
, traffic
load balancing, or unpredictable network behaviour, IP packets may be
lost
Lost may refer to getting lost, or to:
Geography
*Lost, Aberdeenshire, a hamlet in Scotland
*Lake Okeechobee Scenic Trail, or LOST, a hiking and cycling trail in Florida, US
History
*Abbreviation of lost work, any work which is known to have bee ...
, duplicated, or
delivered out of order. TCP detects these problems, requests
re-transmission of lost data, rearranges out-of-order data and even helps minimize network congestion to reduce the occurrence of the other problems. If the data still remains undelivered, the source is notified of this failure. Once the TCP receiver has reassembled the sequence of octets originally transmitted, it passes them to the receiving application. Thus, TCP
abstracts
An abstract is a brief summary of a research article, thesis, review, conference proceeding, or any in-depth analysis of a particular subject and is often used to help the reader quickly ascertain the paper's purpose. When used, an abstract always ...
the application's communication from the underlying networking details.
TCP is used extensively by many internet applications, including the
World Wide Web
The World Wide Web (WWW), commonly known as the Web, is an information system enabling documents and other web resources to be accessed over the Internet.
Documents and downloadable media are made available to the network through web se ...
(WWW),
email
Electronic mail (email or e-mail) is a method of exchanging messages ("mail") between people using electronic devices. Email was thus conceived as the electronic ( digital) version of, or counterpart to, mail, at a time when "mail" mean ...
,
File Transfer Protocol
The File Transfer Protocol (FTP) is a standard communication protocol used for the transfer of computer files from a server to a client on a computer network. FTP is built on a client–server model architecture using separate control and data ...
,
Secure Shell
The Secure Shell Protocol (SSH) is a cryptographic network protocol for operating network services securely over an unsecured network. Its most notable applications are remote login and command-line execution.
SSH applications are based ...
,
peer-to-peer file sharing
Peer-to-peer file sharing is the distribution and sharing of digital media using peer-to-peer (P2P) networking technology. P2P file sharing allows users to access media files such as books, music, movies, and games using a P2P software program th ...
, and
streaming media
Streaming media is multimedia that is delivered and consumed in a continuous manner from a source, with little or no intermediate storage in network elements. ''Streaming'' refers to the delivery method of content, rather than the content i ...
.
TCP is optimized for accurate delivery rather than timely delivery and can incur relatively long delays (on the order of seconds) while waiting for out-of-order messages or re-transmissions of lost messages. Therefore, it is not particularly suitable for real-time applications such as
voice over IP
Voice over Internet Protocol (VoIP), also called IP telephony, is a method and group of technologies for the delivery of voice communications and multimedia sessions over Internet Protocol (IP) networks, such as the Internet. The terms Interne ...
. For such applications, protocols like the
Real-time Transport Protocol
The Real-time Transport Protocol (RTP) is a network protocol for delivering audio and video over IP networks. RTP is used in communication and entertainment systems that involve streaming media, such as telephony, video teleconference applicati ...
(RTP) operating over the
User Datagram Protocol
In computer networking, the User Datagram Protocol (UDP) is one of the core communication protocols of the Internet protocol suite used to send messages (transported as datagrams in packets) to other hosts on an Internet Protocol (IP) networ ...
(UDP) are usually recommended instead.
TCP is a
reliable byte stream
A reliable byte stream is a common service paradigm in computer networking; it refers to a byte stream in which the bytes which emerge from the communication channel at the recipient are exactly the same, and in exactly the same order, as they we ...
delivery service which guarantees that all bytes received will be identical and in the same order as those sent. Since packet transfer by many networks is not reliable, TCP achieves this using a technique known as ''positive acknowledgement with re-transmission''. This requires the receiver to respond with an acknowledgement message as it receives the data. The sender keeps a record of each packet it sends and maintains a timer from when the packet was sent. The sender re-transmits a packet if the timer expires before receiving the acknowledgement. The timer is needed in case a packet gets lost or corrupted.
[
While IP handles actual delivery of the data, TCP keeps track of ''segments'' - the individual units of data transmission that a message is divided into for efficient routing through the network. For example, when an HTML file is sent from a web server, the TCP software layer of that server divides the file into segments and forwards them individually to the ]internet layer
The internet layer is a group of internetworking methods, protocols, and specifications in the Internet protocol suite that are used to transport network packets from the originating host across network boundaries; if necessary, to the destinat ...
in the network stack
The protocol stack or network stack is an implementation of a computer networking protocol suite or protocol family. Some of these terms are used interchangeably but strictly speaking, the ''suite'' is the definition of the communication protoco ...
. The internet layer software encapsulates each TCP segment into an IP packet by adding a header that includes (among other data) the destination IP address
An Internet Protocol address (IP address) is a numerical label such as that is connected to a computer network that uses the Internet Protocol for communication.. Updated by . An IP address serves two main functions: network interface ident ...
. When the client program on the destination computer receives them, the TCP software in the transport layer re-assembles the segments and ensures they are correctly ordered and error-free as it streams the file contents to the receiving application.
TCP segment structure
Transmission Control Protocol accepts data from a data stream, divides it into chunks, and adds a TCP header creating a TCP segment. The TCP segment is then encapsulated into an Internet Protocol (IP) datagram, and exchanged with peers.
The term ''TCP packet'' appears in both informal and formal usage, whereas in more precise terminology ''segment'' refers to the TCP protocol data unit
In telecommunications, a protocol data unit (PDU) is a single unit of information transmitted among peer entities of a computer network. It is composed of protocol-specific control information and user data. In the layered architectures of co ...
(PDU), ''datagram'' to the IP PDU, and ''frame'' to the data link layer
The data link layer, or layer 2, is the second layer of the seven-layer OSI model of computer networking. This layer is the protocol layer that transfers data between nodes on a network segment across the physical layer. The data link layer ...
PDU:
Processes transmit data by calling on the TCP and passing buffers of data as arguments. The TCP packages the data from these buffers into segments and calls on the internet module .g. IPto transmit each segment to the destination TCP.
A TCP segment consists of a segment ''header'' and a ''data'' section. The segment header contains 10 mandatory fields, and an optional extension field (''Options'', pink background in table). The data section follows the header and is the payload data carried for the application. The length of the data section is not specified in the segment header; It can be calculated by subtracting the combined length of the segment header and IP header from the total IP datagram length specified in the IP header.
;Source port (16 bits): Identifies the sending port.
;Destination port (16 bits): Identifies the receiving port.
;Sequence number (32 bits): Has a dual role:
:*If the SYN flag is set (1), then this is the initial sequence number. The sequence number of the actual first data byte and the acknowledged number in the corresponding ACK are then this sequence number plus 1.
:*If the SYN flag is clear (0), then this is the accumulated sequence number of the first data byte of this segment for the current session.
;Acknowledgment number (32 bits): If the ACK flag is set then the value of this field is the next sequence number that the sender of the ACK is expecting. This acknowledges receipt of all prior bytes (if any). The first ACK sent by each end acknowledges the other end's initial sequence number itself, but no data.
;Data offset (4 bits): Specifies the size of the TCP header in 32-bit words
A word is a basic element of language that carries an objective or practical meaning, can be used on its own, and is uninterruptible. Despite the fact that language speakers often have an intuitive grasp of what a word is, there is no conse ...
. The minimum size header is 5 words and the maximum is 15 words thus giving the minimum size of 20 bytes and maximum of 60 bytes, allowing for up to 40 bytes of options in the header. This field gets its name from the fact that it is also the offset from the start of the TCP segment to the actual data.
;Reserved (3 bits):For future use and should be set to zero.
;Flags (9 bits):Contains 9 1-bit flags (control bits) as follows:
:*NS (1 bit): ECN-nonce - concealment protection
:*CWR (1 bit): Congestion window reduced (CWR) flag is set by the sending host to indicate that it received a TCP segment with the ECE flag set and had responded in congestion control mechanism.
:*ECE (1 bit): ECN-Echo has a dual role, depending on the value of the SYN flag. It indicates:
::*If the SYN flag is set (1), that the TCP peer is ECN capable.
::*If the SYN flag is clear (0), that a packet with Congestion Experienced flag set (ECN=11) in the IP header was received during normal transmission. This serves as an indication of network congestion (or impending congestion) to the TCP sender.
:*URG (1 bit): Indicates that the Urgent pointer field is significant
:*ACK (1 bit): Indicates that the Acknowledgment field is significant. All packets after the initial SYN packet sent by the client should have this flag set.
:*PSH (1 bit): Push function. Asks to push the buffered data to the receiving application.
:*RST (1 bit): Reset the connection
:*SYN (1 bit): Synchronize sequence numbers. Only the first packet sent from each end should have this flag set. Some other flags and fields change meaning based on this flag, and some are only valid when it is set, and others when it is clear.
:*FIN (1 bit): Last packet from sender
;Window size (16 bits):The size of the ''receive window'', which specifies the number of window size units that the sender of this segment is currently willing to receive. (See and .)
;Checksum (16 bits):The 16-bit checksum
A checksum is a small-sized block of data derived from another block of digital data for the purpose of detecting errors that may have been introduced during its transmission or storage. By themselves, checksums are often used to verify dat ...
field is used for error-checking of the TCP header, the payload and an IP pseudo-header. The pseudo-header consists of the source IP address, the destination IP address, the protocol number for the TCP protocol (6) and the length of the TCP headers and payload (in bytes).
;Urgent pointer (16 bits):If the URG flag is set, then this 16-bit field is an offset from the sequence number indicating the last urgent data byte.
;Options (Variable 0–320 bits, in units of 32 bits):The length of this field is determined by the ''data offset'' field. Options have up to three fields: Option-Kind (1 byte), Option-Length (1 byte), Option-Data (variable). The Option-Kind field indicates the type of option and is the only field that is not optional. Depending on Option-Kind value, the next two fields may be set. Option-Length indicates the total length of the option, and Option-Data contains data associated with the option, if applicable. For example, an Option-Kind byte of 1 indicates that this is a no operation option used only for padding, and does not have an Option-Length or Option-Data fields following it. An Option-Kind byte of 0 marks the end of options, and is also only one byte. An Option-Kind byte of 2 is used to indicate Maximum Segment Size option, and will be followed by an Option-Length byte specifying the length of the MSS field. Option-Length is the total length of the given options field, including Option-Kind and Option-Length fields. So while the MSS value is typically expressed in two bytes, Option-Length will be 4. As an example, an MSS option field with a value of 0x05B4 is coded as (0x02 0x04 0x05B4) in the TCP options section.
:Some options may only be sent when SYN is set; they are indicated below as YN/code>
. Option-Kind and standard lengths given as (Option-Kind, Option-Length).
:
:The remaining Option-Kind values are historical, obsolete, experimental, not yet standardized, or unassigned. Option number assignments are maintained by the IANA.
;Padding:The TCP header padding is used to ensure that the TCP header ends, and data begins, on a 32-bit boundary. The padding is composed of zeros.
Protocol operation
TCP protocol operations may be divided into three phases. ''Connection establishment'' is a multi-step handshake process that establishes a connection before entering the ''data transfer'' phase. After data transfer is completed, the ''connection termination'' closes the connection and releases all allocated resources.
A TCP connection is managed by an operating system through a resource that represents the local end-point for communications, the ''Internet socket
The Internet (or internet) is the global system of interconnected computer networks that uses the Internet protocol suite (TCP/IP) to communicate between networks and devices. It is a '' network of networks'' that consists of private, pub ...
''. During the lifetime of a TCP connection, the local end-point undergoes a series of state
State may refer to:
Arts, entertainment, and media Literature
* ''State Magazine'', a monthly magazine published by the U.S. Department of State
* ''The State'' (newspaper), a daily newspaper in Columbia, South Carolina, United States
* '' Our ...
changes:
Connection establishment
Before a client attempts to connect with a server, the server must first bind to and listen at a port to open it up for connections: this is called a passive open. Once the passive open is established, a client may establish a connection by initiating an active open using the three-way (or 3-step) handshake:
# SYN: The active open is performed by the client sending a SYN to the server. The client sets the segment's sequence number to a random value A.
# SYN-ACK: In response, the server replies with a SYN-ACK. The acknowledgment number is set to one more than the received sequence number i.e. A+1, and the sequence number that the server chooses for the packet is another random number, B.
# ACK: Finally, the client sends an ACK back to the server. The sequence number is set to the received acknowledgment value i.e. A+1, and the acknowledgment number is set to one more than the received sequence number i.e. B+1.
Steps 1 and 2 establish and acknowledge the sequence number for one direction. Steps 2 and 3 establish and acknowledge the sequence number for the other direction. Following the completion of these steps, both the client and server have received acknowledgments and a full-duplex communication is established.
Connection termination
The connection termination phase uses a four-way handshake, with each side of the connection terminating independently. When an endpoint wishes to stop its half of the connection, it transmits a FIN packet, which the other end acknowledges with an ACK. Therefore, a typical tear-down requires a pair of FIN and ACK segments from each TCP endpoint. After the side that sent the first FIN has responded with the final ACK, it waits for a timeout before finally closing the connection, during which time the local port is unavailable for new connections; this state lets the TCP client resend the final acknowledgement to the server in case the ACK is lost in transit. The time duration is implementation-dependent, but some common values are 30 seconds, 1 minute, and 2 minutes. After the timeout, the client enters the CLOSED state and the local port becomes available for new connections.
It is also possible to terminate the connection by a 3-way handshake, when host A sends a FIN and host B replies with a FIN & ACK (combining two steps into one) and host A replies with an ACK.
Some operating systems, such as Linux
Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, which i ...
and HP-UX
HP-UX (from "Hewlett Packard Unix") is Hewlett Packard Enterprise's proprietary implementation of the Unix operating system, based on Unix System V (initially System III) and first released in 1984. Current versions support HPE Integrit ...
, implement a half-duplex close sequence. If the host actively closes a connection, while still having unread incoming data available, the host sends the signal RST (losing any received data) instead of FIN. This assures that a TCP application is aware there was a data loss.
A connection can be in a half-open state, in which case one side has terminated the connection, but the other has not. The side that has terminated can no longer send any data into the connection, but the other side can. The terminating side should continue reading the data until the other side terminates as well.
Resource usage
Most implementations allocate an entry in a table that maps a session to a running operating system process. Because TCP packets do not include a session identifier, both endpoints identify the session using the client's address and port. Whenever a packet is received, the TCP implementation must perform a lookup on this table to find the destination process. Each entry in the table is known as a Transmission Control Block or TCB. It contains information about the endpoints (IP and port), status of the connection, running data about the packets that are being exchanged and buffers for sending and receiving data.
The number of sessions in the server side is limited only by memory and can grow as new connections arrive, but the client must allocate an ephemeral port
An ephemeral port is a communications endpoint (port) of a transport layer protocol of the Internet protocol suite that is used for only a short period of time for the duration of a communication session. Such short-lived ports are allocated autom ...
before sending the first SYN to the server. This port remains allocated during the whole conversation and effectively limits the number of outgoing connections from each of the client's IP addresses. If an application fails to properly close unrequired connections, a client can run out of resources and become unable to establish new TCP connections, even from other applications.
Both endpoints must also allocate space for unacknowledged packets and received (but unread) data.
Data transfer
The Transmission Control Protocol differs in several key features compared to the User Datagram Protocol
In computer networking, the User Datagram Protocol (UDP) is one of the core communication protocols of the Internet protocol suite used to send messages (transported as datagrams in packets) to other hosts on an Internet Protocol (IP) networ ...
:
* Ordered data transfer: the destination host rearranges segments according to a sequence number[
* Retransmission of lost packets: any cumulative stream not acknowledged is retransmitted][
* Error-free data transfer: corrupted packets are treated as lost and are retransmitted
* Flow control: limits the rate a sender transfers data to guarantee reliable delivery. The receiver continually hints the sender on how much data can be received. When the receiving host's buffer fills, the next acknowledgment suspends the transfer and allows the data in the buffer to be processed.][
* Congestion control: lost packets (presumed due to congestion) trigger a reduction in data delivery rate][
]
Reliable transmission
TCP uses a ''sequence number'' to identify each byte of data. The sequence number identifies the order of the bytes sent from each computer so that the data can be reconstructed in order, regardless of any out-of-order delivery
In computer networking, out-of-order delivery is the delivery of data packets in a different order from which they were sent. Out-of-order delivery can be caused by packets following multiple paths through a network, by lower-layer retransmissio ...
that may occur. The sequence number of the first byte is chosen by the transmitter for the first packet, which is flagged SYN. This number can be arbitrary, and should, in fact, be unpredictable to defend against TCP sequence prediction attack
A TCP sequence prediction attack is an attempt to predict the sequence number used to identify the packets in a TCP connection, which can be used to counterfeit packets.
The attacker hopes to correctly guess the sequence number to be used by the ...
s.
Acknowledgements (ACKs) are sent with a sequence number by the receiver of data to tell the sender that data has been received to the specified byte. ACKs do not imply that the data has been delivered to the application, they merely signify that it is now the receiver's responsibility to deliver the data.
Reliability is achieved by the sender detecting lost data and retransmitting it. TCP uses two primary techniques to identify loss. Retransmission timeout (RTO) and duplicate cumulative acknowledgements (DupAcks).
=Dupack-based retransmission
=
If a single segment (say segment number 100) in a stream is lost, then the receiver cannot acknowledge packets above that segment number (100) because it uses cumulative ACKs. Hence the receiver acknowledges packet 99 again on the receipt of another data packet. This duplicate acknowledgement is used as a signal for packet loss. That is, if the sender receives three duplicate acknowledgements, it retransmits the last unacknowledged packet. A threshold of three is used because the network may reorder segments causing duplicate acknowledgements. This threshold has been demonstrated to avoid spurious retransmissions due to reordering. Some TCP implementation use selective acknowledgement
The Transmission Control Protocol (TCP) is one of the main protocols of the Internet protocol suite. It originated in the initial network implementation in which it complemented the Internet Protocol (IP). Therefore, the entire suite is commonl ...
s (SACKs) to provide explicit feedback about the segments that have been received. This greatly improves TCP's ability to retransmit the right segments.
=Timeout-based retransmission
=
When a sender transmits a segment, it initializes a timer with a conservative estimate of the arrival time of the acknowledgement. The segment is retransmitted if the timer expires, with a new timeout threshold of twice the previous value, resulting in exponential backoff
Exponential backoff is an algorithm that uses feedback to multiplicatively decrease the rate of some process, in order to gradually find an acceptable rate. These algorithms find usage in a wide range of systems and processes, with radio networks ...
behavior. Typically, the initial timer value is , where is the clock granularity. This guards against excessive transmission traffic due to faulty or malicious actors, such as man-in-the-middle
In cryptography and computer security, a man-in-the-middle, monster-in-the-middle, machine-in-the-middle, monkey-in-the-middle, meddler-in-the-middle, manipulator-in-the-middle (MITM), person-in-the-middle (PITM) or adversary-in-the-middle (AiTM) ...
denial of service attack
In computing, a denial-of-service attack (DoS attack) is a cyber-attack in which the perpetrator seeks to make a machine or network resource unavailable to its intended users by temporarily or indefinitely disrupting services of a host conne ...
ers.
Error detection
Sequence numbers allow receivers to discard duplicate packets and properly sequence out-of-order packets. Acknowledgments allow senders to determine when to retransmit lost packets.
To assure correctness a checksum field is included; see for details. The TCP checksum is a weak check by modern standards and is normally paired with a CRC
CRC may refer to:
Science and technology
* Carboniferous Rainforest Collapse, an event at the end of the Carboniferous period
* Class-responsibility-collaboration card, used as a brainstorming tool in the design of object-oriented software
* Cli ...
integrity check at layer 2
The data link layer, or layer 2, is the second layer of the seven-layer OSI model of computer networking. This layer is the protocol layer that transfers data between nodes on a network segment across the physical layer. The data link layer ...
, below both TCP and IP, such as is used in PPP or the Ethernet
Ethernet () is a family of wired computer networking technologies commonly used in local area networks (LAN), metropolitan area networks (MAN) and wide area networks (WAN). It was commercially introduced in 1980 and first standardized in ...
frame. However, introduction of errors in packets between CRC-protected hops is common and the 16-bit TCP checksum catches most of these.
Flow control
TCP uses an end-to-end flow control protocol to avoid having the sender send data too fast for the TCP receiver to receive and process it reliably. Having a mechanism for flow control is essential in an environment where machines of diverse network speeds communicate. For example, if a PC sends data to a smartphone that is slowly processing received data, the smartphone must be able to regulate the data flow so as not to be overwhelmed.[
TCP uses a ]sliding window
A sliding window protocol is a feature of packet-based data transmission protocols. Sliding window protocols are used where reliable in-order delivery of packets is required, such as in the data link layer ( OSI layer 2) as well as in the Tran ...
flow control protocol. In each TCP segment, the receiver specifies in the ''receive window'' field the amount of additionally received data (in bytes) that it is willing to buffer for the connection. The sending host can send only up to that amount of data before it must wait for an acknowledgement and receive window update from the receiving host.
When a receiver advertises a window size of 0, the sender stops sending data and starts its ''persist timer''. The persist timer is used to protect TCP from a deadlock
In concurrent computing, deadlock is any situation in which no member of some group of entities can proceed because each waits for another member, including itself, to take action, such as sending a message or, more commonly, releasing a lo ...
situation that could arise if a subsequent window size update from the receiver is lost, and the sender cannot send more data until receiving a new window size update from the receiver. When the persist timer expires, the TCP sender attempts recovery by sending a small packet so that the receiver responds by sending another acknowledgement containing the new window size.
If a receiver is processing incoming data in small increments, it may repeatedly advertise a small receive window. This is referred to as the silly window syndrome
Silly may refer to:
Places
* Silly, Belgium, a town
* Silly Department, a department or commune of Sissili Province in southern Burkina Faso
Music
* Silly (band), an East German rock group from the 1970s
* The Sillies, an American punk rock b ...
, since it is inefficient to send only a few bytes of data in a TCP segment, given the relatively large overhead of the TCP header.
Congestion control
The final main aspect of TCP is congestion control
Network congestion in data networking and queueing theory is the reduced quality of service that occurs when a network node or link is carrying more data than it can handle. Typical effects include queueing delay, packet loss or the blocking o ...
. TCP uses a number of mechanisms to achieve high performance and avoid congestive collapse
Network congestion in data networking and queueing theory is the reduced quality of service that occurs when a network node or link is carrying more data than it can handle. Typical effects include queueing delay, packet loss or the blocking of ...
, a gridlock situation where network performance is severely degraded. These mechanisms control the rate of data entering the network, keeping the data flow below a rate that would trigger collapse. They also yield an approximately max-min fair
In communication networks, multiplexing and the division of scarce resources, max-min fairness is said to be achieved by an allocation if and only if the allocation is feasible and an attempt to increase the allocation of any participant necessari ...
allocation between flows.
Acknowledgments for data sent, or the lack of acknowledgments, are used by senders to infer network conditions between the TCP sender and receiver. Coupled with timers, TCP senders and receivers can alter the behavior of the flow of data. This is more generally referred to as congestion control or congestion avoidance.
Modern implementations of TCP contain four intertwined algorithms: slow start, congestion avoidance
Network congestion in data networking and queueing theory is the reduced quality of service that occurs when a network node or link is carrying more data than it can handle. Typical effects include queueing delay, packet loss or the blockin ...
, fast retransmit
Transmission Control Protocol (TCP) uses a network congestion-avoidance algorithm that includes various aspects of an additive increase/multiplicative decrease (AIMD) scheme, along with other schemes including slow start and congestion wind ...
, and fast recovery
Transmission Control Protocol (TCP) uses a network congestion-avoidance algorithm that includes various aspects of an additive increase/multiplicative decrease (AIMD) scheme, along with other schemes including slow start and congestion wind ...
.
In addition, senders employ a ''retransmission timeout'' (RTO) that is based on the estimated round-trip time
In telecommunications, round-trip delay (RTD) or round-trip time (RTT) is the amount of time it takes for a signal to be sent ''plus'' the amount of time it takes for acknowledgement of that signal having been received. This time delay includes pr ...
(RTT) between the sender and receiver, as well as the variance in this round-trip time. There are subtleties in the estimation of RTT. For example, senders must be careful when calculating RTT samples for retransmitted packets; typically they use Karn's Algorithm or TCP timestamps. These individual RTT samples are then averaged over time to create a smoothed round trip time (SRTT) using Jacobson's algorithm
Jacobson's was an American regional department store chain. Based in Jackson, Michigan, the chain operated primarily in Michigan and Florida, but also had stores in Ohio, Indiana, Kentucky and Kansas. Jacobson's focused on apparel, fine jewelry a ...
. This SRTT value is what is used as the round-trip time estimate.
Enhancing TCP to reliably handle loss, minimize errors, manage congestion and go fast in very high-speed environments are ongoing areas of research and standards development. As a result, there are a number of TCP congestion avoidance algorithm
Transmission Control Protocol (TCP) uses a network congestion-avoidance algorithm that includes various aspects of an additive increase/multiplicative decrease (AIMD) scheme, along with other schemes including slow start and congestion wind ...
variations.
Maximum segment size
The maximum segment size The maximum segment size (MSS) is a parameter of the ''options'' field of the TCP header that specifies the largest amount of data, specified in bytes, that a computer or communications device can receive in a single TCP segment. It does not coun ...
(MSS) is the largest amount of data, specified in bytes, that TCP is willing to receive in a single segment. For best performance, the MSS should be set small enough to avoid IP fragmentation
400px, An example of the fragmentation of a protocol data unit in a given layer into smaller fragments.
IP fragmentation is an Internet Protocol (IP) process that breaks packets into smaller pieces (fragments), so that the resulting pieces can ...
, which can lead to packet loss and excessive retransmissions. To accomplish this, typically the MSS is announced by each side using the MSS option when the TCP connection is established. The option value is derived from the maximum transmission unit
In computer networking, the maximum transmission unit (MTU) is the size of the largest protocol data unit (PDU) that can be communicated in a single network layer transaction. The MTU relates to, but is not identical to the maximum frame size th ...
(MTU) size of the data link layer of the networks to which the sender and receiver are directly attached. TCP senders can use path MTU discovery
Path MTU Discovery (PMTUD) is a standardized technique in computer networking for determining the maximum transmission unit (MTU) size on the network path between two Internet Protocol (IP) hosts, usually with the goal of avoiding IP fragmentati ...
to infer the minimum MTU along the network path between the sender and receiver, and use this to dynamically adjust the MSS to avoid IP fragmentation within the network.
MSS announcement may also be called ''MSS negotiation'' but, strictly speaking, the MSS is not ''negotiated''. Two completely independent values of MSS are permitted for the two directions of data flow in a TCP connection, so there is no need to agree on a common MSS configuration for a bidirectional connection.
Selective acknowledgments
Relying purely on the cumulative acknowledgment scheme employed by the original TCP can lead to inefficiencies when packets are lost. For example, suppose bytes with sequence number 1,000 to 10,999 are sent in 10 different TCP segments of equal size, and the second segment (sequence numbers 2,000 to 2,999) is lost during transmission. In a pure cumulative acknowledgment protocol, the receiver can only send a cumulative ACK value of 2,000 (the sequence number immediately following the last sequence number of the received data) and cannot say that it received bytes 3,000 to 10,999 successfully. Thus the sender may then have to resend all data starting with sequence number 2,000.
To alleviate this issue TCP employs the ''selective acknowledgment (SACK)'' option, defined in 1996 in RFC 2018, which allows the receiver to acknowledge discontinuous blocks of packets that were received correctly, in addition to the sequence number immediately following the last sequence number of the last contiguous byte received successively, as in the basic TCP acknowledgment. The acknowledgment can include a number of ''SACK blocks'', where each SACK block is conveyed by the ''Left Edge of Block'' (the first sequence number of the block) and the ''Right Edge of Block'' (the sequence number immediately following the last sequence number of the block), with a ''Block'' being a contiguous range that the receiver correctly received. In the example above, the receiver would send an ACK segment with a cumulative ACK value of 2,000 and a SACK option header with sequence numbers 3,000 and 11,000. The sender would accordingly retransmit only the second segment with sequence numbers 2,000 to 2,999.
A TCP sender may interpret an out-of-order segment delivery as a lost segment. If it does so, the TCP sender will retransmit the segment previous to the out-of-order packet and slow its data delivery rate for that connection. The duplicate-SACK option, an extension to the SACK option that was defined in May 2000 in RFC 2883, solves this problem. The TCP receiver sends a D-ACK to indicate that no segments were lost, and the TCP sender can then reinstate the higher transmission rate.
The SACK option is not mandatory and comes into operation only if both parties support it. This is negotiated when a connection is established. SACK uses a TCP header option (see for details). The use of SACK has become widespread—all popular TCP stacks support it. Selective acknowledgment is also used in Stream Control Transmission Protocol
The Stream Control Transmission Protocol (SCTP) is a computer networking communications protocol in the transport layer of the Internet protocol suite. Originally intended for Signaling System 7 (SS7) message transport in telecommunication, the p ...
(SCTP).
Window scaling
For more efficient use of high-bandwidth networks, a larger TCP window size may be used. A 16-bit TCP window size field controls the flow of data and its value is limited to 65,535 bytes. Since the size field cannot be expanded beyond this limit, a scaling factor is used. The