HOME

TheInfoList



OR:

iWARP is a computer
networking protocol A communication protocol is a system of rules that allows two or more entities of a communications system to transmit information via any kind of variation of a physical quantity. The protocol defines the rules, syntax, semantics and synchroni ...
that implements remote direct memory access (RDMA) for efficient data transfer over Internet Protocol networks. Contrary to some accounts, iWARP is not an acronym. Because iWARP is layered on
Internet Engineering Task Force The Internet Engineering Task Force (IETF) is a standards organization for the Internet and is responsible for the technical standards that make up the Internet protocol suite (TCP/IP). It has no formal membership roster or requirements and ...
(IETF)-standard congestion-aware protocols such as
Transmission Control Protocol The Transmission Control Protocol (TCP) is one of the main protocols of the Internet protocol suite. It originated in the initial network implementation in which it complemented the Internet Protocol (IP). Therefore, the entire suite is common ...
(TCP) and
Stream Control Transmission Protocol The Stream Control Transmission Protocol (SCTP) is a computer networking communications protocol in the transport layer of the Internet protocol suite. Originally intended for Signaling System 7 (SS7) message transport in telecommunication, t ...
(SCTP), it makes few requirements on the network, and can be successfully deployed in a broad range of environments.


History

In 2007, the IETF published five
Request for Comments A Request for Comments (RFC) is a publication in a series from the principal technical development and standards-setting bodies for the Internet, most prominently the Internet Engineering Task Force (IETF). An RFC is authored by individuals or g ...
(RFCs) that define iWARP: # RFC 5040 ''A Remote Direct Memory Access Protocol Specification'' is layered over Direct Data Placement Protocol (DDP). It defines how RDMA Send, Read, and Write operations are encoded using DDP into headers on the network. # RFC 5041 ''Direct Data Placement over Reliable Transports'' is layered over MPA/TCP or SCTP. It defines how received data can be directly placed into an upper layer protocol's receive buffer without intermediate buffers. # RFC 5042 ''Direct Data Placement Protocol (DDP) / Remote Direct Memory Access Protocol (RDMAP) Security'' analyzes security issues related to iWARP DDP and RDMAP protocol layers. # RFC 5043 ''Stream Control Transmission Protocol (SCTP) Direct Data Placement (DDP) Adaptation'' defines an adaptation layer that enables DDP over SCTP. # RFC 5044 ''Marker PDU Aligned Framing for TCP Specification'' defines an adaptation layer that enables preservation of DDP-level protocol record boundaries layered over the TCP reliable connected byte stream. These RFCs are based on the RDMA Consortium's specifications for RDMA over TCP. The RDMA Consortium's specifications are influenced by earlier RDMA standards, including Virtual Interface Architecture (VIA) and
InfiniBand InfiniBand (IB) is a computer networking communications standard used in high-performance computing that features very high throughput and very low latency. It is used for data interconnect both among and within computers. InfiniBand is also us ...
(IB). Since 2007, the IETF has published three additional RFCs that maintain and extend iWARP: # RFC 6580 ''IANA Registries for the Remote Direct Data Placement (RDDP) Protocols'' published in 2012 defines
IANA The Internet Assigned Numbers Authority (IANA) is a standards organization that oversees global IP address allocation, autonomous system number allocation, root zone management in the Domain Name System (DNS), media types, and other Intern ...
registries for Remote Direct Data Placement (RDDP) error codes, operation codes, and function codes. # RFC 6581 ''Enhanced Remote Direct Memory Access (RDMA) Connection Establishment'' published in 2011 fixes shortcomings with iWARP connection setup. # RFC 7306 ''Remote Direct Memory Access (RDMA) Protocol Extensions'' published in 2014 extends RFC 5040 with
atomic operation In concurrent programming, an operation (or set of operations) is linearizable if it consists of an ordered list of invocation and response events (event), that may be extended by adding response events such that: # The extended list can be re-e ...
s and RDMA Write with Immediate Data.


Protocol

The main component in the iWARP protocol is the Direct Data Placement Protocol (DDP), which permits the actual zero-copy transmission. DDP itself does not perform the transmission; the underlying protocol (TCP or SCTP) does. However, TCP does not respect message boundaries; it sends data as a sequence of bytes without regard to protocol data units (PDU). In this regard, DDP itself may be better suited for SCTP, and indeed the IETF proposed a standard RDMA over SCTP. To run DDP over TCP requires a tweak known as marker PDU aligned (MPA) framing to guarantee boundaries of messages. Furthermore, DDP is not intended to be accessed directly. Instead, a separate RDMA protocol (RDMAP) provides the services to read and write data. Therefore, the entire RDMA over TCP specification is really RDMAP over DDP over either MPA/TCP or SCTP. All of these protocols can be implemented in hardware. Unlike IB, iWARP only has reliable connected communication as this is the only service that TCP and SCTP provide. The iWARP specification omits other features of IB, such as Send with Immediate Data operations. With RFC 7306, the IETF is working to reduce these omissions.


Implementation

Because a
kernel Kernel may refer to: Computing * Kernel (operating system), the central component of most operating systems * Kernel (image processing), a matrix used for image convolution * Compute kernel, in GPGPU programming * Kernel method, in machine learni ...
implementation of the TCP stack can be seen as a bottleneck, the protocol is typically implemented in hardware RDMA network interface controllers (rNICs). As simple data losses are rare in tightly coupled network environments, the error-correction mechanisms of TCP may be performed by software while the more frequently performed communications are handled strictly by logic embedded on the rNIC. Similarly, connections are often established entirely by software and then handed off to the hardware. Furthermore, the handling of iWARP specific protocol details is often isolated from the TCP implementation, allowing rNICs to be used for both as RDMA offload and TCP offload (in support of traditional sockets based TCP/IP applications). The portion of the hardware implementation used for implementing the TCP protocol is known as the TCP Offload Engine (TOE). TOE itself does not prevent copying on the receive side, and must be combined with RDMA hardware for zero-copy results. The RDMA / TCP specification is a set of different wire protocols intended to be implemented in hardware (though it seems feasible to emulate it in software for compatibility but without the performance benefits).


Interfaces

iWARP is a protocol, not an implementation, but defines protocol behavior in terms of the operations that are legal for the protocol, known as Verbs. As such, iWARP does not have any single standard programming interface. However, programming interfaces tend to very closely correspond to the Verbs. Several programmatic interfaces have been proposed, including OpenFabrics Verbs, Network Direct, uDAPL, kDAPL, IT-API, and RNICPI. Implementations of some of these interfaces are available for different platforms, including Windows and Linux.


Services available

Networking services implemented over iWARP include those offered in the OpenFabrics Enterprise Distribution (OFED) by the OpenFabrics Alliance for
Linux Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, whic ...
operating systems, and by
Microsoft Windows Windows is a group of several Proprietary software, proprietary graphical user interface, graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, W ...
via Network Direct. * NVMe over Fabrics (NVMEoF) *
iSCSI Extensions for RDMA The iSCSI Extensions for RDMA (iSER) is a computer network protocol that extends the Internet Small Computer System Interface (iSCSI) protocol to use Remote Direct Memory Access ( RDMA). RDMA is provided by either the Transmission Control Protocol ...
(iSER) *
Server Message Block Server Message Block (SMB) is a communication protocol originally developed in 1983 by Barry A. Feigenbaum at IBM and intended to provide shared access to files and printers across nodes on a network of systems running IBM's OS/2. It also provid ...
Direct (SMB Direct) *
Sockets Direct Protocol The Sockets Direct Protocol (SDP) is a transport-agnostic protocol to support stream sockets over remote direct memory access (RDMA) network fabrics. SDP was originally defined by the Software Working Group (SWG) of the InfiniBand Trade Associa ...
(SDP) * SCSI RDMA Protocol (SRP) * Network File System over RDMA (NFS over RDMA) * GPUDirect


Vendors

Popular vendors of iWarp enabled equipment include: * Chelsio * Marvell * Bloombase


See also

* RDMA over Converged Ethernet


References


External links


OpenFabrics Alliance
at the University of New Hampshire InterOperability Laboratory — Testing on iWARP devices
Remote Direct Data Placement Charter
(IETF)
MPI-SCTP: Using the Stream Control Transmission Protocol for parallel programs written using the Message Passing Interface
{{Webarchive, url=https://web.archive.org/web/20091002101222/http://www.cs.ubc.ca/labs/dsg/mpi-sctp/ , date=2009-10-02 (2008-09-01)
SMB2 Remote Direct Memory Access (RDMA) Transport Protocol
(2017-06-01) Supercomputers Computer networks