Zero Copy
   HOME

TheInfoList



OR:

In
computer science Computer science is the study of computation, information, and automation. Computer science spans Theoretical computer science, theoretical disciplines (such as algorithms, theory of computation, and information theory) to Applied science, ...
, zero-copy refers to techniques that enable data transfer between memory spaces without requiring the
CPU A central processing unit (CPU), also called a central processor, main processor, or just processor, is the primary processor in a given computer. Its electronic circuitry executes instructions of a computer program, such as arithmetic, log ...
to copy the data. By avoiding redundant copying, zero-copy methods minimize CPU usage and
memory bandwidth Memory bandwidth is the rate at which data can be read from or stored into a semiconductor memory by a processor. Memory bandwidth is usually expressed in units of bytes/second, though this can vary for systems with natural data sizes that are ...
, leading to substantial performance gains. This is crucial for applications demanding high data throughput, such as network communication, file I/O, and multimedia processing.


Principle

Zero-copy programming techniques can be used when exchanging data within a
user space A modern computer operating system usually uses virtual memory to provide separate address spaces or regions of a single address space, called user space and kernel space. This separation primarily provides memory protection and hardware prote ...
process (i.e. between two or more threads, etc.) and/or between two or more processes (see also producer–consumer problem) and/or when data has to be accessed / copied / moved inside kernel space or between a user space process and kernel space portions of
operating system An operating system (OS) is system software that manages computer hardware and software resources, and provides common daemon (computing), services for computer programs. Time-sharing operating systems scheduler (computing), schedule tasks for ...
s (OS). Usually when a user space process has to execute system operations like reading or writing data from/to a device (i.e. a disk, a
NIC Nic is a gender-neutral given name, often short for Nicole, Nicholas, Nicola, or Dominic. It is also a component of Irish-language female surnames. It may refer to: Arts and entertainment * Nic Dalton (born 1964), Australian musician * Nic En ...
, etc.) through their high level software interfaces or like moving data from one device to another, etc., it has to perform one or more
system call In computing, a system call (syscall) is the programmatic way in which a computer program requests a service from the operating system on which it is executed. This may include hardware-related services (for example, accessing a hard disk drive ...
s that are then executed in kernel space by the operating system. If data has to be copied or moved from source to destination and both are located inside kernel space (i.e. two files, a file and a network card, etc.) then unnecessary data copies, from kernel space to user space and from user space to kernel space, can be avoided by using special (zero-copy) system calls, usually available in most recent versions of popular operating systems. Zero-copy versions of operating system elements, such as
device driver In the context of an operating system, a device driver is a computer program that operates or controls a particular type of device that is attached to a computer or automaton. A driver provides a software interface to hardware devices, enabli ...
s, file systems, network protocol stacks, etc., greatly increase the performance of certain application programs (that become processes when executed) and more efficiently utilize system resources. Performance is enhanced by allowing the CPU to move on to other tasks while data copies / processing proceed in parallel in another part of the machine. Also, zero-copy operations reduce the number of time-consuming
context switch In computing, a context switch is the process of storing the state of a process or thread, so that it can be restored and resume execution at a later point, and then restoring a different, previously saved, state. This allows multiple processes ...
es between user space and kernel space. System resources are utilized more efficiently since using a sophisticated CPU to perform extensive data copy operations, which is a relatively simple task, is wasteful if other simpler system components can do the copying. As an example, reading a file and then sending it over a network the traditional way requires 2 extra data copies (1 to read from kernel to user space + 1 to write from user to kernel space) and 4 context switches per read/write cycle. Those extra data copies use the CPU. Sending that file by using mmap of file data and a cycle of write calls, reduces the context switches to 2 per write call and avoids those previous 2 extra user data copies. Sending the same file via zero copy reduces the context switches to 2 per sendfile call and eliminates all CPU extra data copies (both in user and in kernel space). Zero-copy protocols are especially important for very high-speed networks in which the capacity of a network link approaches or exceeds the CPU's processing capacity. In such a case the CPU may spend nearly all of its time copying transferred data, and thus becomes a bottleneck which limits the communication rate to below the link's capacity. A rule of thumb used in the industry is that roughly one CPU clock cycle is needed to process one bit of incoming data.


Hardware implementations

An early implementation was
IBM International Business Machines Corporation (using the trademark IBM), nicknamed Big Blue, is an American Multinational corporation, multinational technology company headquartered in Armonk, New York, and present in over 175 countries. It is ...
OS/360 OS/360, officially known as IBM System/360 Operating System, is a discontinued batch processing operating system developed by IBM for their then-new System/360 mainframe computer, announced in 1964; it was influenced by the earlier IBSYS/IBJOB a ...
where a program can instruct the channel subsystem to read blocks of data from one file or device into a
buffer Buffer may refer to: Science * Buffer gas, an inert or nonflammable gas * Buffer solution, a solution used to prevent changes in pH * Lysis buffer, in cell biology * Metal ion buffer * Mineral redox buffer, in geology Technology and engineeri ...
and write to another from the same buffer without moving the data. Techniques for creating zero-copy software include the use of
direct memory access Direct memory access (DMA) is a feature of computer systems that allows certain hardware subsystems to access main system computer memory, memory independently of the central processing unit (CPU). Without DMA, when the CPU is using programmed i ...
(DMA)-based copying and memory-mapping through a
memory management unit A memory management unit (MMU), sometimes called paged memory management unit (PMMU), is a computer hardware unit that examines all references to computer memory, memory, and translates the memory addresses being referenced, known as virtual mem ...
(MMU). These features require specific hardware support and usually involve particular memory alignment requirements. A newer approach used by the
Heterogeneous System Architecture Heterogeneous System Architecture (HSA) is a cross-vendor set of specifications that allow for the integration of central processing units and graphics processors on the same bus, with shared memory and tasks. The HSA is being developed by the HS ...
(HSA) facilitates the passing of
pointers Pointer may refer to: People with the name * Pointer (surname), a surname (including a list of people with the name) * Pointer Williams (born 1974), American former basketball player Arts, entertainment, and media * ''Pointer'' (journal), the ...
between the
CPU A central processing unit (CPU), also called a central processor, main processor, or just processor, is the primary processor in a given computer. Its electronic circuitry executes instructions of a computer program, such as arithmetic, log ...
and the
GPU A graphics processing unit (GPU) is a specialized electronic circuit designed for digital image processing and to accelerate computer graphics, being present either as a discrete video card or embedded on motherboards, mobile phones, personal ...
and also other processors. This requires a unified address space for the CPU and the GPU.


Program interfaces

Several operating systems support zero-copying of user data and file contents through specific
API An application programming interface (API) is a connection between computers or between computer programs. It is a type of software interface, offering a service to other pieces of software. A document or standard that describes how to build ...
s. Here are listed only a few well known system calls / APIs available in most popular OSs.
Novell NetWare NetWare is a discontinued computer network operating system developed by Novell, Inc. It initially used cooperative multitasking to run various services on a personal computer, using the Internetwork Packet Exchange, IPX network protocol. The f ...
supports a form of zero-copy through Event Control Blocks (ECBs), see NCOPY. The
internal Internal may refer to: *Internality as a concept in behavioural economics *Neijia, internal styles of Chinese martial arts *Neigong or "internal skills", a type of exercise in meditation associated with Daoism * ''Internal'' (album) by Safia, 2016 ...
COPY command in some versions of
DR-DOS DR-DOS is a disk operating system for IBM PC compatibles, originally developed by Gary A. Kildall's Digital Research, Inc. and derived from Concurrent PC DOS 6.0, which was an advanced successor of CP/M-86. Upon its introduction in 198 ...
since 1992 initiates this as well when COMMAND.COM detects that the files to be copied are stored on a NetWare file server, otherwise it falls back to normal
file copying In computing, file copying is the act of creating a new file such that it has the same content as an existing file. The operation is sometimes called ''cloning''. Generally, an operating system command-line shell provides for file copying v ...
. The external
MOVE Move or The Move may refer to: Brands and enterprises * Move (company), an American online real estate company * Move (electronics store), a defunct Australian electronics retailer * Daihatsu Move, a Japanese car * PlayStation Move, a motion ...
command since DR DOS 6.0 (1991) and MS-DOS 6.0 (1993) internally performs a RENAME (causing just the directory entries to be modified in the file system instead of physically copying the file data) when the source and destination are located on the same logical volume. The
Linux kernel The Linux kernel is a Free and open-source software, free and open source Unix-like kernel (operating system), kernel that is used in many computer systems worldwide. The kernel was created by Linus Torvalds in 1991 and was soon adopted as the k ...
supports zero-copy through various system calls, such as: * sendfile, sendfile64; * splice; * tee; * vmsplice; * process_vm_readv; * process_vm_writev; * copy_file_range; * raw sockets with packet
mmap In computing, mmap(2) is a POSIX-compliant Unix system call that maps files or devices into memory. It is a method of memory-mapped file I/O. It implements demand paging because file contents are not immediately read from disk and initially use n ...
or AF_XDP. Some of them are specified in
POSIX The Portable Operating System Interface (POSIX; ) is a family of standards specified by the IEEE Computer Society for maintaining compatibility between operating systems. POSIX defines application programming interfaces (APIs), along with comm ...
and thus also present in the
BSD The Berkeley Software Distribution (BSD), also known as Berkeley Unix or BSD Unix, is a discontinued Unix operating system developed and distributed by the Computer Systems Research Group (CSRG) at the University of California, Berkeley, beginni ...
kernels or
IBM AIX AIX (pronounced ) is a series of Proprietary software, proprietary Unix operating systems developed and sold by IBM since 1986. The name stands for "Advanced Interactive eXecutive". Current versions are designed to work with Power ISA based ...
, some are unique to the Linux kernel API.
FreeBSD FreeBSD is a free-software Unix-like operating system descended from the Berkeley Software Distribution (BSD). The first version was released in 1993 developed from 386BSD, one of the first fully functional and free Unix clones on affordable ...
,
NetBSD NetBSD is a free and open-source Unix-like operating system based on the Berkeley Software Distribution (BSD). It was the first open-source BSD descendant officially released after 386BSD was fork (software development), forked. It continues to ...
,
OpenBSD OpenBSD is a security-focused operating system, security-focused, free software, Unix-like operating system based on the Berkeley Software Distribution (BSD). Theo de Raadt created OpenBSD in 1995 by fork (software development), forking NetBSD ...
,
DragonFly BSD DragonFly BSD is a free and open-source Unix-like operating system forked from FreeBSD 4.8. Matthew Dillon, an Amiga developer in the late 1980s and early 1990s and FreeBSD developer between 1994 and 2003, began working on DragonFly BSD in ...
, etc. support zero-copy through at least these system calls: * sendfile; * write, writev + mmap when writing data to a network socket.
MacOS macOS, previously OS X and originally Mac OS X, is a Unix, Unix-based operating system developed and marketed by Apple Inc., Apple since 2001. It is the current operating system for Apple's Mac (computer), Mac computers. With ...
should support zero-copy through the FreeBSD portion of the kernel because it offers the same system calls (and its manual pages are still tagged BSD) such as: * sendfile.
Oracle Solaris Oracle Solaris is a proprietary Unix operating system offered by Oracle for SPARC and x86-64 based workstations and servers. Originally developed by Sun Microsystems as Solaris, it superseded the company's earlier SunOS in 1993 and became kno ...
supports zero-copy through at least these system calls: * sendfile; * sendfilev; * write, writev + mmap.
Microsoft Windows Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Microsoft. It is grouped into families and subfamilies that cater to particular sec ...
supports zero-copy through at least this system call: * TransmitFile.
Java Java is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea (a part of Pacific Ocean) to the north. With a population of 156.9 million people (including Madura) in mid 2024, proje ...
input streams can support zero-copy through the java.nio.channels.FileChannel's transferTo() method if the underlying operating system also supports zero copy. RDMA (Remote Direct Memory Access) protocols deeply rely on zero-copy techniques.


See also

* AF XDP *
Call by reference In a programming language, an evaluation strategy is a set of rules for evaluating expressions. The term is often used to refer to the more specific notion of a ''parameter-passing strategy'' that defines the kind of value that is passed to the ...
*
Device driver In the context of an operating system, a device driver is a computer program that operates or controls a particular type of device that is attached to a computer or automaton. A driver provides a software interface to hardware devices, enabli ...
*
Embedded system An embedded system is a specialized computer system—a combination of a computer processor, computer memory, and input/output peripheral devices—that has a dedicated function within a larger mechanical or electronic system. It is e ...
*
Infiniband InfiniBand (IB) is a computer networking communications standard used in high-performance computing that features very high throughput and very low latency. It is used for data interconnect both among and within computers. InfiniBand is also used ...
*
Locality of reference In computer science, locality of reference, also known as the principle of locality, is the tendency of a processor to access the same set of memory locations repetitively over a short period of time. There are two basic types of reference localit ...
* NCOPY *
netsniff-ng netsniff-ng is a free Linux network analyzer and networking toolkit originally written by Daniel Borkmann. Its gain of performance is reached by zero-copy mechanisms for network packets (RX_RING, TX_RING), so that the Linux kernel does not need ...
* Programmed input/output * Socket Direct Protocol * Scatter/gather I/O


References

{{reflist, refs= {{Cite web , url=https://www.linuxjournal.com/article/6345?page=0,0 , title=Zero Copy I: User-Mode Perspective , website=www.linuxjournal.com , language=en , date=2003-01-01 , access-date=2021-10-14 , first=Dragan , last=Stancevic {{Cite journal , title=ZeroCopy: Techniques, Benefits and Pitfalls , language=en , date=2012-01-01 , first=Eduard , last=Bröse , citeseerx=10.1.1.93.9589 {{Cite web , url=https://www.uidaho.edu/-/media/UIdaho-Responsive/Files/engr/research/csds/publications/2012/Performance-Review-of-Zero-Copy-Techniques-2012.pdf?la=en&hash=B5F37435875AAD15C55C7DFC1FDA53DBF242C0E3 , title=Performance Review of Zero Copy Techniques , website=www.uidaho.edu, language=en , date=2012-01-01 , access-date=2021-10-14 , first1=Jia , last1=Song , first2=Jim , last2=Alves-Foss {{Cite web , url=https://freebsdfoundation.org/wp-content/uploads/2020/07/TLS-Offload-in-the-Kernel.pdf , title=TLS offload in the kernel, website=freebsdfoundation.org , language=en , date=2020-05-01 , access-date=2021-10-14 , first=John , last=Baldwin {{cite web , title=The programmer's guide to the APU galaxy , url=http://developer.amd.com/afds/assets/keynotes/Phil%20Rogers%20Keynote-FINAL.pdf {{cite web , title=AMD Outlines HSA Roadmap: Unified Memory for CPU/GPU , url=http://www.anandtech.com/show/5493/amd-outlines-hsa-roadmap-unified-memory-for-cpugpu-in-2013-hsa-gpus-in-2014 , date=2012-02-02 {{Cite web , url=https://man7.org/linux/man-pages/man2/sendfile.2.html, title=sendfile(2) - Linux manual page , website=man7.org , language=en , date=2021-03-22 , access-date=2021-10-13 {{Cite web , url=https://man7.org/linux/man-pages/man2/splice.2.html, title=splice(2) - Linux manual page , website=man7.org , language=en , date=2021-03-22 , access-date=2021-10-13 {{Cite web , url=https://man7.org/linux/man-pages/man2/tee.2.html, title=tee(2) - Linux manual page , website=man7.org , language=en , date=2021-03-22 , access-date=2021-10-13 {{Cite web , url=https://man7.org/linux/man-pages/man2/vmsplice.2.html, title=vmsplice(2) - Linux manual page , website=man7.org , language=en , date=2021-03-22 , access-date=2021-10-13 {{Cite web , url=https://man7.org/linux/man-pages/man2/process_vm_readv.2.html , title=process_vm_readv(2) - Linux manual page , website=man7.org , language=en , date=2021-03-22 , access-date=2021-10-13 {{Cite web , url=https://man7.org/linux/man-pages/man2/process_vm_writev.2.html , title=process_vm_writev(2) - Linux manual page , website=man7.org , language=en , date=2021-03-22 , access-date=2021-10-13 {{Cite web , url=https://man7.org/linux/man-pages/man2/copy_file_range.2.html , title=copy_file_range(2) - Linux manual page , website=man7.org , language=en , date=2021-03-22 , access-date=2021-10-13 {{Cite web , url=https://www.kernel.org/doc/Documentation/networking/packet_mmap.txt , title=Linux PACKET_MMAP documentation , website=kernel.org {{Cite web , url=https://www.freebsd.org/cgi/man.cgi?query=sendfile&apropos=0&sektion=2&manpath=FreeBSD+13.0-RELEASE+and+Ports&arch=default&format=html , title=sendfile(2) - FreeBSD manual pages , website=www.freebsd.org , language=en , date=2020-04-30 , access-date=2021-10-13 {{Cite web , url=https://www.freebsd.org/cgi/man.cgi?query=write&apropos=0&sektion=2&manpath=FreeBSD+13.0-RELEASE+and+Ports&arch=default&format=html , title=write(2) - FreeBSD manual pages , website=www.freebsd.org , language=en , date=2020-04-30 , access-date=2021-10-13 {{Cite web , url=https://www.freebsd.org/cgi/man.cgi?query=writev&apropos=0&sektion=2&manpath=FreeBSD+13.0-RELEASE+and+Ports&arch=default&format=html , title=writev(2) - FreeBSD manual pages , website=www.freebsd.org , language=en , date=2020-04-30 , access-date=2021-10-13 {{Cite web , url=https://www.freebsd.org/cgi/man.cgi?query=mmap&apropos=0&sektion=2&manpath=FreeBSD+13.0-RELEASE+and+Ports&arch=default&format=html , title=mmap(2) - FreeBSD manual pages , website=www.freebsd.org , language=en , date=2020-04-30 , access-date=2021-10-13 {{Cite web , url=https://developer.apple.com/library/archive/documentation/System/Conceptual/ManPages_iPhoneOS/man2/sendfile.2.html , title=sendfile(2) - Mac OS X Manual Page , website=developer.apple.com , language=en , date=2006-03-31 , access-date=2021-10-13 {{Cite web , url=https://docs.oracle.com/cd/E88353_01/html/E37843/sendfile-3c.html , title=sendfile(3C) - Solaris manual pages , website=docs.oracle.com , language=en , date=2021-08-13 , access-date=2021-10-13 {{Cite web , url=https://docs.oracle.com/cd/E88353_01/html/E37843/sendfilev-3c.html , title=sendfilev(3C) - Solaris manual pages , website=docs.oracle.com , language=en , date=2021-08-13 , access-date=2021-10-13 {{Cite web , url=https://docs.oracle.com/cd/E88353_01/html/E37841/write-2.html , title=write(2) - Solaris manual pages , website=docs.oracle.com , language=en , date=2021-08-13 , access-date=2021-10-13 {{Cite web , url=https://docs.oracle.com/cd/E88353_01/html/E37841/writev-2.html, title=writev(2) - Solaris manual pages , website=docs.oracle.com , language=en , date=2021-08-13 , access-date=2021-10-13 {{Cite web , url=https://docs.oracle.com/cd/E88353_01/html/E37841/mmap-2.html, title=mmap(2) - Solaris manual pages , website=docs.oracle.com , language=en , date=2021-08-13 , access-date=2021-10-13 {{Cite web , url=https://docs.microsoft.com/en-us/windows/win32/api/mswsock/nf-mswsock-transmitfile , title=TransmitFile function (Win32) , website=docs.microsoft.com , language=en , date=2021-05-10 , access-date=2021-10-13 {{Cite web , url=https://developer.ibm.com/articles/j-zerocopy/?mhsrc=ibmsearch_a&mhq=java%20zero%20copy , title=Java zero-copy , website=developer.ibm.com, language=en , date=2008-09-02 , access-date=2021-10-13 , first1=Sathish K. , last1=Palaniappan , first2=Pramod B. , last2=Nagaraja {{cite web , title=Caldera OpenDOS Machine Readable Source Kit (M.R.S) 7.01 , publisher= Caldera, Inc. , date=1997-05-01 , url=https://archive.sundby.com/retro/DR-DOS/dossrc.zip , access-date=2022-01-02 , url-status=live , archive-url=https://web.archive.org/web/20210807095409/https://archive.sundby.com/retro/DR-DOS/dossrc.zip , archive-date=2021-08-07

(NB. Actually implemented since DR DOS "Panther" on 1992-06-22, see COMCPY.C/DOSIF.ASM in the COMMAND.COM sources of
OpenDOS 7.01 DR-DOS is a disk operating system for IBM PC compatibles, originally developed by Gary A. Kildall's Digital Research, Inc. and derived from Concurrent PC DOS 6.0, which was an advanced successor of CP/M-86. Upon its introduction in 1988, ...
.)
{{cite book , title=NWDOS-TIPs — Tips & Tricks rund um Novell DOS 7, mit Blick auf undokumentierte Details, Bugs und Workarounds , chapter=II.4. Undokumentierte Eigenschaften externer Kommandos: MOVE.EXE , work=MPDOSTIP , author-first=Matthias R. , author-last=Paul , date=1997-07-30 , orig-date=1994-05-01 , edition=3 , version=Release 157 , language=de , url=http://www.antonis.de/dos/dos-tuts/mpdostip/html/nwdostip.htm , access-date=2014-08-06 , url-status=live , archive-url=https://web.archive.org/web/20170910194752/http://www.antonis.de/dos/dos-tuts/mpdostip/html/nwdostip.htm , archive-date=2017-09-10 (NB. NWDOSTIP.TXT is a comprehensive work on Novell DOS 7 and OpenDOS 7.01, including the description of many undocumented features and internals. It is part of the author's yet larger MPDOSTIP.ZIP collection maintained up to 2001 and distributed on many sites at the time. The provided link points to a HTML-converted older version of the NWDOSTIP.TXT file.
A yet older version 155 from 1997-05-13 of the 1997-07-15 distribution archive. -->
/ref> Software optimization