HOME

TheInfoList



OR:

"Zero-copy" describes computer operations in which the
CPU A central processing unit (CPU), also called a central processor, main processor or just processor, is the electronic circuitry that executes instructions comprising a computer program. The CPU performs basic arithmetic, logic, controlling, and ...
does not perform the task of copying data from one
memory Memory is the faculty of the mind by which data or information is encoded, stored, and retrieved when needed. It is the retention of information over time for the purpose of influencing future action. If past events could not be remembered ...
area to another or in which unnecessary data copies are avoided. This is frequently used to save CPU cycles and memory bandwidth in many time consuming tasks, such as when transmitting a file at high speed over a network, etc., thus improving
performances A performance is an act of staging or presenting a play, concert, or other form of entertainment. It is also defined as the action or process of carrying out or accomplishing an action, task, or function. Management science In the work place ...
of
programs Program, programme, programmer, or programming may refer to: Business and management * Program management, the process of managing several related projects * Time management * Program, a part of planning Arts and entertainment Audio * Programm ...
(
processes A process is a series or set of activities that interact to produce a result; it may occur once-only or be recurrent or periodic. Things called a process include: Business and management *Business process, activities that produce a specific se ...
) executed by a computer.


Principle

Zero-copy programming techniques can be used when exchanging data within a
user space A modern computer operating system usually segregates virtual memory into user space and kernel space. Primarily, this separation serves to provide memory protection and hardware protection from malicious or errant software behaviour. Kerne ...
process (i.e. between two or more
threads Thread may refer to: Objects * Thread (yarn), a kind of thin yarn used for sewing ** Thread (unit of measurement), a cotton yarn measure * Screw thread, a helical ridge on a cylindrical fastener Arts and entertainment * ''Thread'' (film), 2016 ...
, etc.) and/or between two or more processes (see also
producer–consumer problem In computing, the producer-consumer problem (also known as the bounded-buffer problem) is a family of problems described by Edsger W. Dijkstra since 1965. Dijkstra found the solution for the producer-consumer problem as he worked as a consultant f ...
) and/or when data has to be accessed / copied / moved inside kernel space or between a user space process and kernel space portions of
operating system An operating system (OS) is system software that manages computer hardware, software resources, and provides common daemon (computing), services for computer programs. Time-sharing operating systems scheduler (computing), schedule tasks for ef ...
s (OS). Usually when a user space process has to execute system operations like reading or writing data from/to a
device A device is usually a constructed tool. Device may also refer to: Technology Computing * Device, a colloquial term encompassing desktops, laptops, tablets, smartphones, etc. * Device file, an interface of a device driver * Peripheral, any devi ...
(i.e. a disk, a
NIC NIC may refer to: Banking and insurance companies * National Insurance Corporation, Uganda * NIC Bank, a commercial bank in Kenya Politics, government and economics * National Ice Center, an agency that provides worldwide navigational ice a ...
, etc.) through their high level software interfaces or like moving data from one device to another, etc., it has to perform one or more
system call In computing, a system call (commonly abbreviated to syscall) is the programmatic way in which a computer program requests a service from the operating system on which it is executed. This may include hardware-related services (for example, acc ...
s that are then executed in kernel space by the operating system. If data has to be copied or moved from source to destination and both are located inside kernel space (i.e. two files, a file and a network card, etc.) then unnecessary data copies, from kernel space to user space and from user space to kernel space, can be avoided by using special (zero-copy) system calls, usually available in most recent versions of popular operating systems. Zero-copy versions of operating system elements, such as
device driver In computing, a device driver is a computer program that operates or controls a particular type of device that is attached to a computer or automaton. A driver provides a software interface to hardware devices, enabling operating systems and o ...
s,
file system In computing, file system or filesystem (often abbreviated to fs) is a method and data structure that the operating system uses to control how data is stored and retrieved. Without a file system, data placed in a storage medium would be one lar ...
s,
network protocol stack The protocol stack or network stack is an implementation of a computer networking protocol suite or protocol family. Some of these terms are used interchangeably but strictly speaking, the ''suite'' is the definition of the communication proto ...
s, etc., greatly increase the performance of certain application programs (that become processes when executed) and more efficiently utilize system resources. Performance is enhanced by allowing the CPU to move on to other tasks while data copies / processing proceed in parallel in another part of the machine. Also, zero-copy operations reduce the number of time-consuming
context switch In computing, a context switch is the process of storing the state of a process or thread, so that it can be restored and resume execution at a later point, and then restoring a different, previously saved, state. This allows multiple processes ...
es between user space and kernel space. System resources are utilized more efficiently since using a sophisticated CPU to perform extensive data copy operations, which is a relatively simple task, is wasteful if other simpler system components can do the copying. As an example, reading a file and then sending it over a network the traditional way requires 2 extra data copies (1 to read from kernel to user space + 1 to write from user to kernel space) and 4 context switches per read/write cycle. Those extra data copies use the CPU. Sending that file by using mmap of file data and a cycle of write calls, reduces the context switches to 2 per write call and avoids those previous 2 extra user data copies. Sending the same file via zero copy reduces the context switches to 2 per sendfile call and eliminates all CPU extra data copies (both in user and in kernel space). Zero-copy protocols are especially important for very high-speed networks in which the capacity of a network link approaches or exceeds the CPU's processing capacity. In such a case the CPU may spend nearly all of its time copying transferred data, and thus becomes a bottleneck which limits the communication rate to below the link's capacity. A rule of thumb used in the industry is that roughly one CPU clock cycle is needed to process one bit of incoming data.


Hardware implementations

An early implementation was IBM
OS/360 OS/360, officially known as IBM System/360 Operating System, is a discontinued batch processing operating system developed by IBM for their then-new System/360 mainframe computer, announced in 1964; it was influenced by the earlier IBSYS/IBJOB ...
where a program can instruct the channel subsystem to read blocks of data from one file or device into a buffer and write to another from the same buffer without moving the data. Techniques for creating zero-copy software include the use of
direct memory access Direct memory access (DMA) is a feature of computer systems and allows certain hardware subsystems to access main system memory independently of the central processing unit (CPU). Without DMA, when the CPU is using programmed input/output, it is ...
(DMA)-based copying and memory-mapping through a
memory management unit A memory management unit (MMU), sometimes called paged memory management unit (PMMU), is a computer hardware unit having all memory references passed through itself, primarily performing the translation of virtual memory addresses to physical ...
(MMU). These features require specific hardware support and usually involve particular memory alignment requirements. A newer approach used by the Heterogeneous System Architecture (HSA) facilitates the passing of pointers between the
CPU A central processing unit (CPU), also called a central processor, main processor or just processor, is the electronic circuitry that executes instructions comprising a computer program. The CPU performs basic arithmetic, logic, controlling, and ...
and the GPU and also other processors. This requires a unified address space for the CPU and the GPU.


Program interfaces

Several operating systems support zero-copying of user data and file contents through specific APIs. Here are listed only a few well known system calls / APIs available in most popular OSs. Novell NetWare supports a form of zero-copy through Event Control Blocks (ECBs), see NCOPY. The
internal Internal may refer to: * Internality as a concept in behavioural economics *Neijia, internal styles of Chinese martial arts *Neigong Neigong, also spelled ''nei kung'', ''neigung'', or ''nae gong'', refers to any of a set of Chinese breathing, ...
COPY command in some versions of
DR-DOS DR-DOS (written as DR DOS, without a hyphen, in versions up to and including 6.0) is a disk operating system for IBM PC compatibles. Upon its introduction in 1988, it was the first DOS attempting to be compatible with IBM PC DOS and MS-DO ...
since 1992 initiates this as well when COMMAND.COM detects that the files to be copied are stored on a NetWare file server, otherwise it falls back to normal
file copying In digital file management, copying is a file operation that creates a new file which has the same content as an existing file. Computer operating systems include file copying methods to users, with operating systems with graphical user interface ...
. The external
MOVE Move may refer to: People * Daniil Move (born 1985), a Russian auto racing driver Brands and enterprises * Move (company), an online real estate company * Move (electronics store), a defunct Australian electronics retailer * Daihatsu Move ...
command since
DR DOS 6.0 DR-DOS (written as DR DOS, without a hyphen, in versions up to and including 6.0) is a disk operating system for IBM PC compatibles. Upon its introduction in 1988, it was the first DOS attempting to be compatible with IBM PC DOS and MS-D ...
(1991) and
MS-DOS 6.0 MS-DOS ( ; acronym for Microsoft Disk Operating System, also known as Microsoft DOS) is an operating system for x86-based personal computers mostly developed by Microsoft. Collectively, MS-DOS, its rebranding as IBM PC DOS, and a few ...
(1993) internally performs a RENAME (causing just the directory entries to be modified in the file system instead of physically copying the file data) when the source and destination are located on the same logical volume. The
Linux kernel The Linux kernel is a free and open-source, monolithic, modular, multitasking, Unix-like operating system kernel. It was originally authored in 1991 by Linus Torvalds for his i386-based PC, and it was soon adopted as the kernel for the GNU ...
supports zero-copy through various system calls, such as: * sendfile, sendfile64; *
splice Splice may refer to: Connections * Rope splicing, joining two pieces of rope or cable by weaving the strands of each into the other ** Eye splice, a method of creating a permanent loop in the end of multi stranded rope by means of rope splicing * ...
; * tee; * vmsplice; * process_vm_readv; * process_vm_writev; * copy_file_range; * raw sockets with packet mmap or
AF_XDP XDP (eXpress Data Path) is an eBPF-based high-performance data path used to send and receive network packets at high rates by bypassing most of the operating system networking stack. It is merged in the Linux kernel since version 4.8. This implem ...
. Some of them are specified in
POSIX The Portable Operating System Interface (POSIX) is a family of standards specified by the IEEE Computer Society for maintaining compatibility between operating systems. POSIX defines both the system- and user-level application programming inte ...
and thus also present in the BSD kernels or
IBM AIX AIX (Advanced Interactive eXecutive, pronounced , "ay-eye-ex") is a series of proprietary Unix operating systems developed and sold by IBM for several of its computer platforms. Background Originally released for the IBM RT PC RISC w ...
, some are unique to the
Linux kernel API The Linux kernel provides several interfaces to user-space applications that are used for different purposes and that have different properties by design. There are two types of application programming interface (API) in the Linux kernel tha ...
.
FreeBSD FreeBSD is a free and open-source Unix-like operating system descended from the Berkeley Software Distribution (BSD), which was based on Research Unix. The first version of FreeBSD was released in 1993. In 2005, FreeBSD was the most popular ...
,
NetBSD NetBSD is a free and open-source Unix operating system based on the Berkeley Software Distribution (BSD). It was the first open-source BSD descendant officially released after 386BSD was forked. It continues to be actively developed and is a ...
,
OpenBSD OpenBSD is a security-focused operating system, security-focused, free and open-source, Unix-like operating system based on the Berkeley Software Distribution (BSD). Theo de Raadt created OpenBSD in 1995 by fork (software development), forking N ...
,
DragonFly BSD DragonFly BSD is a free and open-source Unix-like operating system forked from FreeBSD 4.8. Matthew Dillon, an Amiga developer in the late 1980s and early 1990s and FreeBSD developer between 1994 and 2003, began working on DragonFly BSD in ...
, etc. support zero-copy through at least these system calls: * sendfile; * write, writev + mmap when writing data to a network socket.
MacOS macOS (; previously OS X and originally Mac OS X) is a Unix operating system developed and marketed by Apple Inc. since 2001. It is the primary operating system for Apple's Mac (computer), Mac computers. Within the market of ...
should support zero-copy through the FreeBSD portion of the kernel because it offers the same system calls (and its manual pages are still tagged BSD) such as: * sendfile.
Oracle Solaris Solaris is a proprietary Unix operating system originally developed by Sun Microsystems. After the Sun acquisition by Oracle in 2010, it was renamed Oracle Solaris. Solaris superseded the company's earlier SunOS in 1993, and became known for i ...
supports zero-copy through at least these system calls: * sendfile; * sendfilev; * write, writev + mmap. Microsoft Windows supports zero-copy through at least this system call: * TransmitFile.
Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's mo ...
input streams can support zero-copy through the java.nio.channels.FileChannel's transferTo() method if the underlying operating system also supports zero copy. RDMA (Remote Direct Memory Access) protocols deeply rely on zero-copy techniques.


See also

*
AF XDP XDP (eXpress Data Path) is an eBPF-based high-performance data path used to send and receive network packets at high rates by bypassing most of the operating system networking stack. It is merged in the Linux kernel since version 4.8. This implem ...
*
Call by reference In a programming language, an evaluation strategy is a set of rules for evaluating expressions. The term is often used to refer to the more specific notion of a ''parameter-passing strategy'' that defines the kind of value that is passed to the ...
*
Device driver In computing, a device driver is a computer program that operates or controls a particular type of device that is attached to a computer or automaton. A driver provides a software interface to hardware devices, enabling operating systems and o ...
*
Embedded system An embedded system is a computer system—a combination of a computer processor, computer memory, and input/output peripheral devices—that has a dedicated function within a larger mechanical or electronic system. It is ''embedded'' ...
*
Infiniband InfiniBand (IB) is a computer networking communications standard used in high-performance computing that features very high throughput and very low latency. It is used for data interconnect both among and within computers. InfiniBand is also us ...
*
Locality of reference In computer science, locality of reference, also known as the principle of locality, is the tendency of a processor to access the same set of memory locations repetitively over a short period of time. There are two basic types of reference localit ...
* NCOPY * netsniff-ng * Programmed input/output *
Socket Direct Protocol The Sockets Direct Protocol (SDP) is a transport-agnostic protocol to support stream sockets over remote direct memory access (RDMA) network fabrics. SDP was originally defined by the Software Working Group (SWG) of the InfiniBand Trade Associatio ...
* Scatter/gather I/O


References

{{reflist, refs= {{Cite web , url=https://www.linuxjournal.com/article/6345?page=0,0 , title=Zero Copy I: User-Mode Perspective , website=www.linuxjournal.com , language=en , date=2003-01-01 , access-date=2021-10-14 , first=Dragan , last=Stancevic {{Cite journal , title=ZeroCopy: Techniques, Benefits and Pitfalls , language=en , date=2012-01-01 , first=Eduard , last=Bröse , citeseerx=10.1.1.93.9589 {{Cite web , url=https://www.uidaho.edu/-/media/UIdaho-Responsive/Files/engr/research/csds/publications/2012/Performance-Review-of-Zero-Copy-Techniques-2012.pdf?la=en&hash=B5F37435875AAD15C55C7DFC1FDA53DBF242C0E3 , title=Performance Review of Zero Copy Techniques , website=www.uidaho.edu, language=en , date=2012-01-01 , access-date=2021-10-14 , first1=Jia , last1=Song , first2=Jim , last2=Alves-Foss {{Cite web , url=https://freebsdfoundation.org/wp-content/uploads/2020/07/TLS-Offload-in-the-Kernel.pdf , title=TLS offload in the kernel, website=freebsdfoundation.org , language=en , date=2020-05-01 , access-date=2021-10-14 , first=John , last=Baldwin {{cite web , title=The programmer's guide to the APU galaxy , url=http://developer.amd.com/afds/assets/keynotes/Phil%20Rogers%20Keynote-FINAL.pdf {{cite web , title=AMD Outlines HSA Roadmap: Unified Memory for CPU/GPU , url=http://www.anandtech.com/show/5493/amd-outlines-hsa-roadmap-unified-memory-for-cpugpu-in-2013-hsa-gpus-in-2014 , date=2012-02-02 {{Cite web , url=https://man7.org/linux/man-pages/man2/sendfile.2.html, title=sendfile(2) - Linux manual page , website=man7.org , language=en , date=2021-03-22 , access-date=2021-10-13 {{Cite web , url=https://man7.org/linux/man-pages/man2/splice.2.html, title=splice(2) - Linux manual page , website=man7.org , language=en , date=2021-03-22 , access-date=2021-10-13 {{Cite web , url=https://man7.org/linux/man-pages/man2/tee.2.html, title=tee(2) - Linux manual page , website=man7.org , language=en , date=2021-03-22 , access-date=2021-10-13 {{Cite web , url=https://man7.org/linux/man-pages/man2/vmsplice.2.html, title=vmsplice(2) - Linux manual page , website=man7.org , language=en , date=2021-03-22 , access-date=2021-10-13 {{Cite web , url=https://man7.org/linux/man-pages/man2/process_vm_readv.2.html , title=process_vm_readv(2) - Linux manual page , website=man7.org , language=en , date=2021-03-22 , access-date=2021-10-13 {{Cite web , url=https://man7.org/linux/man-pages/man2/process_vm_writev.2.html , title=process_vm_writev(2) - Linux manual page , website=man7.org , language=en , date=2021-03-22 , access-date=2021-10-13 {{Cite web , url=https://man7.org/linux/man-pages/man2/copy_file_range.2.html , title=copy_file_range(2) - Linux manual page , website=man7.org , language=en , date=2021-03-22 , access-date=2021-10-13 {{Cite web , url=https://www.kernel.org/doc/Documentation/networking/packet_mmap.txt , title=Linux PACKET_MMAP documentation , website=kernel.org {{Cite web , url=https://www.freebsd.org/cgi/man.cgi?query=sendfile&apropos=0&sektion=2&manpath=FreeBSD+13.0-RELEASE+and+Ports&arch=default&format=html , title=sendfile(2) - FreeBSD manual pages , website=www.freebsd.org , language=en , date=2020-04-30 , access-date=2021-10-13 {{Cite web , url=https://www.freebsd.org/cgi/man.cgi?query=write&apropos=0&sektion=2&manpath=FreeBSD+13.0-RELEASE+and+Ports&arch=default&format=html , title=write(2) - FreeBSD manual pages , website=www.freebsd.org , language=en , date=2020-04-30 , access-date=2021-10-13 {{Cite web , url=https://www.freebsd.org/cgi/man.cgi?query=writev&apropos=0&sektion=2&manpath=FreeBSD+13.0-RELEASE+and+Ports&arch=default&format=html , title=writev(2) - FreeBSD manual pages , website=www.freebsd.org , language=en , date=2020-04-30 , access-date=2021-10-13 {{Cite web , url=https://www.freebsd.org/cgi/man.cgi?query=mmap&apropos=0&sektion=2&manpath=FreeBSD+13.0-RELEASE+and+Ports&arch=default&format=html , title=mmap(2) - FreeBSD manual pages , website=www.freebsd.org , language=en , date=2020-04-30 , access-date=2021-10-13 {{Cite web , url=https://developer.apple.com/library/archive/documentation/System/Conceptual/ManPages_iPhoneOS/man2/sendfile.2.html , title=sendfile(2) - Mac OS X Manual Page , website=developer.apple.com , language=en , date=2006-03-31 , access-date=2021-10-13 {{Cite web , url=https://docs.oracle.com/cd/E88353_01/html/E37843/sendfile-3c.html , title=sendfile(3C) - Solaris manual pages , website=docs.oracle.com , language=en , date=2021-08-13 , access-date=2021-10-13 {{Cite web , url=https://docs.oracle.com/cd/E88353_01/html/E37843/sendfilev-3c.html , title=sendfilev(3C) - Solaris manual pages , website=docs.oracle.com , language=en , date=2021-08-13 , access-date=2021-10-13 {{Cite web , url=https://docs.oracle.com/cd/E88353_01/html/E37841/write-2.html , title=write(2) - Solaris manual pages , website=docs.oracle.com , language=en , date=2021-08-13 , access-date=2021-10-13 {{Cite web , url=https://docs.oracle.com/cd/E88353_01/html/E37841/writev-2.html, title=writev(2) - Solaris manual pages , website=docs.oracle.com , language=en , date=2021-08-13 , access-date=2021-10-13 {{Cite web , url=https://docs.oracle.com/cd/E88353_01/html/E37841/mmap-2.html, title=mmap(2) - Solaris manual pages , website=docs.oracle.com , language=en , date=2021-08-13 , access-date=2021-10-13 {{Cite web , url=https://docs.microsoft.com/en-us/windows/win32/api/mswsock/nf-mswsock-transmitfile , title=TransmitFile function (Win32) , website=docs.microsoft.com , language=en , date=2021-05-10 , access-date=2021-10-13 {{Cite web , url=https://developer.ibm.com/articles/j-zerocopy/?mhsrc=ibmsearch_a&mhq=java%20zero%20copy , title=Java zero-copy , website=developer.ibm.com, language=en , date=2008-09-02 , access-date=2021-10-13 , first1=Sathish K. , last1=Palaniappan , first2=Pramod B. , last2=Nagaraja {{cite web , title=Caldera OpenDOS Machine Readable Source Kit (M.R.S) 7.01 , publisher=
Caldera, Inc. Caldera was a US-based software company founded in 1994 to develop Linux- and DOS-based operating system products. Caldera Caldera, Inc. was a Canopy-funded software company founded in October 1994 and incorporated on 25 January 1995 ...
, date=1997-05-01 , url=https://archive.sundby.com/retro/DR-DOS/dossrc.zip , access-date=2022-01-02 , url-status=live , archive-url=https://web.archive.org/web/20210807095409/https://archive.sundby.com/retro/DR-DOS/dossrc.zip , archive-date=2021-08-07

(NB. Actually implemented since DR DOS "Panther" on 1992-06-22, see COMCPY.C/DOSIF.ASM in the COMMAND.COM sources of OpenDOS 7.01.)
{{cite book , title=NWDOS-TIPs — Tips & Tricks rund um Novell DOS 7, mit Blick auf undokumentierte Details, Bugs und Workarounds , chapter=II.4. Undokumentierte Eigenschaften externer Kommandos: MOVE.EXE , work=MPDOSTIP , author-first=Matthias R. , author-last=Paul , date=1997-07-30 , orig-date=1994-05-01 , edition=3 , version=Release 157 , language=de , url=http://www.antonis.de/dos/dos-tuts/mpdostip/html/nwdostip.htm , access-date=2014-08-06 , url-status=live , archive-url=https://web.archive.org/web/20170910194752/http://www.antonis.de/dos/dos-tuts/mpdostip/html/nwdostip.htm , archive-date=2017-09-10 (NB. NWDOSTIP.TXT is a comprehensive work on Novell DOS 7 and OpenDOS 7.01, including the description of many undocumented features and internals. It is part of the author's yet larger MPDOSTIP.ZIP collection maintained up to 2001 and distributed on many sites at the time. The provided link points to a HTML-converted older version of the NWDOSTIP.TXT file.
A yet older version 155 from 1997-05-13 of the 1997-07-15 distribution archive. -->
/ref> Software optimization