HOME

TheInfoList



OR:

In
computing Computing is any goal-oriented activity requiring, benefiting from, or creating computing machinery. It includes the study and experimentation of algorithmic processes, and development of both hardware and software. Computing has scientific, ...
, particularly in the context of the
Unix Unix (; trademarked as UNIX) is a family of multitasking, multiuser computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, ...
operating system and its workalikes, fork is an operation whereby a process creates a copy of itself. It is an interface which is required for compliance with the
POSIX The Portable Operating System Interface (POSIX) is a family of standards specified by the IEEE Computer Society for maintaining compatibility between operating systems. POSIX defines both the system- and user-level application programming in ...
and Single UNIX Specification standards. It is usually implemented as a
C Standard Library The C standard library or libc is the standard library for the C programming language, as specified in the ISO C standard. ISO/IEC (2018). '' ISO/IEC 9899:2018(E): Programming Languages - C §7'' Starting from the original ANSI C standard, it was ...
(libc) wrapper to the fork, clone, or other
system call In computing, a system call (commonly abbreviated to syscall) is the programmatic way in which a computer program requests a service from the operating system on which it is executed. This may include hardware-related services (for example, acc ...
s of the
kernel Kernel may refer to: Computing * Kernel (operating system), the central component of most operating systems * Kernel (image processing), a matrix used for image convolution * Compute kernel, in GPGPU programming * Kernel method, in machine learn ...
. Fork is the primary method of process creation on Unix-like operating systems.


Overview

In multitasking operating systems, processes (running programs) need a way to create new processes, e.g. to run other programs. Fork and its variants are typically the only way of doing so in Unix-like systems. For a process to start the execution of a different program, it first forks to create a copy of itself. Then, the copy, called the "
child process A child process in computing is a process created by another process (the parent process). This technique pertains to multitasking operating systems, and is sometimes called a subprocess or traditionally a subtask. There are two major procedure ...
", calls the
exec Exec or EXEC may refer to: * Executive officer, a person responsible for running an organization * Executive producer, provides finance and guidance for the making of a commercial entertainment product * A family of kit helicopters produced by Rot ...
system call to overlay itself with the other program: it ceases execution of its former program in favor of the other. The fork operation creates a separate
address space In computing, an address space defines a range of discrete addresses, each of which may correspond to a network host, peripheral device, disk sector, a memory cell or other logical or physical entity. For software programs to save and retrieve s ...
for the child. The child process has an exact copy of all the memory segments of the parent process. In modern UNIX variants that follow the
virtual memory In computing, virtual memory, or virtual storage is a memory management technique that provides an "idealized abstraction of the storage resources that are actually available on a given machine" which "creates the illusion to users of a very ...
model from SunOS-4.0,
copy-on-write Copy-on-write (COW), sometimes referred to as implicit sharing or shadowing, is a resource-management technique used in computer programming to efficiently implement a "duplicate" or "copy" operation on modifiable resources. If a resource is dupl ...
semantics are implemented and the physical memory need not be actually copied. Instead, virtual memory pages in both processes may refer to the same pages of physical memory until one of them writes to such a page: then it is copied. This optimization is important in the common case where fork is used in conjunction with exec to execute a new program: typically, the child process performs only a small set of actions before it ceases execution of its program in favour of the program to be started, and it requires very few, if any, of its parent's
data structure In computer science, a data structure is a data organization, management, and storage format that is usually chosen for efficient access to data. More precisely, a data structure is a collection of data values, the relationships among them, ...
s. When a process calls fork, it is deemed the parent process and the newly created process is its child. After the fork, both processes not only run the same program, but they resume execution as though both had called the system call. They can then inspect the call's
return value In computer programming, a return statement causes execution to leave the current subroutine and resume at the point in the code immediately after the instruction which called the subroutine, known as its return address. The return address is s ...
to determine their status, child or parent, and act accordingly.


History

One of the earliest references to a fork concept appeared in ''A Multiprocessor System Design'' by
Melvin Conway Melvin Edward Conway is an American computer scientist, computer programmer, and hacker who coined what is now known as Conway's law: "Organizations, who design systems, are constrained to produce designs which are copies of the communication str ...
, published in 1962. Conway's paper motivated the implementation by L. Peter Deutsch of fork in the GENIE time-sharing system, where the concept was borrowed by
Ken Thompson Kenneth Lane Thompson (born February 4, 1943) is an American pioneer of computer science. Thompson worked at Bell Labs for most of his career where he designed and implemented the original Unix operating system. He also invented the B programmi ...
for its earliest appearance in
Research Unix The term "Research Unix" refers to early versions of the Unix operating system for DEC PDP-7, PDP-11, VAX and Interdata 7/32 and 8/32 computers, developed in the Bell Labs Computing Sciences Research Center (CSRC). History The term ''Resear ...
. Fork later became a standard interface in
POSIX The Portable Operating System Interface (POSIX) is a family of standards specified by the IEEE Computer Society for maintaining compatibility between operating systems. POSIX defines both the system- and user-level application programming in ...
.


Communication

The child process starts off with a copy of its parent's
file descriptor In Unix and Unix-like computer operating systems, a file descriptor (FD, less frequently fildes) is a process-unique identifier ( handle) for a file or other input/output resource, such as a pipe or network socket. File descriptors typically ha ...
s. For interprocess communication, the parent process will often create one or several
pipe Pipe(s), PIPE(S) or piping may refer to: Objects * Pipe (fluid conveyance), a hollow cylinder following certain dimension rules ** Piping, the use of pipes in industry * Smoking pipe ** Tobacco pipe * Half-pipe and quarter pipe, semi-circular ...
s, and then after forking the processes will close the ends of the pipes that they don't need.


Variants


Vfork

Vfork is a variant of fork with the same
calling convention In computer science, a calling convention is an implementation-level (low-level) scheme for how subroutines or functions receive parameters from their caller and how they return a result. When some code calls a function, design choices have bee ...
and much the same semantics, but only to be used in restricted situations. It originated in the
3BSD The History of the Berkeley Software Distribution begins in the 1970s. 1BSD (PDP-11) The earliest distributions of Unix from Bell Labs in the 1970s included the source code to the operating system, allowing researchers at universities to modify an ...
version of Unix, the first Unix to support virtual memory. It was standardized by POSIX, which permitted vfork to have exactly the same behavior as fork, but was marked obsolescent in the 2004 edition and was replaced by posix_spawn() (which is typically implemented via vfork) in subsequent editions. When a vfork system call is issued, the parent process will be suspended until the child process has either completed execution or been replaced with a new executable image via one of the "
exec Exec or EXEC may refer to: * Executive officer, a person responsible for running an organization * Executive producer, provides finance and guidance for the making of a commercial entertainment product * A family of kit helicopters produced by Rot ...
" family of system calls. The child borrows the MMU setup from the parent and memory pages are shared among the parent and child process with no copying done, and in particular with no
copy-on-write Copy-on-write (COW), sometimes referred to as implicit sharing or shadowing, is a resource-management technique used in computer programming to efficiently implement a "duplicate" or "copy" operation on modifiable resources. If a resource is dupl ...
semantics; hence, if the child process makes a modification in any of the shared pages, no new page will be created and the modified pages are visible to the parent process too. Since there is absolutely no page copying involved (consuming additional memory), this technique is an optimization over plain fork in full-copy environments when used with exec. In POSIX, using vfork for any purpose except as a prelude to an immediate call to a function from the exec family (and a select few other operations) gives rise to
undefined behavior In computer programming, undefined behavior (UB) is the result of executing a program whose behavior is prescribed to be unpredictable, in the language specification to which the computer code adheres. This is different from unspecified behavior ...
. As with vfork, the child borrows data structures rather than copying them. vfork is still faster than a fork that uses copy on write semantics.
System V Unix System V (pronounced: "System Five") is one of the first commercial versions of the Unix operating system. It was originally developed by AT&T and first released in 1983. Four major versions of System V were released, numbered 1, 2, 3, an ...
did not support this function call before System VR4 was introduced, because the memory sharing that it causes is error-prone: Similarly, the Linux man page for vfork strongly discourages its use: Other problems with include
deadlock In concurrent computing, deadlock is any situation in which no member of some group of entities can proceed because each waits for another member, including itself, to take action, such as sending a message or, more commonly, releasing a loc ...
s that might occur in multi-threaded programs due to interactions with
dynamic linking In computing, a dynamic linker is the part of an operating system that loads and links the shared libraries needed by an executable when it is executed (at " run time"), by copying the content of libraries from persistent storage to RAM, filling ...
. As a replacement for the interface, POSIX introduced the family of functions that combine the actions of fork and exec. These functions may be implemented as library routines in terms of , as is done in Linux, or in terms of for better performance, as is done in Solaris, but the POSIX specification notes that they were "designed as kernel operations", especially for operating systems running on constrained hardware and
real-time systems Real-time computing (RTC) is the computer science term for Computer hardware, hardware and computer software, software systems subject to a "real-time constraint", for example from Event (synchronization primitive), event to Event (computing), ...
. While the 4.4BSD implementation got rid of the vfork implementation, causing vfork to have the same behavior as fork, it was later reinstated in the
NetBSD NetBSD is a free and open-source Unix operating system based on the Berkeley Software Distribution (BSD). It was the first open-source BSD descendant officially released after 386BSD was forked. It continues to be actively developed and is ava ...
operating system for performance reasons. Some embedded operating systems such as uClinux omit fork and only implement vfork, because they need to operate on devices where copy-on-write is impossible to implement due to lack of an MMU.


Rfork

The Plan 9 operating system, created by the designers of Unix, includes fork but also a variant called "rfork" that permits fine-grained sharing of resources between parent and child processes, including the address space (except for a
stack Stack may refer to: Places * Stack Island, an island game reserve in Bass Strait, south-eastern Australia, in Tasmania’s Hunter Island Group * Blue Stack Mountains, in Co. Donegal, Ireland People * Stack (surname) (including a list of people ...
segment, which is unique to each process),
environment variable An environment variable is a dynamic-named value that can affect the way running processes will behave on a computer. They are part of the environment in which a process runs. For example, a running process can query the value of the TEMP envi ...
s and the filesystem namespace; this makes it a unified interface for the creation of both processes and threads within them. Both
FreeBSD FreeBSD is a free and open-source Unix-like operating system descended from the Berkeley Software Distribution (BSD), which was based on Research Unix. The first version of FreeBSD was released in 1993. In 2005, FreeBSD was the most popular ...
and
IRIX IRIX ( ) is a discontinued operating system developed by Silicon Graphics (SGI) to run on the company's proprietary MIPS workstations and servers. It is based on UNIX System V with BSD extensions. In IRIX, SGI originated the XFS file system a ...
adopted the rfork system call from Plan 9, the latter renaming it "sproc".


Clone

clone is a system call in the
Linux kernel The Linux kernel is a free and open-source, monolithic, modular, multitasking, Unix-like operating system kernel. It was originally authored in 1991 by Linus Torvalds for his i386-based PC, and it was soon adopted as the kernel for the GNU ...
that creates a child process that may share parts of its execution
context Context may refer to: * Context (language use), the relevant constraints of the communicative situation that influence language use, language variation, and discourse summary Computing * Context (computing), the virtual environment required to s ...
with the parent. Like FreeBSD's rfork and IRIX's sproc, Linux's clone was inspired by Plan 9's rfork and can be used to implement threads (though application programmers will typically use a higher-level interface such as
pthreads POSIX Threads, commonly known as pthreads, is an execution model that exists independently from a language, as well as a parallel execution model. It allows a program to control multiple different flows of work that overlap in time. Each flow o ...
, implemented on top of clone). The "separate stacks" feature from Plan 9 and IRIX has been omitted because (according to
Linus Torvalds Linus Benedict Torvalds ( , ; born 28 December 1969) is a Finnish software engineer who is the creator and, historically, the lead developer of the Linux kernel, used by Linux distributions and other operating systems such as Android. He also ...
) it causes too much overhead.


Forking in other operating systems

In the original design of the VMS operating system (1977), a copy operation with subsequent mutation of the content of a few specific addresses for the new process as in forking was considered risky. Errors in the current process state may be copied to a child process. Here, the metaphor of process spawning is used: each component of the memory layout of the new process is newly constructed from scratch. The
spawn Spawn or spawning may refer to: * Spawn (biology), the eggs and sperm of aquatic animals Arts, entertainment, and media * Spawn (character), a fictional character in the comic series of the same name and in the associated franchise ** '' Spawn: A ...
metaphor was later adopted in Microsoft operating systems (1993). The POSIX-compatibility component of
VM/CMS VM (often: VM/CMS) is a family of IBM virtual machine operating systems used on IBM mainframes System/370, System/390, zSeries, System z and compatible systems, including the Hercules emulator for personal computers. The following ver ...
(OpenExtensions) provides a very limited implementation of fork, in which the parent is suspended while the child executes, and the child and the parent share the same address space. This is essentially a ''vfork'' labelled as a ''fork''. (Note this applies to the CMS guest operating system only; other VM guest operating systems, such as Linux, provide standard fork functionality.)


Application usage

The following variant of the
Hello World ''Hello'' is a salutation or greeting in the English language. It is first attested in writing from 1826. Early uses ''Hello'', with that spelling, was used in publications in the U.S. as early as the 18 October 1826 edition of the '' Norwich ...
program demonstrates the mechanics of the system call in the C programming language. The program forks into two processes, each deciding what functionality they perform based on the return value of the fork system call.
Boilerplate code In computer programming, boilerplate code, or simply boilerplate, are sections of code that are repeated in multiple places with little to no variation. When using languages that are considered ''verbose'', the programmer must write a lot of boile ...
such as header inclusions has been omitted. int main(void) What follows is a dissection of this program. pid_t pid = fork(); The first statement in calls the system call to split execution into two processes. The return value of is recorded in a variable of type , which is the POSIX type for process identifiers (PIDs). if (pid

-1)
Minus one indicates an error in : no new process was created, so an error message is printed. If was successful, then there are now two processes, both executing the function from the point where has returned. To make the processes perform different tasks, the program must
branch A branch, sometimes called a ramus in botany, is a woody structural member connected to the central trunk of a tree (or sometimes a shrub). Large branches are known as boughs and small branches are known as twigs. The term '' twig'' usuall ...
on the return value of to determine whether it is executing as the ''child'' process or the ''parent'' process. else if (pid

0)
In the child process, the return value appears as zero (which is an invalid process identifier). The child process prints the desired greeting message, then exits. (For technical reasons, the POSIX function must be used here instead of the C standard function.) else The other process, the parent, receives from the process identifier of the child, which is always a positive number. The parent process passes this identifier to the system call to suspend execution until the child has exited. When this has happened, the parent resumes execution and exits by means of the statement.


See also

*
Fork bomb In computing, a fork bomb (also called rabbit virus or wabbit) is a denial-of-service attack In computing, a denial-of-service attack (DoS attack) is a cyber-attack in which the perpetrator seeks to make a machine or network resource unav ...
* Fork–exec * exit (system call) *
wait (system call) In computer operating systems, a process (or task) may wait on another process to complete its execution. In most systems, a parent process can create an independently executing child process. The parent process may then issue a ''wait'' sys ...
*
spawn (computing) Spawn in computing refers to a function that loads and executes a new child process. The current process may wait for the child to terminate or may continue to execute concurrent computing. Creating a new subprocess requires enough memory in whi ...


References

{{Reflist, 30em Process (computing) C POSIX library Articles with example C code System calls