
In
software development
Software development is the process of designing and Implementation, implementing a software solution to Computer user satisfaction, satisfy a User (computing), user. The process is more encompassing than Computer programming, programming, wri ...
, a fork is a
codebase
In software development, a codebase (or code base) is a collection of source code used to build a particular software system, application, or software component. Typically, a codebase includes only human-written source code system files; thu ...
that is created by duplicating an existing codebase and, generally, is subsequently modified independently of the original.
Software
Software consists of computer programs that instruct the Execution (computing), execution of a computer. Software also includes design documents and specifications.
The history of software is closely tied to the development of digital comput ...
built from a fork initially has identical behavior as software built from the original code, but as the
source code
In computing, source code, or simply code or source, is a plain text computer program written in a programming language. A programmer writes the human readable source code to control the behavior of a computer.
Since a computer, at base, only ...
is increasingly modified, the resulting software tends to have increasingly different behavior compared to the original. A fork is a form of
branching, but generally involves storing the forked files separately from the original; not in the
repository. Reasons for forking a codebase include user preference, stagnated or discontinued development of the original software or a
schism
A schism ( , , or, less commonly, ) is a division between people, usually belonging to an organization, movement, or religious denomination. The word is most frequently applied to a split in what had previously been a single religious body, suc ...
in the developer community. Forking proprietary software (such as
Unix
Unix (, ; trademarked as UNIX) is a family of multitasking, multi-user computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, a ...
) is prohibited by
copyright
A copyright is a type of intellectual property that gives its owner the exclusive legal right to copy, distribute, adapt, display, and perform a creative work, usually for a limited time. The creative work may be in a literary, artistic, ...
law without explicit permission, but
free and open-source software
Free and open-source software (FOSS) is software available under a license that grants users the right to use, modify, and distribute the software modified or not to everyone free of charge. FOSS is an inclusive umbrella term encompassing free ...
, by definition, may be forked without permission.
Etymology
The word ''fork'' has been used to mean "to divide in branches, go separate ways" as early as the 14th century.
In the context of software development, ''fork'' was used in the sense of creating a revision control branch by
Eric Allman as early as 1980, in the context of
Source Code Control System:
The term was in use on
Usenet
Usenet (), a portmanteau of User's Network, is a worldwide distributed discussion system available on computers. It was developed from the general-purpose UUCP, Unix-to-Unix Copy (UUCP) dial-up network architecture. Tom Truscott and Jim Elli ...
by 1983 for the process of creating a subgroup to move topics of discussion to.
Although ''fork'' is not known to have been used in the sense of a community schism during the origins of Lucid Emacs (now
XEmacs
XEmacs is a graphical- and console-based text editor which runs on almost any Unix-like operating system as well as Microsoft Windows. XEmacs is a fork, based on a version of GNU Emacs from the late 1980s. Any user can download, use, and modify ...
) (1991) or the
Berkeley Software Distributions (BSDs) (1993–1994),
Russ Nelson used the term ''shattering'' in this sense in 1993 (attributing it to
John Gilmore). In 1995, ''fork'' was used to describe the XEmacs split, and was an understood usage in the
GNU Project by 1996.
The word is used similarly for the
fork() system call which causes a running
process
A process is a series or set of activities that interact to produce a result; it may occur once-only or be recurrent or periodic.
Things called a process include:
Business and management
* Business process, activities that produce a specific s ...
to split in two typically, to allow them to perform different tasks in parallel.
Forking of free and open-source software
Free and
open-source software
Open-source software (OSS) is Software, computer software that is released under a Open-source license, license in which the copyright holder grants users the rights to use, study, change, and Software distribution, distribute the software an ...
may be legally forked without prior approval of those currently developing, managing, or distributing the software per both
The Free Software Definition and
The Open Source Definition:
[
In free software, forks often result from a schism over different goals or personality clashes. In a fork, both parties assume nearly identical code bases, but typically only the larger group, or whoever controls the web site, will retain the full original name and the associated user community. Thus, there is a reputation penalty associated with forking.][ The relationship between the different teams can be cordial or very bitter. On the other hand, a ''friendly fork'' or a ''soft fork'' is a fork that does not intend to compete, but wants to eventually merge with the original.
Eric S. Raymond, in his essay '' Homesteading the Noosphere'', stated that "The most important characteristic of a fork is that it spawns competing projects that cannot later exchange code, splitting the potential developer community". He notes in the ]Jargon File
The Jargon File is a glossary and usage dictionary of slang used by computer programmers. The original Jargon File was a collection of terms from technical cultures such as the MIT Computer Science and Artificial Intelligence Laboratory, MIT AI Lab ...
:
David A. Wheeler notes[Why Open Source Software / Free Software (OSS/FS, FLOSS, or FOSS)? Look at the Numbers!: Forking]
(David A. Wheeler) four possible outcomes of a fork, with examples:
# The death of the fork. This is by far the most common case. It is easy to declare a fork, but considerable effort to continue independent development and support.
# A re-merging of the fork (''e.g.'', egcs becoming "blessed" as the new version of GNU Compiler Collection
The GNU Compiler Collection (GCC) is a collection of compilers from the GNU Project that support various programming languages, Computer architecture, hardware architectures, and operating systems. The Free Software Foundation (FSF) distributes ...
.)
# The death of the original (''e.g.'' the X.Org Server succeeding and XFree86
XFree86 is an implementation of the X Window System. It was originally written for Unix-like operating systems on IBM PC compatibles and was available for many other operating systems and platforms. It is free software, free and Open-source softw ...
dying.)
# Successful branching, typically with differentiation (''e.g.'', OpenBSD
OpenBSD is a security-focused operating system, security-focused, free software, Unix-like operating system based on the Berkeley Software Distribution (BSD). Theo de Raadt created OpenBSD in 1995 by fork (software development), forking NetBSD ...
and NetBSD
NetBSD is a free and open-source Unix-like operating system based on the Berkeley Software Distribution (BSD). It was the first open-source BSD descendant officially released after 386BSD was fork (software development), forked. It continues to ...
.)
Distributed revision control (DVCS) tools have popularised a less emotive use of the term "fork", blurring the distinction with "branch". With a DVCS such as Mercurial or Git, the normal way to contribute to a project, is to first create a personal branch of the repository, independent of the main repository, and later seek to have your changes integrated with it. Sites such as GitHub
GitHub () is a Proprietary software, proprietary developer platform that allows developers to create, store, manage, and share their code. It uses Git to provide distributed version control and GitHub itself provides access control, bug trackin ...
, Bitbucket and Launchpad provide free DVCS hosting expressly supporting independent branches, such that the technical, social and financial barriers to forking a source code repository are massively reduced, and GitHub uses "fork" as its term for this method of contribution to a project.
Forks often restart version numbering from numbers typically used for initial versions of programs like 0.0.1, 0.1, or 1.0 even if the original software was at another version such as 3.0, 4.0, or 5.0. An exception is sometimes made when the forked software is designed to be a drop-in replacement for the original project, ''e.g.'' MariaDB
MariaDB is a community-developed, commercially supported Fork (software development), fork of the MySQL relational database management system (RDBMS), intended to remain free and open-source software under the GNU General Public License. Developm ...
for MySQL
MySQL () is an Open-source software, open-source relational database management system (RDBMS). Its name is a combination of "My", the name of co-founder Michael Widenius's daughter My, and "SQL", the acronym for Structured Query Language. A rel ...
or LibreOffice
LibreOffice () is a free and open-source office productivity software suite developed by The Document Foundation (TDF). It was created in 2010 as a fork of OpenOffice.org, itself a successor to StarOffice. The suite includes applications ...
for OpenOffice.org.
The BSD licenses
BSD licenses are a family of permissive free software licenses, imposing minimal restrictions on the use and distribution of covered software. This is in contrast to copyleft licenses, which have share-alike requirements. The original BSD lice ...
permit forks to become proprietary software, and copyleft
Copyleft is the legal technique of granting certain freedoms over copies of copyrighted works with the requirement that the same rights be preserved in derivative works. In this sense, ''freedoms'' refers to the use of the work for any purpose, ...
proponents say that commercial incentives thus make proprietisation almost inevitable. (Copyleft licenses can, however, be circumvented via dual-licensing with a proprietary grant in the form of a Contributor License Agreement.) Examples include macOS
macOS, previously OS X and originally Mac OS X, is a Unix, Unix-based operating system developed and marketed by Apple Inc., Apple since 2001. It is the current operating system for Apple's Mac (computer), Mac computers. With ...
(based on the proprietary NeXTSTEP
NeXTSTEP is a discontinued object-oriented, multitasking operating system based on the Mach kernel and the UNIX-derived BSD. It was developed by NeXT, founded by Steve Jobs, in the late 1980s and early 1990s and was initially used for its ...
and the open source FreeBSD
FreeBSD is a free-software Unix-like operating system descended from the Berkeley Software Distribution (BSD). The first version was released in 1993 developed from 386BSD, one of the first fully functional and free Unix clones on affordable ...
), Cedega and CrossOver (proprietary forks of Wine
Wine is an alcoholic drink made from Fermentation in winemaking, fermented fruit. Yeast in winemaking, Yeast consumes the sugar in the fruit and converts it to ethanol and carbon dioxide, releasing heat in the process. Wine is most often made f ...
, though CrossOver tracks Wine and contributes considerably), EnterpriseDB (a fork of PostgreSQL
PostgreSQL ( ) also known as Postgres, is a free and open-source software, free and open-source relational database management system (RDBMS) emphasizing extensibility and SQL compliance. PostgreSQL features transaction processing, transactions ...
, adding Oracle compatibility features), Supported PostgreSQL with their proprietary ESM storage system, and Netezza'sNetezza
proprietary highly scalable derivative of PostgreSQL. Some of these vendors contribute back changes to the community project, while some keep their changes as their own competitive advantages.
Forking proprietary software
In proprietary software
Proprietary software is computer software, software that grants its creator, publisher, or other rightsholder or rightsholder partner a legal monopoly by modern copyright and intellectual property law to exclude the recipient from freely sharing t ...
, the copyright is usually held by the employing entity, not by the individual software developers. Proprietary code is thus more commonly forked when the owner needs to develop two or more versions, such as a windowed version and a command line version, or versions for differing operating systems, such as a word processor A word processor (WP) is a device or computer program that provides for input, editing, formatting, and output of text, often with some additional features.
Early word processors were stand-alone devices dedicated to the function, but current word ...
for IBM PC
The IBM Personal Computer (model 5150, commonly known as the IBM PC) is the first microcomputer released in the List of IBM Personal Computer models, IBM PC model line and the basis for the IBM PC compatible ''de facto'' standard. Released on ...
compatible machines and Macintosh
Mac is a brand of personal computers designed and marketed by Apple Inc., Apple since 1984. The name is short for Macintosh (its official name until 1999), a reference to the McIntosh (apple), McIntosh apple. The current product lineup inclu ...
computers. Generally, such internal forks will concentrate on having the same look, feel, data format, and behavior between platforms so that a user familiar with one can also be productive or share documents generated on the other. This is almost always an economic decision to generate a greater market share
Market share is the percentage of the total revenue or sales in a Market (economics), market that a company's business makes up. For example, if there are 50,000 units sold per year in a given industry, a company whose sales were 5,000 of those ...
and thus pay back the associated extra development costs created by the fork.
A notable proprietary fork not of this kind is the many varieties of proprietary Unix
Unix (, ; trademarked as UNIX) is a family of multitasking, multi-user computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, a ...
—almost all derived from AT&T Unix under license and all called "Unix", but increasingly mutually incompatible.[Fear of forking]
– An essay about forking in free software
Free software, libre software, libreware sometimes known as freedom-respecting software is computer software distributed open-source license, under terms that allow users to run the software for any purpose as well as to study, change, distribut ...
projects, by Rick Moen ''See'' Unix wars.
See also
*
*
*
*
*
*
*
*
*
*
*
References
External links
Right to Fork
at Meatball Wiki
* A PhD examining forking
(Nyman, 2015)
"Understanding Code Forking in Open Source Software – An examination of code forking, its effect on open source software, and how it is viewed and practiced by developers"
{{DEFAULTSORT:Fork (Software Development)
Software project management