HOME

TheInfoList



OR:

Git () is a distributed version control system: tracking changes in any set of files, usually used for coordinating work among
programmer A computer programmer, sometimes referred to as a software developer, a software engineer, a programmer or a coder, is a person who creates computer programs — often for larger computer software. A programmer is someone who writes/creates ...
s collaboratively developing
source code In computing, source code, or simply code, is any collection of code, with or without comments, written using a human-readable programming language, usually as plain text. The source code of a program is specially designed to facilitate the ...
during
software development Software development is the process of conceiving, specifying, designing, programming, documenting, testing, and bug fixing involved in creating and maintaining applications, frameworks, or other software components. Software development invo ...
. Its goals include speed,
data integrity Data integrity is the maintenance of, and the assurance of, data accuracy and consistency over its entire life-cycle and is a critical aspect to the design, implementation, and usage of any system that stores, processes, or retrieves data. The ter ...
, and support for distributed, non-linear workflows (thousands of parallel branches running on different systems). "So I'm writing some scripts to try to track things a whole lot faster." Git was originally authored by
Linus Torvalds Linus Benedict Torvalds ( , ; born 28 December 1969) is a Finnish software engineer who is the creator and, historically, the lead developer of the Linux kernel, used by Linux distributions and other operating systems such as Android. He also ...
in 2005 for development of the
Linux kernel The Linux kernel is a free and open-source, monolithic, modular, multitasking, Unix-like operating system kernel. It was originally authored in 1991 by Linus Torvalds for his i386-based PC, and it was soon adopted as the kernel for the GNU ...
, with other kernel developers contributing to its initial development. Since 2005, Junio Hamano has been the core maintainer. As with most other distributed version control systems, and unlike most client–server systems, every Git
directory Directory may refer to: * Directory (computing), or folder, a file system structure in which to store computer files * Directory (OpenVMS command) * Directory service, a software application for organizing information about a computer network' ...
on every
computer A computer is a machine that can be programmed to carry out sequences of arithmetic or logical operations ( computation) automatically. Modern digital electronic computers can perform generic sets of operations known as programs. These prog ...
is a full-fledged repository with complete history and full version-tracking abilities, independent of network access or a central server. Git is
free and open-source software Free and open-source software (FOSS) is a term used to refer to groups of software consisting of both free software and open-source software where anyone is freely licensed to use, copy, study, and change the software in any way, and the source ...
distributed under the GPL-2.0-only license.


History

Git development began in April 2005, after many developers of the
Linux kernel The Linux kernel is a free and open-source, monolithic, modular, multitasking, Unix-like operating system kernel. It was originally authored in 1991 by Linus Torvalds for his i386-based PC, and it was soon adopted as the kernel for the GNU ...
gave up access to
BitKeeper BitKeeper is a software tool for distributed revision control of computer source code. Originally developed as proprietary software by BitMover Inc., a privately held company based in Los Gatos, California, it was released as open-source sof ...
, a proprietary source-control management (SCM) system that they had been using to maintain the project since 2002.BitKeeper and Linux: The end of the road?
The copyright holder of BitKeeper,
Larry McVoy Larry McVoy (born 1962 in Concord, Massachusetts, United States) is the CEO of BitMover, the company that makes BitKeeper, a version control system that was used from February 2002 to early 2005 to manage the source code of the Linux kernel. ...
, had withdrawn free use of the product after claiming that Andrew Tridgell had created SourcePuller by
reverse engineering Reverse engineering (also known as backwards engineering or back engineering) is a process or method through which one attempts to understand through deductive reasoning how a previously made device, process, system, or piece of software accompli ...
the BitKeeper protocols. The same incident also spurred the creation of another version-control system, Mercurial.
Linus Torvalds Linus Benedict Torvalds ( , ; born 28 December 1969) is a Finnish software engineer who is the creator and, historically, the lead developer of the Linux kernel, used by Linux distributions and other operating systems such as Android. He also ...
wanted a distributed system that he could use like BitKeeper, but none of the available free systems met his needs. Torvalds cited an example of a source-control management system needing 30 seconds to apply a patch and update all associated metadata, and noted that this would not scale to the needs of Linux kernel development, where synchronizing with fellow maintainers could require 250 such actions at once. For his design criterion, he specified that patching should take no more than three seconds, and added three more goals: * Take
Concurrent Versions System Concurrent Versions System (CVS, also known as the Concurrent Versioning System) is a revision control system originally developed by Dick Grune in July 1986. CVS operates as a front end to RCS, an earlier system which operates on single file ...
(CVS) as an example of what ''not'' to do; if in doubt, make the exact opposite decision. * Support a distributed,
BitKeeper BitKeeper is a software tool for distributed revision control of computer source code. Originally developed as proprietary software by BitMover Inc., a privately held company based in Los Gatos, California, it was released as open-source sof ...
-like workflow. * Include very strong safeguards against corruption, either accidental or malicious. These criteria eliminated every version-control system in use at the time, so immediately after the 2.6.12-rc2 Linux kernel development release, Torvalds set out to write his own. The development of Git began on 3 April 2005. Torvalds announced the project on 6 April and became self-hosting the next day. The first merge of multiple branches took place on 18 April. Torvalds achieved his performance goals; on 29 April, the nascent Git was benchmarked recording patches to the Linux kernel tree at the rate of 6.7 patches per second. On 16 June, Git managed the kernel 2.6.12 release. Torvalds turned over
maintenance Maintenance may refer to: Biological science * Maintenance of an organism * Maintenance respiration Non-technical maintenance * Alimony, also called ''maintenance'' in British English * Champerty and maintenance, two related legal doct ...
on 26 July 2005 to Junio Hamano, a major contributor to the project. Hamano was responsible for the 1.0 release on 21 December 2005.


Naming

Torvalds sarcastically quipped about the name ''git'' (which means "unpleasant person" in
British English British English (BrE, en-GB, or BE) is, according to Oxford Dictionaries, "English as used in Great Britain, as distinct from that used elsewhere". More narrowly, it can refer specifically to the English language in England, or, more broadl ...
slang): "I'm an egotistical bastard, and I name all my projects after myself. First '
Linux Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, whi ...
', now 'git'." The
man page A man page (short for manual page) is a form of software documentation usually found on a Unix or Unix-like operating system. Topics covered include computer programs (including library and system calls), formal standards and conventions, and e ...
describes Git as "the stupid content tracker". The read-me file of the source code elaborates further: The source code for Git refers to the program as, "the information manager from hell."


Releases

List of Git releases:


Design

Git's design was inspired by
BitKeeper BitKeeper is a software tool for distributed revision control of computer source code. Originally developed as proprietary software by BitMover Inc., a privately held company based in Los Gatos, California, it was released as open-source sof ...
and
Monotone Monotone refers to a sound, for example music or speech, that has a single unvaried tone. See: monophony. Monotone or monotonicity may also refer to: In economics *Monotone preferences, a property of a consumer's preference ordering. *Monotonic ...
. Git was originally designed as a low-level version-control system engine, on top of which others could write front ends, such as Cogito or StGIT. The core Git project has since become a complete version-control system that is usable directly. While strongly influenced by BitKeeper, Torvalds deliberately avoided conventional approaches, leading to a unique design.


Characteristics

Git's design is a synthesis of Torvalds's experience with Linux in maintaining a large distributed development project, along with his intimate knowledge of file-system performance gained from the same project and the urgent need to produce a working system in short order. These influences led to the following implementation choices: ; Strong support for non-linear development: Git supports rapid branching and merging, and includes specific tools for visualizing and navigating a non-linear development history. In Git, a core assumption is that a change will be merged more often than it is written, as it is passed around to various reviewers. In Git, branches are very lightweight: a branch is only a reference to one commit. With its parental commits, the full branch structure can be constructed. ; Distributed development: Like
Darcs Darcs is a distributed version control system created by David Roundy. Key features include the ability to choose which changes to accept from other repositories, interaction with either other local (on-disk) repositories or remote repositories v ...
,
BitKeeper BitKeeper is a software tool for distributed revision control of computer source code. Originally developed as proprietary software by BitMover Inc., a privately held company based in Los Gatos, California, it was released as open-source sof ...
, Mercurial,
Bazaar A bazaar () or souk (; also transliterated as souq) is a marketplace consisting of multiple small stalls or shops, especially in the Middle East, the Balkans, North Africa and India. However, temporary open markets elsewhere, such as in t ...
, and
Monotone Monotone refers to a sound, for example music or speech, that has a single unvaried tone. See: monophony. Monotone or monotonicity may also refer to: In economics *Monotone preferences, a property of a consumer's preference ordering. *Monotonic ...
, Git gives each developer a local copy of the full development history, and changes are copied from one such repository to another. These changes are imported as added development branches and can be merged in the same way as a locally developed branch. ; Compatibility with existing systems and protocols: Repositories can be published via
Hypertext Transfer Protocol Secure Hypertext Transfer Protocol Secure (HTTPS) is an extension of the Hypertext Transfer Protocol (HTTP). It is used for secure communication over a computer network, and is widely used on the Internet. In HTTPS, the communication protocol is en ...
(HTTPS),
Hypertext Transfer Protocol The Hypertext Transfer Protocol (HTTP) is an application layer protocol in the Internet protocol suite model for distributed, collaborative, hypermedia information systems. HTTP is the foundation of data communication for the World Wide We ...
(HTTP),
File Transfer Protocol The File Transfer Protocol (FTP) is a standard communication protocol used for the transfer of computer files from a server to a client on a computer network. FTP is built on a client–server model architecture using separate control and da ...
(FTP), or a Git protocol over either a plain socket or
Secure Shell The Secure Shell Protocol (SSH) is a cryptographic network protocol for operating network services securely over an unsecured network. Its most notable applications are remote login and command-line execution. SSH applications are based ...
(ssh). Git also has a CVS server emulation, which enables the use of existing CVS clients and IDE plugins to access Git repositories.
Subversion Subversion () refers to a process by which the values and principles of a system in place are contradicted or reversed in an attempt to transform the established social order and its structures of power, authority, hierarchy, and social norms ...
repositories can be used directly with git-svn. ; Efficient handling of large projects: Torvalds has described Git as being very fast and scalable, and performance tests done by Mozilla showed that it was an
order of magnitude An order of magnitude is an approximation of the logarithm of a value relative to some contextually understood reference value, usually 10, interpreted as the base of the logarithm and the representative of values of magnitude one. Logarithmic di ...
faster diffing large repositories than Mercurial and
GNU Bazaar GNU Bazaar (formerly Bazaar-NG, command line tool bzr) is a distributed and client–server revision control system sponsored by Canonical. Bazaar can be used by a single developer working on multiple branches of local content, or by teams col ...
; fetching version history from a locally stored repository can be one hundred times faster than fetching it from the remote server. ; Cryptographic authentication of history: The Git history is stored in such a way that the ID of a particular version (a ''commit'' in Git terms) depends upon the complete development history leading up to that commit. Once it is published, it is not possible to change the old versions without it being noticed. The structure is similar to a Merkle tree, but with added data at the nodes and leaves. ( Mercurial and
Monotone Monotone refers to a sound, for example music or speech, that has a single unvaried tone. See: monophony. Monotone or monotonicity may also refer to: In economics *Monotone preferences, a property of a consumer's preference ordering. *Monotonic ...
also have this property.) ; Toolkit-based design: Git was designed as a set of programs written in C and several shell scripts that provide wrappers around those programs. Although most of those scripts have since been rewritten in C for speed and portability, the design remains, and it is easy to chain the components together. ; Pluggable merge strategies: As part of its toolkit design, Git has a well-defined model of an incomplete merge, and it has multiple algorithms for completing it, culminating in telling the user that it is unable to complete the merge automatically and that manual editing is needed. ;
Garbage Garbage, trash, rubbish, or refuse is waste material that is discarded by humans, usually due to a perceived lack of utility. The term generally does not encompass bodily waste products, purely liquid or gaseous wastes, or toxic waste produ ...
accumulates until collected: Aborting operations or backing out changes will leave useless dangling objects in the database. These are generally a small fraction of the continuously growing history of wanted objects. Git will automatically perform
garbage collection Waste collection is a part of the process of waste management. It is the transfer of solid waste from the point of use and disposal to the point of treatment or landfill. Waste collection also includes the curbside collection of recyclabl ...
when enough loose objects have been created in the repository. Garbage collection can be called explicitly using git gc. ; Periodic explicit object packing: Git stores each newly created object as a separate file. Although individually compressed, this takes up a great deal of space and is inefficient. This is solved by the use of ''packs'' that store a large number of objects delta-compressed among themselves in one file (or network byte stream) called a ''packfile''. Packs are compressed using the
heuristic A heuristic (; ), or heuristic technique, is any approach to problem solving or self-discovery that employs a practical method that is not guaranteed to be optimal, perfect, or rational, but is nevertheless sufficient for reaching an immediate ...
that files with the same name are probably similar, without depending on this for correctness. A corresponding index file is created for each packfile, telling the offset of each object in the packfile. Newly created objects (with newly added history) are still stored as single objects, and periodic repacking is needed to maintain space efficiency. The process of packing the repository can be very computationally costly. By allowing objects to exist in the repository in a loose but quickly generated format, Git allows the costly pack operation to be deferred until later, when time matters less, e.g., the end of a workday. Git does periodic repacking automatically, but manual repacking is also possible with the git gc command. For data integrity, both the packfile and its index have an
SHA-1 In cryptography, SHA-1 (Secure Hash Algorithm 1) is a cryptographically broken but still widely used hash function which takes an input and produces a 160- bit (20- byte) hash value known as a message digest – typically rendered as 40 hexa ...
checksum inside, and the file name of the packfile also contains an SHA-1 checksum. To check the integrity of a repository, run the git fsck command. Another property of Git is that it snapshots directory trees of files. The earliest systems for tracking versions of source code,
Source Code Control System Source Code Control System (SCCS) is a version control system designed to track changes in source code and other text files during the development of a piece of software. This allows the user to retrieve any of the previous versions of the origin ...
(SCCS) and
Revision Control System Revision Control System (RCS) is an early implementation of a version control system (VCS). It is a set of UNIX commands that allow multiple users to develop and maintain program code or documents. With RCS, users can make their own revisions of ...
(RCS), worked on individual files and emphasized the space savings to be gained from
interleaved deltas Interleaved deltas, or SCCS weave is a method used by the Source Code Control System to store all revisions of a file. All lines from all revisions are "woven" together in a single block of data, with interspersed control instructions indicating ...
(SCCS) or delta encoding (RCS) the (mostly similar) versions. Later revision-control systems maintained this notion of a file having an identity across multiple revisions of a project. However, Torvalds rejected this concept. Consequently, Git does not explicitly record file revision relationships at any level below the source-code tree. These implicit revision relationships have some significant consequences: * It is slightly more costly to examine the change history of one file than the whole project. To obtain a history of changes affecting a given file, Git must walk the global history and then determine whether each change modified that file. This method of examining history does, however, let Git produce with equal efficiency a single history showing the changes to an arbitrary set of files. For example, a subdirectory of the source tree plus an associated global header file is a very common case. * Renames are handled implicitly rather than explicitly. A common complaint with CVS is that it uses the name of a file to identify its revision history, so moving or renaming a file is not possible without either interrupting its history or renaming the history and thereby making the history inaccurate. Most post-CVS revision-control systems solve this by giving a file a unique long-lived name (analogous to an
inode The inode (index node) is a data structure in a Unix-style file system that describes a file-system object such as a file or a directory. Each inode stores the attributes and disk block locations of the object's data. File-system object attribut ...
number) that survives renaming. Git does not record such an identifier, and this is claimed as an advantage.
Source code In computing, source code, or simply code, is any collection of code, with or without comments, written using a human-readable programming language, usually as plain text. The source code of a program is specially designed to facilitate the ...
files are sometimes split or merged, or simply renamed, and recording this as a simple rename would freeze an inaccurate description of what happened in the (immutable) history. Git addresses the issue by detecting renames while browsing the history of snapshots rather than recording it when making the snapshot. (Briefly, given a file in revision ''N'', a file of the same name in revision ''N'' − 1 is its default ancestor. However, when there is no like-named file in revision ''N'' − 1, Git searches for a file that existed only in revision ''N'' − 1 and is very similar to the new file.) However, it does require more
CPU A central processing unit (CPU), also called a central processor, main processor or just processor, is the electronic circuitry that executes instructions comprising a computer program. The CPU performs basic arithmetic, logic, controlling, a ...
-intensive work every time the history is reviewed, and several options to adjust the heuristics are available. This mechanism does not always work; sometimes a file that is renamed with changes in the same commit is read as a deletion of the old file and the creation of a new file. Developers can work around this limitation by committing the rename and the changes separately. Git implements several merging strategies; a non-default strategy can be selected at merge time: * ''resolve'': the traditional
three-way merge In version control, merging (also called integration) is a fundamental operation that reconciles multiple changes made to a version-controlled collection of files. Most often, it is necessary when a file is modified on two independent branches an ...
algorithm. * ''recursive'': This is the default when pulling or merging one branch, and is a variant of the three-way merge algorithm. * ''octopus'': This is the default when merging more than two heads.


Data structures

Git's primitives are not inherently a source-code management system. Torvalds explains: From this initial design approach, Git has developed the full set of features expected of a traditional SCM, with features mostly being created as needed, then refined and extended over time. Git has two
data structure In computer science, a data structure is a data organization, management, and storage format that is usually chosen for efficient access to data. More precisely, a data structure is a collection of data values, the relationships among them, ...
s: a mutable ''index'' (also called ''stage'' or ''cache'') that caches information about the working directory and the next revision to be committed; and an immutable, append-only ''object database''. The index serves as a connection point between the object database and the working tree. The object database contains five types of objects: * A blob ( binary large object) is the content of a file. Blobs have no proper file name, time stamps, or other metadata (A blob's name internally is a hash of its content.). In git each blob is a version of a file, it holds the file's data. * A tree object is the equivalent of a directory. It contains a list of file names, each with some type bits and a reference to a blob or tree object that is that file, symbolic link, or directory's contents. These objects are a snapshot of the source tree. (In whole, this comprises a Merkle tree, meaning that only a single hash for the root tree is sufficient and actually used in commits to precisely pinpoint to the exact state of whole tree structures of any number of sub-directories and files.) * A commit object links tree objects together into history. It contains the name of a tree object (of the top-level source directory), a timestamp, a log message, and the names of zero or more parent commit objects. * A tag object is a container that contains a reference to another object and can hold added meta-data related to another object. Most commonly, it is used to store a
digital signature A digital signature is a mathematical scheme for verifying the authenticity of digital messages or documents. A valid digital signature, where the prerequisites are satisfied, gives a recipient very high confidence that the message was created b ...
of a commit object corresponding to a particular release of the data being tracked by Git. * A packfile object collects various other objects into a zlib-compressed bundle for compactness and ease of transport over network protocols. Each object is identified by a SHA-1
hash Hash, hashes, hash mark, or hashing may refer to: Substances * Hash (food), a coarse mixture of ingredients * Hash, a nickname for hashish, a cannabis product Hash mark *Hash mark (sports), a marking on hockey rinks and gridiron football fiel ...
of its contents. Git computes the hash and uses this value for the object's name. The object is put into a directory matching the first two characters of its hash. The rest of the hash is used as the file name for that object. Git stores each revision of a file as a unique blob. The relationships between the blobs can be found through examining the tree and commit objects. Newly added objects are stored in their entirety using zlib compression. This can consume a large amount of disk space quickly, so objects can be combined into ''packs'', which use
delta compression Delta encoding is a way of storing or transmitting data in the form of '' differences'' (deltas) between sequential data rather than complete files; more generally this is known as data differencing. Delta encoding is sometimes called delta compre ...
to save space, storing blobs as their changes relative to other blobs. Additionally, git stores labels called refs (short for references) to indicate the locations of various commits. They are stored in the reference database and are respectively: * Heads (branches): Named references that are advanced automatically to the new commit when a commit is made on top of them. * HEAD: A reserved head that will be compared against the working tree to create a commit. * Tags: Like branch references but fixed to a particular commit. Used to label important points in history.


References

Every object in the Git database that is not referred to may be cleaned up by using a garbage collection command or automatically. An object may be referenced by another object or an explicit reference. Git knows different types of references. The commands to create, move, and delete references vary. git show-ref lists all references. Some types are: * ''heads'': refers to an object locally, * ''remotes'': refers to an object which exists in a remote repository, * ''stash'': refers to an object not yet committed, * ''meta'': e.g. a configuration in a bare repository, user rights; the refs/meta/config namespace was introduced retrospectively, gets used by Gerrit, * ''tags'': see above.


Implementations

Git (the main implementation in C) is primarily developed on
Linux Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, whi ...
, although it also supports most major operating systems, including the BSDs (
DragonFly BSD DragonFly BSD is a free and open-source Unix-like operating system forked from FreeBSD 4.8. Matthew Dillon, an Amiga developer in the late 1980s and early 1990s and FreeBSD developer between 1994 and 2003, began working on DragonFly BSD ...
,
FreeBSD FreeBSD is a free and open-source Unix-like operating system descended from the Berkeley Software Distribution (BSD), which was based on Research Unix. The first version of FreeBSD was released in 1993. In 2005, FreeBSD was the most popular ...
,
NetBSD NetBSD is a free and open-source Unix operating system based on the Berkeley Software Distribution (BSD). It was the first open-source BSD descendant officially released after 386BSD was forked. It continues to be actively developed and is ava ...
, and
OpenBSD OpenBSD is a security-focused, free and open-source, Unix-like operating system based on the Berkeley Software Distribution (BSD). Theo de Raadt created OpenBSD in 1995 by forking NetBSD 1.0. According to the website, the OpenBSD project e ...
), Solaris,
macOS macOS (; previously OS X and originally Mac OS X) is a Unix operating system developed and marketed by Apple Inc. since 2001. It is the primary operating system for Apple's Mac computers. Within the market of desktop and la ...
, and
Windows Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, Windows NT for consumers, Windows Server for se ...
. The first Windows
port A port is a maritime facility comprising one or more wharves or loading areas, where ships load and discharge cargo and passengers. Although usually situated on a sea coast or estuary, ports can also be found far inland, such as H ...
of Git was primarily a Linux-emulation framework that hosts the Linux version. Installing Git under Windows creates a similarly named Program Files directory containing the
Mingw-w64 Mingw-w64 is a free and open source software development environment to create (cross-compile) Microsoft Windows PE applications. It was forked in 2005–2010 from MinGW (''Minimalist GNU for Windows''). Mingw-w64 includes a port of the GN ...
port of the
GNU Compiler Collection The GNU Compiler Collection (GCC) is an optimizing compiler produced by the GNU Project supporting various programming languages, hardware architectures and operating systems. The Free Software Foundation (FSF) distributes GCC as free softwar ...
,
Perl Perl is a family of two high-level, general-purpose, interpreted, dynamic programming languages. "Perl" refers to Perl 5, but from 2000 to 2019 it also referred to its redesigned "sister language", Perl 6, before the latter's name was offic ...
5, MSYS2 (itself a fork of
Cygwin Cygwin ( ) is a POSIX-compatible programming and runtime environment that runs natively on Microsoft Windows. Under Cygwin, source code designed for Unix-like operating systems may be compiled with minimal modification and executed. The Cygwin in ...
, a Unix-like emulation environment for Windows) and various other Windows ports or emulations of Linux utilities and libraries. Currently, native Windows builds of Git are distributed as 32- and 64-bit installers. The git official website currently maintains a build of Git for Windows, still using the MSYS2 environment. The JGit implementation of Git is a pure
Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's mo ...
software library, designed to be embedded in any Java application. JGit is used in the Gerrit code-review tool, and in EGit, a Git client for the
Eclipse An eclipse is an astronomical event that occurs when an astronomical object or spacecraft is temporarily obscured, by passing into the shadow of another body or by having another body pass between it and the viewer. This alignment of three c ...
IDE. Go-git is an
open-source Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use the source code, design documents, or content of the product. The open-source model is a decentralized so ...
implementation of Git written in pure Go. It is currently used for backing projects as a SQL interface for Git code repositories and providing
encryption In cryptography, encryption is the process of encoding information. This process converts the original representation of the information, known as plaintext, into an alternative form known as ciphertext. Ideally, only authorized parties can d ...
for Git. The Dulwich implementation of Git is a pure Python software component for Python 2.7, 3.4 and 3.5. The libgit2 implementation of Git is an ANSI C software library with no other dependencies, which can be built on multiple platforms, including Windows, Linux, macOS, and BSD. It has bindings for many programming languages, including
Ruby A ruby is a pinkish red to blood-red colored gemstone, a variety of the mineral corundum ( aluminium oxide). Ruby is one of the most popular traditional jewelry gems and is very durable. Other varieties of gem-quality corundum are called ...
, Python, and Haskell. JS-Git is a
JavaScript JavaScript (), often abbreviated as JS, is a programming language that is one of the core technologies of the World Wide Web, alongside HTML and CSS. As of 2022, 98% of websites use JavaScript on the client side for webpage behavior, of ...
implementation of a subset of Git.


Git server

As Git is a distributed version control system, it could be used as a server out of the box. It's shipped with a built-in command git daemon which starts a simple TCP server running on the GIT protocol. Dedicated Git HTTP servers help (amongst other features) by adding access control, displaying the contents of a Git repository via the web interfaces, and managing multiple repositories. Already existing Git repositories can be cloned and shared to be used by others as a centralized repo. It can also be accessed via remote shell just by having the Git software installed and allowing a user to log in. Git servers typically listen on TCP port 9418.


Open source

* Hosting the Git server using the Git Binary. * Gerrit, a Git server configurable to support code reviews and provide access via ssh, an integrated Apache MINA or OpenSSH, or an integrated
Jetty A jetty is a structure that projects from land out into water. A jetty may serve as a breakwater, as a walkway, or both; or, in pairs, as a means of constricting a channel. The term derives from the French word ', "thrown", signifying somet ...
web server. Gerrit provides integration for LDAP, Active Directory, OpenID, OAuth, Kerberos/GSSAPI, X509 https client certificates. With Gerrit 3.0 all configurations will be stored as Git repositories, and no database is required to run. Gerrit has a pull-request feature implemented in its core but lacks a GUI for it. * Phabricator, a spin-off from Facebook. As Facebook primarily uses Mercurial, Git support is not as prominent. *
RhodeCode RhodeCode is an open source self-hosted platform for behind-the-firewall source code management. It provides centralized control over Git, Mercurial, and Subversion repositories within an organization, with common authentication and permission man ...
Community Edition (CE), supporting Git, Mercurial and
Subversion Subversion () refers to a process by which the values and principles of a system in place are contradicted or reversed in an attempt to transform the established social order and its structures of power, authority, hierarchy, and social norms ...
with an AGPLv3 license. * Kallithea, supporting both Git and Mercurial, developed in Python with
GPL license The GNU General Public License (GNU GPL or simply GPL) is a series of widely used free software licenses that guarantee end users the four freedoms to run, study, share, and modify the software. The license was the first copyleft for general us ...
. * External projects like gitolite, which provide scripts on top of Git software to provide fine-grained access control. * There are several other FLOSS solutions for self-hosting, including Gogs and Gitea, a fork of Gogs, both developed in Go language with
MIT license The MIT License is a permissive free software license originating at the Massachusetts Institute of Technology (MIT) in the late 1980s. As a permissive license, it puts only very limited restriction on reuse and has, therefore, high license comp ...
.


Git server as a service

There are many offerings of Git repositories as a service. The most popular are
GitHub GitHub, Inc. () is an Internet hosting service for software development and version control using Git. It provides the distributed version control of Git plus access control, bug tracking, software feature requests, task management, cont ...
,
SourceForge SourceForge is a web service that offers software consumers a centralized online location to control and manage open-source software projects and research business software. It provides source code repository hosting, bug tracking, mirroring ...
, Bitbucket and
GitLab GitLab Inc. is an open-core company that operates GitLab, a DevOps software package which can develop, secure, and operate software. The open source software project was created by Ukrainian developer Dmitriy Zaporozhets and Dutch developer ...
.


Adoption

The Eclipse Foundation reported in its annual community survey that as of May 2014, Git is now the most widely used source-code management tool, with 42.9% of professional software developers reporting that they use Git as their primary source-control system compared with 36.3% in 2013, 32% in 2012; or for Git responses excluding use of
GitHub GitHub, Inc. () is an Internet hosting service for software development and version control using Git. It provides the distributed version control of Git plus access control, bug tracking, software feature requests, task management, cont ...
: 33.3% in 2014, 30.3% in 2013, 27.6% in 2012 and 12.8% in 2011. Open-source directory
Black Duck Open Hub Black Duck Open Hub, formerly Ohloh, is a website which provides a web services suite and online community platform that aims to index the open-source software development community. It was founded by former Microsoft managers Jason Allen and Sc ...
reports a similar uptake among open-source projects. Stack Overflow has included
version control In software engineering, version control (also known as revision control, source control, or source code management) is a class of systems responsible for managing changes to computer programs, documents, large web sites, or other collections o ...
in their annual developer survey in 2015 (16,694 responses), 2017 (30,730 responses), 2018 (74,298 responses) and 2022 (71,379 reponses). Git was the overwhelming favorite of responding developers in these surveys, reporting as high as 93.9% in 2022. Version control systems used by responding developers: The UK IT jobs website itjobswatch.co.uk reports that as of late September 2016, 29.27% of UK permanent software development job openings have cited Git, ahead of 12.17% for Microsoft Team Foundation Server, 10.60% for
Subversion Subversion () refers to a process by which the values and principles of a system in place are contradicted or reversed in an attempt to transform the established social order and its structures of power, authority, hierarchy, and social norms ...
, 1.30% for Mercurial, and 0.48% for Visual SourceSafe.


Extensions

There are many ''Git extensions'', lik
Git LFS
which started as an extension to Git in the GitHub community and is now widely used by other repositories. Extensions are usually independently developed and maintained by different people, but at some point in the future, a widely used extension can be merged with Git. Other open-source Git extensions include: *
git-annex git-annex is a distributed file synchronization system written in Haskell. It aims to solve the problem of sharing and synchronizing collections of large files independent from a commercial service or even a central server. History The develop ...
, a distributed file synchronization system based on Git * git-flow, a set of Git extensions to provide high-level repository operations fo
Vincent Driessen's branching model

git-machete
a repository organizer & tool for automating rebase/merge/pull/push operations Microsoft developed the
Virtual File System for Git Virtual File System for Git (VFS for Git), developed by Microsoft, is an extension to the Git version control system. Overview VFS for Git is designed to ease the handling of enterprise-scale Git repositories, such as the Microsoft Windows operat ...
(VFS for Git; formerly Git Virtual File System or GVFS) extension to handle the size of the
Windows Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, Windows NT for consumers, Windows Server for se ...
source-code tree as part of their 2017 migration from
Perforce Perforce, legally Perforce Software, Inc., is an American developer of software used for developing and running applications, including version control software, web-based repository management, developer collaboration, application lifecycle man ...
. VFS for Git allows cloned repositories to use placeholders whose contents are downloaded only once a file is accessed.


Conventions

Git does not impose many restrictions on how it should be used, but some conventions are adopted in order to organize histories, especially those which require the cooperation of many contributors. * The ''master'' branch is created by default with ''git init'' and is often used as the branch that other changes are merged into. Correspondingly, the default name of the upstream remote is ''origin'' and so the name of the default remote branch is ''origin/master''. The use of ''master'' as the default branch name is not universally true. Repositories created in GitHub and GitLab will initialize with a ''main'' branch instead of ''master''. * Pushed commits should usually not be overwritten, but should rather be ''revert''ed (a commit is made on top which reverses the changes to an earlier commit). This prevents shared new commits based on shared commits from being invalid because the commit on which they are based does not exist in the remote. If the commits contain sensitive information, they should be removed, which involves a more complex procedure to rewrite history. * The ''git-flow'' workflow and naming conventions are often adopted to distinguish feature specific unstable histories (feature/*), unstable shared histories (develop), production ready histories (main), and emergency patches to released products (hotfix). *''Pull requests'' are not a feature of git, but are commonly provided by git cloud services. A pull request is a request by one user to merge a branch of their repository fork into another repository sharing the same history (called the ''upstream'' remote). The underlying function of a pull request is no different than that of an administrator of a repository pulling changes from another remote (the repository that is the source of the pull request). However, the pull request itself is a ticket managed by the hosting server which initiates scripts to perform these actions; it is not a feature of git SCM.


Security

Git does not provide access-control mechanisms, but was designed for operation with other tools that specialize in access control. On 17 December 2014, an exploit was found affecting the
Windows Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, Windows NT for consumers, Windows Server for se ...
and
macOS macOS (; previously OS X and originally Mac OS X) is a Unix operating system developed and marketed by Apple Inc. since 2001. It is the primary operating system for Apple's Mac computers. Within the market of desktop and la ...
versions of the Git client. An attacker could perform
arbitrary code execution In computer security, arbitrary code execution (ACE) is an attacker's ability to run any commands or code of the attacker's choice on a target machine or in a target process. An arbitrary code execution vulnerability is a security flaw in softw ...
on a target computer with Git installed by creating a malicious Git tree (directory) named ''.git'' (a directory in Git repositories that stores all the data of the repository) in a different case (such as .GIT or .Git, needed because Git does not allow the all-lowercase version of ''.git'' to be created manually) with malicious files in the ''.git/hooks'' subdirectory (a folder with executable files that Git runs) on a repository that the attacker made or on a repository that the attacker can modify. If a Windows or Mac user ''pulls'' (downloads) a version of the repository with the malicious directory, then switches to that directory, the .git directory will be overwritten (due to the case-insensitive trait of the Windows and Mac filesystems) and the malicious executable files in ''.git/hooks'' may be run, which results in the attacker's commands being executed. An attacker could also modify the ''.git/config'' configuration file, which allows the attacker to create malicious Git aliases (aliases for Git commands or external commands) or modify extant aliases to execute malicious commands when run. The vulnerability was patched in version 2.2.1 of Git, released on 17 December 2014, and announced the next day. Git version 2.6.1, released on 29 September 2015, contained a patch for a security vulnerability () that allowed arbitrary code execution. The vulnerability was exploitable if an attacker could convince a victim to clone a specific URL, as the arbitrary commands were embedded in the URL itself. An attacker could use the exploit via a
man-in-the-middle attack In cryptography and computer security, a man-in-the-middle, monster-in-the-middle, machine-in-the-middle, monkey-in-the-middle, meddler-in-the-middle, manipulator-in-the-middle (MITM), person-in-the-middle (PITM) or adversary-in-the-middle (AiTM) ...
if the connection was unencrypted, as they could redirect the user to a URL of their choice. Recursive clones were also vulnerable since they allowed the controller of a repository to specify arbitrary URLs via the gitmodules file. Git uses
SHA-1 In cryptography, SHA-1 (Secure Hash Algorithm 1) is a cryptographically broken but still widely used hash function which takes an input and produces a 160- bit (20- byte) hash value known as a message digest – typically rendered as 40 hexa ...
hashes internally. Linus Torvalds has responded that the hash was mostly to guard against accidental corruption, and the security a cryptographically secure hash gives was just an accidental side effect, with the main security being signing elsewhere. Since a demonstration of the SHAttered attack against git in 2017, git was modified to use a SHA-1 variant resistant to this attack. A plan for hash function transition is being written since February 2020.


Trademark

"Git" is a registered word
trademark A trademark (also written trade mark or trade-mark) is a type of intellectual property consisting of a recognizable sign, design, or expression that identifies products or services from a particular source and distinguishes them from ot ...
of
Software Freedom Conservancy Software Freedom Conservancy, Inc. is an organization that provides a non-profit home and infrastructure support for free and open source software projects. The organization was established in 2006, and as of June 2022, had over 40 member pr ...
unde
US500000085961336
since 2015-02-03.


See also

*
Comparison of version-control software In software development, version control is a class of systems responsible for managing changes to computer programs or other collections of information such that revisions have a logical and consistent organization. The following tables inclu ...
*
Comparison of source-code-hosting facilities A source-code-hosting facility (also known as forge) is a file archive and web hosting facility for source code of software, documentation, web pages, and other works, accessible either publicly or privately. They are often used by open-source s ...
*
List of version-control software This is a list of notable software for version control. Local data model In the local-only approach, all developers must use the same file system. Open source * Revision Control System (RCS) – stores the latest version and backward del ...


Notes


References


External links

* * {{Authority control 2005 software Distributed version control systems Free version control software Free software programmed in C Free software programmed in Perl Self-hosting software Linus Torvalds Software using the GPL license Software that uses Tk (software) Version control systems