The bit is the most basic
unit of information in
computing
Computing is any goal-oriented activity requiring, benefiting from, or creating computing machinery. It includes the study and experimentation of algorithmic processes, and development of both hardware and software. Computing has scientific, ...
and digital
communication
Communication (from la, communicare, meaning "to share" or "to be in relation with") is usually defined as the transmission of information. The term may also refer to the message communicated through such transmissions or the field of inqu ...
s. The name is a
portmanteau of binary digit.
The bit represents a
logical state with one of two possible
values. These values are most commonly represented as either , but other representations such as ''true''/''false'', ''yes''/''no'', ''on''/''off'', or ''+''/''−'' are also commonly used.
The relation between these values and the physical states of the underlying
storage
Storage may refer to:
Goods Containers
* Dry cask storage, for storing high-level radioactive waste
* Food storage
* Intermodal container, cargo shipping
* Storage tank
Facilities
* Garage (residential), a storage space normally used to store car ...
or
device
A device is usually a constructed tool. Device may also refer to:
Technology Computing
* Device, a colloquial term encompassing desktops, laptops, tablets, smartphones, etc.
* Device file, an interface of a device driver
* Peripheral, any devi ...
is a matter of convention, and different assignments may be used even within the same device or
program. It may be physically implemented with a two-state device.
The symbol for the binary digit is either "bit" per recommendation by the
IEC 80000-13:2008 standard, or the lowercase character "b", as recommended by the
IEEE 1541-2002 standard.
A contiguous group of binary digits is commonly called a ''
bit string'', a bit vector, or a single-dimensional (or multi-dimensional) ''
bit array''.
A group of eight bits is called one ''
byte
The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable unit ...
'', but historically the size of the byte is not strictly defined.
Frequently, half, full, double and quadruple words consist of a number of bytes which is a low power of two. A string of four bits is a ''
nibble''.
In
information theory, one bit is the
information entropy of a random
binary variable that is 0 or 1 with equal probability,
or the information that is gained when the value of such a variable becomes known.
As a
unit of information, the bit is also known as a ''
shannon'',
named after
Claude E. Shannon
Claude Elwood Shannon (April 30, 1916 – February 24, 2001) was an American mathematician, electrical engineer, and cryptographer known as a "father of information theory".
As a 21-year-old master's degree student at the Massachusetts Institu ...
.
History
The encoding of data by discrete bits was used in the
punched cards invented by
Basile Bouchon and Jean-Baptiste Falcon (1732), developed by
Joseph Marie Jacquard (1804), and later adopted by
Semyon Korsakov,
Charles Babbage,
Hermann Hollerith, and early computer manufacturers like
IBM. A variant of that idea was the perforated
paper tape. In all those systems, the medium (card or tape) conceptually carried an array of hole positions; each position could be either punched through or not, thus carrying one bit of information. The encoding of text by bits was also used in
Morse code
Morse code is a method used in telecommunication to encode text characters as standardized sequences of two different signal durations, called ''dots'' and ''dashes'', or ''dits'' and ''dahs''. Morse code is named after Samuel Morse, one ...
(1844) and early digital communications machines such as
teletypes
A teleprinter (teletypewriter, teletype or TTY) is an electromechanical device that can be used to send and receive typed messages through various communications channels, in both point-to-point and point-to-multipoint configurations. Initia ...
and
stock ticker machines (1870).
Ralph Hartley suggested the use of a logarithmic measure of information in 1928.
Claude E. Shannon first used the word "bit" in his seminal 1948 paper "
A Mathematical Theory of Communication".
He attributed its origin to
John W. Tukey, who had written a Bell Labs memo on 9 January 1947 in which he contracted "binary information digit" to simply "bit".
Vannevar Bush had written in 1936 of "bits of information" that could be stored on the
punched cards used in the mechanical computers of that time.
The first programmable computer, built by
Konrad Zuse, used binary notation for numbers.
Physical representation
A bit can be stored by a digital device or other physical system that exists in either of two possible distinct
states. These may be the two stable states of a flip-flop, two positions of an
electrical switch, two distinct
voltage or
current levels allowed by a
circuit
Circuit may refer to:
Science and technology
Electrical engineering
* Electrical circuit, a complete electrical network with a closed-loop giving a return path for current
** Analog circuit, uses continuous signal levels
** Balanced circu ...
, two distinct levels of
light intensity, two directions of
magnetization or
polarization
Polarization or polarisation may refer to:
Mathematics
*Polarization of an Abelian variety, in the mathematics of complex manifolds
*Polarization of an algebraic form, a technique for expressing a homogeneous polynomial in a simpler fashion by ...
, the orientation of reversible double stranded
DNA, etc.
Bits can be implemented in several forms. In most modern computing devices, a bit is usually represented by an
electrical voltage or
current pulse, or by the electrical state of a flip-flop circuit.
For devices using
positive logic
In digital circuits, a logic level is one of a finite number of states that a digital signal can inhabit. Logic levels are usually represented by the voltage difference between the signal and ground, although other standards exist. The range of ...
, a digit value of (or a logical value of true) is represented by a more positive voltage relative to the representation of . The specific voltages are different for different logic families and variations are permitted to allow for component aging and noise immunity. For example, in
transistor–transistor logic (TTL) and compatible circuits, digit values and at the output of a device are represented by no higher than 0.4 volts and no lower than 2.6 volts, respectively; while TTL inputs are specified to recognize 0.8 volts or below as and 2.2 volts or above as .
Transmission and processing
Bits are transmitted one at a time in
serial transmission, and by a multiple number of bits in
parallel transmission. A
bitwise operation optionally processes bits one at a time. Data transfer rates are usually measured in decimal SI multiples of the unit
bit per second (bit/s), such as kbit/s.
Storage
In the earliest non-electronic information processing devices, such as Jacquard's loom or Babbage's
Analytical Engine, a bit was often stored as the position of a mechanical lever or gear, or the presence or absence of a hole at a specific point of a
paper card or
tape
Tape or Tapes may refer to:
Material
A long, narrow, thin strip of material (see also Ribbon (disambiguation):
Adhesive tapes
* Adhesive tape, any of many varieties of backing materials coated with an adhesive
*Athletic tape, pressure-sensitiv ...
. The first electrical devices for discrete logic (such as
elevator
An elevator or lift is a cable-assisted, hydraulic cylinder-assisted, or roller-track assisted machine that vertically transports people or freight between floors, levels, or decks of a building, vessel, or other structure. They ar ...
and
traffic light control
circuits,
telephone switches, and Konrad Zuse's computer) represented bits as the states of
electrical relays which could be either "open" or "closed". When relays were replaced by
vacuum tube
A vacuum tube, electron tube, valve (British usage), or tube (North America), is a device that controls electric current flow in a high vacuum between electrodes to which an electric voltage, potential difference has been applied.
The type kn ...
s, starting in the 1940s, computer builders experimented with a variety of storage methods, such as pressure pulses traveling down a
mercury delay line
Delay-line memory is a form of computer memory, now obsolete, that was used on some of the earliest digital computers. Like many modern forms of electronic computer memory, delay-line memory was a refreshable memory, but as opposed to modern r ...
, charges stored on the inside surface of a
cathode-ray tube, or opaque spots printed on
glass discs by
photolithographic techniques.
In the 1950s and 1960s, these methods were largely supplanted by
magnetic storage devices such as
magnetic-core memory,
magnetic tapes,
drums, and
disks, where a bit was represented by the polarity of
magnetization of a certain area of a
ferromagnetic film, or by a change in polarity from one direction to the other. The same principle was later used in the
magnetic bubble memory
Bubble memory is a type of non-volatile computer memory that uses a thin film of a magnetic material to hold small magnetized areas, known as ''bubbles'' or ''domains'', each storing one bit of data. The material is arranged to form a series of ...
developed in the 1980s, and is still found in various
magnetic strip items such as
metro
Metro, short for metropolitan, may refer to:
Geography
* Metro (city), a city in Indonesia
* A metropolitan area, the populated region including and surrounding an urban center
Public transport
* Rapid transit, a passenger railway in an urban ...
tickets and some
credit cards.
In modern
semiconductor memory, such as
dynamic random-access memory, the two values of a bit may be represented by two levels of
electric charge stored in a
capacitor
A capacitor is a device that stores electrical energy in an electric field by virtue of accumulating electric charges on two close surfaces insulated from each other. It is a passive electronic component with two terminals.
The effect of a ...
. In certain types of
programmable logic arrays and
read-only memory, a bit may be represented by the presence or absence of a conducting path at a certain point of a circuit. In
optical discs, a bit is encoded as the presence or absence of a
microscopic pit on a reflective surface. In one-dimensional
bar codes, bits are encoded as the thickness of alternating black and white lines.
Unit and symbol
The bit is not defined in the
International System of Units (SI). However, the
International Electrotechnical Commission issued standard
IEC 60027, which specifies that the symbol for binary digit should be 'bit', and this should be used in all multiples, such as 'kbit', for kilobit.
However, the lower-case letter 'b' is widely used as well and was recommended by the
IEEE 1541 Standard (2002). In contrast, the upper case letter 'B' is the standard and customary symbol for byte.
Multiple bits
Multiple bits may be expressed and represented in several ways. For convenience of representing commonly reoccurring groups of bits in information technology, several
units of information have traditionally been used. The most common is the unit
byte
The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable unit ...
, coined by
Werner Buchholz in June 1956, which historically was used to represent the group of bits used to encode a single
character
Character or Characters may refer to:
Arts, entertainment, and media Literature
* ''Character'' (novel), a 1936 Dutch novel by Ferdinand Bordewijk
* ''Characters'' (Theophrastus), a classical Greek set of character sketches attributed to The ...
of text (until
UTF-8
UTF-8 is a variable-length character encoding used for electronic communication. Defined by the Unicode Standard, the name is derived from ''Unicode'' (or ''Universal Coded Character Set'') ''Transformation Format 8-bit''.
UTF-8 is capable of ...
multibyte encoding took over) in a computer
and for this reason it was used as the basic
addressable element in many
computer architecture
In computer engineering, computer architecture is a description of the structure of a computer system made from component parts. It can sometimes be a high-level description that ignores details of the implementation. At a more detailed level, the ...
s. The trend in hardware design converged on the most common implementation of using eight bits per byte, as it is widely used today. However, because of the ambiguity of relying on the underlying hardware design, the unit
octet
Octet may refer to:
Music
* Octet (music), ensemble consisting of eight instruments or voices, or composition written for such an ensemble
** String octet, a piece of music written for eight string instruments
*** Octet (Mendelssohn), 1825 com ...
was defined to explicitly denote a sequence of eight bits.
Computers usually manipulate bits in groups of a fixed size, conventionally named "
words
A word is a basic element of language that carries an objective or practical meaning, can be used on its own, and is uninterruptible. Despite the fact that language speakers often have an intuitive grasp of what a word is, there is no conse ...
". Like the byte, the number of bits in a word also varies with the hardware design, and is typically between 8 and 80 bits, or even more in some specialized computers. In the 21st century, retail personal or server computers have a word size of 32 or 64 bits.
The
International System of Units defines a series of decimal prefixes for multiples of standardized units which are commonly also used with the bit and the byte. The prefixes
kilo (10
3) through
yotta (10
24) increment by multiples of one thousand, and the corresponding units are the
kilobit (kbit) through the
yottabit (Ybit).
Information capacity and information compression
When the information capacity of a storage system or a communication channel is presented in ''bits'' or ''bits per second'', this often refers to binary digits, which is a
computer hardware capacity to store binary data ( or , up or down, current or not, etc.).
Information capacity of a storage system is only an upper bound to the quantity of information stored therein. If the two possible values of one bit of storage are not equally likely, that bit of storage contains less than one bit of information. If the value is completely predictable, then the reading of that value provides no information at all (zero entropic bits, because no resolution of uncertainty occurs and therefore no information is available). If a computer file that uses ''n'' bits of storage contains only ''m'' < ''n'' bits of information, then that information can in principle be encoded in about ''m'' bits, at least on the average. This principle is the basis of
data compression
In information theory, data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression is either lossy or lossless. Lossless compressi ...
technology. Using an analogy, the hardware binary digits refer to the amount of storage space available (like the number of buckets available to store things), and the information content the filling, which comes in different levels of granularity (fine or coarse, that is, compressed or uncompressed information). When the granularity is finer—when information is more compressed—the same bucket can hold more.
For example, it is estimated that the combined technological capacity of the world to store information provides 1,300
exabytes of hardware digits. However, when this storage space is filled and the corresponding content is optimally compressed, this only represents 295 exabytes of information.
When optimally compressed, the resulting carrying capacity approaches
Shannon information or
information entropy.
Bit-based computing
Certain
bitwise
In computer programming, a bitwise operation operates on a bit string, a bit array or a binary numeral (considered as a bit string) at the level of its individual bits. It is a fast and simple action, basic to the higher-level arithmetic oper ...
computer
processor instructions (such as ''bit set'') operate at the level of manipulating bits rather than manipulating data interpreted as an aggregate of bits.
In the 1980s, when
bitmapped computer displays became popular, some computers provided specialized
bit block transfer instructions to set or copy the bits that corresponded to a given rectangular area on the screen.
In most computers and programming languages, when a bit within a group of bits, such as a byte or word, is referred to, it is usually specified by a number from 0 upwards corresponding to its position within the byte or word. However, 0 can refer to either the
most or
least significant bit depending on the context.
Other information units
Similar to
torque
In physics and mechanics, torque is the rotational equivalent of linear force. It is also referred to as the moment of force (also abbreviated to moment). It represents the capability of a force to produce change in the rotational motion of t ...
and
energy
In physics, energy (from Ancient Greek: ἐνέργεια, ''enérgeia'', “activity”) is the quantitative property that is transferred to a body or to a physical system, recognizable in the performance of work and in the form of hea ...
in physics;
information-theoretic information and data storage size have the same
dimensionality of
units of measurement
A unit of measurement is a definite magnitude of a quantity, defined and adopted by convention or by law, that is used as a standard for measurement of the same kind of quantity. Any other quantity of that kind can be expressed as a mul ...
, but there is in general no meaning to adding, subtracting or otherwise combining the units mathematically, although one may act as a bound on the other.
Units of information used in information theory include the ''
shannon'' (Sh), the ''
natural unit of information'' (nat) and the ''
hartley'' (Hart). One shannon is the maximum amount of information needed to specify the state of one bit of storage. These are related by 1 Sh ≈ 0.693 nat ≈ 0.301 Hart.
Some authors also define a binit as an arbitrary information unit equivalent to some fixed but unspecified number of bits.
See also
*
Byte
The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable unit ...
*
Integer (computer science)
*
Primitive data type
*
Trit (Trinary digit)
*
Qubit (quantum bit)
*
Bitstream
*
Entropy (information theory)
*
Bit rate and
baud rate
*
Binary numeral system
*
Ternary numeral system
*
Shannon (unit)
*
Nibble
References
External links
Bit Calculator– a tool providing conversions between bit, byte, kilobit, kilobyte, megabit, megabyte, gigabit, gigabyte
BitXByteConverter– a tool for computing file sizes, storage capacity, and digital information in various units
{{Authority control
Binary arithmetic
Primitive types
Data types
Units of information