alphabet (computer science)

TheInfoList

In
formal language theory In mathematics, computer science, and linguistics, a formal language consists of string (computer science), words whose symbol (formal), letters are taken from an alphabet (computer science), alphabet and are well-formedness, well-formed accordin ...
, an alphabet is a non-empty set of
symbols A symbol is a mark, sign, or word In linguistics, a word of a spoken language can be defined as the smallest sequence of phonemes that can be uttered in isolation with semantic, objective or pragmatics, practical meaning (linguistics), meanin ...
/
glyph In typography File:metal movable type.jpg, 225px, Movable type being assembled on a composing stick using pieces that are stored in the type case shown below it Typography is the art and technique of typesetting, arranging type to make wr ...
s, typically thought of as representing letters, characters, or digits but among other possibilities the "symbols" could also be a set of
phoneme In phonology and linguistics, a phoneme is a unit of sound that distinguishes one word from another in a particular language. For example, in most List of dialects of English, dialects of English, with the notable exception of the West Midlan ...
s (sound units). Alphabets in this technical sense of a set are used in a diverse range of fields including logic, mathematics, computer science, and linguistics. An alphabet may have any
cardinality In mathematics Mathematics (from Ancient Greek, Greek: ) includes the study of such topics as quantity (number theory), mathematical structure, structure (algebra), space (geometry), and calculus, change (mathematical analysis, analysis). It ...
("size") and depending on its purpose maybe be
finite Finite is the opposite of Infinity, infinite. It may refer to: * Finite number (disambiguation) * Finite set, a set whose cardinality (number of elements) is some natural number * Finite verb, a verb form that has a subject, usually being inflected ...
(e.g., the alphabet of letters "a" through "z"),
countable In mathematics Mathematics (from Ancient Greek, Greek: ) includes the study of such topics as quantity (number theory), mathematical structure, structure (algebra), space (geometry), and calculus, change (mathematical analysis, analysis). It ...
(e.g., $\$), or even
uncountable In mathematics, an uncountable set (or uncountably infinite set) is an infinite set that contains too many Element (mathematics), elements to be countable set, countable. The uncountability of a set is closely related to its cardinal number: a set ...

(e.g., $\$).
String String or strings may refer to: *String (structure), a long flexible structure made from threads twisted together, which is used to tie, bind, or hang other objects Arts, entertainment, and media Films * Strings (1991 film), ''Strings'' (1991 fil ...
s, also known as "words", over an alphabet are defined as a
sequence In mathematics, a sequence is an enumerated collection of objects in which repetitions are allowed and order theory, order matters. Like a Set (mathematics), set, it contains Element (mathematics), members (also called ''elements'', or ''terms''). ...

of the symbols from the set. For example, the alphabet of lowercase letters "a" through "z" can be used to form English words like "iceberg" while the alphabet of both upper and lower case letters can also be used to form proper names like "Wikipedia". A common alphabet is , the binary alphabet, and a "00101111" is an example of a
binary string In computer programming, a string is traditionally a sequence of Character (computing), characters, either as a literal (computer programming), literal constant or as some kind of variable. The latter may allow its elements to be mutated and the ...
. Infinite
sequence In mathematics, a sequence is an enumerated collection of objects in which repetitions are allowed and order theory, order matters. Like a Set (mathematics), set, it contains Element (mathematics), members (also called ''elements'', or ''terms''). ...
of symbols may be considered as well. It is often necessary for practical purposes to restrict the symbols in an alphabet so that they are unambiguous when interpreted. For instance, if the two-member alphabet is , a string written on paper as "000" is ambiguous because it is unclear if it is a sequence of three "0" symbols, a "00" followed by a "0", or a "0" followed by a "00".

# Notation

If ''L'' is a formal language, i.e. a (possibly infinite) set of finite-length strings, the alphabet of ''L'' is the set of all symbols that may occur in any string in ''L''. For example, if ''L'' is the set of all variable identifiers in the programming language C, ''L''’s alphabet is the set . Given an alphabet $\Sigma$, the set of all strings of length $n$ over the alphabet $\Sigma$ is indicated by $\Sigma^n$. The set $\bigcup_ \Sigma^i$ of all finite strings (regardless of their length) is indicated by the
Kleene star In mathematical logic Mathematical logic, also called formal logic, is a subfield of mathematics Mathematics (from Ancient Greek, Greek: ) includes the study of such topics as quantity (number theory), mathematical structure, structure (alg ...
operator as $\Sigma^*$, and is also called the Kleene closure of $\Sigma$. The notation $\Sigma^\omega$ indicates the set of all infinite sequences over the alphabet $\Sigma$, and $\Sigma^\infty$ indicates the set $\Sigma^\ast \cup \Sigma^\omega$ of all finite or infinite sequences. For example, using the binary alphabet , the strings ε, 0, 1, 00, 01, 10, 11, 000, etc. are all in the Kleene closure of the alphabet (where ε represents the
empty string In formal language theory, the empty string, or empty word, is the unique String (computer science), string of length zero. Formal theory Formally, a string is a finite, ordered sequence of character (symbol), characters such as letters, digits o ...
).

# Applications

Alphabets are important in the use of
formal languages In logic, mathematics, computer science, and linguistics, a formal language consists of string (computer science), words whose symbol (formal), letters are taken from an alphabet (computer science), alphabet and are well-formedness, well-formed a ...

,
automata automaton. An automaton (; plural: automata or automatons) is a relatively self-operating machine A machine is a man-made device that uses power to apply forces and control movement to perform an action. Machines can be driven by anim ...
and semiautomata. In most cases, for defining instances of automata, such as
deterministic finite automata In the theory of computation, a branch of theoretical computer science, a deterministic finite automaton (DFA)—also known as Finite-state machine#Acceptors (recognizers), deterministic finite acceptor (DFA), deterministic finite-state machine ( ...
(DFAs), it is required to specify an alphabet from which the input strings for the automaton are built. In these applications, an alphabet is usually required to be a
finite set In mathematics Mathematics (from Ancient Greek, Greek: ) includes the study of such topics as quantity (number theory), mathematical structure, structure (algebra), space (geometry), and calculus, change (mathematical analysis, analysis). It ...
, but is not otherwise restricted. When using automata, regular expressions, or formal grammars as part of string-processing algorithms, the alphabet may be assumed to be the character set of the text to be processed by these algorithms, or a subset of allowable characters from the character set.