In

formal language theory
In mathematics, computer science, and linguistics, a formal language consists of string (computer science), words whose symbol (formal), letters are taken from an alphabet (computer science), alphabet and are well-formedness, well-formed accordin ...

, an alphabet is a non-empty set of symbols
A symbol is a mark, sign, or word
In linguistics, a word of a spoken language can be defined as the smallest sequence of phonemes that can be uttered in isolation with semantic, objective or pragmatics, practical meaning (linguistics), meanin ...

/glyph
In typography
File:metal movable type.jpg, 225px, Movable type being assembled on a composing stick using pieces that are stored in the type case shown below it
Typography is the art and technique of typesetting, arranging type to make wr ...

s, typically thought of as representing letters, characters, or digits but among other possibilities the "symbols" could also be a set of phoneme
In phonology and linguistics, a phoneme is a unit of sound that distinguishes one word from another in a particular language.
For example, in most List of dialects of English, dialects of English, with the notable exception of the West Midlan ...

s (sound units). Alphabets in this technical sense of a set are used in a diverse range of fields including logic, mathematics, computer science, and linguistics. An alphabet may have any cardinality
In mathematics
Mathematics (from Ancient Greek, Greek: ) includes the study of such topics as quantity (number theory), mathematical structure, structure (algebra), space (geometry), and calculus, change (mathematical analysis, analysis). It ...

("size") and depending on its purpose maybe be finite
Finite is the opposite of Infinity, infinite. It may refer to:
* Finite number (disambiguation)
* Finite set, a set whose cardinality (number of elements) is some natural number
* Finite verb, a verb form that has a subject, usually being inflected ...

(e.g., the alphabet of letters "a" through "z"), countable
In mathematics
Mathematics (from Ancient Greek, Greek: ) includes the study of such topics as quantity (number theory), mathematical structure, structure (algebra), space (geometry), and calculus, change (mathematical analysis, analysis). It ...

(e.g., $\backslash $), or even uncountable
In mathematics, an uncountable set (or uncountably infinite set) is an infinite set that contains too many Element (mathematics), elements to be countable set, countable. The uncountability of a set is closely related to its cardinal number: a set ...

(e.g., $\backslash $).
String
String or strings may refer to:
*String (structure), a long flexible structure made from threads twisted together, which is used to tie, bind, or hang other objects
Arts, entertainment, and media Films
* Strings (1991 film), ''Strings'' (1991 fil ...

s, also known as "words", over an alphabet are defined as a sequence
In mathematics, a sequence is an enumerated collection of objects in which repetitions are allowed and order theory, order matters. Like a Set (mathematics), set, it contains Element (mathematics), members (also called ''elements'', or ''terms''). ...

of the symbols from the set. For example, the alphabet of lowercase letters "a" through "z" can be used to form English words like "iceberg" while the alphabet of both upper and lower case letters can also be used to form proper names like "Wikipedia". A common alphabet is , the binary alphabet, and a "00101111" is an example of a binary string
In computer programming, a string is traditionally a sequence of Character (computing), characters, either as a literal (computer programming), literal constant or as some kind of variable. The latter may allow its elements to be mutated and the ...

. Infinite sequence
In mathematics, a sequence is an enumerated collection of objects in which repetitions are allowed and order theory, order matters. Like a Set (mathematics), set, it contains Element (mathematics), members (also called ''elements'', or ''terms''). ...

of symbols may be considered as well.
It is often necessary for practical purposes to restrict the symbols in an alphabet so that they are unambiguous when interpreted. For instance, if the two-member alphabet is , a string written on paper as "000" is ambiguous because it is unclear if it is a sequence of three "0" symbols, a "00" followed by a "0", or a "0" followed by a "00".
Notation

If ''L'' is a formal language, i.e. a (possibly infinite) set of finite-length strings, the alphabet of ''L'' is the set of all symbols that may occur in any string in ''L''. For example, if ''L'' is the set of all variable identifiers in the programming language C, ''L''’s alphabet is the set . Given an alphabet $\backslash Sigma$, the set of all strings of length $n$ over the alphabet $\backslash Sigma$ is indicated by $\backslash Sigma^n$. The set $\backslash bigcup\_\; \backslash Sigma^i$ of all finite strings (regardless of their length) is indicated by theKleene star
In mathematical logic
Mathematical logic, also called formal logic, is a subfield of mathematics
Mathematics (from Ancient Greek, Greek: ) includes the study of such topics as quantity (number theory), mathematical structure, structure (alg ...

operator as $\backslash Sigma^*$, and is also called the Kleene closure of $\backslash Sigma$. The notation $\backslash Sigma^\backslash omega$ indicates the set of all infinite sequences over the alphabet $\backslash Sigma$, and $\backslash Sigma^\backslash infty$ indicates the set $\backslash Sigma^\backslash ast\; \backslash cup\; \backslash Sigma^\backslash omega$ of all finite or infinite sequences.
For example, using the binary alphabet , the strings ε, 0, 1, 00, 01, 10, 11, 000, etc. are all in the Kleene closure of the alphabet (where ε represents the empty string
In formal language theory, the empty string, or empty word, is the unique String (computer science), string of length zero.
Formal theory
Formally, a string is a finite, ordered sequence of character (symbol), characters such as letters, digits o ...

).
Applications

Alphabets are important in the use offormal languages
In logic, mathematics, computer science, and linguistics, a formal language consists of string (computer science), words whose symbol (formal), letters are taken from an alphabet (computer science), alphabet and are well-formedness, well-formed a ...

, automata
automaton.
An automaton (; plural: automata or automatons) is a relatively self-operating machine
A machine is a man-made device that uses power to apply forces and control movement to perform an action. Machines can be driven by anim ...

and semiautomata. In most cases, for defining instances of automata, such as deterministic finite automata
In the theory of computation, a branch of theoretical computer science, a deterministic finite automaton (DFA)—also known as Finite-state machine#Acceptors (recognizers), deterministic finite acceptor (DFA), deterministic finite-state machine ( ...

(DFAs), it is required to specify an alphabet from which the input strings for the automaton are built. In these applications, an alphabet is usually required to be a finite set
In mathematics
Mathematics (from Ancient Greek, Greek: ) includes the study of such topics as quantity (number theory), mathematical structure, structure (algebra), space (geometry), and calculus, change (mathematical analysis, analysis). It ...

, but is not otherwise restricted.
When using automata, regular expressions, or formal grammars as part of string-processing algorithms, the alphabet may be assumed to be the character set of the text to be processed by these algorithms, or a subset of allowable characters from the character set.
See also

*Combinatorics on wordsReferences

Literature

* John E. Hopcroft and Jeffrey D. Ullman, ''Introduction to Automata Theory, Languages, and Computation'', Addison-Wesley Publishing, Reading Massachusetts, 1979. . {{DEFAULTSORT:Alphabet (formal languages) Formal languages Combinatorics on words