In the theory of
formal language
In logic, mathematics, computer science, and linguistics, a formal language consists of words whose letters are taken from an alphabet and are well-formed according to a specific set of rules.
The alphabet of a formal language consists of s ...
s, the Myhill–Nerode theorem provides a
necessary and sufficient condition
In logic and mathematics, necessity and sufficiency are terms used to describe a conditional or implicational relationship between two statements. For example, in the conditional statement: "If then ", is necessary for , because the truth of ...
for a language to be
regular
The term regular can mean normal or in accordance with rules. It may refer to:
People
* Moses Regular (born 1971), America football player
Arts, entertainment, and media Music
* "Regular" (Badfinger song)
* Regular tunings of stringed instrum ...
. The theorem is named for
John Myhill and
Anil Nerode, who proved it at the
University of Chicago
The University of Chicago (UChicago, Chicago, U of C, or UChi) is a private university, private research university in Chicago, Illinois. Its main campus is located in Chicago's Hyde Park, Chicago, Hyde Park neighborhood. The University of Chic ...
in 1958 .
Statement
Given a language
, and a pair of strings
and
, define a distinguishing extension to be a string
such that
exactly one of the two strings
and
belongs to
.
Define a relation
on strings as
iff
In logic and related fields such as mathematics and philosophy, "if and only if" (shortened as "iff") is a biconditional logical connective between statements, where either both statements are true or both are false.
The connective is bicondi ...
there is no distinguishing extension for
and
. It is easy to show that
is an
equivalence relation
In mathematics, an equivalence relation is a binary relation that is reflexive, symmetric and transitive. The equipollence relation between line segments in geometry is a common example of an equivalence relation.
Each equivalence relatio ...
on strings, and thus it divides the set of all strings into
equivalence class
In mathematics, when the elements of some set S have a notion of equivalence (formalized as an equivalence relation), then one may naturally split the set S into equivalence classes. These equivalence classes are constructed so that elements ...
es.
The Myhill–Nerode theorem states that a language
is regular if and only if
has a finite number of equivalence classes, and moreover, that this number is equal to the number of states in the
minimal deterministic finite automaton
In the theory of computation, a branch of theoretical computer science, a deterministic finite automaton (DFA)—also known as deterministic finite acceptor (DFA), deterministic finite-state machine (DFSM), or deterministic finite-state automa ...
(DFA) recognizing
. In particular, this implies that there is a ''unique'' minimal DFA for each regular language .
Some authors refer to the
relation as Nerode congruence,
in honor of
Anil Nerode.
Proof
If
is a regular language, then by definition there is a DFA
that recognizes it, with only finitely many states. If there are
states, then partition the set of all finite strings into
subsets, where subset
is the set of strings that, when given as input to automaton
, cause it to end in state
. For every two strings
and
that belong to the same subset, and for every choice of a third string
, the automaton
reaches the same state on input
as it reaches on input
, and therefore must either accept both of the inputs
and
or reject both of them. Therefore, no string
can be a distinguishing extension for
and
, so they must be related by
. Thus,
is a subset of an equivalence class of
. Combining this fact with the fact that every member of one of these equivalence classes belongs to one of the sets
, this gives a surjective function from states of
to equivalence classes, implying that the number of equivalence classes is finite and at most
.
In the other direction, suppose that
has finitely many equivalence classes. In this case, it is possible to design a deterministic finite automaton that has one state for each equivalence class. The start state of the automaton corresponds to the equivalence class containing the empty string, and the transition function from a state
on input symbol
takes the automaton to a new state, the state corresponding to the equivalence class containing string
, where
is an arbitrarily chosen string in the equivalence class corresponding to
. The definition of the Myhill–Nerode relation implies that the transition function is well-defined: no matter which representative string
is chosen for state
, the same transition function value will result. A state of this automaton is accepting if the corresponding equivalence class contains a string in
; in this case, again, the definition of the relation implies that all strings in the same equivalence class must also belong to
, for otherwise the empty string would be a distinguishing string for some pairs of strings in the class.
Thus, the existence of a finite automaton recognizing
implies that the Myhill–Nerode relation has a finite number of equivalence classes, at most equal to the number of states of the automaton, and the existence of a finite number of equivalence classes implies the existence of an automaton with that many states.
Use and consequences
The Myhill–Nerode theorem may be used to show that a language
is
regular
The term regular can mean normal or in accordance with rules. It may refer to:
People
* Moses Regular (born 1971), America football player
Arts, entertainment, and media Music
* "Regular" (Badfinger song)
* Regular tunings of stringed instrum ...
by proving that the number of equivalence classes of
is finite. This may be done by an exhaustive
case analysis in which, beginning from the
empty string
In formal language theory, the empty string, or empty word, is the unique string of length zero.
Formal theory
Formally, a string is a finite, ordered sequence of characters such as letters, digits or spaces. The empty string is the special cas ...
, distinguishing extensions are used to find additional equivalence classes until no more can be found. For example, the language consisting of binary representations of numbers that can be divided by 3 is regular. Given the empty string,
(or
),
, and
are distinguishing extensions resulting in the three classes (corresponding to numbers that give remainders 0, 1 and 2 when divided by 3), but after this step there is no distinguishing extension anymore. The minimal automaton accepting our language would have three states corresponding to these three equivalence classes.
Another immediate
corollary
In mathematics and logic, a corollary ( , ) is a theorem of less importance which can be readily deduced from a previous, more notable statement. A corollary could, for instance, be a proposition which is incidentally proved while proving another ...
of the theorem is that if for a language
the relation
has infinitely many equivalence classes, it is regular. It is this corollary that is frequently used to prove that a language is not regular.
Generalizations
The Myhill–Nerode theorem can be generalized to
tree automata.
See also
*
Pumping lemma for regular languages, an alternative method for proving that a language is not regular. The pumping lemma may not always be able to prove that a language is not regular.
*
Syntactic monoid In mathematics and computer science, the syntactic monoid M(L) of a formal language L is the smallest monoid that recognizes the language L.
Syntactic quotient
The free monoid on a given set is the monoid whose elements are all the strings of ...
References
*.
*.
*.
Further reading
*
{{DEFAULTSORT:Myhill-Nerode theorem
Formal languages
Theorems in discrete mathematics
Finite automata