Indexed languages are a class of
formal language
In logic, mathematics, computer science, and linguistics, a formal language consists of words whose letters are taken from an alphabet and are well-formed according to a specific set of rules.
The alphabet of a formal language consists of s ...
s discovered by
Alfred Aho
Alfred Vaino Aho (born August 9, 1941) is a Canadian computer scientist best known for his work on programming languages, compilers, and related algorithms, and his textbooks on the art and science of computer programming.
Aho was elected into ...
;
they are described by
indexed grammar Indexed grammars are a generalization of context-free grammars in that nonterminals are equipped with lists of ''flags'', or ''index symbols''.
The language produced by an indexed grammar is called an indexed language.
Definition
Modern definition ...
s and can be recognized by
nested stack automata
In automata theory, a nested stack automaton is a finite automaton that can make use of a stack containing data which can be additional stacks.
Like a stack automaton, a nested stack automaton may step up or down in the stack, and read the cur ...
.
Indexed languages are a
proper subset
In mathematics, set ''A'' is a subset of a set ''B'' if all elements of ''A'' are also elements of ''B''; ''B'' is then a superset of ''A''. It is possible for ''A'' and ''B'' to be equal; if they are unequal, then ''A'' is a proper subset o ...
of
context-sensitive language
In formal language theory, a context-sensitive language is a language that can be defined by a context-sensitive grammar (and equivalently by a noncontracting grammar). Context-sensitive is one of the four types of grammars in the Chomsky hierarc ...
s.
They qualify as an
abstract family of languages In computer science, in particular in the field of formal language theory,
an abstract family of languages is an abstract mathematical notion generalizing characteristics common to the regular languages, the context-free languages and the recursi ...
(furthermore a full AFL) and hence satisfy many closure properties. However, they are not closed under intersection or complement.
The class of indexed languages has generalization of
context-free languages, since
indexed grammar Indexed grammars are a generalization of context-free grammars in that nonterminals are equipped with lists of ''flags'', or ''index symbols''.
The language produced by an indexed grammar is called an indexed language.
Definition
Modern definition ...
s can describe many of the nonlocal constraints occurring in natural languages.
Gerald Gazdar
Gerald James Michael Gazdar, FBA (born 24 February 1950) is a British linguist and computer scientist.
Education
He was educated at Heath Mount School, Bradfield College, the University of East Anglia (BA, 1970) and the University of Reading (M ...
(1988)
and Vijay-Shanker (1987) introduced a
mildly context-sensitive language In computational linguistics, the term mildly context-sensitive grammar formalisms refers to several grammar formalisms that have been developed in an effort to provide adequate descriptions of the syntactic structure of natural language.
Every ...
class now known as linear indexed grammars (LIG).
Linear indexed grammars have additional restrictions relative to IG. LIGs are
weakly equivalent (generate the same language class) as
tree adjoining grammars
Tree-adjoining grammar (TAG) is a grammar formalism defined by Aravind Joshi. Tree-adjoining grammars are somewhat similar to context-free grammars, but the elementary unit of rewriting is the tree rather than the symbol. Whereas context-free gram ...
.
Examples
The following languages are indexed, but are not
context-free:
:
:
These two languages are also indexed, but are not even
mildly context sensitive under Gazdar's characterization:
:
:
On the other hand, the following language is not indexed:
:
Properties
Hopcroft and
Ullman Ullman is a surname. Notable people with the surname include:
*Al Ullman (1914–1986), American politician
*Berthold Ullman (1882–1965), American classical scholar
*Edward Ullman (1912–1976), American geographer
*Ellen Ullman, American author
...
tend to consider indexed languages as a "natural" class, since they are generated by several formalisms, such as:
*
Aho's
indexed grammar Indexed grammars are a generalization of context-free grammars in that nonterminals are equipped with lists of ''flags'', or ''index symbols''.
The language produced by an indexed grammar is called an indexed language.
Definition
Modern definition ...
s
*
Aho's one-way
nested stack automata
In automata theory, a nested stack automaton is a finite automaton that can make use of a stack containing data which can be additional stacks.
Like a stack automaton, a nested stack automaton may step up or down in the stack, and read the cur ...
*
Fischer
Fischer is a German occupational surname, meaning fisherman. The name Fischer is the fourth most common German surname. The English version is Fisher.
People with the surname A
* Abraham Fischer (1850–1913) South African public official
* ...
's macro grammars
*
Greibach's automata with stacks of stacks
*
Maibaum Maibaum is a German surname meaning "maypole". Notable people with the surname include:
*Richard Maibaum (1909–1991), American film producer, playwright, and screenwriter
*Tom Maibaum
Thomas Stephen Edward Maibaum Fellow of the Royal Society of ...
's algebraic characterization
Hayashi
generalized the
pumping lemma to indexed grammars.
Conversely, Gilman
gives a "shrinking lemma" for indexed languages.
See also
*
Chomsky hierarchy
In formal language theory, computer science and linguistics, the Chomsky hierarchy (also referred to as the Chomsky–Schützenberger hierarchy) is a containment hierarchy of classes of formal grammars.
This hierarchy of grammars was described ...
*
Indexed grammar Indexed grammars are a generalization of context-free grammars in that nonterminals are equipped with lists of ''flags'', or ''index symbols''.
The language produced by an indexed grammar is called an indexed language.
Definition
Modern definition ...
*
Mildly context-sensitive language In computational linguistics, the term mildly context-sensitive grammar formalisms refers to several grammar formalisms that have been developed in an effort to provide adequate descriptions of the syntactic structure of natural language.
Every ...
References
External links
"NLP in Prolog" chapter on indexed grammars and languages
{{DEFAULTSORT:Indexed Language
Formal languages