In
linguistics
Linguistics is the scientific study of language. The areas of linguistic analysis are syntax (rules governing the structure of sentences), semantics (meaning), Morphology (linguistics), morphology (structure of words), phonetics (speech sounds ...
and
theoretical computer science
Theoretical computer science is a subfield of computer science and mathematics that focuses on the Abstraction, abstract and mathematical foundations of computation.
It is difficult to circumscribe the theoretical areas precisely. The Associati ...
, literal movement grammars (LMGs) are a grammar formalism intended to characterize certain
extraposition
Extraposition is a mechanism of syntax that alters word order in such a manner that a relatively "heavy" constituent appears to the right of its canonical position. Extraposing a constituent results in a discontinuity and in this regard, it i ...
phenomena of
natural language
A natural language or ordinary language is a language that occurs naturally in a human community by a process of use, repetition, and change. It can take different forms, typically either a spoken language or a sign language. Natural languages ...
such as
topicalization
Topicalization is a mechanism of syntax that establishes an expression as the sentence or clause topic (linguistics), topic by having it appear at the front of the sentence or clause (as opposed to in a canonical position later in the sentence). T ...
and
cross-serial dependency. LMGs extend the class of
context free grammar
In formal language theory, a context-free grammar (CFG) is a formal grammar whose Production (computer science), production rules
can be applied to a Terminal and nonterminal symbols, nonterminal symbol regardless of its context.
In particular ...
s (CFGs) by adding introducing pattern-matched function-like rewrite semantics, as well as the operations of
variable binding and slash deletion. LMGs were introduced by A.V. Groenink in 1995.
[Groenink, Annius V. 1995. Literal Movement Grammars. In ''Proceedings of the 7th EACL Conference''.]
Description
The basic
rewrite operation of an LMG is very similar to that of a CFG, with the addition of
arguments
An argument is a series of sentences, statements, or propositions some of which are called premises and one is the conclusion. The purpose of an argument is to give reasons for one's conclusion via justification, explanation, and/or persua ...
to the non-terminal symbols. Where a context-free rewrite rule obeys the general
schema
Schema may refer to:
Science and technology
* SCHEMA (bioinformatics), an algorithm used in protein engineering
* Schema (genetic algorithms), a set of programs or bit strings that have some genotypic similarity
* Schema.org, a web markup vocab ...
for some non-terminal
and some
string
String or strings may refer to:
*String (structure), a long flexible structure made from threads twisted together, which is used to tie, bind, or hang other objects
Arts, entertainment, and media Films
* ''Strings'' (1991 film), a Canadian anim ...
of terminals and/or non-terminals
, an LMG rewrite rule obeys the general schema
, where X is a non-terminal with
arity
In logic, mathematics, and computer science, arity () is the number of arguments or operands taken by a function, operation or relation. In mathematics, arity may also be called rank, but this word can have many other meanings. In logic and ...
n (called a
predicate in LMG terminology), and
is a string of "items", as defined below. The arguments
are strings of terminal symbols and/or variable symbols defining an argument pattern. In the case where an argument pattern has multiple adjacent variable symbols, the argument pattern will match any and all partitions of the actual value that unify. Thus, if the predicate is
and the actual pattern is
, there are three valid matches:
. In this way, a single rule is actually a family of alternatives.
An "item" in a literal movement grammar is one of
*
, a predicate of arity n,
*
, a variable binding x to the string produced by
, or
*
, a slash deletion of
by the string of terminals and/or variables
.
In a rule like
, the variable y is bound to whatever terminal string the g predicate produces, and in
and
, all occurrences of y are replaced by that string, and
and
are produced as if terminal string had always been there.
An item
, where x is something that produces a terminal string (either a terminal string itself or some predicate), and y is a string of terminals and/or variables, is rewritten as the empty string (
) if and only if
, and otherwise cannot be rewritten at all.
Example
LMGs can characterize the non-CF language
as follows:
:
:
:
:
:
The derivation for ', using parentheses also for grouping, is therefore
:
Computational power
Languages generated by LMGs contain the context-free languages as a proper
subset
In mathematics, a Set (mathematics), set ''A'' is a subset of a set ''B'' if all Element (mathematics), elements of ''A'' are also elements of ''B''; ''B'' is then a superset of ''A''. It is possible for ''A'' and ''B'' to be equal; if they a ...
, as every CFG is an LMG where all predicates have arity 0 and no
production rule contains variable bindings or slash deletions.
References
{{Formal languages and grammars
Formal languages
Grammar frameworks