HOME

TheInfoList



OR:

In
linguistics Linguistics is the scientific study of language. The areas of linguistic analysis are syntax (rules governing the structure of sentences), semantics (meaning), Morphology (linguistics), morphology (structure of words), phonetics (speech sounds ...
and
theoretical computer science Theoretical computer science is a subfield of computer science and mathematics that focuses on the Abstraction, abstract and mathematical foundations of computation. It is difficult to circumscribe the theoretical areas precisely. The Associati ...
, literal movement grammars (LMGs) are a grammar formalism intended to characterize certain
extraposition Extraposition is a mechanism of syntax that alters word order in such a manner that a relatively "heavy" constituent appears to the right of its canonical position. Extraposing a constituent results in a discontinuity and in this regard, it i ...
phenomena of
natural language A natural language or ordinary language is a language that occurs naturally in a human community by a process of use, repetition, and change. It can take different forms, typically either a spoken language or a sign language. Natural languages ...
such as
topicalization Topicalization is a mechanism of syntax that establishes an expression as the sentence or clause topic (linguistics), topic by having it appear at the front of the sentence or clause (as opposed to in a canonical position later in the sentence). T ...
and cross-serial dependency. LMGs extend the class of
context free grammar In formal language theory, a context-free grammar (CFG) is a formal grammar whose Production (computer science), production rules can be applied to a Terminal and nonterminal symbols, nonterminal symbol regardless of its context. In particular ...
s (CFGs) by adding introducing pattern-matched function-like rewrite semantics, as well as the operations of variable binding and slash deletion. LMGs were introduced by A.V. Groenink in 1995.Groenink, Annius V. 1995. Literal Movement Grammars. In ''Proceedings of the 7th EACL Conference''.


Description

The basic rewrite operation of an LMG is very similar to that of a CFG, with the addition of
arguments An argument is a series of sentences, statements, or propositions some of which are called premises and one is the conclusion. The purpose of an argument is to give reasons for one's conclusion via justification, explanation, and/or persua ...
to the non-terminal symbols. Where a context-free rewrite rule obeys the general
schema Schema may refer to: Science and technology * SCHEMA (bioinformatics), an algorithm used in protein engineering * Schema (genetic algorithms), a set of programs or bit strings that have some genotypic similarity * Schema.org, a web markup vocab ...
S \to \alpha for some non-terminal S and some
string String or strings may refer to: *String (structure), a long flexible structure made from threads twisted together, which is used to tie, bind, or hang other objects Arts, entertainment, and media Films * ''Strings'' (1991 film), a Canadian anim ...
of terminals and/or non-terminals \alpha, an LMG rewrite rule obeys the general schema X(x_1, ..., x_n) \to \alpha, where X is a non-terminal with
arity In logic, mathematics, and computer science, arity () is the number of arguments or operands taken by a function, operation or relation. In mathematics, arity may also be called rank, but this word can have many other meanings. In logic and ...
n (called a predicate in LMG terminology), and \alpha is a string of "items", as defined below. The arguments x_i are strings of terminal symbols and/or variable symbols defining an argument pattern. In the case where an argument pattern has multiple adjacent variable symbols, the argument pattern will match any and all partitions of the actual value that unify. Thus, if the predicate is f(xy) and the actual pattern is f(ab), there are three valid matches: x = \epsilon,\ y = ab;\ x = a,\ y = b;\ x = ab,\ y = \epsilon. In this way, a single rule is actually a family of alternatives. An "item" in a literal movement grammar is one of * f(x_1, \ldots, x_n), a predicate of arity n, * x \textf(x_1, \ldots, x_n), a variable binding x to the string produced by f(x_1, ..., x_n), or * f(x_1, \ldots, x_n)/\alpha, a slash deletion of f(x_1, ..., x_n) by the string of terminals and/or variables \alpha. In a rule like f(x_1, ..., x_m) \to \alpha\ y \text g(z_1, ... z_n)\ \beta, the variable y is bound to whatever terminal string the g predicate produces, and in \alpha and \beta, all occurrences of y are replaced by that string, and \alpha and \beta are produced as if terminal string had always been there. An item x/y, where x is something that produces a terminal string (either a terminal string itself or some predicate), and y is a string of terminals and/or variables, is rewritten as the empty string (\epsilon) if and only if g(y_1, ..., y_n) = z, and otherwise cannot be rewritten at all.


Example

LMGs can characterize the non-CF language \ as follows: :S() \to x\textA()\ B(x) :A() \to a\ A() :A() \to \epsilon :B(xy) \to a/x\ b\ B(y) c :B(\epsilon) \to \epsilon The derivation for ', using parentheses also for grouping, is therefore S() \to x\textA()\ B(x) \to x\text(a\ A())\ B(x) \to x\text(aa\ A())\ B(x) \to x\textaa\ B(x) \to aa\ B(aa) : \to aa\ a/a\ b\ B(a)\ c \to aab\ B(a)\ c \to aab\ a/a\ b\ B()\ cc \to aabb\ B()\ cc\ \to aabbcc


Computational power

Languages generated by LMGs contain the context-free languages as a proper
subset In mathematics, a Set (mathematics), set ''A'' is a subset of a set ''B'' if all Element (mathematics), elements of ''A'' are also elements of ''B''; ''B'' is then a superset of ''A''. It is possible for ''A'' and ''B'' to be equal; if they a ...
, as every CFG is an LMG where all predicates have arity 0 and no production rule contains variable bindings or slash deletions.


References

{{Formal languages and grammars Formal languages Grammar frameworks