Cross-serial Dependencies
   HOME

TheInfoList



OR:

In
linguistics Linguistics is the scientific study of language. The areas of linguistic analysis are syntax (rules governing the structure of sentences), semantics (meaning), Morphology (linguistics), morphology (structure of words), phonetics (speech sounds ...
, cross-serial dependencies (also called crossing dependencies by some authors.) occur when the lines representing the dependency relations between two series of words cross over each other.. They are of particular interest to linguists who wish to determine the syntactic structure of natural language; languages containing an arbitrary number of them are non- context-free. By this fact, Dutch. and Swiss-German. have been proven to be non-context-free.


Example

As Swiss-German allows verbs and their arguments to be ordered cross-serially, we have the following example, taken from Shieber: That is, "we help Hans paint the house." Notice that the sequential noun phrases ''em Hans'' (''Hans'') and ''es huus'' (''the house''), and the sequential verbs ''hälfed'' (''help'') and ''aastriiche'' (''paint'') both form two separate series of constituents. Notice also that the dative verb ''hälfed'' and the accusative verb ''aastriiche'' take the dative ''em Hans'' and accusative ''es huus'' as their arguments, respectively.


Non-context-freeness

Let L_ to be the set of all Swiss-German sentences. We will prove mathematically that L_ is not context-free. In Swiss-German sentences, the number of verbs of a
grammatical case A grammatical case is a category of nouns and noun modifiers (determiners, adjectives, participles, and Numeral (linguistics), numerals) that corresponds to one or more potential grammatical functions for a Nominal group (functional grammar), n ...
(dative or accusative) must match the number of objects of that case. Additionally, a sentence containing an arbitrary number of such objects is admissible (in principle). Hence, we can define the following
formal language In logic, mathematics, computer science, and linguistics, a formal language is a set of strings whose symbols are taken from a set called "alphabet". The alphabet of a formal language consists of symbols that concatenate into strings (also c ...
, a subset of L_: L =\text \text\ddot\text \text \text \text^m \text \text^n \text \text \text\ddot\text \text \text^m \text\ddot\text^n \text Thus, we have L = L_ \cap L_r, where L_r is the
regular language In theoretical computer science and formal language theory, a regular language (also called a rational language) is a formal language that can be defined by a regular expression, in the strict sense in theoretical computer science (as opposed to ...
defined by L =\text \text\ddot\text \text \text \text^+ \text \text^+ \text \text \text\ddot\text \text \text^+ \text\ddot\text^+ \text where the superscript plus symbol means "one or more copies". Since the set of context-free languages is closed under intersection with regular languages, we need only prove that L is not context-free (,. pp 130–135). After a word substitution, L is of the form \. Since L can be mapped to L' by the following map: x, y, z \mapsto \epsilon; a \mapsto a; b \mapsto b; c \mapsto c, and since the context-free languages are closed under mappings from terminal symbols to terminal strings (that is, a homomorphism) (, pp 130–135), we need only prove that L' is not context-free. L' =\ is a standard example of non-context-free language (, p. 128). This can be shown by
Ogden's lemma In the theory of formal languages, Ogden's lemma (named after William F. Ogden) is a generalization of the pumping lemma for context-free languages. Despite Ogden's lemma being a strengthening of the pumping lemma, it is insufficient to fully char ...
.
Suppose the language is generated by a context-free grammar, then let p be the length required in Ogden's lemma, then consider the word a^pb^pc^pd^p in the language, and mark the letters b^pc^p. Then the three conditions implied by Ogden's lemma cannot all be satisfied.
All known spoken languages which contain cross-serial dependencies can be similarly proved to be not context-free. This led to the abandonment of
Generalized Phrase Structure Grammar Generalized phrase structure grammar (GPSG) is a framework for describing the syntax and semantics of natural languages. It is a type of constraint-based phrase structure grammar. Constraint based grammars are based around defining certain syntacti ...
once cross-serial dependencies were identified in natural languages in the 1980s.


Treatment

Research in
mildly context-sensitive language In computational linguistics, the term mildly context-sensitive grammar formalisms refers to several grammar formalisms that have been developed in an effort to provide adequate descriptions of the syntactic structure of natural language. Every ...
has attempted to identify a narrower and more computationally tractable subclass of
context-sensitive language In formal language theory, a context-sensitive language is a language that can be defined by a context-sensitive grammar (and equivalently by a noncontracting grammar). Context-sensitive is known as type-1 in the Chomsky hierarchy of formal langu ...
s that can capture context sensitivity as found in natural languages. For example, cross-serial dependencies can be expressed in linear context-free rewriting systems (LCFRS); one can write a LCFRS grammar for for example.http://user.phil-fak.uni-duesseldorf.de/~kallmeyer/GrammarFormalisms/4lcfrs-intro.pdf


References

{{Reflist Formal languages Syntax