HOME

TheInfoList



OR:

Deep linguistic processing is a
natural language processing Natural language processing (NLP) is an interdisciplinary subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language, in particular how to program computers to proc ...
framework which draws on theoretical and
descriptive linguistics In the study of language, description or descriptive linguistics is the work of objectively analyzing and describing how language is actually used (or how it was used in the past) by a speech community. François & Ponsonnet (2013). All acad ...
. It models language predominantly by way of theoretical syntactic/semantic theory (e.g. CCG,
HPSG Head-driven phrase structure grammar (HPSG) is a highly lexicalized, constraint-based grammar developed by Carl Pollard and Ivan Sag. It is a type of phrase structure grammar, as opposed to a dependency grammar, and it is the immediate successor to ...
, LFG, TAG, the
Prague School The Prague school or Prague linguistic circle is a language and literature society. It started in 1926 as a group of linguists, philologists and literary critics in Prague. Its proponents developed methods of structuralist literary analysis and ...
). Deep linguistic processing approaches differ from "shallower" methods in that they yield more expressive and structural representations which directly capture long-distance dependencies and underlying
predicate Predicate or predication may refer to: * Predicate (grammar), in linguistics * Predication (philosophy) * several closely related uses in mathematics and formal logic: **Predicate (mathematical logic) **Propositional function **Finitary relation, ...
-
argument An argument is a statement or group of statements called premises intended to determine the degree of truth or acceptability of another statement called conclusion. Arguments can be studied from three main perspectives: the logical, the dialect ...
structures.
The knowledge-intensive approach of deep linguistic processing requires considerable computational power, and has in the past sometimes been judged as being intractable. However, research in the early 2000s had made considerable advancement in efficiency of deep processing. Today, efficiency is no longer a major problem for applications using deep linguistic processing.


Contrast to "shallow linguistic processing"

Traditionally, deep linguistic processing has been concerned with computational grammar development (for use in both
parsing Parsing, syntax analysis, or syntactic analysis is the process of analyzing a string of symbols, either in natural language, computer languages or data structures, conforming to the rules of a formal grammar. The term ''parsing'' comes from Lati ...
and generation). These grammars were manually developed, maintained and were computationally expensive to run. In recent years, machine learning approaches (also known as shallow linguistic processing) have fundamentally altered the field of
natural language processing Natural language processing (NLP) is an interdisciplinary subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language, in particular how to program computers to proc ...
. The rapid creation of robust and wide-coverage machine learning NLP tools requires substantially lesser amount of manual labor. Thus deep linguistic processing methods have received less attention. However, it is the belief of some computational linguists that in order for computers to understand natural language or
inference Inferences are steps in reasoning, moving from premises to logical consequences; etymologically, the word '' infer'' means to "carry forward". Inference is theoretically traditionally divided into deduction and induction, a distinction that ...
, detailed syntactic and
semantic representation Semantics (from grc, σημαντικός ''sēmantikós'', "significant") is the study of reference, meaning, or truth. The term can be used to refer to subfields of several distinct disciplines, including philosophy, linguistics and comput ...
is necessary. Moreover, while humans can easily understand a sentence and its meaning, shallow linguistic processing might lack human language 'understanding'. For example:U. Schafer. 2007. �
Integrating Deep and Shallow Natural Language Processing Components – Representations and Hybrid Architectures
Ph.D. thesis, Faculty of Mathematics and Computer Science, Saarland University, Saarbrucken, Germany.

:a) ''Things would be different if Microsoft were located in Georgia.'' In sentence (a), a shallow
information extraction Information extraction (IE) is the task of automatically extracting structured information from unstructured and/or semi-structured machine-readable documents and other electronically represented sources. In most of the cases this activity concer ...
system might infer wrongly that Microsoft's headquarters was located in Georgia. While as humans, we understand from the sentence that Microsoft office was never in Georgia.
:b) ''The National Institute for Psychology in Israel was established in May 1971 as the Israel Center for Psychobiology by Prof. Joel.'' In sentence (b), a shallow system could wrongly infer that Israel was established in May 1971. Humans know that it is the National Institute for Psychobiology that was established in 1971.
In summary of the comparison between deep and shallow language processing, deep linguistic processing provides a knowledge-rich analysis of language through manually developed grammars and language resources. Whereas, shallow linguistic processing provides a knowledge-lean analysis of language through statistical/machine learning manipulation of texts and/or annotated linguistic resource.


Sub-communities

"Deep" computational linguists are divided in different sub-communities based on the grammatical formalism they adopted for deep linguistic processing. The major sub-communities includes the: *DEep Linguistic Processing with HPSG - INitiative (
DELPH-IN Deep Linguistic Processing with HPSG - INitiative (DELPH-IN) is a collaboration where computational linguists worldwide develop natural language processing tools for deep linguistic processing of human language. The goal of DELPH-IN is to combine ...
) collaboration working with the
HPSG Head-driven phrase structure grammar (HPSG) is a highly lexicalized, constraint-based grammar developed by Carl Pollard and Ivan Sag. It is a type of phrase structure grammar, as opposed to a dependency grammar, and it is the immediate successor to ...
formalism. Th
HPSG Conference
is the central conference to share knowledge/advancement of
HPSG Head-driven phrase structure grammar (HPSG) is a highly lexicalized, constraint-based grammar developed by Carl Pollard and Ivan Sag. It is a type of phrase structure grammar, as opposed to a dependency grammar, and it is the immediate successor to ...
based deep processing.
ParGram/ParSem
is international collaboration on LFG-based grammar and semantics development. Th
LFG Conference
is the central conference to share knowledge/advancement of LFG based deep processing. *XTAG Research group working with the TAG formalism. Th
TAG+ conference
is the central conference to share knowledge/advancement of TAG based deep processing. The shortlist above is not exhaustively representative of all the communities working on deep linguistic processing.


See also

*
Combinatory categorial grammar Combinatory categorial grammar (CCG) is an efficiently parsable, yet linguistically expressive grammar formalism. It has a transparent interface between surface syntax and underlying semantic representation, including predicate–argument structure ...
*
Head-driven phrase structure grammar Head-driven phrase structure grammar (HPSG) is a highly lexicalized, constraint-based grammar developed by Carl Pollard and Ivan Sag. It is a type of phrase structure grammar, as opposed to a dependency grammar, and it is the immediate successor ...
* Lexical functional grammar *
Natural language processing Natural language processing (NLP) is an interdisciplinary subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language, in particular how to program computers to proc ...
*
Tree-adjoining grammar Tree-adjoining grammar (TAG) is a grammar formalism defined by Aravind Joshi. Tree-adjoining grammars are somewhat similar to context-free grammars, but the elementary unit of rewriting is the tree rather than the symbol. Whereas context-free gr ...


References

{{Natural Language Processing Natural language processing Generative linguistics Grammar frameworks