Linguistic Knowledge Builder
   HOME

TheInfoList



OR:

Deep Linguistic Processing with HPSG - INitiative (DELPH-IN) is a collaboration where computational linguists worldwide develop
natural language processing Natural language processing (NLP) is a subfield of computer science and especially artificial intelligence. It is primarily concerned with providing computers with the ability to process data encoded in natural language and is thus closely related ...
tools for
deep linguistic processing Deep linguistic processing is a natural language processing framework which draws on theoretical and descriptive linguistics. It models language predominantly by way of theoretical syntactic/semantic theory (e.g. CCG, HPSG, LFG, TAG, the Prague Sc ...
of human language. The goal of DELPH-IN is to combine linguistic and statistical processing methods in order to computationally understand the meaning of texts and utterances. The tools developed by DELPH-IN adopt two linguistic formalisms for deep linguistic analysis, viz.
head-driven phrase structure grammar Head-driven phrase structure grammar (HPSG) is a highly lexicalized, constraint-based grammar developed by Carl Pollard and Ivan Sag. It is a type of phrase structure grammar, as opposed to a dependency grammar, and it is the immediate successor t ...
(HPSG) and
minimal recursion semantics Minimal recursion semantics (MRS) is a framework for computational semantics. It can be implemented in typed feature structure formalisms such as head-driven phrase structure grammar and lexical functional grammar. It is suitable for computational ...
(MRS). All tools under the DELPH-IN collaboration are developed for general use of
open-source Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use and view the source code, design documents, or content of the product. The open source model is a decentrali ...
licensing. Since 2005, DELPH-IN has held an annual summit. This is a loosely structured
unconference An unconference is a participant-driven meeting. The term "unconference" has been applied to a wide range of gatherings that try to avoid hierarchical aspects of a conventional conference, such as sponsored presentations and top-down organizati ...
where people update each other about the work they are doing, seek feedback on current work, and occasionally hammer out agreement on standards and best practice.


DELPH-IN technologies and resources

The DELPH-IN collaboration has been progressively building computational tools for deep linguistic analysis, such as: * LKB system (Linguistic Knowledge Builder): a
grammar engineering In linguistics, grammar is the set of rules for how a natural language is structured, as demonstrated by its speakers or writers. Grammar rules may concern the use of clauses, phrases, and words. The term may also refer to the study of such rul ...
environment where linguists can build unification grammars with the
Head-driven Phrase Structure Grammar Head-driven phrase structure grammar (HPSG) is a highly lexicalized, constraint-based grammar developed by Carl Pollard and Ivan Sag. It is a type of phrase structure grammar, as opposed to a dependency grammar, and it is the immediate successor t ...
formalism * PET parser (Platform for Experimentation with efficient HPSG processing Techniques): an open source parser which produces
HPSG Head-driven phrase structure grammar (HPSG) is a highly lexicalized, constraint-based grammar developed by Carl Pollard and Ivan Sag. It is a type of phrase structure grammar, as opposed to a dependency grammar, and it is the immediate successor t ...
parse trees with
Minimal Recursion Semantics Minimal recursion semantics (MRS) is a framework for computational semantics. It can be implemented in typed feature structure formalisms such as head-driven phrase structure grammar and lexical functional grammar. It is suitable for computational ...
(MRS) outputs * ACE processor (Answer Constraint Engine): an efficient system to process DELPH-IN grammars that provide
HPSG Head-driven phrase structure grammar (HPSG) is a highly lexicalized, constraint-based grammar developed by Carl Pollard and Ivan Sag. It is a type of phrase structure grammar, as opposed to a dependency grammar, and it is the immediate successor t ...
syntactic parses with
MRS MRS, Mrs, or mrs may refer to: Acronyms * ICAO code for Air Marshall Islands, an airline based in Majuro, Marshall Islands * Magnetic resonance spectroscopy * Mammography reporting software, used to manage data related to radiologist interpretat ...
outputs. The latest version of ACE is able to generate natural language sentences. * LOGON infrastructure is a collection of software and DELPH-IN grammars to provide transfer-based machine translation. The LOGON approach to machine translation has proven to provide quality oriented hybrid (rule-based and stochastic) translations. Other than deep linguistic processing tools, the DELPH-IN collaboration supplies computational resources for
Natural Language Processing Natural language processing (NLP) is a subfield of computer science and especially artificial intelligence. It is primarily concerned with providing computers with the ability to process data encoded in natural language and is thus closely related ...
such as computational HPSG grammars and language prototypes e.g.: * DELPH-IN grammars: a catalogue of computational HPSG grammar hand-crafted to capture deep linguistics analysis specific to the respective languages * LinGO Grammar Matrix: an open-source starter-kit for rapid prototyping of precision broad-coverage grammars compatible with the LKB. It contains a library of common language phenomena that computational grammarians can inherit for their HPSG grammars. * CLIMB libraries (Comparative Libraries of Implementations with Matrix Basis): an extended language library built on the Grammar Matrix. The objective of the CLIMB library is to maintain alternative analyses of the same phenomenon across different languages to test their impact on long-term grammar development. Another range of DELPH-IN resources are not unlike the data use for
shallow linguistic processing Shallow may refer to: Places * Shallow (underwater relief), where the depth of the water is low compared to its surroundings * Shallow Bay (disambiguation), various places * Shallow Brook, New Jersey, United States * Shallow Inlet, Victoria, A ...
, such as
Text corpus In linguistics and natural language processing, a corpus (: corpora) or text corpus is a dataset, consisting of natively digital and older, digitalized, language resources, either annotated or unannotated. Annotated, they have been used in corp ...
and
treebanks In linguistics, a treebank is a parsed text corpus that annotates syntactic or semantic sentence structure. The construction of parsed corpora in the early 1990s revolutionized computational linguistics, which benefitted from large-scale empir ...
: * MRS Test Suite: a short but representative set of sentences designed to capture some
minimal recursion semantics Minimal recursion semantics (MRS) is a framework for computational semantics. It can be implemented in typed feature structure formalisms such as head-driven phrase structure grammar and lexical functional grammar. It is suitable for computational ...
phenomena. The test suites are available in Bulgarian, English, French, German, Greek, Japanese, Mandarin, Norwegian, Portuguese, Russian and Spanish. * Wikiwoods: WikiWoods is a
parsed corpus In linguistics, a treebank is a parsed text corpus that annotated, annotates syntactic or semantic sentence (linguistics), sentence structure. The construction of parsed corpora in the early 1990s revolutionized computational linguistics, which ...
that provides rich syntacto-semantic annotations for the English Wikipedia. * DeepBank: an ongoing project to annotate the one million words of 1989 Wall Street Journal text (the same set of sentences annotated in the original Penn Treebank project) with the English Resource Grammar, augmented with a robust approximating PCFG for complete coverage. * Cathedral and the Bazaar: a compilation of an early essay on Open Source by Eric Raymond with translations into multiple languages. It was proposed as a multilingual shared test suite to enable us to compare parses across different grammars. The open-source culture of the DELPH-IN collaboration provides the
Natural Language Processing Natural language processing (NLP) is a subfield of computer science and especially artificial intelligence. It is primarily concerned with providing computers with the ability to process data encoded in natural language and is thus closely related ...
community with an array of
deep linguistic processing Deep linguistic processing is a natural language processing framework which draws on theoretical and descriptive linguistics. It models language predominantly by way of theoretical syntactic/semantic theory (e.g. CCG, HPSG, LFG, TAG, the Prague Sc ...
tools and resources. However, the usability of DELPH-IN tools has been an issue with users and application developers new to the DELPH-IN ecology. The DELPH-IN developers are aware of these usability issues and there are ongoing attempts to improve documentation and tutorials of DELPH-IN technologies.DELPH-IN 2013 Summit: Special Interest Group in Useability
/ref>


See also

*
Head-driven Phrase Structure Grammar Head-driven phrase structure grammar (HPSG) is a highly lexicalized, constraint-based grammar developed by Carl Pollard and Ivan Sag. It is a type of phrase structure grammar, as opposed to a dependency grammar, and it is the immediate successor t ...
*
Minimal Recursion Semantics Minimal recursion semantics (MRS) is a framework for computational semantics. It can be implemented in typed feature structure formalisms such as head-driven phrase structure grammar and lexical functional grammar. It is suitable for computational ...


References

{{Reflist, 2


External links


DELPH-IN website

DELPH-IN wiki forum

Short tutorial to DELPH-IN's ecology of tools and resources
Natural language processing Generative linguistics Grammar frameworks