Text simplification is an operation used in
natural language processing
Natural language processing (NLP) is an interdisciplinary subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language, in particular how to program computers to proc ...
to change, enhance, classify, or otherwise process an existing body of human-readable text so its grammar and structure is greatly simplified while the underlying
meaning
Meaning most commonly refers to:
* Meaning (linguistics), meaning which is communicated through the use of language
* Meaning (philosophy), definition, elements, and types of meaning discussed in philosophy
* Meaning (non-linguistic), a general te ...
and
information
Information is an abstract concept that refers to that which has the power to inform. At the most fundamental level information pertains to the interpretation of that which may be sensed. Any natural process that is not completely random, ...
remain the same. Text simplification is an important area of research because of communication needs in an increasingly complex and interconnected world more dominated by science, technology, and new media. But natural human languages pose huge problems because they ordinarily contain large vocabularies and complex constructions that machines, no matter how fast and well-programmed, cannot easily process. However, researchers have discovered that, to reduce linguistic diversity, they can use methods of
semantic compression to limit and simplify a set of words used in given texts.
Example
Text simplification is illustrated with an example used by Siddharthan (2006). The first sentence contains two relative clauses and one conjoined verb phrase. A text simplification system aims to change the first sentence into a group of simpler sentences, as seen just below the first sentence.
* ''Also contributing to the firmness in copper, the analyst noted, was a report by Chicago purchasing agents, which precedes the full purchasing agents report that is due out today and gives an indication of what the full report might hold.''
* ''Also contributing to the firmness in copper, the analyst noted, was a report by Chicago purchasing agents. The Chicago report precedes the full purchasing agents report. The Chicago report gives an indication of what the full report might hold. The full report is due out today.''
One approach to text simplification is
lexical simplification via
lexical substitution, a two-step process of first identifying complex words and then replacing them with simpler synonyms. A key challenge here is identifying complex words, which is performed by a machine learning classifier trained on
labeled data. Researchers, frustrated by the problems with using the classical method of asking research subjects to describe words as either simple or complex, have discovered that they can get a higher consistency in more levels of complexity if they ask labelers to sort words presented to them in order of complexity.
See also
*
Automated paraphrasing
*
Controlled natural language
*
Language reform
*
Lexical simplification
*
Lexical substitution
*
Semantic compression
*
Text normalization
*
Simplified English
*
Basic English
Basic English (British American Scientific International and Commercial English) is an English-based controlled language created by the linguist and philosopher Charles Kay Ogden as an international auxiliary language, and as an aid for teac ...
References
* Wei Xu, Chris Callison-Burch and Courtney Napoles.
Problems in Current Text Simplification Research. In Transactions of the Association for Computational Linguistics (TACL), Volume 3, 2015, Pages 283–297.
* Advaith Siddharthan.
Syntactic Simplification and Text Cohesion. In Research on Language and Computation, Volume 4, Issue 1, Jun 2006, Pages 77–109, Springer Science, the Netherlands.
* Siddhartha Jonnalagadda, Luis Tari, Joerg Hakenberg, Chitta Baral and Graciela Gonzalez. Towards Effective Sentence Simplification for Automatic Processing of Biomedical Text. In Proc. of the NAACL-HLT 2009, Boulder, USA, June
External links
Automatic Induction of Rules for Text Simplification(pdf)
*
Text Simplification for Information-Seeking Applications (link broken)--
Text Simplification for Information-Seeking Applications
{{Natural Language Processing
Computational linguistics
Speech recognition
Natural language processing
Tasks of natural language processing