lexical substitution



Lexical substitution is the task of identifying a substitute for a word in the context of a clause. For instance, given the following text: "After the ''match'', replace any remaining fluid deficit to prevent chronic dehydration throughout the tournament", a substitute of ''game'' might be given. Lexical substitution is strictly related to word sense disambiguation (WSD), in that both aim to determine the meaning of a word. However, while WSD consists of automatically assigning the appropriate
sense A sense is a biological system used by an organism for sensation, the process of gathering information about the world through the detection of stimuli. (For example, in the human body, the brain A brain is an organ (biology), organ tha ...
from a fixed sense inventory, lexical substitution does not impose any constraint on which substitute to choose as the best representative for the word in context. By not prescribing the inventory, lexical substitution overcomes the issue of the granularity of sense distinctions and provides a level playing field for automatic systems that automatically acquire word senses (a task referred to as Word Sense Induction).


In order to evaluate automatic systems on lexical substitution, a task was organized at th
evaluation competition held in
Prague Prague ( ; cs, Praha ; german: Prag, ; la, Praga) is the capital and List of cities in the Czech Republic, largest city in the Czech Republic, and the historical capital of Bohemia. On the Vltava river, Prague is home to about 1.3 milli ...
in 2007.
task on cross-lingual lexical substitution has also taken place.

Skip-gram model

The skip-gram model takes words with similar meanings into a vector space (collection of objects that can be added together and multiplied by numbers) that are found close to each other in N-dimensions (list of items). A variety of
neural network A neural network is a network or neural circuit, circuit of biological neurons, or, in a modern sense, an artificial neural network, composed of artificial neurons or nodes. Thus, a neural network is either a biological neural network, made up ...
s (computer system modeled after a human brain) are formed together as a result of the vectors and networks that are related together. This all occurs in the dimensions of the vocabulary that has been generated in a network. The model has been used in lexical substitution automation and prediction algorithms. One such algorithm developed by Oren Melamud, Omer Levy, and Ido Dagan uses the skip-gram model to find a vector for each word and its synonyms. Then, it calculates the cosine distance between vectors to determine which words will be the best substitutes.


In a sentence like "The dog walked at a quick pace" each word has a specific vector in relation to the other. The vector for "The" would be ,0,0,0,0,0,0because the 1 is the word vocabulary and the 0s are the words surrounding that vocabulary, which create a vector.

See also

Lexical semantics Lexical semantics (also known as lexicosemantics), as a subfield of linguistics, linguistic semantics, is the study of word meanings.Pustejovsky, J. (2005) Lexical Semantics: Overview' in Encyclopedia of Language and Linguistics, second edition, V ...
Semantic compression In natural language processing, semantic compression is a process of compacting a lexicon used to build a textual document (or a set of documents) by reducing language heterogeneity, while maintaining text semantics. As a result, the same ideas ca ...
* SemEval *
Word sense In linguistics Linguistics is the scientific study of human language. It is called a scientific study because it entails a comprehensive, systematic, objective, and precise analysis of all aspects of language, particularly its nature and s ...


* D. McCarthy, R. Navigli
The English Lexical Substitution Task
''Language Resources and Evaluation'', 43(2), Springer, 2009, pp. 139–159. * D. McCarthy, R. Navigli
SemEval-2007 Task 10: English Lexical Substitution Task
''Proc. of Semeval-2007 Workshop (SEMEVAL)'', in the 45th Annual Meeting of the Association for Computational Linguistics (ACL 2007), Prague, Czech Republic, 23–24 June 2007, pp. 48–53. * D. McCarthy
Lexical substitution as a task for WSD evaluation
In Proceedings of the ACL workshop on word sense disambiguation: Recent successes and future directions, Philadelphia, USA, 2002, pp. 109–115. * R. Navigli
''Word Sense Disambiguation: A Survey
', ACM Computing Surveys, 41(2), 2009, pp. 1–69.


Natural language processing Computational linguistics Lexical semantics Semantics Word-sense disambiguation {{Comp-ling-stub