Trigram Tagger
   HOME

TheInfoList



OR:

In
computational linguistics Computational linguistics is an Interdisciplinarity, interdisciplinary field concerned with the computational modelling of natural language, as well as the study of appropriate computational approaches to linguistic questions. In general, comput ...
, a trigram tagger is a statistical method for automatically identifying words as being nouns, verbs, adjectives, adverbs, etc. based on second order
Markov model In probability theory, a Markov model is a stochastic model used to model pseudo-randomly changing systems. It is assumed that future states depend only on the current state, not on the events that occurred before it (that is, it assumes the Mark ...
s that consider triples of consecutive words. It is trained on a
text corpus In linguistics, a corpus (plural ''corpora'') or text corpus is a language resource consisting of a large and structured set of texts (nowadays usually electronically stored and processed). In corpus linguistics, they are used to do statistical ...
as a method to predict the next word, taking the product of the probabilities of
unigram In the fields of computational linguistics and probability, an ''n''-gram (sometimes also called Q-gram) is a contiguous sequence of ''n'' items from a given sample of text or speech. The items can be phonemes, syllables, letters, words or ...
,
bigram A bigram or digram is a sequence of two adjacent elements from a string of tokens, which are typically letters, syllables, or words. A bigram is an ''n''-gram for ''n''=2. The frequency distribution of every bigram in a string is commonly used fo ...
and
trigram Trigrams are a special case of the ''n''-gram, where ''n'' is 3. They are often used in natural language processing for performing statistical analysis of texts and in cryptography for control and use of ciphers and codes. Frequency Context ...
. In speech recognition, algorithms utilizing trigram-tagger score better than those algorithms utilizing IIMM tagger but less well than Net tagger. The description of the trigram tagger is provided by Brants (2000).


References

* Kempe Andre (1993). "A stochastic Tagger and an Analysis of Tagging Errors". Internal paper. Institute for Computational Linguistics, Universität Stuttgart. * Brants, T. (2000
TnT - A Statistical Part-of-Speech Tagger
Proc 6th Applied Natural Language Processing Conference, ANLP-200


External links


TnT -- Statistical Part-of-Speech Tagging
by Thorsten Brants Statistical natural language processing {{compu-ling-stub