The factored language model (FLM) is an extension of a conventional

language model A language model is a probability distribution over sequences of words. Given any sequence of words of length , a language model assigns a probability P(w_1,\ldots,w_m) to the whole sequence. Language models generate probabilities by training on ...

introduced by Jeff Bilmes and Katrin Kirchoff in 2003. In an FLM, each word is viewed as a vector of ''k'' factors:

w_i = \.

An FLM provides the probabilistic model

P(f, f_1, ..., f_N)

where the prediction of a factor

f

is based on

N

parents

\

. For example, if

w

represents a word token and

t

represents a

Part of speech In grammar, a part of speech or part-of-speech ( abbreviated as POS or PoS, also known as word class or grammatical category) is a category of words (or, more generally, of lexical items) that have similar grammatical properties. Words that are as ...

tag for English, the expression

P(w_i, w_, w_, t_)

gives a model for predicting current word token based on a traditional

Ngram In the fields of computational linguistics and probability, an ''n''-gram (sometimes also called Q-gram) is a contiguous sequence of ''n'' items from a given sample of text or speech. The items can be phonemes, syllables, letters, words or ...

model as well as the

tag of the previous word. A major advantage of factored language models is that they allow users to specify linguistic knowledge such as the relationship between word tokens and

in English, or morphological information (stems, root, etc.) in Arabic. Like N-gram models, smoothing techniques are necessary in parameter estimation. In particular, generalized back-off is used in training an FLM.

References

* Language modeling Statistical natural language processing Probabilistic models {{compu-AI-stub