In
computer science, learning vector quantization (LVQ) is a
prototype-based supervised classification Classification is a process related to categorization, the process in which ideas and objects are recognized, differentiated and understood.
Classification is the grouping of related facts into classes.
It may also refer to:
Business, organizat ...
algorithm. LVQ is the supervised counterpart of
vector quantization systems.
Overview
LVQ can be understood as a special case of an
artificial neural network, more precisely, it applies a
winner-take-all Hebbian learning-based approach. It is a precursor to
self-organizing maps (SOM) and related to
neural gas, and to the
k-nearest neighbor algorithm (k-NN). LVQ was invented by
Teuvo Kohonen.
An LVQ system is represented by prototypes
which are defined in the
feature space of observed data. In winner-take-all training algorithms one determines, for each data point, the prototype which is closest to the input according to a given distance measure. The position of this so-called winner prototype is then adapted, i.e. the winner is moved closer if it correctly classifies the data point or moved away if it classifies the data point incorrectly.
An advantage of LVQ is that it creates prototypes that are easy to interpret for experts in the respective application domain.
LVQ systems can be applied to
multi-class classification problems in a natural way.
A key issue in LVQ is the choice of an appropriate measure of distance or similarity for training and classification. Recently, techniques have been developed which adapt a parameterized distance measure in the course of training the system, see e.g. (Schneider, Biehl, and Hammer, 2009)
and references therein.
LVQ can be a source of great help in classifying text documents.
Algorithm
Below follows an informal description.
The algorithm consists of three basic steps. The algorithm's input is:
* how many neurons the system will have
(in the simplest case it is equal to the number of classes)
* what weight each neuron has
for
* the corresponding label
to each neuron
* how fast the neurons are learning
* and an input list
containing all the vectors of which the labels are known already (training set).
The algorithm's flow is:
# For next input
(with label
) in
find the closest neuron
,
i.e.
, where
is the metric used (
Euclidean, etc. ).
# Update
. A better explanation is get
closer to the input
, if
and
belong to the same label and get them further apart if they don't.
if
(closer together)
or
if
(further apart).
# While there are vectors left in
go to step 1, else terminate.
Note:
and
are
vectors in feature space.
References
Further reading
Self-Organizing Maps and Learning Vector Quantization for Feature Sequences, Somervuo and Kohonen. 2004(pdf)
External links
lvq_pakofficial release (1996) by Kohonen and his team
Artificial neural networks
Classification algorithms