An autoencoder is a type of
artificial neural network
Artificial neural networks (ANNs), usually simply called neural networks (NNs) or neural nets, are computing systems inspired by the biological neural networks that constitute animal brains.
An ANN is based on a collection of connected units ...
used to learn
efficient codings of unlabeled data (
unsupervised learning). The encoding is validated and refined by attempting to regenerate the input from the encoding. The autoencoder learns a
representation (encoding) for a set of data, typically for
dimensionality reduction, by training the network to ignore insignificant data (“noise”).
Variants exist, aiming to force the learned representations to assume useful properties.
Examples are regularized autoencoders (''Sparse'', ''Denoising'' and ''Contractive''), which are effective in learning representations for subsequent
classification tasks,
and ''Variational'' autoencoders, with applications as
generative models.
Autoencoders are applied to many problems, including
facial recognition, feature detection,
anomaly detection and acquiring the meaning of words. Autoencoders are also generative models which can randomly generate new data that is similar to the input data (training data).
Mathematical principles
Definition
An autoencoder is defined by the following components:
Two sets: the space of decoded messages ; the space of encoded messages . Almost always, both and are Euclidean spaces, that is, for some .
Two parametrized families of functions: the encoder family , parametrized by ; the decoder family , parametrized by .
For any
, we usually write
, and refer to it as the code, the
latent variable, latent representation, latent vector, etc. Conversely, for any
, we usually write
, and refer to it as the (decoded) message.
Usually, both the encoder and the decoder are defined as
multilayer perceptrons. For example, a one-layer-MLP encoder
is:
:
where
is an element-wise
activation function such as a
sigmoid function or a
rectified linear unit,
is a matrix called "weight", and
is a vector called "bias".
Training an autoencoder
An autoencoder, by itself, is simply a tuple of two functions. To judge its ''quality'', we need a ''task''. A task is defined by a reference probability distribution
over
, and a "reconstruction quality" function