In
machine learning
Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence.
Machine ...
, the Highway Network was the first working very deep
feedforward neural network
A feedforward neural network (FNN) is an artificial neural network wherein connections between the nodes do ''not'' form a cycle. As such, it is different from its descendant: recurrent neural networks.
The feedforward neural network was the ...
with hundreds of layers, much deeper than previous
artificial neural network
Artificial neural networks (ANNs), usually simply called neural networks (NNs) or neural nets, are computing systems inspired by the biological neural networks that constitute animal brains.
An ANN is based on a collection of connected units ...
s.
It uses skip connections modulated by learned gating mechanisms to regulate information flow, inspired by
Long Short-Term Memory (LSTM)
recurrent neural networks.
The advantage of a Highway Network over the common deep neural networks is that it solves or partially prevents the
vanishing gradient problem,
thus leading to easier to optimize neural networks.
The gating mechanisms facilitate information flow across many layers ("information highways").
Highway Networks have been used as part of
text sequence labeling and
speech recognition
Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers with the ma ...
tasks.
An open-gated or gateless Highway Network variant called
Residual neural network
A residual neural network (ResNet) is an artificial neural network (ANN). It is a gateless or open-gated variant of the HighwayNet, the first working very deep feedforward neural network
A feedforward neural network (FNN) is an artificial neu ...
was used to win the ImageNet 2015 competition. This has become the most cited neural network of the 21st century.
Model
The model has two gates in addition to the H(W
H, x) gate: the transform gate T(W
T, x) and the carry gate C(W
C, x). Those two last gates are non-linear transfer functions (by convention
Sigmoid function
A sigmoid function is a mathematical function having a characteristic "S"-shaped curve or sigmoid curve.
A common example of a sigmoid function is the logistic function shown in the first figure and defined by the formula:
:S(x) = \frac = \ ...
). The H(W
H, x) function can be any desired transfer function.
The carry gate is defined as C(W
C, x) = 1 - T(W
T, x). While the transform gate is just a gate with a sigmoid transfer function.
Structure
The structure of a hidden layer follows the equation:
References
Machine learning
{{compu-ai-stub