
LeNet is a series of
convolutional neural network
In deep learning, a convolutional neural network (CNN, or ConvNet) is a class of artificial neural network (ANN), most commonly applied to analyze visual imagery. CNNs are also known as Shift Invariant or Space Invariant Artificial Neural Netwo ...
structure proposed by
LeCun et al.
The earliest version, LeNet-1, was trained in 1989. In general, when "LeNet" is referred to without a number, it refers to LeNet-5 (1998), the most well-known version.
Convolutional neural networks are a kind of
feed-forward neural network
A feedforward neural network (FNN) is an artificial neural network wherein connections between the nodes do ''not'' form a cycle. As such, it is different from its descendant: recurrent neural networks.
The feedforward neural network was the f ...
whose artificial neurons can respond to a part of the surrounding cells in the coverage range and perform well in large-scale image processing. LeNet-5 was one of the earliest
convolutional neural network
In deep learning, a convolutional neural network (CNN, or ConvNet) is a class of artificial neural network (ANN), most commonly applied to analyze visual imagery. CNNs are also known as Shift Invariant or Space Invariant Artificial Neural Netwo ...
s and was historically important during the development of
deep learning.
Development history

In 1988, LeCun joined the Adaptive Systems Research Department at
AT&T Bell Laboratories
Nokia Bell Labs, originally named Bell Telephone Laboratories (1925–1984),
then AT&T Bell Laboratories (1984–1996)
and Bell Labs Innovations (1996–2007),
is an American industrial Research and development, research and scientific developm ...
in
Holmdel
Holmdel Township (usually shortened to Holmdel) is a township in Monmouth County, New Jersey, United States. The township is centrally located in the Raritan Valley region, being within the regional and cultural influence of the Raritan Bays ...
, New Jersey, United States, headed by Lawrence D. Jackel.
In 1988, LeCun et al. published a neural network design that recognize handwritten zip code. However, its convolutional kernels were hand-designed.
In 1989,
Yann LeCun
Yann André LeCun ( , ; originally spelled Le Cun; born 8 July 1960) is a French computer scientist working primarily in the fields of machine learning, computer vision, mobile robotics and computational neuroscience. He is the Silver Professor ...
et al. at
Bell Labs
Nokia Bell Labs, originally named Bell Telephone Laboratories (1925–1984),
then AT&T Bell Laboratories (1984–1996)
and Bell Labs Innovations (1996–2007),
is an American industrial research and scientific development company owned by mult ...
first applied the
backpropagation algorithm to practical applications, and believed that the ability to learn network generalization could be greatly enhanced by providing constraints from the task's domain. He combined a convolutional neural network trained by backpropagation algorithms to read handwritten numbers and successfully applied it in identifying handwritten zip code numbers provided by the
US Postal Service
The United States Postal Service (USPS), also known as the Post Office, U.S. Mail, or Postal Service, is an independent agency of the executive branch of the United States federal government responsible for providing postal service in the U ...
. This was the prototype of what later came to be called LeNet-1.
In the same year, LeCun described a small handwritten digit recognition problem in another paper, and showed that even though the problem is linearly separable, single-layer networks exhibited poor generalization capabilities. When using shift-invariant feature detectors on a multi-layered, constrained network, the model could perform very well. He believed that these results proved that minimizing the number of free parameters in the neural network could enhance the generalization ability of the neural network.
In 1990, their paper described the application of backpropagation networks in handwritten digit recognition again. They only performed minimal preprocessing on the data, and the model was carefully designed for this task and it was highly constrained. The input data consisted of images, each containing a number, and the test results on the postal code digital data provided by the US Postal Service showed that the model had an error rate of only 1% and a rejection rate of about 9%.
Their research continued for the next four years, and in 1994
MNIST database
The MNIST database (''Modified National Institute of Standards and Technology database'') is a large database of handwritten digits that is commonly used for training various image processing systems. The database is also widely used for training a ...
was developed, for which LeNet-1 was too small, hence a new LeNet-4 was trained on it.
A year later the AT&T Bell Labs collective introduced LeNet-5 and reviewed various methods on handwritten character recognition in paper, using standard handwritten digits to identify benchmark tasks. These models were compared and the results showed that the latest network outperformed other models.
By 1998 Yann LeCun,
Leon Bottou,
Yoshua Bengio
Yoshua Bengio (born March 5, 1964) is a Canadian computer scientist, most noted for his work on artificial neural networks and deep learning. He is a professor at the Department of Computer Science and Operations Research at the Université d ...
, and Patrick Haffner were able to provide examples of practical applications of neural networks, such as two systems for recognizing handwritten characters online and models that could read millions of checks per day.
The research achieved great success and aroused the interest of scholars in the study of neural networks. While the architecture of the best performing neural networks today are not the same as that of LeNet, the network was the starting point for a large number of neural network architectures, and also brought inspiration to the field.
Architecture
LeNet has several common motifs of modern convolutional neural networks, such as convolutional layer,
pooling layer
In neural networks, a pooling layer is a kind of network layer that downsamples and aggregates information that is dispersed among many vectors into fewer vectors. It has several uses. It removes redundant information, reducing the amount of comp ...
and full connection layer.
* Every
convolutional layer
In artificial neural networks, a convolutional layer is a type of network layer that applies a convolution operation to the input. Convolutional layers are some of the primary building blocks of convolutional neural networks
In deep learning, ...
includes three parts: convolution, pooling, and nonlinear
activation function
In artificial neural networks, the activation function of a node defines the output of that node given an input or set of inputs.
A standard integrated circuit can be seen as a digital network of activation functions that can be "ON" (1) or " ...
s
* Using convolution to extract spatial features (Convolution was called receptive fields originally)
* Subsampling average pooling layer
* tanh
activation function
In artificial neural networks, the activation function of a node defines the output of that node given an input or set of inputs.
A standard integrated circuit can be seen as a digital network of activation functions that can be "ON" (1) or " ...
* fully connected layers in the final layers for classification
* Sparse connection between layers to reduce the complexity of computation
LeNet-1
Before LeNet-1, the 1988 architecture
was a hybrid approach. The first stage scaled, deskewed, and
skeletonized the input image. The second stage was a convolutional layer with 18 hand-designed kernels. The third stage was a fully connected network with one hidden layer.
The LeNet-1 architecture has 3 hidden layers (H1-H3) and an output layer.
It has 1256 units, 64660 connections, and 9760 independent parameters.
* H1 (Convolutional):
with
kernels.
* H2 (Convolutional):
with
kernels.
* H3: 30 units fully connected to H2.
* Output: 10 units fully connected to H3, representing the 10 digit classes (0-9).
The dataset was 9298 grayscale images, digitized from handwritten
zip codes that appeared on U.S. mail passing through the
Buffalo, New York
Buffalo is the second-largest city in the U.S. state of New York (behind only New York City) and the seat of Erie County. It is at the eastern end of Lake Erie, at the head of the Niagara River, and is across the Canadian border from Sou ...
post office. The training set had 7291 data points, and test set had 2007. Both training and test set contained ambiguous, unclassifiable, and misclassified data. Training took 3 days on a
Sun workstation
The SUN workstation was a modular computer system designed at Stanford University in the early 1980s. It became the seed technology for many commercial products, including the original workstations from Sun Microsystems.
History
In 1979 Xerox d ...
.
Compared to the previous 1988 architecture, there was no skeletonization, and the convolutional kernels were learned automatically by backpropagation.
A later version of LeNet-1 has four hidden layers (H1-H4) and an output layer. It takes a 28x28 pixel image as input, though the active region is 16x16 to avoid boundary effects.
* H1 (Convolutional):
with
kernels. This layer has
trainable parameters (100 from kernels, 4 from biases).
* H2 (Pooling):
by
average pooling.
* H3 (Convolutional):
with
kernels. Some kernels take input from 1 feature map, while others take inputs from 2 feature maps.
* H4 (Pooling):
by
average pooling.
* Output: 10 units fully connected to H4, representing the 10 digit classes (0-9).
The network 4635 units, 98442 connections, and 2578 trainable parameters. It was started by a previous CNN with 4 times as many trainable parameters, then optimized by
Optimal Brain Damage. One forward pass requires about 140,000
multiply-add operations.
LeNet-4
LeNet-4 was a larger version of LeNet-1 designed to fit the larger MNIST database. It had more feature maps in its convolutional layers, and had an additional layer of hidden units, fully connected to both the last convolutional layer and to the output units. It has 2 convolutions, 2 average poolings, and 2 fully connected layers. It has about 17000 trainable parameters.
One forward pass requires about 260,000
multiply-add operations.
LeNet-5


LeNet-5 is similar to LeNet-4, but with more fully connected layers. Its architecture is shown in the image on the right. It has 2 convolutions, 2 average poolings, and 3 fully connected layers.
LeNet-5 was trained for about 20 epoches over MNIST. It took 2 to 3 days of CPU time on a
Silicon Graphics Origin 2000 server, using a single 200 MHz
R10000
The R10000, code-named "T5", is a RISC microprocessor implementation of the MIPS IV instruction set architecture (ISA) developed by MIPS Technologies, Inc. (MTI), then a division of Silicon Graphics, Inc. (SGI). The chief designers are Chris Ro ...
processor.
Application
Recognizing simple digit images is the most classic application of LeNet as it was created because of that.
Yann LeCun
Yann André LeCun ( , ; originally spelled Le Cun; born 8 July 1960) is a French computer scientist working primarily in the fields of machine learning, computer vision, mobile robotics and computational neuroscience. He is the Silver Professor ...
et al. created LeNet-1 in 1989. The paper ''Backpropagation Applied to Handwritten Zip Code Recognition''
demonstrates how such constraints can be integrated into a
backpropagation
In machine learning, backpropagation (backprop, BP) is a widely used algorithm for training feedforward artificial neural networks. Generalizations of backpropagation exist for other artificial neural networks (ANNs), and for functions gener ...
network through the architecture of the network. And it had been successfully applied to the recognition of handwritten zip code digits provided by the U.S. Postal Service.
After the development of LeNet-1, as a demonstration for real-time application, they loaded the neural network into a AT&T DSP-32C digital signal processor with a peak performance of 12.5 million multiply-add operations per second. It could normalize-and-classify 10 digits a second, or classify 30 normalized digits a second. Shortly afterwards, the research group started working with a development group and a product group at
NCR (acquired by AT&T in 1991). It resulted in ATM machines that could read the numerical amounts on checks using a LeNet loaded on DSP-32C. Later, NCR deployed a similar system in large
cheque
A cheque, or check (American English; see spelling differences) is a document that orders a bank (or credit union) to pay a specific amount of money from a person's account to the person in whose name the cheque has been issued. The pers ...
reading machines in bank
back offices.
Development analysis
The LeNet-5 means the emergence of
CNN
CNN (Cable News Network) is a multinational cable news channel headquartered in Atlanta, Georgia, U.S. Founded in 1980 by American media proprietor Ted Turner and Reese Schonfeld as a 24-hour cable news channel, and presently owned by the ...
and defines the basic components of
CNN
CNN (Cable News Network) is a multinational cable news channel headquartered in Atlanta, Georgia, U.S. Founded in 1980 by American media proprietor Ted Turner and Reese Schonfeld as a 24-hour cable news channel, and presently owned by the ...
.
But it was not popular at that time because of the lack of hardware, especially since
GPUs
A graphics processing unit (GPU) is a specialized electronic circuit designed to manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display device. GPUs are used in embedded systems, mob ...
and other algorithms, such as
SVM, could achieve similar effects or even exceed LeNet.
Since the success of AlexNet in 2012,
CNN
CNN (Cable News Network) is a multinational cable news channel headquartered in Atlanta, Georgia, U.S. Founded in 1980 by American media proprietor Ted Turner and Reese Schonfeld as a 24-hour cable news channel, and presently owned by the ...
has become the best choice for computer vision applications and many different types of
CNN
CNN (Cable News Network) is a multinational cable news channel headquartered in Atlanta, Georgia, U.S. Founded in 1980 by American media proprietor Ted Turner and Reese Schonfeld as a 24-hour cable news channel, and presently owned by the ...
have been created, such as the R-
CNN
CNN (Cable News Network) is a multinational cable news channel headquartered in Atlanta, Georgia, U.S. Founded in 1980 by American media proprietor Ted Turner and Reese Schonfeld as a 24-hour cable news channel, and presently owned by the ...
series. Nowadays,
CNN
CNN (Cable News Network) is a multinational cable news channel headquartered in Atlanta, Georgia, U.S. Founded in 1980 by American media proprietor Ted Turner and Reese Schonfeld as a 24-hour cable news channel, and presently owned by the ...
models are quite different from LeNet, but they are all developed on the basis of LeNet.
A three-layer tree architecture imitating LeNet-5 and consisting of only one convolutional layer, has achieved a similar success rate on the CIFAR-10 dataset.
Increasing the number of filters for the LeNet architecture results in a power law decay of the error rate. These results indicate that a shallow network can achieve the same performance as deep learning architectures.
References
{{reflist
External links
LeNet-5, convolutional neural networks An online project page for LeNet maintained by Yann LeCun, containing animations and bibliography.
projects:lush [leon.bottou.org/nowiki>">eon.bottou.org">
projects:lush [leon.bottou.org
/nowiki> Lush, an object-oriented programming language. It contains SN, a neural network simulator. The LeNet series was written in SN.
Artificial neural networks