LeNet is a series of

convolutional neural network In deep learning, a convolutional neural network (CNN, or ConvNet) is a class of artificial neural network (ANN), most commonly applied to analyze visual imagery. CNNs are also known as Shift Invariant or Space Invariant Artificial Neural Netwo ...

structure proposed by LeCun et al. The earliest version, LeNet-1, was trained in 1989. In general, when "LeNet" is referred to without a number, it refers to LeNet-5 (1998), the most well-known version. Convolutional neural networks are a kind of

feed-forward neural network A feedforward neural network (FNN) is an artificial neural network wherein connections between the nodes do ''not'' form a cycle. As such, it is different from its descendant: recurrent neural networks. The feedforward neural network was the f ...

whose artificial neurons can respond to a part of the surrounding cells in the coverage range and perform well in large-scale image processing. LeNet-5 was one of the earliest

s and was historically important during the development of deep learning.

Development history

In 1988, LeCun joined the Adaptive Systems Research Department at

AT&T Bell Laboratories Nokia Bell Labs, originally named Bell Telephone Laboratories (1925–1984), then AT&T Bell Laboratories (1984–1996) and Bell Labs Innovations (1996–2007), is an American industrial Research and development, research and scientific developm ...

Holmdel Holmdel Township (usually shortened to Holmdel) is a township in Monmouth County, New Jersey, United States. The township is centrally located in the Raritan Valley region, being within the regional and cultural influence of the Raritan Bays ...

, New Jersey, United States, headed by Lawrence D. Jackel. In 1988, LeCun et al. published a neural network design that recognize handwritten zip code. However, its convolutional kernels were hand-designed. In 1989,

Yann LeCun Yann André LeCun ( , ; originally spelled Le Cun; born 8 July 1960) is a French computer scientist working primarily in the fields of machine learning, computer vision, mobile robotics and computational neuroscience. He is the Silver Professor ...

et al. at

Bell Labs Nokia Bell Labs, originally named Bell Telephone Laboratories (1925–1984), then AT&T Bell Laboratories (1984–1996) and Bell Labs Innovations (1996–2007), is an American industrial research and scientific development company owned by mult ...

first applied the backpropagation algorithm to practical applications, and believed that the ability to learn network generalization could be greatly enhanced by providing constraints from the task's domain. He combined a convolutional neural network trained by backpropagation algorithms to read handwritten numbers and successfully applied it in identifying handwritten zip code numbers provided by the

US Postal Service The United States Postal Service (USPS), also known as the Post Office, U.S. Mail, or Postal Service, is an independent agency of the executive branch of the United States federal government responsible for providing postal service in the U ...

. This was the prototype of what later came to be called LeNet-1. In the same year, LeCun described a small handwritten digit recognition problem in another paper, and showed that even though the problem is linearly separable, single-layer networks exhibited poor generalization capabilities. When using shift-invariant feature detectors on a multi-layered, constrained network, the model could perform very well. He believed that these results proved that minimizing the number of free parameters in the neural network could enhance the generalization ability of the neural network. In 1990, their paper described the application of backpropagation networks in handwritten digit recognition again. They only performed minimal preprocessing on the data, and the model was carefully designed for this task and it was highly constrained. The input data consisted of images, each containing a number, and the test results on the postal code digital data provided by the US Postal Service showed that the model had an error rate of only 1% and a rejection rate of about 9%. Their research continued for the next four years, and in 1994

MNIST database The MNIST database (''Modified National Institute of Standards and Technology database'') is a large database of handwritten digits that is commonly used for training various image processing systems. The database is also widely used for training a ...

was developed, for which LeNet-1 was too small, hence a new LeNet-4 was trained on it. A year later the AT&T Bell Labs collective introduced LeNet-5 and reviewed various methods on handwritten character recognition in paper, using standard handwritten digits to identify benchmark tasks. These models were compared and the results showed that the latest network outperformed other models. By 1998 Yann LeCun, Leon Bottou,

Yoshua Bengio Yoshua Bengio (born March 5, 1964) is a Canadian computer scientist, most noted for his work on artificial neural networks and deep learning. He is a professor at the Department of Computer Science and Operations Research at the Université d ...

, and Patrick Haffner were able to provide examples of practical applications of neural networks, such as two systems for recognizing handwritten characters online and models that could read millions of checks per day. The research achieved great success and aroused the interest of scholars in the study of neural networks. While the architecture of the best performing neural networks today are not the same as that of LeNet, the network was the starting point for a large number of neural network architectures, and also brought inspiration to the field.

Architecture

LeNet has several common motifs of modern convolutional neural networks, such as convolutional layer,

pooling layer In neural networks, a pooling layer is a kind of network layer that downsamples and aggregates information that is dispersed among many vectors into fewer vectors. It has several uses. It removes redundant information, reducing the amount of comp ...

and full connection layer. * Every

convolutional layer In artificial neural networks, a convolutional layer is a type of network layer that applies a convolution operation to the input. Convolutional layers are some of the primary building blocks of convolutional neural networks In deep learning, ...

includes three parts: convolution, pooling, and nonlinear

activation function In artificial neural networks, the activation function of a node defines the output of that node given an input or set of inputs. A standard integrated circuit can be seen as a digital network of activation functions that can be "ON" (1) or " ...

s * Using convolution to extract spatial features (Convolution was called receptive fields originally) * Subsampling average pooling layer * tanh

* fully connected layers in the final layers for classification * Sparse connection between layers to reduce the complexity of computation

LeNet-1

Before LeNet-1, the 1988 architecture was a hybrid approach. The first stage scaled, deskewed, and skeletonized the input image. The second stage was a convolutional layer with 18 hand-designed kernels. The third stage was a fully connected network with one hidden layer. The LeNet-1 architecture has 3 hidden layers (H1-H3) and an output layer. It has 1256 units, 64660 connections, and 9760 independent parameters. * H1 (Convolutional):

16 \times 16 \to 12 \times 8 \times 8

with

5\times 5

kernels. * H2 (Convolutional):

12 \times 8 \times 8 \to 12 \times 4 \times 4

with

8 \times 5\times 5

kernels. * H3: 30 units fully connected to H2. * Output: 10 units fully connected to H3, representing the 10 digit classes (0-9). The dataset was 9298 grayscale images, digitized from handwritten zip codes that appeared on U.S. mail passing through the

Buffalo, New York Buffalo is the second-largest city in the U.S. state of New York (behind only New York City) and the seat of Erie County. It is at the eastern end of Lake Erie, at the head of the Niagara River, and is across the Canadian border from Sou ...

post office. The training set had 7291 data points, and test set had 2007. Both training and test set contained ambiguous, unclassifiable, and misclassified data. Training took 3 days on a

Sun workstation The SUN workstation was a modular computer system designed at Stanford University in the early 1980s. It became the seed technology for many commercial products, including the original workstations from Sun Microsystems. History In 1979 Xerox d ...

. Compared to the previous 1988 architecture, there was no skeletonization, and the convolutional kernels were learned automatically by backpropagation. A later version of LeNet-1 has four hidden layers (H1-H4) and an output layer. It takes a 28x28 pixel image as input, though the active region is 16x16 to avoid boundary effects. * H1 (Convolutional):

28 \times 28 \to 4 \times 24 \times 24

with

5\times 5

kernels. This layer has

104

trainable parameters (100 from kernels, 4 from biases). * H2 (Pooling):

4 \times 24 \times 24 \to 4 \times 12 \times 12

2\times 2

average pooling. * H3 (Convolutional):

4 \times 12 \times 12 \to 12 \times 8 \times 8

with

5\times 5

kernels. Some kernels take input from 1 feature map, while others take inputs from 2 feature maps. * H4 (Pooling):

12 \times 8 \times 8 \to 12 \times 4 \times 4

2\times 2

average pooling. * Output: 10 units fully connected to H4, representing the 10 digit classes (0-9). The network 4635 units, 98442 connections, and 2578 trainable parameters. It was started by a previous CNN with 4 times as many trainable parameters, then optimized by Optimal Brain Damage. One forward pass requires about 140,000 multiply-add operations.

LeNet-4

LeNet-4 was a larger version of LeNet-1 designed to fit the larger MNIST database. It had more feature maps in its convolutional layers, and had an additional layer of hidden units, fully connected to both the last convolutional layer and to the output units. It has 2 convolutions, 2 average poolings, and 2 fully connected layers. It has about 17000 trainable parameters. One forward pass requires about 260,000 multiply-add operations.

LeNet-5

LeNet-5 is similar to LeNet-4, but with more fully connected layers. Its architecture is shown in the image on the right. It has 2 convolutions, 2 average poolings, and 3 fully connected layers. LeNet-5 was trained for about 20 epoches over MNIST. It took 2 to 3 days of CPU time on a Silicon Graphics Origin 2000 server, using a single 200 MHz

R10000 The R10000, code-named "T5", is a RISC microprocessor implementation of the MIPS IV instruction set architecture (ISA) developed by MIPS Technologies, Inc. (MTI), then a division of Silicon Graphics, Inc. (SGI). The chief designers are Chris Ro ...

processor.

Application

Recognizing simple digit images is the most classic application of LeNet as it was created because of that.

et al. created LeNet-1 in 1989. The paper ''Backpropagation Applied to Handwritten Zip Code Recognition'' demonstrates how such constraints can be integrated into a

backpropagation In machine learning, backpropagation (backprop, BP) is a widely used algorithm for training feedforward artificial neural networks. Generalizations of backpropagation exist for other artificial neural networks (ANNs), and for functions gener ...

network through the architecture of the network. And it had been successfully applied to the recognition of handwritten zip code digits provided by the U.S. Postal Service. After the development of LeNet-1, as a demonstration for real-time application, they loaded the neural network into a AT&T DSP-32C digital signal processor with a peak performance of 12.5 million multiply-add operations per second. It could normalize-and-classify 10 digits a second, or classify 30 normalized digits a second. Shortly afterwards, the research group started working with a development group and a product group at NCR (acquired by AT&T in 1991). It resulted in ATM machines that could read the numerical amounts on checks using a LeNet loaded on DSP-32C. Later, NCR deployed a similar system in large

cheque A cheque, or check (American English; see spelling differences) is a document that orders a bank (or credit union) to pay a specific amount of money from a person's account to the person in whose name the cheque has been issued. The pers ...

reading machines in bank back offices.

Development analysis

The LeNet-5 means the emergence of

CNN CNN (Cable News Network) is a multinational cable news channel headquartered in Atlanta, Georgia, U.S. Founded in 1980 by American media proprietor Ted Turner and Reese Schonfeld as a 24-hour cable news channel, and presently owned by the ...

and defines the basic components of

. But it was not popular at that time because of the lack of hardware, especially since

GPUs A graphics processing unit (GPU) is a specialized electronic circuit designed to manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display device. GPUs are used in embedded systems, mob ...

and other algorithms, such as SVM, could achieve similar effects or even exceed LeNet. Since the success of AlexNet in 2012,

has become the best choice for computer vision applications and many different types of

have been created, such as the R-

series. Nowadays,

models are quite different from LeNet, but they are all developed on the basis of LeNet. A three-layer tree architecture imitating LeNet-5 and consisting of only one convolutional layer, has achieved a similar success rate on the CIFAR-10 dataset. Increasing the number of filters for the LeNet architecture results in a power law decay of the error rate. These results indicate that a shallow network can achieve the same performance as deep learning architectures.

References

{{reflist

External links

LeNet-5, convolutional neural networks
An online project page for LeNet maintained by Yann LeCun, containing animations and bibliography.
projects:lush [leon.bottou.org
/nowiki>">eon.bottou.org">projects:lush [leon.bottou.org
/nowiki> Lush, an object-oriented programming language. It contains SN, a neural network simulator. The LeNet series was written in SN. Artificial neural networks