Handwriting recognition (HWR), also known as handwritten text recognition (HTR), is the ability of a computer to receive and interpret intelligible
handwritten input from sources such as
paper
Paper is a thin sheet material produced by mechanically or chemically processing cellulose fibres derived from wood, Textile, rags, poaceae, grasses, Feces#Other uses, herbivore dung, or other vegetable sources in water. Once the water is dra ...
documents,
photograph
A photograph (also known as a photo, or more generically referred to as an ''image'' or ''picture'') is an image created by light falling on a photosensitivity, photosensitive surface, usually photographic film or an electronic image sensor. Th ...
s,
touch-screen
A touchscreen (or touch screen) is a type of electronic visual display, display that can detect touch input from a user. It consists of both an input device (a touch panel) and an output device (a visual display). The touch panel is typically l ...
s and other devices. The image of the written text may be sensed "off line" from a piece of paper by optical scanning (
optical character recognition
Optical character recognition or optical character reader (OCR) is the electronics, electronic or machine, mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo ...
) or
intelligent word recognition. Alternatively, the movements of the pen tip may be sensed "on line", for example by a pen-based computer screen surface, a generally easier task as there are more clues available. A handwriting recognition system handles formatting, performs correct
segmentation into characters, and finds the most possible words.
Offline recognition
Offline handwriting recognition involves the automatic conversion of text in an image into letter codes that are usable within computer and text-processing applications. The data obtained by this form is regarded as a static representation of handwriting. Offline handwriting recognition is comparatively difficult, as different people have different handwriting styles. And, as of today, OCR engines are primarily focused on machine printed text and
ICR for hand "printed" (written in capital letters) text.
Traditional techniques
Character extraction
Offline character recognition often involves scanning a form or document. This means the individual characters contained in the scanned image will need to be extracted. Tools exist that are capable of performing this step. However, there are several common imperfections in this step. The most common is when characters that are connected are returned as a single sub-image containing both characters. This causes a major problem in the recognition stage. Yet many algorithms are available that reduce the risk of connected characters.
Character recognition
After individual characters have been extracted, a recognition engine is used to identify the corresponding computer character. Several different recognition techniques are currently available.
= Feature extraction
=
Feature extraction
Feature may refer to:
Computing
* Feature recognition, could be a hole, pocket, or notch
* Feature (computer vision), could be an edge, corner or blob
* Feature (machine learning), in statistics: individual measurable properties of the phenome ...
works in a similar fashion to neural network recognizers. However, programmers must manually determine the properties they feel are important. This approach gives the recognizer more control over the properties used in identification. Yet any system using this approach requires substantially more development time than a neural network because the properties are not learned automatically.
Modern techniques
Where traditional techniques focus on
segmenting individual characters for recognition, modern techniques focus on recognizing all the characters in a segmented line of text. Particularly they focus on
machine learning
Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of Computational statistics, statistical algorithms that can learn from data and generalise to unseen data, and thus perform Task ( ...
techniques that are able to learn visual features, avoiding the limiting feature engineering previously used. State-of-the-art methods use
convolutional networks to extract visual features over several overlapping windows of a text line image which a
recurrent neural network
Recurrent neural networks (RNNs) are a class of artificial neural networks designed for processing sequential data, such as text, speech, and time series, where the order of elements is important. Unlike feedforward neural networks, which proces ...
uses to produce character probabilities.
Online recognition
Online handwriting recognition involves the automatic conversion of text as it is written on a special
digitizer or
PDA, where a sensor picks up the pen-tip movements as well as pen-up/pen-down switching. This kind of data is known as digital ink and can be regarded as a digital representation of handwriting. The obtained signal is converted into letter codes that are usable within computer and text-processing applications.
The elements of an online handwriting recognition interface typically include:
* a pen or stylus for the user to write with
* a touch sensitive surface, which may be integrated with, or adjacent to, an output display.
* a software application which interprets the movements of the stylus across the writing surface, translating the resulting strokes into digital text.
The process of online handwriting recognition can be broken down into a few general steps:
* preprocessing,
* feature extraction and
* classification
The purpose of preprocessing is to discard irrelevant information in the input data, that can negatively affect the recognition. This concerns speed and accuracy. Preprocessing usually consists of binarization, normalization, sampling, smoothing and denoising. The second step is feature extraction. Out of the two- or higher-dimensional vector field received from the preprocessing algorithms, higher-dimensional data is extracted. The purpose of this step is to highlight important information for the recognition model. This data may include information like pen pressure, velocity or the changes of writing direction. The last big step is classification. In this step, various models are used to map the extracted features to different classes and thus identifying the characters or words the features represent.
Hardware
Commercial products incorporating handwriting recognition as a replacement for keyboard input were introduced in the early 1980s. Examples include handwriting terminals such as the
Pencept Penpad
and the Inforite point-of-sale terminal.
With the advent of the large consumer market for personal computers, several commercial products were introduced to replace the keyboard and mouse on a personal computer with a single pointing/handwriting system, such as those from Pencept,
CIC
and others.
The first commercially available tablet-type portable computer was the
Write-Top from Linus Technologies, released in July 1988. Its operating system was based on
MS-DOS
MS-DOS ( ; acronym for Microsoft Disk Operating System, also known as Microsoft DOS) is an operating system for x86-based personal computers mostly developed by Microsoft. Collectively, MS-DOS, its rebranding as IBM PC DOS, and a few op ...
.
In the early 1990s, hardware makers including
NCR,
IBM
International Business Machines Corporation (using the trademark IBM), nicknamed Big Blue, is an American Multinational corporation, multinational technology company headquartered in Armonk, New York, and present in over 175 countries. It is ...
and
EO released
tablet computer
A tablet computer, commonly shortened to tablet, is a mobile device, typically with a mobile operating system and touchscreen display processing circuitry, and a rechargeable battery in a single, thin and flat package. Tablets, being computers ...
s running the
PenPoint operating system developed by
GO Corp. PenPoint used handwriting recognition and gestures throughout and provided the facilities to third-party software. IBM's tablet computer was the first to use the
ThinkPad
ThinkPad is a line of business-oriented laptop and Tablet computer, tablet computers produced since 1992. It was originally designed, created and manufactured by the American IBM, International Business Machines (IBM) Corporation. IBM Acquisit ...
name and used IBM's handwriting recognition. This recognition system was later ported to Microsoft
Windows for Pen Computing, and IBM's
Pen for OS/2. None of these were commercially successful.
Advancements in electronics allowed the computing power necessary for handwriting recognition to fit into a smaller form factor than tablet computers, and handwriting recognition is often used as an input method for hand-held
PDAs. The first PDA to provide written input was the
Apple Newton
The Newton is a specified standard and series of personal digital assistants (PDAs) developed and marketed by Apple Inc., Apple Computer, Inc. from 1993 to 1998. An early device in the PDA categorythe term itself originating with the Newtonit w ...
, which exposed the public to the advantage of a streamlined user interface. However, the device was not a commercial success, owing to the unreliability of the software, which tried to learn a user's writing patterns. By the time of the release of the
Newton OS 2.0, wherein the handwriting recognition was greatly improved, including unique features still not found in current recognition systems such as modeless error correction, the largely negative first impression had been made. After discontinuation of
Apple Newton
The Newton is a specified standard and series of personal digital assistants (PDAs) developed and marketed by Apple Inc., Apple Computer, Inc. from 1993 to 1998. An early device in the PDA categorythe term itself originating with the Newtonit w ...
, the feature was incorporated in Mac OS X 10.2 and later as
Inkwell
An inkwell is a small jar or container, often made of glass, porcelain, silver, brass, or pewter, used for holding ink in a place convenient for the person who is writing. The artist or writer dips the brush, quill, or dip pen into the inkwell ...
.
Palm
Palm most commonly refers to:
* Palm of the hand, the central region of the front of the hand
* Palm plants, of family Arecaceae
** List of Arecaceae genera
**Palm oil
* Several other plants known as "palm"
Palm or Palms may also refer to:
Music ...
later launched a successful series of
PDAs based on the
Graffiti
Graffiti (singular ''graffiti'', or ''graffito'' only in graffiti archeology) is writing or drawings made on a wall or other surface, usually without permission and within public view. Graffiti ranges from simple written "monikers" to elabor ...
recognition system. Graffiti improved usability by defining a set of "unistrokes", or one-stroke forms, for each character. This narrowed the possibility for erroneous input, although memorization of the stroke patterns did increase the learning curve for the user. The Graffiti handwriting recognition was found to infringe on a patent held by Xerox, and Palm replaced Graffiti with a licensed version of the CIC handwriting recognition which, while also supporting unistroke forms, pre-dated the Xerox patent. The court finding of infringement was reversed on appeal, and then reversed again on a later appeal. The parties involved subsequently negotiated a settlement concerning this and other patents.
A
Tablet PC is a notebook computer with a
digitizer tablet and a stylus, which allows a user to handwrite text on the unit's screen. The operating system recognizes the handwriting and converts it into text.
Windows Vista
Windows Vista is a major release of the Windows NT operating system developed by Microsoft. It was the direct successor to Windows XP, released five years earlier, which was then the longest time span between successive releases of Microsoft W ...
and
Windows 7
Windows 7 is a major release of the Windows NT operating system developed by Microsoft. It was Software release life cycle#Release to manufacturing (RTM), released to manufacturing on July 22, 2009, and became generally available on October 22, ...
include personalization features that learn a user's writing patterns or vocabulary for English, Japanese, Chinese Traditional, Chinese Simplified and Korean. The features include a "personalization wizard" that prompts for samples of a user's handwriting and uses them to retrain the system for higher accuracy recognition. This system is distinct from the less advanced handwriting recognition system employed in its
Windows Mobile
Windows Mobile is a discontinued mobile operating system developed by Microsoft for smartphones and personal digital assistants (PDA). Designed to be the portable equivalent of the Windows desktop OS in the emerging Mobile device, mobile/port ...
OS for PDAs.
Although handwriting recognition is an input form that the public has become accustomed to, it has not achieved widespread use in either desktop computers or laptops. It is still generally accepted that
keyboard input is both faster and more reliable. , many PDAs offer handwriting input, sometimes even accepting natural cursive handwriting, but accuracy is still a problem, and some people still find even a simple
on-screen keyboard more efficient.
Software
Early software could understand print handwriting where the characters were separated; however, cursive handwriting with connected characters presented
Sayre's Paradox, a difficulty involving character segmentation. In 1962
Shelia Guberman, then in Moscow, wrote the first applied pattern recognition program. Commercial examples came from companies such as Communications Intelligence Corporation and IBM.
In the early 1990s, two companies – ParaGraph International and Lexicus – came up with systems that could understand cursive handwriting recognition. ParaGraph was based in Russia and founded by computer scientist
Stepan Pachikov while Lexicus was founded by
Ronjon Nag and Chris Kortge who were students at Stanford University. The ParaGraph CalliGrapher system was deployed in the Apple Newton systems, and Lexicus Longhand system was made available commercially for the PenPoint and Windows operating system. Lexicus was acquired by Motorola in 1993 and went on to develop Chinese handwriting recognition and
predictive text systems for Motorola. ParaGraph was acquired in 1997 by SGI and its handwriting recognition team formed a P&I division, later acquired from SGI by
Vadem. Microsoft has acquired CalliGrapher handwriting recognition and other digital ink technologies developed by P&I from Vadem in 1999.
Wolfram Mathematica (8.0 or later) also provides a handwriting or text recognition function TextRecognize.
Research

Handwriting recognition has an active community of academics studying it. The biggest conferences for handwriting recognition are the International Conference on Frontiers in Handwriting Recognition (ICFHR), held in even-numbered years, and the
International Conference on Document Analysis and Recognition (ICDAR), held in odd-numbered years. Both of these conferences are endorsed by the IEEE and
IAPR.
In 2021, the ICDAR proceedings will be published by
LNCS
''Lecture Notes in Computer Science'' is a series of computer science books published by Springer Science+Business Media since 1973.
Overview
The series contains proceedings, post-proceedings, monographs, and Festschrifts. In addition, tutorials ...
, Springer.
Active areas of research include:
* Online recognition
* Offline recognition
* Signature verification
*
Postal address interpretation
* Bank-Check processing
*
Writer recognition
Results since 2009
Since 2009, the
recurrent neural network
Recurrent neural networks (RNNs) are a class of artificial neural networks designed for processing sequential data, such as text, speech, and time series, where the order of elements is important. Unlike feedforward neural networks, which proces ...
s and deep
feedforward neural networks developed in the research group of
Jürgen Schmidhuber at the
Swiss AI Lab IDSIA have won several international handwriting competitions. In particular, the bi-directional and
multi-dimensional Long short-term memory
Long short-term memory (LSTM) is a type of recurrent neural network (RNN) aimed at mitigating the vanishing gradient problem commonly encountered by traditional RNNs. Its relative insensitivity to gap length is its advantage over other RNNs, ...
(LSTM) of Alex Graves et al. won three competitions in connected handwriting recognition at the 2009 International Conference on Document Analysis and Recognition (ICDAR), without any prior knowledge about the three different languages (French, Arabic,
Persian) to be learned. Recent
GPU-based
deep learning
Deep learning is a subset of machine learning that focuses on utilizing multilayered neural networks to perform tasks such as classification, regression, and representation learning. The field takes inspiration from biological neuroscience a ...
methods for feedforward networks by Dan Ciresan and colleagues at
IDSIA won the ICDAR 2011 offline Chinese handwriting recognition contest; their neural networks also were the first artificial pattern recognizers to achieve human-competitive performance on the famous
MNIST handwritten digits problem of
Yann LeCun
Yann André Le Cun ( , ; usually spelled LeCun; born 8 July 1960) is a French-American computer scientist working primarily in the fields of machine learning, computer vision, mobile robotics and computational neuroscience. He is the Silver Pr ...
and colleagues at
NYU.
Benjamin Graham of the
University of Warwick
The University of Warwick ( ; abbreviated as ''Warw.'' in post-nominal letters) is a public research university on the outskirts of Coventry between the West Midlands and Warwickshire, England. The university was founded in 1965 as part of ...
won a 2013 Chinese handwriting recognition contest, with only a 2.61% error rate, by using an approach to
convolutional neural networks
A convolutional neural network (CNN) is a type of feedforward neural network that learns features via filter (or kernel) optimization. This type of deep learning network has been applied to process and make predictions from many different type ...
that evolved (by 2017) into "sparse convolutional neural networks".
See also
*
AI effect
The AI effect is the discounting of the behavior of an artificial intelligence program as not "real" intelligence.
The author Pamela McCorduck writes: "It's part of the history of the field of artificial intelligence that every time somebody fi ...
*
Applications of artificial intelligence
Artificial intelligence (AI) has been used in applications throughout industry and academia. In a manner analogous to electricity or computers, AI serves as a general-purpose technology. AI programs are designed to simulate human perception and u ...
*
Electronic signature
An electronic signature, or e-signature, is data that is logically associated with other data and which is used by the signatory to sign the associated data. This type of signature has the same legal standing as a handwritten signature as long as ...
*
eScriptorium
*
Handwriting movement analysis
Handwriting movement analysis is the study and analysis of the movements involved in handwriting and drawing. It forms an important part of graphonomics, which became established after the "International Workshop on Handwriting Movement Analysis ...
*
Intelligent character recognition
*
Live Ink Character Recognition Solution
*
Neocognitron
__NOTOC__
The neocognitron is a hierarchical, multilayered artificial neural network proposed by Kunihiko Fukushima in 1979. It has been used for Japanese handwritten character recognition and other pattern recognition tasks, and served as the i ...
*
Optical character recognition
Optical character recognition or optical character reader (OCR) is the electronics, electronic or machine, mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo ...
*
Pen computing
Pen computing refers to any computer user-interface using a digital pen or Stylus (computing), stylus and Graphics tablet, tablet, over input devices such as a keyboard or a mouse.
Historically, pen computing (defined as a computer system employin ...
*
Sketch recognition
*
Stylus (computing)
In computing, a stylus (or stylus pen) is a small pen-shaped instrument whose tip position on a computer monitor can be detected. It is used to draw, or make selections by tapping. While devices with touchscreens such as laptops, smartphones, ...
*
Tablet PC
Lists
*
Outline of artificial intelligence
*
List of emerging technologies
This is a list of emerging technologies, which are emerging technologies, in-development technical innovations that have significant potential in their applications. The criteria for this list is that the technology must:
# Exist in some way; ...
References
External links
Annotated bibliography of references to gesture and pen computingNotes on the History of Pen-based Computing– video on
YouTube
YouTube is an American social media and online video sharing platform owned by Google. YouTube was founded on February 14, 2005, by Steve Chen, Chad Hurley, and Jawed Karim who were three former employees of PayPal. Headquartered in ...
{{DEFAULTSORT:Handwriting Recognition
Pointing-device text input
Machine learning task
Computational linguistics