Image tracing
   HOME

TheInfoList



OR:

In
computer graphics Computer graphics deals with generating images with the aid of computers. Today, computer graphics is a core technology in digital photography, film, video games, cell phone and computer displays, and many specialized applications. A great de ...
, image tracing, raster-to-vector conversion or raster vectorization is the conversion of
raster graphics upright=1, The Smiley, smiley face in the top left corner is a raster image. When enlarged, individual pixels appear as squares. Enlarging further, each pixel can be analyzed, with their colors constructed through combination of the values for ...
into
vector graphics Vector graphics is a form of computer graphics in which visual images are created directly from geometric shapes defined on a Cartesian plane, such as points, lines, curves and polygons. The associated mechanisms may include vector display ...
.


Background

An image does not have any structure: it is just a collection of marks on paper, grains in film, or pixels in a
bitmap In computing, a bitmap is a mapping from some domain (for example, a range of integers) to bits. It is also called a bit array or bitmap index. As a noun, the term "bitmap" is very often used to refer to a particular bitmapping application: t ...
. While such an image is useful, it has some limits. If the image is magnified enough, its artifacts appear. The halftone dots, film grains, and pixels become apparent. Images of sharp edges become fuzzy or jagged. See, for example,
pixelation In computer graphics, pixelation (or pixellation in British English) is caused by displaying a bitmap or a section of a bitmap at such a large size that individual pixels, small single-colored square display elements that comprise the bitmap, a ...
. Ideally, a vector image does not have the same problem. Edges and filled areas are represented as mathematical curves or gradients, and they can be magnified arbitrarily (though of course the final image must also be rasterized in to be rendered, and its quality depends on the quality of the rasterization algorithm for the given inputs). The task in vectorization is to convert a two-dimensional image into a two-dimensional vector representation of the image. It is not examining the image and attempting to recognize or extract a three-dimensional model which may be depicted; i.e. it is not a vision system. For most applications, vectorization also does not involve
optical character recognition Optical character recognition or optical character reader (OCR) is the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a sc ...
; characters are treated as lines, curves, or filled objects without attaching any significance to them. In vectorization, the shape of the character is preserved, so artistic embellishments remain. Vectorization is the inverse operation corresponding to
rasterization In computer graphics, rasterisation (British English) or rasterization (American English) is the task of taking an image described in a vector graphics format (shapes) and converting it into a raster image (a series of pixels, dots or lines, whi ...
, as integration is to differentiation. And, just as with these other two operations, while rasterization is fairly straightforward and algorithmic, vectorization involves the reconstruction of lost information and therefore requires
heuristic A heuristic (; ), or heuristic technique, is any approach to problem solving or self-discovery that employs a practical method that is not guaranteed to be optimal, perfect, or rational, but is nevertheless sufficient for reaching an immediate ...
methods. Synthetic images such as maps, cartoons, logos, clip art, and technical drawings are suitable for vectorization. Those images could have been originally made as vector images because they are based on geometric shapes or drawn with simple curves. Continuous tone photographs (such as live portraits) are not good candidates for vectorization. The input to vectorization is an image, but an image may come in many forms such as a photograph, a drawing on paper, or one of several raster file formats. Programs that do raster-to-vector conversion may accept bitmap formats such as
TIFF Tag Image File Format, abbreviated TIFF or TIF, is an image file format for storing raster graphics images, popular among graphic artists, the publishing industry, and photographers. TIFF is widely supported by scanning, faxing, word process ...
, BMP and PNG. The output is a vector file format. Common vector formats are
SVG Scalable Vector Graphics (SVG) is an XML-based vector image format for defining two-dimensional graphics, having support for interactivity and animation. The SVG specification is an open standard developed by the World Wide Web Consortium s ...
, DXF, EPS, EMF and AI. Vectorization can be used to update images or recover work. Personal computers often come with a simple paint program that produces a bitmap output file. These programs allow users to make simple illustrations by adding text, drawing outlines, and filling outlines with a specific color. Only the results of these operations (the pixels) are saved in the resulting bitmap; the drawing and filling operations are discarded. Vectorization can be used to recapture some of the information that was lost. Vectorization is also used to recover information that was originally in a vector format but has been lost or has become unavailable. A company may have commissioned a logo from a graphic arts firm. Although the graphics firm used a vector format, the client company may not have received a copy of that format. The company may then acquire a vector format by scanning and vectorizing a paper copy of the logo.


Process

Vectorization starts with an image.


Manual

The image can be vectorized manually. A person could look at the image, make some measurements, and then write the output file by hand. That was the case for the vectorization of a technical illustration about neutrinos. The illustration has a few geometric shapes and a lot of text; it was relatively easy to convert the shapes, and the SVG vector format allows the text (even subscripts and superscripts) to be entered easily. The original image did not have any curves (except for the text), so the conversion is straightforward. Curves make the conversion more complicated. Manual vectorization of complicated shapes can be facilitated by the tracing function built into some vector graphics editing programs. If the image is not yet in machine readable form, then it has to be scanned into a usable file format. Once there is a machine-readable bitmap, the image can be imported into a graphics editing program (such as
Adobe Illustrator Adobe Illustrator is a vector graphics editor and design program developed and marketed by Adobe Inc. Originally designed for the Apple Macintosh, development of Adobe Illustrator began in 1985. Along with Creative Cloud (Adobe's shift to month ...
,
CorelDRAW CorelDRAW is a vector graphics editor developed and marketed by Corel Corporation. It is also the name of the Corel graphics suite, which includes the bitmap-image editor Corel Photo-Paint as well as other graphics-related programs (see below) ...
, or
Inkscape Inkscape is a free and open-source vector graphics editor used to create vector images, primarily in Scalable Vector Graphics (SVG) format. Other formats can be imported and exported. Inkscape can render primitive vector shapes (e.g. rec ...
). Then a person can manually trace the elements of the image using the program's editing features. Curves in the original image can be approximated with lines, arcs, and Bézier curves. An illustration program allows spline knots to be adjusted for a close fit. Manual vectorization is possible, but it can be tedious. Although graphics drawing programs have been around for a long time, artists may find the freehand drawing facilities are awkward even when a drawing tablet is used. Instead of using a program, Pepper recommends making an initial sketch on paper. Instead of scanning the sketch and tracing it freehand in the computer, Pepper states: "Those proficient with a graphic tablet and stylus could make the following changes directly in CorelDRAW by using a scan of the sketch as an underlay and drawing over it. I prefer to use pen and ink, and a light table"; most of the final image was traced by hand in ink. Later the line-drawing image was scanned at 600 dpi, cleaned up in a paint program, and then automatically traced with a program. Once the black and white image was in the graphics program, some other elements were added and the figure was colored. Similarly, Ploch recreated a design from a digital photograph. The JPEG was imported and some "basic shapes" were traced by hand and colored in the graphics drawing program; more complex shapes were handled differently. Ploch used a bitmap editor to remove the background and crop the more complex image components. He then printed the image and traced it by hand onto tracing paper to get a clean black and white line drawing. That drawing was scanned and then vectorized with a program.


Automatic

There are programs that automate the vectorization process. Example programs are
Adobe Streamline Adobe Streamline is a discontinued line tracing program developed and published by Adobe Systems. Its primary purpose is to convert scanned bitmaps into vector artwork. Streamline is similar in function to competitors, such as Corel Trace, but w ...
(discontinued), Corel's PowerTRACE, and Potrace. Some of these programs have a command line interface while others are interactive that allow the user to adjust the conversion settings and view the result. Adobe Streamline is not only an interactive program, but it also allows a user to manually edit the input bitmap and the output curves. Corel's PowerTRACE is accessed through CorelDRAW; CorelDRAW can be used to modify the input bitmap and edit the output curves. Adobe Illustrator has a facility to trace individual curves. Automated programs can have mixed results. A program (PowerTRACE) was used to convert a PNG map to SVG. The program did a good job on the map boundaries (the most tedious task in the tracing) and the settings dropped out all the text (small objects). The text was manually re-inserted. Other conversions may not go as well. The results depend on having high-quality scans, reasonable settings, and good algorithms. Scanned images often have a lot of noise. The bitmap image may need a lot of work to clean it up. Erase stray marks and fill in lines and areas. Corel advice: Put image on a light table, cover with
vellum Vellum is prepared animal skin or membrane, typically used as writing material. Parchment is another term for this material, from which vellum is sometimes distinguished, when it is made from calfskin, as opposed to that made from other anim ...
(
tracing paper Tracing paper is paper made to have low opacity, allowing light to pass through. It was originally developed for architects and design engineers to create drawings that could be copied precisely using the diazo copy process; it then found ma ...
), and then manually ink the desired outlines. Then scan the vellum and use automated raster-to-vector conversion program on that scan.


Options

There are many different image styles and possibilities, and no single vectorization method works well on all images. Consequently, vectorization programs have many options that influence the result. One issue is what the predominant shapes are. If the image is of a fill-in form, then it will probably have just vertical and horizontal lines of a constant width. The program's vectorization should take that into account. On the other hand, a CAD drawing may have lines at any angle, there may be curved lines, and there may be several line weights (thick for objects and thin for dimension lines). Instead of (or in addition to) curves, the image may contain outlines filled with the same color. Adobe Streamline allows users to select a combination of line recognition (horizontal and vertical lines), centerline recognition, or outline recognition. Streamline also allows outline shapes that are small to be thrown out; the notion is such small shapes are noise. The user may set the noise level between 0 and 1000; an outline that has fewer pixels than that setting is discarded. Another issue is the number of colors in the image. Even images that were created as black on white drawings may end up with many shades of gray. Some line-drawing routines employ anti-aliasing; a pixel completely covered by the line will be black, but a pixel that is only partially covered will be gray. If the original image is on paper and is scanned, there is a similar result: edge pixels will be gray. Sometimes images are compressed (e.g., JPEG images), and the compression will introduce gray levels. Many of the vectorization programs will group same-color pixels into lines, curves, or outlined shapes. If each possible color is grouped into its own object, there can be an enormous number of objects. Instead, the user is asked to select a finite number of colors (usually less than 256), the image is reduced to using that many colors (this step is
color quantization In computer graphics, color quantization or color image quantization is quantization applied to color spaces; it is a process that reduces the number of distinct colors used in an image, usually with the intention that the new image should be as v ...
), and then the vectorization is done on the reduced image. For continuous tone images such as photographs, the result of color quantization is
posterization Posterization or posterisation of an image is the conversion of a continuous gradation of tone to several regions of fewer tones, causing abrupt changes from one tone to another. This was originally done with photographic processes to create p ...
. Gradient fills will also be posterized. Reducing the number of colors in an image is often aided with a histogram. The most common colors may be selected as the representatives, and other colors are mapped to their closest representative. When the number of colors is set to two, the user may be asked to make threshold and contrast setting. A contrast setting looks for significant changes in pixel color rather than a particular color; consequently, it may ignore the gradual color variations in a gradient fill. Once the outline has been extracted, the user could manually reintroduce the gradient fill. The vectorization program will want to group a region of the same color into a single object. It can clearly do that by making the region boundary exactly follow the pixel boundaries, but the result will be a boundary of often short orthogonal lines. The resulting conversion will also have the same pixelation problems that a bitmap has when it is magnified. Instead, the vectorization program needs to approximate the region boundary with lines and curves that closely follow the pixel boundaries but are not exactly the pixel boundaries. A tolerance parameter tells the program how closely it should follow the pixel boundaries. The end result of many vectorization programs are curves consisting of cubic
Bézier curve A Bézier curve ( ) is a parametric curve used in computer graphics and related fields. A set of discrete "control points" defines a smooth, continuous curve by means of a formula. Usually the curve is intended to approximate a real-world shape ...
s. A region boundary is approximated with several curve segments. To keep a curve smooth, the joints of two curves is constrained so the tangents match. One problem is determining where a curve bends so sharply that it should not be smooth. The smooth portions of a curve are then approximated with a Bézier curve fitting procedure. Successive division may be used. Such a fitting procedure tries to fit the curve with a single cubic curve; if the fit is acceptable, then the procedure stops. Otherwise, it selects some advantageous point along the curve and breaks the curve into two parts. It then fits the parts while keeping the joint tangent. If the fit is still unacceptable, then it breaks the curve into more parts. Some vectorizers are standalone programs, but many have interactive interfaces that allow a user to adjust the program parameters and quickly see the result. PowerTRACE, for example, can display the original image and preview the converted image so the user may compare them; the program also reports information such as the number of curves.


Example

File:Radula diagram3.png, Original artwork in PNG format; 115 kB. File:Radula diagram3 traced.svg, Traced with PowerTRACE using detailed logo, smoothing 40, detail +2.5; result: 50 colors, 94 curves, 2452 nodes, 96 kB. On the right is an illustration showing the operation of the
radula The radula (, ; plural radulae or radulas) is an anatomical structure used by molluscs for feeding, sometimes compared to a tongue. It is a minutely toothed, chitinous ribbon, which is typically used for scraping or cutting food before the food ...
in mollusks. The upper portion is mostly a one-pen-width filled outline diagram, but it has a mesh gradient fill along the bottom of the shell and along the bottom of food. It also has some artistic brushes on the upper left of the shell. The bottom portion of the illustration has four line weights and some small characters; the color fill is simple except for a gradient at the jagged lines. The 531×879 pixel image was traced; 50 colors were used. Most (if not all) lines were lost; they were turned into black regions, and their effective line widths vary. The black outline around the blue food in upper part disappeared. The gradient fills and brushed spots were lost to color quantization/posterization; some brush spots disappeared. Some letters survived the vectorization with distortion, but most letters were discarded. Losing the letters is not a big issue; post conversion editing would want to delete the annotation and replace it with text rather than curves. Thin lines crossing at a shallow angle made filled regions, and intersecting outlines of filled region became confused; see lower right corner. The tracing also has some odd features. Many black outlines touch, so they become a large, complicated, object rather than just outlines for specific regions. Instead of just background, a rectangular white region separates the two outlined rectangles. The objects labeled ''op'', ''rp'', and ''rr'' are not simple layered shapes; the desired result would have ''rr'' overlaid by ''rp'' which is overlaid by ''op''.


Usage domains

*In
computer-aided design Computer-aided design (CAD) is the use of computers (or ) to aid in the creation, modification, analysis, or optimization of a design. This software is used to increase the productivity of the designer, improve the quality of design, improve co ...
(CAD) drawings (
blueprint A blueprint is a reproduction of a technical drawing or engineering drawing using a contact print process on light-sensitive sheets. Introduced by Sir John Herschel in 1842, the process allowed rapid and accurate production of an unlimited number ...
s etc.) are scanned, vectorized and written as CAD files in a process called ''paper-to-CAD conversion'' or ''drawing conversion''. *In
geographic information system A geographic information system (GIS) is a type of database containing geographic data (that is, descriptions of phenomena for which location is relevant), combined with software tools for managing, analyzing, and visualizing those data. In a ...
s (GIS) satellite or aerial images are vectorized to create
maps A map is a symbolic depiction emphasizing relationships between elements of some space, such as objects, regions, or themes. Many maps are static, fixed to paper or some other durable medium, while others are dynamic or interactive. Although ...
. *In
graphic design Graphic design is a profession, academic discipline and applied art whose activity consists in projecting visual communications intended to transmit specific messages to social groups, with specific objectives. Graphic design is an interdiscip ...
and
photography Photography is the art, application, and practice of creating durable images by recording light, either electronically by means of an image sensor, or chemically by means of a light-sensitive material such as photographic film. It is emplo ...
, graphics can be vectorized for easier usage and resizing. *Vectorization is often the first step in OCR solutions for handwritten text or
signature A signature (; from la, signare, "to sign") is a Handwriting, handwritten (and often Stylization, stylized) depiction of someone's name, nickname, or even a simple "X" or other mark that a person writes on documents as a proof of identity and ...
s. Vectorization is effective on single colored, non gradient input data, like signatures. File:Firma-colon.JPG, Signature of
Christopher Columbus Christopher Columbus * lij, Cristoffa C(or)ombo * es, link=no, Cristóbal Colón * pt, Cristóvão Colombo * ca, Cristòfor (or ) * la, Christophorus Columbus. (; born between 25 August and 31 October 1451, died 20 May 1506) was a ...
as
JPEG JPEG ( ) is a commonly used method of lossy compression for digital images, particularly for those images produced by digital photography. The degree of compression can be adjusted, allowing a selectable tradeoff between storage size and imag ...
image (1,308 × 481 pixel), 63  kB File:Columbus Signature.svg, Vectorized two-color (black & white) variant of the signature of
Christopher Columbus Christopher Columbus * lij, Cristoffa C(or)ombo * es, link=no, Cristóbal Colón * pt, Cristóvão Colombo * ca, Cristòfor (or ) * la, Christophorus Columbus. (; born between 25 August and 31 October 1451, died 20 May 1506) was a ...
, 19 kB


Continuous tone images

Vectorization is usually inappropriate for continuous tone images such as portraits. The result is often poor. For example, many different image tracing algorithms were applied to a 25 kB JPEG image. The resulting vector images are at least a factor of ten larger and may have pronounced posterization effects when a small number of colors are used. File:Silversmith.jpg, A photograph in
JPEG JPEG ( ) is a commonly used method of lossy compression for digital images, particularly for those images produced by digital photography. The degree of compression can be adjusted, allowing a selectable tradeoff between storage size and imag ...
format, 25 KB File:SilversmithRaveGrid.svg, The photograph at left vectorized with ''RaveGrid'', 1.64 MB File:Silversmith.svg, Same photograph vectorized with AutoTrace in the Delineate GUI, 677 KB File:Silversmith-inkscape.svg, Same photograph vectorized with
Inkscape Inkscape is a free and open-source vector graphics editor used to create vector images, primarily in Scalable Vector Graphics (SVG) format. Other formats can be imported and exported. Inkscape can render primitive vector shapes (e.g. rec ...
's "Trace Bitmap" function, based on potrace, 1.05 MB File:Silversmith-scan2cad.svg, Same photograph vectorized with Scan2CAD, 340 KB File:SilversmithVectormagic-high-12colors.svg, ''Vectormagic'', 12 colors, 369 KB File:SilversmithVectormagic-high-unlimitedcolors.svg, ''Vectormagic'', all colors, 744 KB File:Silversmith vectorized12.svg, ''Super Vectorizer'', 12 colors, 225 KB


See also

*
Rasterization In computer graphics, rasterisation (British English) or rasterization (American English) is the task of taking an image described in a vector graphics format (shapes) and converting it into a raster image (a series of pixels, dots or lines, whi ...
*
CAD data exchange CAD data exchange is a method of drawing data exchange used to translate between different Computer-aided design ( CAD) authoring systems or between CAD and other downstream CAx systems. Many companies use different CAD systems and exchange CAD ...
*
Comparison of raster to vector conversion software The following tables contain general and technical information about a number of raster-to-vector conversion software products. Please see the individual products' articles for further information. General information This table gives basic ...
*
Digitizing DigitizationTech Target. (2011, April). Definition: digitization. ''WhatIs.com''. Retrieved December 15, 2021, from https://whatis.techtarget.com/definition/digitization is the process of converting information into a digital (i.e. computer- ...
* Discretization error *
Downsampling In digital signal processing, downsampling, compression, and decimation are terms associated with the process of ''resampling'' in a multi-rate digital signal processing system. Both ''downsampling'' and ''decimation'' can be synonymous with ''com ...
*
Feature detection (computer vision) In computer vision and image processing, a feature is a piece of information about the content of an image; typically about whether a certain region of the image has certain properties. Features may be specific structures in the image such as poi ...
*
Edge detection Edge detection includes a variety of mathematical methods that aim at identifying edges, curves in a digital image at which the image brightness changes sharply or, more formally, has discontinuities. The same problem of finding discontinuitie ...
*
Image scanner An image scanner—often abbreviated to just scanner—is a device that optically scans images, printed text, handwriting or an object and converts it to a digital image. Commonly used in offices are variations of the desktop ''flatbed scanner'' ...
*
Optical character recognition Optical character recognition or optical character reader (OCR) is the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a sc ...
*
Quantization error Quantization, in mathematics and digital signal processing, is the process of mapping input values from a large set (often a continuous set) to output values in a (countable) smaller set, often with a finite number of elements. Rounding and ...
*
Subpaving In mathematics, a subpaving is a set of nonoverlapping boxes of R⁺. A subset ''X'' of Rⁿ can be approximated by two subpavings ''X⁻'' and ''X⁺'' such that  ''X⁻'' ⊂ ''X'' ⊂ ''X⁺''. In R¹ the bo ...


References

* * * * * *


Further reading

* * {{citation , last1=Itoh , first1=Koichi , last2=Ohno , first2=Yoshio , title=A curve fitting algorithm for character fonts , journal=Electronic Publishing , volume=6 , issue=3 , pages=195–205 , date=September 1993 , publisher=John Wiley, citeseerx=10.1.1.39.537


External links


Taking Corel PowerTRACE for a Test Drive
Computer graphics