Cartographic generalization, or map generalization, includes all changes in a map that are made when one derives a smaller-scale

map A map is a symbolic depiction emphasizing relationships between elements of some space, such as objects, regions, or themes. Many maps are static, fixed to paper or some other durable medium, while others are dynamic or interactive. Although ...

from a larger-scale map or map data. It is a core part of

cartographic design Cartographic design or map design is the process of crafting the appearance of a map, applying the principles of design and knowledge of how maps are used to create a map that has both aesthetic appeal and practical function. It shares this dua ...

. Whether done manually by a cartographer or by a computer or set of

algorithms In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing ...

generalization A generalization is a form of abstraction whereby common properties of specific instances are formulated as general concepts or claims. Generalizations posit the existence of a domain or set of elements, as well as one or more common characte ...

seeks to abstract spatial information at a high level of detail to information that can be rendered on a map at a lower level of detail. The cartographer has license to adjust the content within their maps to create a suitable and useful map that conveys spatial information, while striking the right balance between the map's purpose and the precise detail of the subject being mapped. Well generalized maps are those that emphasize the most important map elements while still representing the world in the most faithful and recognizable way.

History

During the first half of the 20th century, cartographers began to think seriously about how the features they drew depended on scale.

Eduard Imhof Eduard Imhof (25 January 1895 – 27 April 1986) was a professor of cartography at the Swiss Federal Institute of Technology, Zürich, from 1925 to 1965. His fame, which extends far beyond the Institute of Technology, stems from his relief shadi ...

, one of the most accomplished academic and professional cartographers at the time, published a study of city plans on maps at a variety of scales in 1937, itemizing several forms of generalization that occurred, including those later termed symbolization, merging, simplification, enhancement, and displacement. As analytical approaches to geography arose in the 1950s and 1960s, generalization, especially line simplification and raster smoothing, was a target of study.Perkal, Julian (1958) "Proba obiektywnej generalizacji," ''Geodezja i Karografia'', VII:2 (1958), pp.130-142. English translation, 1965,
An Attempt at Objective Generalization
" ''Discussion Papers of The Michigan Inter-university Community of Mathematical Geographers'' Generalization was probably the most thoroughly studied aspect of cartography from the 1970s to the 1990s. This is probably because it fit within both of the major two research trends of the era: cartographic communication (especially signal processing algorithms based on Information theory), and the opportunities afforded by technological advance (because of its potential for automation). Early research focused primarily on algorithms for automating individual generalization operations. By the late 1980s, academic cartographers were thinking bigger, developing a general theory of generalization, and exploring the use of expert systems and other nascent

Artificial intelligence Artificial intelligence (AI) is intelligence—perceiving, synthesizing, and inferring information—demonstrated by machines, as opposed to intelligence displayed by animals and humans. Example tasks in which this is done include speech r ...

technologies to automate the entire process, including decisions on which tools to use when. These tracks foundered somewhat in the late 1990s, coinciding with a general loss of faith in the promise of AI, and the rise of post-modern criticisms of the impacts of the automation of design. In recent years, the generalization community has seen a resurgence, fueled in part by the renewed opportunities of AI. Another recent trend has been a focus on ''multi-scale mapping'', integrating GIS databases developed for several target scales, narrowing the scope of need for generalization to the scale "gaps" between them, a more manageable level for automation.

Theories of Map detail

Generalization is often defined simply as removing detail, but it is based on the notion, originally adopted from Information theory, of the volume of information or detail found on the map, and how that volume is controlled by map scale, map purpose, and intended audience. If there is an optimal amount of information for a given map project, then generalization is the process of taking existing available data, often called (especially in Europe) the ''digital landscape model'' (DLM), which usually but not always has a larger amount of information than needed, and processing it to create a new data set, often called the ''digital cartographic model'' (DCM), with the desired amount. Many general conceptual models have been proposed for understanding this process, often attempting to capture the decision process of the human master cartographer. One of the most popular models, developed by McMaster and Shea in 1988, divides these decisions into three phases: ''Philosophical objectives'', the general reasons why generalization is desirable or necessary, and criteria for evaluating its success; ''Cartometric evaluation'', the characteristics of a given map (or feature within that map) that demands generalization; and ''Spatial and attribute transformations'', the set of generalization operators available to use on a given feature, layer, or map. In the first, most conceptual phase, McMaster and Shea show how generalization plays a central role in resolving the often conflicting goals of

Cartographic design Cartographic design or map design is the process of crafting the appearance of a map, applying the principles of design and knowledge of how maps are used to create a map that has both aesthetic appeal and practical function. It shares this dua ...

as a whole: functionality vs. aesthetics, information richness vs. clarity, and the desire to do more vs. the limitations of technology and medium. These conflicts can be reduced to a basic conflict between the need for more data on the map, and the need for less, with generalization as the tool for balancing them. One challenge with the information theory approach to generalization is its basis on measuring the amount of information on the map, before and after generalization procedures. One could conceive of a map being quantified by its ''map information density'', the average number of "bits" of information per unit area on the map (or its corollary, ''information resolution'', the average distance between bits), and by its ''ground information density'' or ''resolution'', the same measures per unit area on the Earth. Scale would thus be proportional to the ratio between them, and a change in scale would require the adjustment of one or both of them by means of generalization. But what counts as a "bit" of map information? In specific cases, that is not difficult, such as counting the total number of features on the map, or the number of vertices in a single line (possibly reduced to the number of ''salient'' vertices); such straightforwardness explains why these were early targets for generalization research. However, it is a challenge for the map in general, in which questions arise such as "how much graphical information is there in a map label: one bit (the entire word), a bit for each character, or bits for each vertex or curve in every character, as if they were each area features?" Each option can be relevant at different times. This measurement is further complicated by the role of

map symbol A map symbol or cartographic symbol is a graphical device used to visually represent a real-world feature on a map, working in the same fashion as other forms of symbols. Map symbols may include point markers, lines, regions, continuous fields, ...

ogy, which can affect the ''apparent information density''. A map with a strong

visual hierarchy Visual hierarchy, according to Gestalt psychology, is a pattern in the visual field wherein some elements tend to "stand out," or attract attention, more strongly than other elements, suggesting a hierarchy of importance. While it may occur natura ...

(i.e., with less important layers being subdued but still present) carries an aesthetic of being "clear" because it appears at first glance to contain less data than it really does; conversely, a map with no visual hierarchy, in which all layers seem equally important, might be summarized as "cluttered" because one's first impression is that it contains more data than it really does. Designing a map to achieve the desired gestalt aesthetic is therefore about managing the apparent information density more than the actual information density. In the words of

Edward Tufte Edward Rolf Tufte (; born March 14, 1942), sometimes known as "ET",. is an American statistician and professor emeritus of political science, statistics, and computer science at Yale University. He is noted for his writings on information design ...

, There is recent work that recognizes the role of map symbols, including the Roth-Brewer typology of generalization operators, although they clarify that symbology is not a form of generalization, just a partner with generalization in achieving a desired apparent information density.

Operators

There are many cartographic techniques that are used to adjust the amount of geographic data on the map. Over the decades of generalization research, over a dozen unique lists of such ''generalization operators'' have been published, with significant differences. In fact, there are multiple reviews comparing the lists, and even they miss a few salient ones, such as that found in John Keates' first textbook (1973) that was apparently ahead of its time. Some of these operations have been automated by multiple algorithms, with tools available in Geographic information systems and other software; others have proven much more difficult, with most cartographers still performing them manually. Oklahoma osm

Select

''Also called filter, omission'' One of the first operators to be recognized and analyzed, first appearing in the 1973 Keates list, selection is the process of simply removing entire geographic features from the map. There are two types of selection, which are combined in some models, and separated in others: * ''Layer Selection'': (also called ''class selection'' or ''add'') the choice of which data layers or themes to include or not (for example, a street map including streets but not geology). * ''Feature Selection'': (sometimes called ''refinement'' or ''eliminate'') the choice of which specific features to include or remove within an included layers (for example, which 50 of the millions of cities to show on a world map). In feature selection, the choice of which features to keep or exclude is more challenging than it might seem. Using a simple attribute of real-world size (city population, road width or traffic volume, river flow volume), while often easily available in existing GIS data, often produces a selection that is excessively concentrated in some areas and sparse in others. Thus, cartographers often filter them using their degree of ''regional importance'', their prominence in their local area rather than the map as a whole, which produces a more balanced map, but is more difficult to automate. Many formulas have been developed for automatically ranking the regional importance of features, for example by balancing the raw size with the distance to the nearest feature of significantly greater size, similar to measures of

Topographic prominence In topography, prominence (also referred to as autonomous height, relative height, and shoulder drop in US English, and drop or relative height in British English) measures the height of a mountain or hill's summit relative to the lowest contou ...

, but this is much more difficult for line features than points, and sometimes produces undesirable results (such as the "Baltimore Problem," in which cities that seem important get left out). Another approach is to manually encode a subjective judgment of regional importance into the GIS data, which can subsequently be used to filter features; this was the approach taken for the Natural Earth dataset created by cartographers.

Simplify

Another early focus of generalization research, simplification is the removal of vertices in lines and area boundaries. A variety of algorithms have been developed, but most involve searching through the vertices of the line, removing those that contribute the least to the overall shape of the line. The

Ramer–Douglas–Peucker algorithm The Ramer–Douglas–Peucker algorithm, also known as the Douglas–Peucker algorithm and iterative end-point fit algorithm, is an algorithm that decimates a curve composed of line segments to a similar curve with fewer points. It was one of the e ...

(1972/1973) is one of the earliest and still most common techniques for line simplification. Most of these algorithms, especially the early ones, placed a higher priority on reducing the size of datasets in the days of limited digital storage, than on quality appearance on maps, and often produce lines that look excessively angular, especially on curves such as rivers. Some other algorithms include the Wang-Müller algorithm (1998) which looks for critical bends and is typically more accurate at the cost of processing time, and the Zhou-Jones algorithm (2005) and Visvalingam-Whyatt algorithm (1992) which use properties of the triangles within the polygon to determine which vertices to remove.

Smooth

For line features (and area boundaries), Smoothing seems similar to simplification, and in the past, was sometimes combined with simplification. The difference is that smoothing is designed to make the overall shape of the line look simpler by removing small details; which may actually require more vertices than the original. Simplification tends to make a curved line look angular, while Smoothing tends to do the opposite. The smoothing principle is also often used to generalize raster representations of

fields Fields may refer to: Music * Fields (band), an indie rock band formed in 2006 * Fields (progressive rock band), a progressive rock band formed in 1971 * ''Fields'' (album), an LP by Swedish-based indie rock band Junip (2010) * "Fields", a song b ...

, often using a

Kernel smoother A kernel smoother is a statistical technique to estimate a real valued function f: \mathbb^p \to \mathbb as the weighted average of neighboring observed data. The weight is defined by the ''kernel'', such that closer points are given higher weights ...

approach. This was actually one of the first published generalization algorithms, by

Waldo Tobler Waldo Rudolph Tobler (November 16, 1930 – February 20, 2018) was an American-Swiss geographer and cartographer. Tobler's idea that "Everything is related to everything else, but near things are more related than distant things" is referred to ...

in 1966.

Merge

''Also called dissolve, amalgamation, agglomeration, or combine'' This operation, identified by Imhof in 1937, involves combining neighboring features into a single feature of the same type, at scales where the distinction between them is not important. For example, a mountain chain may consist of several isolated ridges in the natural environment, but shown as a continuous chain on a small scale the map. Or, adjacent buildings in a complex could be combined into a single "building." For proper interpretation, the map reader must be aware that because of scale limitations combined elements are not perfect depictions of natural or manmade features. Dissolve is a common GIS tool that is used for this generalization operation, but additional tools GIS tools have been developed for specific situations, such as finding very small polygons and merging them into neighboring larger polygons. This operator is different from aggregation because there is no change in dimensionality (i.e. lines are dissolved into lines and polygons into polygons), and the original and final objects are of the same conceptual type (e.g., building becomes building).

Aggregate

''Also called combine or regionalization'' Aggregation is the merger of multiple features into a new composite feature, often of increased

Dimension In physics and mathematics, the dimension of a mathematical space (or object) is informally defined as the minimum number of coordinates needed to specify any point within it. Thus, a line has a dimension of one (1D) because only one coor ...

(usually points to areas). The new feature is of an ontological type different than the original individuals, because it conceptualizes the group. For example, a multitude of "buildings" can be turned into a single region representing an "urban area" (not a "building"), or a cluster of "trees" into a "forest". Some GIS software has aggregation tools that identify clusters of features and combine them. Aggregation differs from Merging in that it can operate across dimensions, such as aggregating points to lines, points to polygons, lines to polygons, and polygons to polygons, and that there is a conceptual difference between the source and product.

Typify

''Also called distribution refinement'' Typify is a symbology operator that replaces a large set of similar features with a smaller number of representative symbols, resulting in a sparser, cleaner map. For example, an area with dozens of mines might be symbolized with only 3 or 4 mine symbols that do not represent actual mine locations, just the general presence of mines in the area. Unlike the aggregation operator which replaces many related features with a single "group" feature, the symbols used in the typify operator still represent individuals, just "typical" individuals. It reduces the density of features while still maintaining its relative location and design. When using the typify operator, a new set of symbols is created, it does not change the spatial data. This operator can be used on point, line, and polygon features.

Collapse

''Also called Symbolize'' This operator reduces the

of a feature, such as the common practice of representing cities (2-dimensional) as points (0-dimensional), and roads (2-dimensional) as lines (1-dimensional). Frequently, a

Map symbol A map symbol or cartographic symbol is a graphical device used to visually represent a real-world feature on a map, working in the same fashion as other forms of symbols. Map symbols may include point markers, lines, regions, continuous fields, ...

is applied to the resultant geometry to give a general indication of its original extent, such as point diameter to represent city population or line thickness to represent the number of lanes in a road. Imhof (1937) discusses these particular generalizations at length. This operator frequently mimics a similar cognitive generalization practice. For example, unambiguously discussing the distance between two cities implies a point conceptualization of a city, and using phrases like "up the road" or "along the road" or even street addresses implies a line conceptualization of a road.

Reclassify

This operator primarily simplifies the attributes of the features, although a geometric simplification may also result. While Categorization is used for a wide variety of purposes, in this case the task is to take a large range of values that is too complex to represent on the map of a given scale, and reduce it to a few categories that is much simpler to represent, especially if geographic patterns result in large regions of the same category. An example would be to take a land cover layer with 120 categories, and group them into 5 categories (urban, agriculture, forest, water, desert), which would make a spatially simpler map. For discrete fields (also known as categorical coverages or area-class maps) represented as vector polygons, such as land cover, climate type, soil type, city zoning, or surface geology, reclassification often results in adjacent polygons with the same category, necessitating a subsequent dissolve operation to merge them.

Exaggerate

Exaggeration is the partial adjustment of geometry or symbology to make some aspect of a feature larger than it really is, in order to make them more visible, recognizable, or higher in the

. For example, a set of tight switchbacks in a road would run together on a small-scale map, so the road is redrawn with the loops larger and further apart than in reality. A symbology example would be drawing highways as thick lines in a small-scale map that would be miles wide if measured according to the scale. Exaggeration often necessitates a subsequent displacement operation because the exaggerated feature overlaps the actual location of nearby features, necessitating their adjustment.

Displace

''Also called conflict resolution'' Displacement can be employed when two objects are so close to each other that they would overlap at smaller scales, especially when an exaggerate operator has made the two objects larger than they really are. A common place where this would occur is the cities Brazzaville and Kinshasa on either side of the Congo river in Africa. They are both the capital city of their country and on overview maps they would be displayed with a slightly larger symbol than other cities. Depending on the scale of the map the symbols would overlap. By displacing both of them away from the river (and away from their true location) the symbol overlap can be avoided. Another common case is when a road and a railroad run parallel to each other. Keates (1973) was one of the first to use the modern terms for exaggeration and displacement and discuss their close relationship, but they were recognized as early as Imhof (1937)

Enhance

This is the addition of symbols or other details on a smaller scale map to make a particular feature make more sense, especially when such understanding is important the map purpose. A common example is the addition of a bridge symbol to emphasize that a road crossing is not at grade, but an overpass. At a large scale, such a symbol may not be necessary because of the different symbology and the increased space to show the actual relationship. This addition may seem counter-intuitive if one only thinks of generalization as the removal of detail. This is one of the least commonly listed operators.

GIS and automated generalization

As GIS developed from about the late 1960s onward, the need for automatic, algorithmic generalization techniques became clear. Ideally, agencies responsible for collecting and maintaining spatial data should try to keep only one canonical representation of a given feature, at the highest possible level of detail. That way there is only one record to update when that feature changes in the real world. From this large-scale data, it should ideally be possible, through automated generalization, to produce maps and other data products at any scale required. The alternative is to maintain separate databases each at the scale required for a given set of mapping projects, each of which requires attention when something changes in the real world. Several broad approaches to generalization were developed around this time: * The ''representation-oriented'' view focuses on the representation of data on different scales, which is related to the field of Multi-Representation

Database In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases s ...

s (MRDB). * The ''process-oriented'' view focuses on the process of generalization. * The ''ladder-approach'' is a stepwise generalization, in which each derived dataset is based on the other database of the next larger scale. * The ''star-approach'' is the derived data on all scales is based on a single (large-scale) data base.

Scaling law

There are far more small geographic features than large ones in the Earth's surface, or far more small things than large ones in maps. This notion of far more small things than large ones is also called spatial heterogeneity, which has been formulated as scaling law. Cartographic generalization or any mapping practices in general is essentially to retain the underlying scaling of numerous smallest, a very few largest, and some in between the smallest and largest. This mapping process can be efficiently and effectively achieved by head/tail breaks, a new classification scheme or visualization tool for data with a heavy tailed distribution. Scaling law is likely to replace Töpfer's radical law to be a universal law for various mapping practices. What underlies scaling law is something of paradigm shift from Euclidean geometry to fractal, from non-recursive thinking to recursive thinking.

The 'Baltimore phenomenon'

The Baltimore phenomenon is the tendency for a city (or other object) to be omitted from maps due to space constraints while smaller cities are included on the same map simply because space is available to display them. This phenomenon owes its name to the city of

Baltimore, Maryland Baltimore ( , locally: or ) is the List of municipalities in Maryland, most populous city in the U.S. state of Maryland, fourth most populous city in the Mid-Atlantic (United States), Mid-Atlantic, and List of United States cities by popula ...

, which tends to be omitted on maps due to the presence of larger cities in close proximity within the Mid-Atlantic United States. As larger cities near Baltimore appear on maps, smaller and lesser known cities may also appear at the same scale simply because there is enough space for them on the map. Although the Baltimore phenomenon occurs more frequently on automated mapping sites, it does not occur at every scale. Popular mapping sites like Google Maps, Bing Maps, OpenStreetMap, and Yahoo Maps will only begin displaying Baltimore at certain zoom levels: 5th, 6th, 7th, etc.

References

{{Reflist

External links

The ICA commission on generalization
Geographic information systems Communication design Graphic design