Land cover maps are tools that provide vital information about the Earth's land use and cover patterns. They aid policy development,

urban planning Urban planning (also called city planning in some contexts) is the process of developing and designing land use and the built environment, including air, water, and the infrastructure passing into and out of urban areas, such as transportatio ...

, and forest and agricultural monitoring. The systematic mapping of land cover patterns, including change detection, often follows two main approaches: *

Field survey Field research, field studies, or fieldwork is the collection of raw data outside a laboratory, library, or workplace setting. The approaches and methods used in field research vary across disciplines. For example, biologists who conduct f ...

* Remote sensing satellite image processing. This cost-efficient approach employs several techniques for image pre-processing and processing to accurately map land cover patterns. These techniques detect changes at various spatial scales following a series of

machine learning Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of Computational statistics, statistical algorithms that can learn from data and generalise to unseen data, and thus perform Task ( ...

simulations and

statistical Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...

applications. Image pre-processing is normally done through

radiometric Radiometry is a set of techniques for measuring electromagnetic radiation, including visible light. Radiometric techniques in optics characterize the distribution of the radiation's power in space, as opposed to photometric techniques, which ch ...

corrections, while image processing involves the application of either unsupervised or supervised

classification Classification is the activity of assigning objects to some pre-existing classes or categories. This is distinct from the task of establishing the classes themselves (for example through cluster analysis). Examples include diagnostic tests, identif ...

s and vegetation indices quantification for land cover map production.

Supervised classification

A supervised classification is a system of classification in which the user builds a series of randomly generated training datasets or spectral signatures representing different land-use and land-cover (LULC) classes and applies these datasets in machine learning models to predict and spatially classify LULC patterns and evaluate classification accuracies.

Algorithms

Several

machine learning algorithms The following outline is provided as an overview of, and topical guide to, machine learning: Machine learning (ML) is a subfield of artificial intelligence within computer science that evolved from the study of pattern recognition and computat ...

have been developed for supervised classification. * Maximum likelihood classification (MLC) – This approach classifies overlapping signatures by estimating the

probability Probability is a branch of mathematics and statistics concerning events and numerical descriptions of how likely they are to occur. The probability of an event is a number between 0 and 1; the larger the probability, the more likely an e ...

that an image

pixel In digital imaging, a pixel (abbreviated px), pel, or picture element is the smallest addressable element in a Raster graphics, raster image, or the smallest addressable element in a dot matrix display device. In most digital display devices, p ...

with the maximum likelihood corresponds to a particular LULC type. It is also dependent on the mean and

covariance In probability theory and statistics, covariance is a measure of the joint variability of two random variables. The sign of the covariance, therefore, shows the tendency in the linear relationship between the variables. If greater values of one ...

matrices of training datasets and assumes

statistical significance In statistical hypothesis testing, a result has statistical significance when a result at least as "extreme" would be very infrequent if the null hypothesis were true. More precisely, a study's defined significance level, denoted by \alpha, is the ...

of image pixels. * Minimum distance (MD) – A form of supervised classification that defines decision boundaries between image pixels to classify land cover. The decision boundaries are formed by calculating the mean distance between class pixels and using the

standard deviation In statistics, the standard deviation is a measure of the amount of variation of the values of a variable about its Expected value, mean. A low standard Deviation (statistics), deviation indicates that the values tend to be close to the mean ( ...

of the generated training datasets to generate a

parallelepiped In geometry, a parallelepiped is a three-dimensional figure formed by six parallelograms (the term ''rhomboid'' is also sometimes used with this meaning). By analogy, it relates to a parallelogram just as a cube relates to a square. Three equiva ...

box. *

Mahalanobis distance The Mahalanobis distance is a distance measure, measure of the distance between a point P and a probability distribution D, introduced by Prasanta Chandra Mahalanobis, P. C. Mahalanobis in 1936. The mathematical details of Mahalanobis distance ...

– A system of classification that uses the

Euclidean distance In mathematics, the Euclidean distance between two points in Euclidean space is the length of the line segment between them. It can be calculated from the Cartesian coordinates of the points using the Pythagorean theorem, and therefore is o ...

algorithm to assign land cover classes from a set of training datasets. * Spectral angler mapper (SAM) – A spectral image classification approach that uses angular measurements to determine the relationship between two spectra, treating them as vectors in a ''q''-dimensional space, with the ''q''-dimensions representing the number of bands. * Discriminant analysis (DA) – A system of classification in which the classifying algorithm separates groups of closely related image pixels into classes, minimizing the variance within classes, and maximizing the variance between classes following a maximum likelihood discriminant rule. *

Genetic algorithm In computer science and operations research, a genetic algorithm (GA) is a metaheuristic inspired by the process of natural selection that belongs to the larger class of evolutionary algorithms (EA). Genetic algorithms are commonly used to g ...

– A system of classification that applies genetic principles for selecting appropriate clusters of training data and classifying them under the influence of predictors (satellite image bands). * Subspace – A classification approach in which the classifier creates low dimensional subspaces of each land cover class selected from a cluster of training points. The approach of dimensional subspace creation involves performing a

principal component analysis Principal component analysis (PCA) is a linear dimensionality reduction technique with applications in exploratory data analysis, visualization and data preprocessing. The data is linearly transformed onto a new coordinate system such that th ...

on the training points. Two types of subspace algorithms exist for minimizing land cover classification errors: class-featuring information compression (CLAFIC) and the average learning subspace method (ALSM). * Parallelepiped classification – A feature space classifier that assigns range of values for each land cover class within each image band and creates bounding boxes where pixels from each land cover class are selected for training the classifier. * Multi-perceptron artificial neural networks (MPANNs) – A system of classification in which the classifier uses a series of

neural network A neural network is a group of interconnected units called neurons that send signals to one another. Neurons can be either biological cells or signal pathways. While individual neurons are simple, many of them together in a network can perfor ...

s or nodes to classify land cover based on

backpropagation In machine learning, backpropagation is a gradient computation method commonly used for training a neural network to compute its parameter updates. It is an efficient application of the chain rule to neural networks. Backpropagation computes th ...

s of training samples. *

Support vector machines In machine learning, support vector machines (SVMs, also support vector networks) are supervised max-margin models with associated learning algorithms that analyze data for classification and regression analysis. Developed at AT&T Bell Laborato ...

(SVMs) – A classification approach in which the classifier uses support vectors to obtain optimal decision boundaries separating two or more land cover classes. *

Random forest Random forests or random decision forests is an ensemble learning method for statistical classification, classification, regression analysis, regression and other tasks that works by creating a multitude of decision tree learning, decision trees ...

(RF) – An approach in which the classifier uses bootstraps to create several decision trees that classify training datasets based on a number of satellite image bands. * ''K''-nearest neighbors algorithm (''k''NN) – This approach draws ''k'' closest samples from training datasets and classifies land cover based on the distance between these samples. *

Decision tree A decision tree is a decision support system, decision support recursive partitioning structure that uses a Tree (graph theory), tree-like Causal model, model of decisions and their possible consequences, including probability, chance event ou ...

(DT) – Like RF, DT constitutes a set of connected nodes that partition training samples into a set of land cover clusters. Its advantages are that it is fast, easy to construct and interpret for smaller data, and good at excluding background or unimportant information. It is disadvantageous in that it can create

overfitting In mathematical modeling, overfitting is "the production of an analysis that corresponds too closely or exactly to a particular set of data, and may therefore fail to fit to additional data or predict future observations reliably". An overfi ...

, especially for large datasets. * Fuzzy clustering (FZ)

Unsupervised classification

Unsupervised classification is a system of classification in which single or groups of pixels are automatically classified by the software without the user applying signature files or training data. However, the user defines the number of classes for which the computer will automatically generate by grouping similar pixels into a single category using a clustering algorithm. This system of classification is mostly used in areas with no field observations or prior knowledge on the available land cover types.

Algorithms

* Iterative self-organizing data analysis technique (ISODATA) – In this approach, the classifier automatically groups a number of closely related image pixels into clusters, and then computes the mean clusters and classifies land cover based on a series of repeated iterations. * ''K''-means clustering – An approach in which the computer automatically extracts ''k'' land cover features from satellite images, and classifies the overall image based on the calculated means of the extracted features.

Vegetation indices classification

Vegetation indices classification is a system in which two or more spectral bands are combined through defined statistical algorithms to reflect the spatial properties of a vegetation cover. Most of these indices make use of the relationship between red and

near-infrared Infrared (IR; sometimes called infrared light) is electromagnetic radiation (EMR) with wavelengths longer than that of visible light but shorter than microwaves. The infrared spectral band begins with the waves that are just longer than those of ...

(NIR) bands of satellite images to generate vegetation properties. Several vegetation indices have been developed; scientists apply these via remote sensing to effectively classify forest cover and land use patterns. These spectral indices use two or more bands to accurately acquire surface reflectance of land features, thereby improving classification accuracy.

Vegetation indices

* Normalized difference vegetation index (NDVI) – Defined as the ratio between the red and near-infrared (NIR) bands of satellite images. It is calculated as: ::

\text =

:This index measures vegetation greenness, with values ranging between -1 and 1. High NDVI values represent dense vegetation cover, moderate NDVI values represent sparse vegetation cover, and low NDVI values correspond to non-vegetated areas (e.g., barren or bare lands). * Enhanced vegetation index (EVI) – Defined as the ratio between the red, NIR, and blue bands, with a gain factor (G), soil brightness correction factor (L) and atmospheric aerosol correction factors (C). It is calculated as: ::

G \times

:with usually default values of L = 0.5 and G = 2.5. * Soil adjusted vegetation index (SAVI) – Defined as the ratio between the red and NIR values with a soil brightness correction factor (L). It is calculated as: ::

\text = (1 + L) \times

* Canopy shadow index (SI) – Defined as the square root of the red and green bands of satellite images. It evaluates the different shadow patterns of forest canopies based on age, structure, and composition, as well as easily differentiates dense forests from grass and bare lands. It is calculated as: ::

\text = \sqrt[]

:where both red and green range between 0 and 256. *Advanced vegetation index (AVI) – Used to differentiate forest cover from grassland and bare land areas. It is calculated as: ::

\text = \sqrt /math>
:where red ranges between 0 and 256.
*Bare soil index (BSI) – Defined as the ratio between the NIR, red, and blue bands of satellite images. It measures the amount of bare soil and as such increases with decrease forest density. It is calculated as:
:: \text = *Normalized differential water index (NDWI) – Developed for quantifying the water content of plants and other earth system features, using short-wave infrared (SWIR). It is calculated as:
:: \text = *Normalized differential built-up index (NDBI) – Developed for quantifying built-up areas in satellite images. It is calculated as:
:: \text =

External links

Documentation Documentation is any communicable material that is used to describe, explain or instruct regarding some attributes of an object, system or procedure, such as its parts, assembly, installation, maintenance, and use. As a form of knowledge managem ...

for

OpenStreetMap OpenStreetMap (abbreviated OSM) is a free, Open Database License, open geographic database, map database updated and maintained by a community of volunteers via open collaboration. Contributors collect data from surveying, surveys, trace from Ae ...

about land cover

References

{{Reflist Land use Land surveying systems

Supervised classification

Algorithms

Unsupervised classification

Algorithms

Vegetation indices classification

Vegetation indices

See also

External links

References