
Multidimensional scaling (MDS) is a means of visualizing the level of
similarity
Similarity may refer to:
In mathematics and computing
* Similarity (geometry), the property of sharing the same shape
* Matrix similarity, a relation between matrices
* Similarity measure, a function that quantifies the similarity of two objects
* ...
of individual cases of a dataset. MDS is used to translate "information about the pairwise 'distances' among a set of
objects or individuals" into a configuration of
points mapped into an abstract
Cartesian space
A Cartesian coordinate system (, ) in a plane is a coordinate system that specifies each point uniquely by a pair of numerical coordinates, which are the signed distances to the point from two fixed perpendicular oriented lines, measured i ...
.
More technically, MDS refers to a set of related
ordination
Ordination is the process by which individuals are consecrated, that is, set apart and elevated from the laity class to the clergy, who are thus then authorized (usually by the denominational hierarchy composed of other clergy) to perform v ...
techniques used in
information visualization
Information is an abstract concept that refers to that which has the power to inform. At the most fundamental level information pertains to the interpretation of that which may be sensed. Any natural process that is not completely random, ...
, in particular to display the information contained in a
distance matrix
In mathematics, computer science and especially graph theory, a distance matrix is a square matrix
In mathematics, a square matrix is a matrix with the same number of rows and columns. An ''n''-by-''n'' matrix is known as a square matrix of orde ...
. It is a form of
non-linear dimensionality reduction.
Given a distance matrix with the distances between each pair of objects in a set, and a chosen number of dimensions, ''N'', an MDS
algorithm
In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing ...
places each object into ''N''-
dimension
In physics and mathematics, the dimension of a mathematical space (or object) is informally defined as the minimum number of coordinates needed to specify any point within it. Thus, a line has a dimension of one (1D) because only one coor ...
al space (a lower-dimensional representation) such that the between-object distances are preserved as well as possible. For ''N'' = 1, 2, and 3, the resulting points can be visualized on a
scatter plot
A scatter plot (also called a scatterplot, scatter graph, scatter chart, scattergram, or scatter diagram) is a type of plot or mathematical diagram using Cartesian coordinates to display values for typically two variables for a set of data. ...
.
Core theoretical contributions to MDS were made by
James O. Ramsay
James O. Ramsay (born 5 September 1942) is a Canadian statistician and Professor Emeritus at McGill University, Montreal, who developed much of the statistical theory behind multidimensional scaling (MDS). Together with co-author Bernard Silverm ...
of
McGill University
McGill University (french: link=no, Université McGill) is an English-language public research university located in Montreal, Quebec, Canada. Founded in 1821 by royal charter granted by King George IV,Frost, Stanley Brice. ''McGill Universit ...
, who is also regarded as the founder of
functional data analysis Functional data analysis (FDA) is a branch of statistics that analyses data providing information about curves, surfaces or anything else varying over a continuum. In its most general form, under an FDA framework, each sample element of functional ...
.
Types
MDS algorithms fall into a
taxonomy
Taxonomy is the practice and science of categorization or classification.
A taxonomy (or taxonomical classification) is a scheme of classification, especially a hierarchical classification, in which things are organized into groups or types. ...
, depending on the meaning of the input matrix:
Classical multidimensional scaling
It is also known as Principal Coordinates Analysis (PCoA), Torgerson Scaling or Torgerson–Gower scaling. It takes an input matrix giving dissimilarities between pairs of items and outputs a coordinate matrix whose configuration minimizes a
loss function
In mathematical optimization and decision theory, a loss function or cost function (sometimes also called an error function) is a function that maps an event or values of one or more variables onto a real number intuitively representing some "co ...
called ''strain.'',
which is given by
where
denote vectors in ''N''-dimensional space,
denotes the scalar product between
and
, and
are the elements of the matrix
defined on step 2 of the following algorithm, which are computed from the distances.
: Steps of a Classical MDS algorithm:
: Classical MDS uses the fact that the coordinate matrix
can be derived by
eigenvalue decomposition
In linear algebra, eigendecomposition is the factorization of a matrix into a canonical form, whereby the matrix is represented in terms of its eigenvalues and eigenvectors. Only diagonalizable matrices can be factorized in this way. When the matri ...
from
. And the matrix
can be computed from proximity matrix
by using double centering.
:# Set up the squared proximity matrix