Manifold alignment is a class of

machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...

algorithms that produce projections between sets of data, given that the original data sets lie on a common

manifold In mathematics, a manifold is a topological space that locally resembles Euclidean space near each point. More precisely, an n-dimensional manifold, or ''n-manifold'' for short, is a topological space with the property that each point has a ...

. The concept was first introduced as such by Ham, Lee, and Saul in 2003, adding a manifold constraint to the general problem of correlating sets of high-dimensional vectors.

Overview

Manifold alignment assumes that disparate data sets produced by similar generating processes will share a similar underlying

representation. By learning projections from each original space to the shared manifold, correspondences are recovered and knowledge from one domain can be transferred to another. Most manifold alignment techniques consider only two data sets, but the concept extends to arbitrarily many initial data sets. Consider the case of aligning two data sets,

X

and

Y

, with

X_i \in \mathbb^m

and

Y_i \in \mathbb^n

. Manifold alignment algorithms attempt to project both

X

and

Y

into a new ''d''-dimensional space such that the projections both minimize distance between corresponding points and preserve the local manifold structure of the original data. The projection functions are denoted:

\phi_:\,\mathbb^\rightarrow\mathbb^

\phi_:\,\mathbb^\rightarrow\mathbb^

Let

W

represent the binary correspondence matrix between points in

X

and

Y

W_=\begin
1 & if\, X_\leftrightarrow Y_\\
0 & otherwise
\end

Let

S_X

and

S_Y

represent pointwise similarities within data sets. This is usually encoded as the

heat kernel In the mathematical study of heat conduction and diffusion, a heat kernel is the fundamental solution to the heat equation on a specified domain with appropriate boundary conditions. It is also one of the main tools in the study of the spectr ...

of the

adjacency matrix In graph theory and computer science, an adjacency matrix is a square matrix used to represent a finite graph. The elements of the matrix indicate whether pairs of vertices are adjacent or not in the graph. In the special case of a finite simple ...

of a ''k''-nearest neighbor graph. Finally, introduce a coefficient

0 \le \mu \le 1

, which can be tuned to adjust the weight of the 'preserve manifold structure' goal, versus the 'minimize corresponding point distances' goal. With these definitions in place, the

loss function In mathematical optimization and decision theory, a loss function or cost function (sometimes also called an error function) is a function that maps an event or values of one or more variables onto a real number intuitively representing some "co ...

for manifold alignment can be written:

\arg\min_\mu\sum_\left\Vert \phi_\left(X_\right)-\phi_\left(X_\right)\right\Vert ^S_+\mu\sum_\left\Vert \phi_\left(Y_\right)-\phi_\left(Y_\right)\right\Vert ^S_+\left(1-\mu\right)\sum_\Vert\phi_\left(X_\right)-\phi_\left(Y_\right)\Vert^W_

Solving this optimization problem is equivalent to solving a

generalized eigenvalue problem In linear algebra, eigendecomposition is the factorization of a matrix into a canonical form, whereby the matrix is represented in terms of its eigenvalues and eigenvectors. Only diagonalizable matrices can be factorized in this way. When the matri ...

using the

graph laplacian In the mathematical field of graph theory, the Laplacian matrix, also called the graph Laplacian, admittance matrix, Kirchhoff matrix or discrete Laplacian, is a matrix representation of a graph. Named after Pierre-Simon Laplace, the graph Lapla ...

of the joint matrix, ''G'':

G=\left begin
\mu S_X & \left(1-\mu\right)W\\
\left(1-\mu\right)W^T & \mu S_Y
\end\right

Inter-data correspondences

The algorithm described above requires full pairwise correspondence information between input data sets; a

supervised learning Supervised learning (SL) is a machine learning paradigm for problems where the available data consists of labelled examples, meaning that each data point contains features (covariates) and an associated label. The goal of supervised learning alg ...

paradigm. However, this information is usually difficult or impossible to obtain in real world applications. Recent work has extended the core manifold alignment algorithm to semi-supervised , unsupervised , and multiple-instance settings.

One-step vs. two-step alignment

The algorithm described above performs a "one-step" alignment, finding embeddings for both data sets at the same time. A similar effect can also be achieved with "two-step" alignments , following a slightly modified procedure: # Project each input data set to a lower-dimensional space independently, using any of a variety of

dimension reduction Dimensionality reduction, or dimension reduction, is the transformation of data from a high-dimensional space into a low-dimensional space so that the low-dimensional representation retains some meaningful properties of the original data, ideally ...

algorithms. # Perform linear manifold alignment on the embedded data, holding the first data set fixed, mapping each additional data set onto the first's manifold. This approach has the benefit of decomposing the required computation, which lowers memory overhead and allows parallel implementations.

Instance-level vs. feature-level projections

Manifold alignment can be used to find linear (feature-level) projections, or nonlinear (instance-level) embeddings. While the instance-level version generally produces more accurate alignments, it sacrifices a great degree of flexibility as the learned embedding is often difficult to parameterize. Feature-level projections allow any new instances to be easily embedded in the manifold space, and projections may be combined to form direct mappings between the original data representations. These properties are especially important for knowledge-transfer applications.

Applications

Manifold alignment is suited to problems with several corpora that lie on a shared manifold, even when each corpus is of a different dimensionality. Many real-world problems fit this description, but traditional techniques are not able to take advantage of all corpora at the same time. Manifold alignment also facilitates

transfer learning Transfer learning (TL) is a research problem in machine learning (ML) that focuses on storing knowledge gained while solving one problem and applying it to a different but related problem. For example, knowledge gained while learning to recognize ...

, in which knowledge of one domain is used to jump-start learning in correlated domains. Applications of manifold alignment include: *

Cross-language information retrieval Cross-language information retrieval (CLIR) is a subfield of information retrieval dealing with retrieving information written in a language different from the language of the user's query. The term "cross-language information retrieval" has man ...

/ automatic translation ** By representing documents as vector of word counts, manifold alignment can recover the mapping between documents of different languages. ** Cross-language document correspondence is relatively easy to obtain, especially from multi-lingual organizations like the

European Union The European Union (EU) is a supranational political and economic union of member states that are located primarily in Europe. The union has a total area of and an estimated total population of about 447million. The EU has often been ...

. * Transfer learning of policy and state representations for reinforcement learning * Alignment of

protein NMR Nuclear magnetic resonance spectroscopy of proteins (usually abbreviated protein NMR) is a field of structural biology in which NMR spectroscopy is used to obtain information about the structure and dynamics of proteins, and also nucleic acids, and ...

structures * Accelerating model learning in robotics by sharing data generated by other robots

Overview

Inter-data correspondences

One-step vs. two-step alignment

Instance-level vs. feature-level projections

Applications

See also

References

Further reading