In persistent homology, a persistent homology group is a multiscale analog of a

homology group In mathematics, the term homology, originally introduced in algebraic topology, has three primary, closely-related usages. The most direct usage of the term is to take the ''homology of a chain complex'', resulting in a sequence of abelian grou ...

that captures information about the evolution of topological features across a

filtration Filtration is a physical separation process that separates solid matter and fluid from a mixture using a ''filter medium'' that has a complex structure through which only the fluid can pass. Solid particles that cannot pass through the filte ...

of spaces. While the ordinary homology group represents nontrivial homology classes of an individual

topological space In mathematics, a topological space is, roughly speaking, a Geometry, geometrical space in which Closeness (mathematics), closeness is defined but cannot necessarily be measured by a numeric Distance (mathematics), distance. More specifically, a to ...

, the persistent homology group tracks only those classes that remain nontrivial across multiple parameters in the underlying filtration. Analogous to the ordinary

Betti number In algebraic topology, the Betti numbers are used to distinguish topological spaces based on the connectivity of ''n''-dimensional simplicial complexes. For the most reasonable finite-dimensional spaces (such as compact manifolds, finite simplicia ...

, the ranks of the persistent homology groups are known as the persistent Betti numbers. Persistent homology groups were first introduced by Herbert Edelsbrunner, David Letscher, and Afra Zomorodian in a 2002 paper ''Topological Persistence and Simplification'', one of the foundational papers in the fields of persistent homology and

topological data analysis In applied mathematics, topological data analysis (TDA) is an approach to the analysis of datasets using techniques from topology. Extraction of information from datasets that are high-dimensional, incomplete and noisy is generally challenging. TDA ...

, based largely on the persistence barcodes and the persistence algorithm, that were first described by Serguei Barannikov in the 1994 paper. Since then, the study of persistent homology groups has led to applications in

data science Data science is an interdisciplinary academic field that uses statistics, scientific computing, scientific methods, processing, scientific visualization, algorithms and systems to extract or extrapolate knowledge from potentially noisy, stru ...

machine learning Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of Computational statistics, statistical algorithms that can learn from data and generalise to unseen data, and thus perform Task ( ...

materials science Materials science is an interdisciplinary field of researching and discovering materials. Materials engineering is an engineering field of finding uses for materials in other fields and industries. The intellectual origins of materials sci ...

biology Biology is the scientific study of life and living organisms. It is a broad natural science that encompasses a wide range of fields and unifying principles that explain the structure, function, growth, History of life, origin, evolution, and ...

, and

economics Economics () is a behavioral science that studies the Production (economics), production, distribution (economics), distribution, and Consumption (economics), consumption of goods and services. Economics focuses on the behaviour and interac ...

Definition

Let

K

be a

simplicial complex In mathematics, a simplicial complex is a structured Set (mathematics), set composed of Point (geometry), points, line segments, triangles, and their ''n''-dimensional counterparts, called Simplex, simplices, such that all the faces and intersec ...

, and let

f: K \to \mathbb R

be a

real-valued In mathematics, value may refer to several, strongly related notions. In general, a mathematical value may be any definite mathematical object. In elementary mathematics, this is most often a number – for example, a real number such as or an ...

monotonic function In mathematics, a monotonic function (or monotone function) is a function between ordered sets that preserves or reverses the given order. This concept first arose in calculus, and was later generalized to the more abstract setting of or ...

. Then for some values

a_0 < a_1 < \cdots < a_n \in \mathbb R

the sublevel-sets

K(a) := f^(-\infty, a]

yield a sequence of nested subcomplexes

\emptyset = K_0 \subseteq K_1 \subseteq \cdots \subseteq K_n = K

known as a ''filtration'' of

K

. Applying

p^

homology to each complex yields a sequence of homology groups

0 = H_p (K_0) \to H_p (K_1) \to \cdots \to H_p (K_n) = H_p (K)

connected by

homomorphism In algebra, a homomorphism is a morphism, structure-preserving map (mathematics), map between two algebraic structures of the same type (such as two group (mathematics), groups, two ring (mathematics), rings, or two vector spaces). The word ''homo ...

s induced by the

inclusion map In mathematics, if A is a subset of B, then the inclusion map is the function \iota that sends each element x of A to x, treated as an element of B: \iota : A\rightarrow B, \qquad \iota(x)=x. An inclusion map may also be referred to as an inclu ...

s of the underlying filtration. When homology is taken over a field, we get a sequence of

vector space In mathematics and physics, a vector space (also called a linear space) is a set (mathematics), set whose elements, often called vector (mathematics and physics), ''vectors'', can be added together and multiplied ("scaled") by numbers called sc ...

s and

linear map In mathematics, and more specifically in linear algebra, a linear map (also called a linear mapping, linear transformation, vector space homomorphism, or in some contexts linear function) is a mapping V \to W between two vector spaces that p ...

s known as a persistence module. Let

f_p^: H_p (K_i) \to H_p (K_j)

be the homomorphism induced by the inclusion

K_i \hookrightarrow K_j

. Then the

p^

persistent homology groups are defined as the

images An image or picture is a visual representation. An image can be two-dimensional, such as a drawing, painting, or photograph, or three-dimensional, such as a carving or sculpture. Images may be displayed through other media, including a project ...

H_p^ := \operatorname f_p^

for all

1 \leq i \leq j \leq n

. In particular, the persistent homology group

H_p^ = H_p (K_i)

. More precisely, the

p^

persistent homology group can be defined as

H_p^ = Z_p (K_i) / \left( B_p (K_j) \cap Z_p(K_i) \right)

, where

Z_p(K_\bullet)

and

B_p(K_\bullet)

are the standard p-cycle and p-boundary groups, respectively.

Birth and death of homology classes

Sometimes the elements of

H_p^

are described as the homology classes that are "born" at or before

K_i

and that have not yet "died" entering

K_j

. These notions can be made precise as follows. A homology class

\gamma \in H_p (K_i)

is said to be ''born'' at

K_i

if it is not contained in the image of the previous persistent homology group, i.e.,

\gamma \notin H_p^

. Conversely,

\gamma

is said to ''die entering''

K_j

\gamma

is subsumed (i.e., merges with) another older class as the sequence proceeds from

K_ \to K_j

. That is to say,

f_p^ (\gamma) \notin H_p^

but

f_p^ (\gamma) \in H_p^

. The determination that an older class persists if it merges with a younger class, instead of the other way around, is sometimes known as the ''Elder Rule''. The indices

i,j

at which a homology class

\gamma

is born and dies entering are known as the ''birth'' and ''death'' indices of

\gamma

. The difference

j-i

is known as the ''index persistence'' of

\gamma

, while the corresponding difference

a_j - a_i

in function values corresponding to those indices is known as the ''persistence'' of

\gamma

. If there exists no index at which

\gamma

dies, it is assigned an infinite death index. Thus, the persistence of each class can be represented as an interval in the extended real line

\mathbb R \cup \

of either the form

[a_i, a_j)

[a_i', \infty)

. Since, in the case of an infinite field, the infinite number of classes always have the same persistence, the collection over ''all'' classes of such intervals does not give meaningful multiplicities for a multiset of intervals. Instead, such multiplicities and a multiset of intervals in the extended real line are given by the structure theorem of persistent homology, persistence homology. This multiset is known as the '' persistence barcode''.

Canonical form

Concretely, the structure theorem states that for any filtered complex over a field

F

, there exists a linear transformation that preserves the filtration and converts the filtered complex into so called canonical form, a canonically defined direct sum of filtered complexes of two types: two-dimensional complexes with trivial homology

d(e_)=e_

and one-dimensional complexes with trivial differential

d(e_)=0

Persistence diagram

Geometrically, a barcode can be plotted as a multiset of points (with possibly infinite coordinates)

(a_i, a_j)

in the extended plane

\left( \mathbb R \cup \ \right)^2

. By the above definitions, each point will lie above the diagonal, and the distance to the diagonal is exactly equal to the persistence of the corresponding class times

\frac{\sqrt 2}

. This construction is known as the ''persistence diagram'', and it provides a way of visualizing the structure of the persistence of homology classes in the sequence of persistent homology groups.

References

Computational topology Data analysis Homology theory Algebraic topology