topological data analysis In applied mathematics, topological data analysis (TDA) is an approach to the analysis of datasets using techniques from topology. Extraction of information from datasets that are high-dimensional, incomplete and noisy is generally challenging. TDA ...

, a simplex tree is a type of

trie In computer science, a trie (, ), also known as a digital tree or prefix tree, is a specialized search tree data structure used to store and retrieve strings from a dictionary or set. Unlike a binary search tree, nodes in a trie do not store t ...

used to represent efficiently any general

simplicial complex In mathematics, a simplicial complex is a structured Set (mathematics), set composed of Point (geometry), points, line segments, triangles, and their ''n''-dimensional counterparts, called Simplex, simplices, such that all the faces and intersec ...

. Through its nodes, this

data structure In computer science, a data structure is a data organization and storage format that is usually chosen for Efficiency, efficient Data access, access to data. More precisely, a data structure is a collection of data values, the relationships amo ...

notably explicitly represents all the simplices. Its flexible structure allows the implementation of many basic operations useful to computing persistent homology. This data structure was invented by Jean-Daniel Boissonnat and Clément Maria in 2014, in the article ''The Simplex Tree: An Efficient Data Structure for General Simplicial Complexes''. This data structure offers efficient operations on sparse simplicial complexes. For dense or maximal simplices, Skeleton-Blocker representations or Toplex Map representations are used.

Definitions

Many researchers in topological data analysis consider the simplex tree to be the most compact simplex-based data structure for simplicial complexes, and a data structure allowing an intuitive understanding of simplicial complexes due to integrated usage of their mathematical properties.

Heuristic definition

Consider any simplicial complex is a

set Set, The Set, SET or SETS may refer to: Science, technology, and mathematics Mathematics *Set (mathematics), a collection of elements *Category of sets, the category whose objects and morphisms are sets and total functions, respectively Electro ...

composed of

points A point is a small dot or the sharp tip of something. Point or points may refer to: Mathematics * Point (geometry), an entity that has a location in space or on a plane, but has no extent; more generally, an element of some abstract topologica ...

(0 dimensions),

line segment In geometry, a line segment is a part of a line (mathematics), straight line that is bounded by two distinct endpoints (its extreme points), and contains every Point (geometry), point on the line that is between its endpoints. It is a special c ...

s (1 dimension),

triangle A triangle is a polygon with three corners and three sides, one of the basic shapes in geometry. The corners, also called ''vertices'', are zero-dimensional points while the sides connecting them, also called ''edges'', are one-dimension ...

s (2 dimensions), and their ''n''-dimensional counterparts, called n-simplexes within a topological space. By the mathematical properties of simplexes, any n-simplex is composed of multiple

(n-1)

-simplexes. Thus, lines are composed of points, triangles of lines, and tetrahedrons of triangles. Notice each higher level adds 1 vertex to the vertices of the n-simplex. The data structure is simplex-based, therefore, it should represent all simplexes uniquely by the points defining the simplex. A simple way to achieve this is to define each simplex by its points in sorted order. Let

\Kappa

be a simplicial complex of dimension k,

V

its vertex set, where vertices are labeled from 1 to

\left\vert V \right\vert

and ordered accordingly. Now, construct a dictionary size

\left\vert V \right\vert

containing all vertex labels in order. This represents the 0-dimensional simplexes. Then, for the path to the initial dictionary of each entry in the initial dictionary, add as a child dictionary all vertices fully-connected to the current set of vertices, all of which have a label greater than

l

. Represent this step on k levels. Clearly, considering the first dictionary as depth 0, any entry at depth

\tau

of any dictionary in this data structure uniquely represents a

\tau

-simplex within

\Kappa

. For completeness, the point to the initial dictionary is considered the representation of the empty simplex. For the practicality of the operations, labels that are repeated on the same level are linked together, forming a looped

linked list In computer science, a linked list is a linear collection of data elements whose order is not given by their physical placement in memory. Instead, each element points to the next. It is a data structure consisting of a collection of nodes whi ...

. Finally, child dictionaries also have pointers to their parent dictionary, for fast ancestor access.

Constructive definition

Let

\Kappa

be a simplicial complex of dimension k. We begin by decomposing the simplicial complex into mutually exclusive simplexes. This can be achieved in a greedy way by iteratively removing from the simplicial complex the highest order simplexes until the simplicial complex is empty. We then need to label each vertex from 1 to

\left\vert V \right\vert

and associate each simplex with its corresponding "word", that is the ordered list of its vertices by label. Ordering the labels ensures no repetition in the simplex tree, as there is only one way to describe a simplex. We start with a null root, representing the null simplex. Then, we iterate through all simplexes, and through each label of each simplex word. If the label is available as a child to the current root, make that child the temporary root of the insertion process, otherwise, create a new node for the child, make it the new temporary root, and continue with the rest of the word. During this process, k dictionaries are maintained with all the labels and insert the address of the node for the corresponding label. If an address is already at that space in the dictionary, a pointer is created from the old node to the new node. Once the process is finished, all children of each node are entered into a dictionary, and all pointers are looped to make looped linked lists. A wide range of dictionaries could be applied here, like hash tables, but some operations assume the possibility of an ordered traversal of the entries, leading most of the implementations to use red-black trees are dictionaries.

Operations

While simplex trees are not the most space efficient data structures for simplicial

complex representation In mathematics, a complex representation is a representation of a group (or that of Lie algebra In mathematics, a Lie algebra (pronounced ) is a vector space \mathfrak g together with an operation called the Lie bracket, an alternating bil ...

, their operations on sparse data are considered state-of-art. Here, we give the bounds of different useful operations possible through this representation. Many implementations of these operations are available. We first introduce the notation. Consider

s

is a given simplex,

\sigma

is a given node corresponding to the last vertex of

s

l

is the label associate to that node,

j

is the depth of that node,

k

is the dimension of the simplicial complex,

D_\sigma

is the maximal number of operations to access

\sigma

in a dictionary (if the dictionary is a red-black tree,

D_\sigma=O(log(deg(\sigma)))

is the complexity) . Consider

C_s

is the number of cofaces of

s

, and

N_l^

is the number of nodes of the simplex tree ending with the label

l

at depth greater than

j

. Notice

N_l^ \leq C_s

. # Search, insert and remove words are done in

O(j D_\sigma)

. # Insert and remove an entire simplex is done in

O(2^j D_\sigma)

. # Computing persistent homology, or in a more involved way, computing

Betti number In algebraic topology, the Betti numbers are used to distinguish topological spaces based on the connectivity of ''n''-dimensional simplicial complexes. For the most reasonable finite-dimensional spaces (such as compact manifolds, finite simplicia ...

s, using a simplex tree most efficiently remains an open problem, however, current algorithms for this task on sparse simplicial complexes achieve state-of-art performance. # The structure of simplex trees allows for elementary collapse of collapsible simplexes, however the bounds of this operation in the general case are unknown. # A subcase of elementary collapse is edge-contraction.

Edge contraction In graph theory, an edge contraction is an operation that removes an edge from a graph while simultaneously merging the two vertices that it previously joined. Edge contraction is a fundamental operation in the theory of graph minors. Vertex id ...

can be achieved in

O(k N_l^+C_s D_\sigma)

. # Locating cofaces of given simplex can be achieved in

O(k N_l^)

. # Locating cofacets of given simplex can be achieved in

O(j^2 D_\sigma)

. As for construction, as seen in the constructive definition, construction is proportional to the number and complexity of simplexes in the simplicial complex. This can be especially expensive if the simplicial complex is dense. However, some optimizations for particular simplicial complexes, including for

Flag complex Clique complexes, independence complexes, flag complexes, Whitney complexes and conformal hypergraphs are closely related mathematical objects in graph theory and geometric topology that each describe the cliques (complete subgraphs) of an undir ...

es, Rips complexes and Witness complexes.

Applications

Simplex trees are efficient in sparse simplicial complexes. For this purpose, many persistent homology algorithms focusing on high-dimensional real data (often sparse) use simplex trees within these algorithms. While simplex trees are not as efficient as incidence matrices, their simplex-based structure allows them to be useful and efficient for simplicial complex storage within persistent homology algorithms.

References

{{reflist Trees (data structures) Simplicial sets