In
computer science
Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to practical disciplines (includin ...
, a suffix array is a sorted
array
An array is a systematic arrangement of similar objects, usually in rows and columns.
Things called an array include:
{{TOC right
Music
* In twelve-tone and serial composition, the presentation of simultaneous twelve-tone sets such that the ...
of all
suffixes
In linguistics, a suffix is an affix which is placed after the stem of a word. Common examples are case endings, which indicate the grammatical case of nouns, adjectives, and verb endings, which form the conjugation of verbs. Suffixes can carry gr ...
of a
string
String or strings may refer to:
*String (structure), a long flexible structure made from threads twisted together, which is used to tie, bind, or hang other objects
Arts, entertainment, and media Films
* ''Strings'' (1991 film), a Canadian anim ...
. It is a data structure used in, among others, full-text indices, data-compression algorithms, and the field of
bibliometrics
Bibliometrics is the use of statistical methods to analyse books, articles and other publications, especially in regard with scientific contents. Bibliometric methods are frequently used in the field of library and information science. Bibliom ...
.
Suffix arrays were introduced by as a simple, space efficient alternative to
suffix tree
In computer science, a suffix tree (also called PAT tree or, in an earlier form, position tree) is a compressed trie containing all the suffixes of the given text as their keys and positions in the text as their values. Suffix trees allow parti ...
s. They had independently been discovered by
Gaston Gonnet
Gaston H. Gonnet is a Uruguayan Canadian computer scientist and entrepreneur. He is best known for his contributions to the Maple computer algebra system and the creation of a digital version of the Oxford English Dictionary.
Education and e ...
in 1987 under the name ''PAT array'' .
gave the first in-place
time suffix array construction algorithm that is optimal both in time and space, where ''in-place'' means that the algorithm only needs
additional space beyond the input string and the output suffix array.
Enhanced suffix arrays (ESAs) are suffix arrays with additional tables that reproduce the full functionality of suffix trees preserving the same time and memory complexity.
The suffix array for a subset of all suffixes of a string is called
sparse suffix array
Sparse is a computer software tool designed to find possible coding faults in the Linux kernel. Unlike other such tools, this static analysis tool was initially designed to only flag constructs that were likely to be of interest to kernel deve ...
. Multiple probabilistic algorithms have been developed to minimize the additional memory usage including an optimal time and memory algorithm.
Definition
Let