In
computer science
Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to practical disciplines (includin ...
, splaysort is an
adaptive comparison sort
A comparison sort is a type of sorting algorithm that only reads the list elements through a single abstract comparison operation (often a "less than or equal to" operator or a three-way comparison) that determines which of two elements should oc ...
ing
algorithm
In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing ...
based on the
splay tree data structure
In computer science, a data structure is a data organization, management, and storage format that is usually chosen for Efficiency, efficient Data access, access to data. More precisely, a data structure is a collection of data values, the rel ...
.
Algorithm
The steps of the algorithm are:
# Initialize an empty splay tree
# For each data item in the input order, insert it into the splay tree
# Traverse the splay tree in
inorder to find the sorted order of the data
Thus, the algorithm may be seen as a form of
insertion sort
Insertion sort is a simple sorting algorithm that builds the final sorted array (or list) one item at a time by comparisons. It is much less efficient on large lists than more advanced algorithms such as quicksort, heapsort, or merge sort. Howe ...
or
tree sort
A tree sort is a sort algorithm that builds a binary search tree from the elements to be sorted, and then traverses the tree ( in-order) so that the elements come out in sorted order. Its typical use is sorting elements online: after each insert ...
, using a splay tree to speed up each insertion.
Analysis
Based on the
amortized analysis
In computer science, amortized analysis is a method for analyzing a given algorithm's complexity, or how much of a resource, especially time or memory, it takes to execute. The motivation for amortized analysis is that looking at the worst-case ...
of splay trees, the worst case running time of splaysort, on an input with ''n'' data items, is ''O''(''n'' log ''n''), matching the time bounds for efficient non-adaptive algorithms such as
quicksort
Quicksort is an efficient, general-purpose sorting algorithm. Quicksort was developed by British computer scientist Tony Hoare in 1959 and published in 1961, it is still a commonly used algorithm for sorting. Overall, it is slightly faster than ...
,
heap sort
In computer science, heapsort is a comparison-based sorting algorithm. Heapsort can be thought of as an improved selection sort: like selection sort, heapsort divides its input into a sorted and an unsorted region, and it iteratively shrinks the ...
, and
merge sort
In computer science, merge sort (also commonly spelled as mergesort) is an efficient, general-purpose, and comparison-based sorting algorithm. Most implementations produce a stable sort, which means that the order of equal elements is the same ...
.
For an input sequence in which most items are placed close to their predecessor in the sorted order, or are out of order with only a small number of other items, splaysort can be faster than ''O''(''n'' log ''n''), showing that it is an
adaptive sort A sorting algorithm falls into the adaptive sort family if it takes advantage of existing order in its input. It benefits from the presortedness in the input sequence – or a limited amount of disorder for various definitions of measures of disord ...
. To quantify this, let ''d''
''x'' be the number of positions in the input that separate ''x'' from its predecessor, and let ''i''
''x'' be the number of items that appear on one side of ''x'' in the input and on the other side of ''x'' in the output (the number of
inversions that involve ''x''). Then it follows from the dynamic finger theorem for splay trees that the total time for splaysort is bounded by
:
and by
:
.
Splaysort can also be shown to be adaptive to the
entropy
Entropy is a scientific concept, as well as a measurable physical property, that is most commonly associated with a state of disorder, randomness, or uncertainty. The term and the concept are used in diverse fields, from classical thermodyna ...
of the input sequence.
Experimental results
In experiments by , splaysort was slower than quicksort on tables of random numbers by a factor of 1.5 to 2, and slower than mergesort by smaller factors. For data consisting of larger records, again in a random order, the additional amount of data movement performed by quicksort significantly slowed it down compared to pointer-based algorithms, and the times for splaysort and mergesort were very close to each other. However, for nearly presorted input sequences (measured in terms of the number of contiguous monotone subsequences in the data, the number of inversions, the number of items that must be removed to make a sorted subsequence, or the number of non-contiguous monotone subsequences into which the input can be partitioned) splaysort became significantly more efficient than the other algorithms.
compared splaysort to several other algorithms that are adaptive to the total number of inversions in the input, as well as to quicksort. They found that, on the inputs that had few enough inversions to make an adaptive algorithm faster than quicksort, splaysort was the fastest algorithm.
Variations
modify splaysort to be more strongly adaptive to the number of contiguous monotone subsequences in the input, and report on experiments showing that the resulting algorithm is faster on inputs that are nearly presorted according to this measure.
[.]
References
{{Sorting
Sorting algorithms