HOME

TheInfoList



OR:

In
computer science Computer science is the study of computation, information, and automation. Computer science spans Theoretical computer science, theoretical disciplines (such as algorithms, theory of computation, and information theory) to Applied science, ...
, a selection algorithm is an
algorithm In mathematics and computer science, an algorithm () is a finite sequence of Rigour#Mathematics, mathematically rigorous instructions, typically used to solve a class of specific Computational problem, problems or to perform a computation. Algo ...
for finding the kth smallest value in a collection of ordered values, such as numbers. The value that it finds is called the
order statistic In statistics, the ''k''th order statistic of a statistical sample is equal to its ''k''th-smallest value. Together with Ranking (statistics), rank statistics, order statistics are among the most fundamental tools in non-parametric statistics and ...
. Selection includes as special cases the problems of finding the
minimum In mathematical analysis, the maximum and minimum of a function are, respectively, the greatest and least value taken by the function. Known generically as extremum, they may be defined either within a given range (the ''local'' or ''relative ...
,
median The median of a set of numbers is the value separating the higher half from the lower half of a Sample (statistics), data sample, a statistical population, population, or a probability distribution. For a data set, it may be thought of as the “ ...
, and
maximum In mathematical analysis, the maximum and minimum of a function (mathematics), function are, respectively, the greatest and least value taken by the function. Known generically as extremum, they may be defined either within a given Interval (ma ...
element in the collection. Selection algorithms include
quickselect In computer science, quickselect is a selection algorithm to find the ''k''th smallest element in an unordered list, also known as the ''k''th order statistic. Like the related quicksort sorting algorithm, it was developed by Tony Hoare, and t ...
, and the
median of medians In computer science, the median of medians is an approximate median selection algorithm, frequently used to supply a good pivot for an exact selection algorithm, most commonly quickselect, that selects the ''k''th smallest element of an initial ...
algorithm. When applied to a collection of n values, these algorithms take
linear time In theoretical computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations ...
, O(n) as expressed using
big O notation Big ''O'' notation is a mathematical notation that describes the asymptotic analysis, limiting behavior of a function (mathematics), function when the Argument of a function, argument tends towards a particular value or infinity. Big O is a memb ...
. For data that is already structured, faster algorithms may be possible; as an extreme case, selection in an already-sorted
array An array is a systematic arrangement of similar objects, usually in rows and columns. Things called an array include: {{TOC right Music * In twelve-tone and serial composition, the presentation of simultaneous twelve-tone sets such that the ...
takes


Problem statement

An algorithm for the selection problem takes as input a collection of values, and a It outputs the smallest of these values, or, in some versions of the problem, a collection of the k smallest values. For this to be well-defined, it should be possible to sort the values into an order from smallest to largest; for instance, they may be
integers An integer is the number zero (0), a positive natural number (1, 2, 3, ...), or the negation of a positive natural number (−1, −2, −3, ...). The negations or additive inverses of the positive natural numbers are referred to as negative in ...
,
floating-point number In computing, floating-point arithmetic (FP) is arithmetic on subsets of real numbers formed by a ''significand'' (a signed sequence of a fixed number of digits in some base) multiplied by an integer power of that base. Numbers of this form ...
s, or some other kind of
object Object may refer to: General meanings * Object (philosophy), a thing, being, or concept ** Object (abstract), an object which does not exist at any particular time or place ** Physical object, an identifiable collection of matter * Goal, an a ...
with a numeric key. However, they are not assumed to have been already sorted. Often, selection algorithms are restricted to a comparison-based
model of computation In computer science, and more specifically in computability theory and computational complexity theory, a model of computation is a model which describes how an output of a mathematical function is computed given an input. A model describes how ...
, as in
comparison sort A comparison sort is a type of sorting algorithm that only reads the list elements through a single abstract comparison operation (often a "less than or equal to" operator or a three-way comparison) that determines which of two elements should oc ...
algorithms, where the algorithm has access to a comparison operation that can determine the relative ordering of any two values, but may not perform any other kind of
arithmetic operations Arithmetic is an elementary branch of mathematics that deals with numerical operations like addition, subtraction, multiplication, and Division (mathematics), division. In a wider sense, it also includes exponentiation, extraction of nth root, ...
on these values. To simplify the problem, some works on this problem assume that the values are all distinct from each or that some consistent tie-breaking method has been used to assign an ordering to pairs of items with the same value as each other. Another variation in the problem definition concerns the numbering of the ordered values: is the smallest value obtained by as in
zero-based numbering Zero-based numbering is a way of numbering in which the initial element of a sequence is assigned the index 0, rather than the index 1 as is typical in everyday non-mathematical or non-programming circumstances. Under zero-based number ...
of arrays, or is it obtained by following the usual English-language conventions for the smallest, second-smallest, etc.? This article follows the conventions used by Cormen et al., according to which all values are distinct and the minimum value is obtained from With these conventions, the maximum value, among a collection of n values, is obtained by When n is an
odd number In mathematics, parity is the property of an integer of whether it is even or odd. An integer is even if it is divisible by 2, and odd if it is not.. For example, −4, 0, and 82 are even numbers, while −3, 5, 23, and 69 are odd numbers. The ...
, the
median The median of a set of numbers is the value separating the higher half from the lower half of a Sample (statistics), data sample, a statistical population, population, or a probability distribution. For a data set, it may be thought of as the “ ...
of the collection is obtained by When n is even, there are two choices for the median, obtained by rounding this choice of k down or up, respectively: the ''lower median'' with k=n/2 and the ''upper median'' with


Algorithms


Sorting and heapselect

As a baseline algorithm, selection of the smallest value in a collection of values can be performed by the following two steps: * Sort the collection * If the output of the sorting algorithm is an
array An array is a systematic arrangement of similar objects, usually in rows and columns. Things called an array include: {{TOC right Music * In twelve-tone and serial composition, the presentation of simultaneous twelve-tone sets such that the ...
, retrieve its element; otherwise, scan the sorted sequence to find the element. The time for this method is dominated by the sorting step, which requires \Theta(n\log n) time using a Even when
integer sorting In computer science, integer sorting is the algorithmic problem of sorting a collection of data values by integer keys. Algorithms designed for integer sorting may also often be applied to sorting problems in which the keys are floating point numb ...
algorithms may be used, these are generally slower than the linear time that may be achieved using specialized selection algorithms. Nevertheless, the simplicity of this approach makes it attractive, especially when a highly-optimized sorting routine is provided as part of a runtime library, but a selection algorithm is not. For inputs of moderate size, sorting can be faster than non-random selection algorithms, because of the smaller constant factors in its running This method also produces a sorted version of the collection, which may be useful for other later computations, and in particular for selection with other choices For a sorting algorithm that generates one item at a time, such as
selection sort In computer science, selection sort is an in-place comparison sorting algorithm. It has a O(''n''2) time complexity, which makes it inefficient on large lists, and generally performs worse than the similar insertion sort. Selection sort is note ...
, the scan can be done in tandem with the sort, and the sort can be terminated once the element has been found. One possible design of a consolation bracket in a
single-elimination tournament A single-elimination knockout, or sudden-death tournament is a type of elimination tournament where the loser of a match-up is immediately eliminated from the tournament. Each winner will play another in the next round, until the final match-up, ...
, in which the teams who lost to the eventual winner play another mini-tournament to determine second place, can be seen as an instance of this Applying this optimization to
heapsort In computer science, heapsort is an efficient, comparison-based sorting algorithm that reorganizes an input array into a heap (a data structure where each node is greater than its children) and then repeatedly removes the largest node from that ...
produces the
heapselect In computer science, a heap is a tree-based data structure that satisfies the heap property: In a ''max heap'', for any given node C, if P is the parent node of C, then the ''key'' (the ''value'') of P is greater than or equal to the key of C. In ...
algorithm, which can select the smallest value in This is fast when k is small relative but degenerates to O(n\log n) for larger values such as the choice k=n/2 used for median finding.


Pivoting

Many methods for selection are based on choosing a special "pivot" element from the input, and using comparisons with this element to divide the remaining n-1 input values into two subsets: the set L of elements less than the pivot, and the set R of elements greater than the pivot. The algorithm can then determine where the smallest value is to be found, based on a comparison of k with the sizes of these sets. In particular, the smallest value is and can be found recursively by applying the same selection algorithm then the smallest value is the pivot, and it can be returned immediately. In the remaining case, the smallest value is and more specifically it is the element in position k-, L, -1 It can be found by applying a selection algorithm recursively, seeking the value in this position As with the related pivoting-based
quicksort Quicksort is an efficient, general-purpose sorting algorithm. Quicksort was developed by British computer scientist Tony Hoare in 1959 and published in 1961. It is still a commonly used algorithm for sorting. Overall, it is slightly faster than ...
algorithm, the partition of the input into L and R may be done by making new collections for these sets, or by a method that partitions a given list or array data type in-place. Details vary depending on how the input collection is The time to compare the pivot against all the other values However, pivoting methods differ in how they choose the pivot, which affects how big the subproblems in each recursive call will be. The efficiency of these methods depends greatly on the choice of the pivot. If the pivot is chosen badly, the running time of this method can be as slow *If the pivot were exactly at the median of the input, then each recursive call would have at most half as many values as the previous call, and the total times would add in a
geometric series In mathematics, a geometric series is a series (mathematics), series summing the terms of an infinite geometric sequence, in which the ratio of consecutive terms is constant. For example, 1/2 + 1/4 + 1/8 + 1/16 + ⋯, the series \tfrac12 + \tfrac1 ...
However, finding the median is itself a selection problem, on the entire original input. Trying to find it by a recursive call to a selection algorithm would lead to an infinite recursion, because the problem size would not decrease in each *
Quickselect In computer science, quickselect is a selection algorithm to find the ''k''th smallest element in an unordered list, also known as the ''k''th order statistic. Like the related quicksort sorting algorithm, it was developed by Tony Hoare, and t ...
chooses the pivot uniformly at random from the input values. It can be described as a
prune and search Prune and search is a method of solving optimization problems suggested by Nimrod Megiddo in 1983.Nimrod Megiddo (1983) Linear-time algorithms for linear programming in R3 and related problems. SIAM J. Comput., 12:759–776 The basic idea of ...
algorithm, a variant of
quicksort Quicksort is an efficient, general-purpose sorting algorithm. Quicksort was developed by British computer scientist Tony Hoare in 1959 and published in 1961. It is still a commonly used algorithm for sorting. Overall, it is slightly faster than ...
, with the same pivoting strategy, but where quicksort makes two recursive calls to sort the two subcollections L quickselect only makes one of these two calls. Its
expected time In computational complexity theory, the average-case complexity of an algorithm is the amount of some computational resource (typically time) used by the algorithm, averaged over all possible inputs. It is frequently contrasted with worst-case com ...
For any constant C, the probability that its number of comparisons exceeds Cn is superexponentially small *The
Floyd–Rivest algorithm In computer science, the Floyd-Rivest algorithm is a selection algorithm developed by Robert W. Floyd and Ronald L. Rivest that has an optimal expected number of comparisons within lower-order terms. It is functionally equivalent to quickselect ...
, a variation of quickselect, chooses a pivot by randomly sampling a subset of r data values, for some sample and then recursively selecting two elements somewhat above and below position rk/n of the sample to use as pivots. With this choice, it is likely that k is sandwiched between the two pivots, so that after pivoting only a small number of data values between the pivots are left for a recursive call. This method can achieve an expected number of comparisons that is In their original work, Floyd and Rivest claimed that the o(n) term could be made as small as O(\sqrt n) by a recursive sampling scheme, but the correctness of their analysis has been Instead, more rigorous analysis has shown that a version of their algorithm achieves O(\sqrt) for this Although the usual analysis of both quickselect and the Floyd–Rivest algorithm assumes the use of a
true random number generator In computing, a hardware random number generator (HRNG), true random number generator (TRNG), non-deterministic random bit generator (NRBG), or physical random number generator is a device that generates random numbers from a physical process ca ...
, a version of the Floyd–Rivest algorithm using a
pseudorandom number generator A pseudorandom number generator (PRNG), also known as a deterministic random bit generator (DRBG), is an algorithm for generating a sequence of numbers whose properties approximate the properties of sequences of random number generation, random n ...
seeded with only logarithmically many true random bits has been proven to run in linear time with high probability. *The
median of medians In computer science, the median of medians is an approximate median selection algorithm, frequently used to supply a good pivot for an exact selection algorithm, most commonly quickselect, that selects the ''k''th smallest element of an initial ...
method partitions the input into sets of five elements, and uses some other non-recursive method to find the median of each of these sets in constant time per set. It then recursively calls itself to find the median of these n/5 medians. Using the resulting median of medians as the pivot produces a partition with Thus, a problem on n elements is reduced to two recursive problems on n/5 elements (to find the pivot) and at most 7n/10 elements (after the pivot is used). The total size of these two recursive subproblems is at allowing the total time to be analyzed as a geometric series adding Unlike quickselect, this algorithm is deterministic, not It was the first linear-time deterministic selection algorithm and is commonly taught in undergraduate algorithms classes as an example of a divide and conquer that does not divide into two equal However, the high constant factors in its O(n) time bound make it slower than quickselect in and slower even than sorting for inputs of moderate *Hybrid algorithms such as
introselect In computer science, introselect (short for "introspective selection") is a selection algorithm that is a hybrid of quickselect and median of medians which has fast average performance and optimal worst-case performance. Introselect is related ...
can be used to achieve the practical performance of quickselect with a fallback to medians of medians guaranteeing worst-case O(n)


Factories

The deterministic selection algorithms with the smallest known numbers of comparisons, for values of k that are far from 1 are based on the concept of ''factories'', introduced in 1976 by
Arnold Schönhage Arnold Schönhage (born 1 December 1934 in Lockhausen, now Bad Salzuflen) is a German mathematician and computer scientist. Schönhage was professor at the Rheinische Friedrich-Wilhelms-Universität, Bonn, and also in Tübingen and Konstanz. ...
, Mike Paterson, and These are methods that build
partial order In mathematics, especially order theory, a partial order on a set is an arrangement such that, for certain pairs of elements, one precedes the other. The word ''partial'' is used to indicate that not every pair of elements needs to be comparable ...
s of certain specified types, on small subsets of input values, by using comparisons to combine smaller partial orders. As a very simple example, one type of factory can take as input a sequence of single-element partial orders, compare pairs of elements from these orders, and produce as output a sequence of two-element totally ordered sets. The elements used as the inputs to this factory could either be input values that have not been compared with anything yet, or "waste" values produced by other factories. The goal of a factory-based algorithm is to combine together different factories, with the outputs of some factories going to the inputs of others, in order to eventually obtain a partial order in which one element (the smallest) is larger than some k-1 other elements and smaller than another n-k others. A careful design of these factories leads to an algorithm that, when applied to median-finding, uses at most 2.942n comparisons. For other values the number of comparisons is


Parallel algorithms

Parallel algorithm In computer science, a parallel algorithm, as opposed to a traditional serial algorithm, is an algorithm which can do multiple operations in a given time. It has been a tradition of computer science to describe serial algorithms in abstract mach ...
s for selection have been studied since 1975, when
Leslie Valiant Leslie Gabriel Valiant (born 28 March 1949) is a British American computer scientist and computational theorist. He was born to a chemical engineer father and a translator mother. He is currently the T. Jefferson Coolidge Professor of Compute ...
introduced the parallel comparison tree model for analyzing these algorithms, and proved that in this model selection using a linear number of comparisons requires \Omega(\log\log n) parallel steps, even for selecting the minimum or Researchers later found parallel algorithms for selection in O(\log\log n) steps, matching this In a randomized parallel comparison tree model it is possible to perform selection in a bounded number of steps and a linear number of On the more realistic parallel RAM model of computing, with exclusive read exclusive write memory access, selection can be performed in time O(\log n) with O(n/\log n) processors, which is optimal both in time and in the number of With concurrent memory access, slightly faster parallel time is possible in and the \log n term in the time bound can be replaced


Sublinear data structures

When data is already organized into a
data structure In computer science, a data structure is a data organization and storage format that is usually chosen for Efficiency, efficient Data access, access to data. More precisely, a data structure is a collection of data values, the relationships amo ...
, it may be possible to perform selection in an amount of time that is sublinear in the number of values. As a simple case of this, for data already sorted into an array, selecting the element may be performed by a single array lookup, in constant For values organized into a two-dimensional array of with sorted rows and columns, selection may be performed in time or faster when k is small relative to the array For a collection of m one-dimensional sorted arrays, with k_i items less than the selected item in the array, the time is Selection from data in a
binary heap A binary heap is a heap (data structure), heap data structure that takes the form of a binary tree. Binary heaps are a common way of implementing priority queues. The binary heap was introduced by J. W. J. Williams in 1964 as a data structure fo ...
takes This is independent of the size n of the heap, and faster than the O(k\log n) time bound that would be obtained from This same method can be applied more generally to data organized as any kind of heap-ordered tree (a tree in which each node stores one value in which the parent of each non-root node has a smaller value than its child). This method of performing selection in a heap has been applied to problems of listing multiple solutions to combinatorial optimization problems, such as finding the shortest paths in a weighted graph, by defining a
state space In computer science, a state space is a discrete space representing the set of all possible configurations of a system. It is a useful abstraction for reasoning about the behavior of a given system and is widely used in the fields of artificial ...
of solutions in the form of an implicitly defined heap-ordered tree, and then applying this selection algorithm to this In the other direction, linear time selection algorithms have been used as a subroutine in a
priority queue In computer science, a priority queue is an abstract data type similar to a regular queue (abstract data type), queue or stack (abstract data type), stack abstract data type. In a priority queue, each element has an associated ''priority'', which ...
data structure related to the heap, improving the time for extracting its item from O(\log n) to here \log^* n is the For a collection of data values undergoing dynamic insertions and deletions, the
order statistic tree In computer science, an order statistic tree is a variant of the binary search tree (or more generally, a B-tree) that supports two additional operations beyond insertion, lookup and deletion: * Select(''i'') – find the ''i''-th smallest element ...
augments a
self-balancing binary search tree In computer science, a self-balancing binary search tree (BST) is any node-based binary search tree that automatically keeps its height (maximal number of levels below the root) small in the face of arbitrary item insertions and deletions.Donald ...
structure with a constant amount of additional information per tree node, allowing insertions, deletions, and selection queries that ask for the element in the current set to all be performed in O(\log n) time per Going beyond the comparison model of computation, faster times per operation are possible for values that are small integers, on which binary arithmetic operations are It is not possible for a
streaming algorithm In computer science, streaming algorithms are algorithms for processing data streams in which the input is presented as a sequence of items and can be examined in only a few passes, typically one-pass algorithm, just one. These algorithms are desi ...
with memory sublinear in both n and k to solve selection queries exactly for dynamic data, but the count–min sketch can be used to solve selection queries approximately, by finding a value whose position in the ordering of the elements (if it were added to them) would be within \varepsilon n steps of k, for a sketch whose size is within logarithmic factors of 1/\varepsilon.


Lower bounds

The O(n) running time of the selection algorithms described above is necessary, because a selection algorithm that can handle inputs in an arbitrary order must take that much time to look at all of its inputs. If any one of its input values is not compared, that one value could be the one that should have been selected, and the algorithm can be made to produce an incorrect answer. Beyond this simple argument, there has been a significant amount of research on the exact number of comparisons needed for selection, both in the randomized and deterministic cases. Selecting the minimum of n values requires n-1 comparisons, because the n-1 values that are not selected must each have been determined to be non-minimal, by being the largest in some comparison, and no two of these values can be largest in the same comparison. The same argument applies symmetrically to selecting the The next simplest case is selecting the second-smallest. After several incorrect attempts, the first tight lower bound on this case was published in 1964 by Soviet mathematician Sergey Kislitsyn. It can be shown by observing that selecting the second-smallest also requires distinguishing the smallest value from the rest, and by considering the number p of comparisons involving the smallest value that an algorithm for this problem makes. Each of the p items that were compared to the smallest value is a candidate for second-smallest, and p-1 of these values must be found larger than another value in a second comparison in order to rule them out as second-smallest. With n-1 values being the larger in at least one comparison, and p-1 values being the larger in at least two comparisons, there are a total of at least n+p-2 comparisons. An adversary argument, in which the outcome of each comparison is chosen in order to maximize p (subject to consistency with at least one possible ordering) rather than by the numerical values of the given items, shows that it is possible to force p to be Therefore, the worst-case number of comparisons needed to select the second smallest the same number that would be obtained by holding a single-elimination tournament with a run-off tournament among the values that lost to the smallest value. However, the expected number of comparisons of a randomized selection algorithm can be better than this bound; for instance, selecting the second-smallest of six elements requires seven comparisons in the worst case, but may be done by a randomized algorithm with an expected number of More generally, selecting the element out of n requires at least n+\min(k,n-k)-O(1) comparisons, in the average case, matching the number of comparisons of the Floyd–Rivest algorithm up to its o(n) term. The argument is made directly for deterministic algorithms, with a number of comparisons that is averaged over all possible
permutation In mathematics, a permutation of a set can mean one of two different things: * an arrangement of its members in a sequence or linear order, or * the act or process of changing the linear order of an ordered set. An example of the first mean ...
s of the input By
Yao's principle In computational complexity theory, Yao's principle (also called Yao's minimax principle or Yao's lemma) relates the performance of randomized algorithms to deterministic (non-random) algorithms. It states that, for certain classes of algorithms, ...
, it also applies to the expected number of comparisons for a randomized algorithm on its worst-case For deterministic algorithms, it has been shown that selecting the element requires \bigl(1+H(k/n)\bigr)n+\Omega(\sqrt n) comparisons, where H(x)=x\log_2\frac1x + (1-x)\log_2\frac1 is the The special case of median-finding has a slightly larger lower bound on the number of comparisons, for


Exact numbers of comparisons

Knuth supplies the following triangle of numbers summarizing pairs of n and k for which the exact number of comparisons needed by an optimal selection algorithm is known. The row of the triangle (starting with n=1 in the top row) gives the numbers of comparisons for inputs of n values, and the number within each row gives the number of comparisons needed to select the smallest value from an input of that size. The rows are symmetric because selecting the smallest requires exactly the same number of comparisons, in the worst case, as selecting the Most, but not all, of the entries on the left half of each row can be found using the formula n-k+(k-1)\bigl\lceil\log_2(n+2-k)\bigr\rceil. This describes the number of comparisons made by a method of Abdollah Hadian and Milton Sobel, related to heapselect, that finds the smallest value using a single-elimination tournament and then repeatedly uses a smaller tournament among the values eliminated by the eventual tournament winners to find the next successive values until reaching the smallest. Some of the larger entries were proven to be optimal using a computer search.


Language support

Very few languages have built-in support for general selection, although many provide facilities for finding the smallest or largest element of a list. A notable exception is the
Standard Template Library The Standard Template Library (STL) is a software library originally designed by Alexander Stepanov for the C++ programming language that influenced many parts of the C++ Standard Library. It provides four components called ''algorithms'', '' ...
for C++, which provides a templated nth_element method with a guarantee of expected linear time. Python's standard library includes heapq.nsmallest and heapq.nlargest functions for returning the smallest or largest elements from a collection, in sorted order. The implementation maintains a
binary heap A binary heap is a heap (data structure), heap data structure that takes the form of a binary tree. Binary heaps are a common way of implementing priority queues. The binary heap was introduced by J. W. J. Williams in 1964 as a data structure fo ...
, limited to holding k elements, and initialized to the first k elements in the collection. Then, each subsequent item of the collection may replace the largest or smallest element in the heap if it is smaller or larger than this element. The algorithm's memory usage is superior to heapselect (the former only holds k elements in memory at a time while the latter requires manipulating the entire dataset into memory). Running time depends on data ordering. The best case is O((n - k) + k\log k) for already sorted data. The worst-case is O(n\log k) for reverse sorted data. In average cases, there are likely to be few heap updates and most input elements are processed with only a single comparison. For example, extracting the 100 largest or smallest values out of 10,000,000 random inputs makes 10,009,401 comparisons on average. Since 2017,
Matlab MATLAB (an abbreviation of "MATrix LABoratory") is a proprietary multi-paradigm programming language and numeric computing environment developed by MathWorks. MATLAB allows matrix manipulations, plotting of functions and data, implementat ...
has included maxk() and mink() functions, which return the maximal (minimal) k values in a vector as well as their indices. The Matlab documentation does not specify which algorithm these functions use or what their running


History

Quickselect In computer science, quickselect is a selection algorithm to find the ''k''th smallest element in an unordered list, also known as the ''k''th order statistic. Like the related quicksort sorting algorithm, it was developed by Tony Hoare, and t ...
was presented without analysis by
Tony Hoare Sir Charles Antony Richard Hoare (; born 11 January 1934), also known as C. A. R. Hoare, is a British computer scientist who has made foundational contributions to programming languages, algorithms, operating systems, formal verification, and ...
and first analyzed in a 1971 technical report by The first known linear time deterministic selection algorithm is the
median of medians In computer science, the median of medians is an approximate median selection algorithm, frequently used to supply a good pivot for an exact selection algorithm, most commonly quickselect, that selects the ''k''th smallest element of an initial ...
method, published in 1973 by
Manuel Blum Manuel Blum (born 26 April 1938) is a Venezuelan-born American computer scientist who received the Turing Award in 1995 "In recognition of his contributions to the foundations of computational complexity theory and its application to cryptography ...
,
Robert W. Floyd Robert W. Floyd (born Robert Willoughby Floyd; June 8, 1936 – September 25, 2001) was an American computer scientist. His contributions include the design of the Floyd–Warshall algorithm (independently of Stephen Warshall), which efficientl ...
,
Vaughan Pratt Vaughan Pratt (born April 12, 1944) is a Professor, Professor Emeritus at Stanford University, who was an early pioneer in the field of computer science. Since 1969, Pratt has made several contributions to foundational areas such as search algorit ...
,
Ron Rivest Ronald Linn Rivest (; born May 6, 1947) is an American cryptographer and computer scientist whose work has spanned the fields of algorithms and combinatorics, cryptography, machine learning, and election integrity. He is an Institute Profess ...
, and
Robert Tarjan Robert Endre Tarjan (born April 30, 1948) is an American computer scientist and mathematician. He is the discoverer of several graph theory algorithms, including his strongly connected components algorithm, and co-inventor of both splay trees a ...
. They trace the formulation of the selection problem to work of Charles L. Dodgson (better known as
Lewis Carroll Charles Lutwidge Dodgson (27 January 1832 – 14 January 1898), better known by his pen name Lewis Carroll, was an English author, poet, mathematician, photographer and reluctant Anglicanism, Anglican deacon. His most notable works are ''Alice ...
) who in 1883 pointed out that the usual design of single-elimination sports tournaments does not guarantee that the second-best player wins second place, and to work of
Hugo Steinhaus Hugo Dyonizy Steinhaus ( , ; 14 January 1887 – 25 February 1972) was a Polish mathematician and educator. Steinhaus obtained his PhD under David Hilbert at Göttingen University in 1911 and later became a professor at the Jan Kazimierz Univers ...
circa 1930, who followed up this same line of thought by asking for a tournament design that can make this guarantee, with a minimum number of games played (that is,


See also

* , algorithms for higher-dimensional generalizations of medians *
Median filter The median filter is a non-linear digital filtering technique, often used to remove signal noise, noise from an image, signal, and video. Such noise reduction is a typical pre-processing step to improve the results of later processing (for example ...
, application of median-finding algorithms in image processing


References

{{DEFAULTSORT:Selection Algorithm