A galactic algorithm is one that outperforms any other algorithm for problems that are sufficiently large, but where "sufficiently large" is so big that the algorithm is never used in practice. Galactic algorithms were so named by
Richard Lipton and Ken Regan,
because they will never be used on any data sets on Earth.
Possible use cases
Even if they are never used in practice, galactic algorithms may still contribute to computer science:
* An algorithm, even if impractical, may show new techniques that may eventually be used to create practical algorithms.
* Available computational power may catch up to the crossover point, so that a previously impractical algorithm becomes practical.
* An impractical algorithm can still demonstrate that conjectured bounds can be achieved, or that proposed bounds are wrong, and hence advance the theory of algorithms. As Lipton states:
Similarly, a hypothetical large but polynomial
algorithm for the
Boolean satisfiability problem
In logic and computer science, the Boolean satisfiability problem (sometimes called propositional satisfiability problem and abbreviated SATISFIABILITY, SAT or B-SAT) is the problem of determining if there exists an interpretation that satisfies ...
, although unusable in practice, would settle the
P versus NP problem
The P versus NP problem is a major unsolved problem in theoretical computer science. In informal terms, it asks whether every problem whose solution can be quickly verified can also be quickly solved.
The informal term ''quickly'', used above ...
, considered the most important open problem in computer science and one of the
Millennium Prize Problems
The Millennium Prize Problems are seven well-known complex mathematical problems selected by the Clay Mathematics Institute in 2000. The Clay Institute has pledged a US$1 million prize for the first correct solution to each problem. According ...
.
Examples
Integer multiplication
An example of a galactic algorithm is the fastest known way to
multiply two numbers, which is based on a 1729-dimensional
Fourier transform
A Fourier transform (FT) is a mathematical transform that decomposes functions into frequency components, which are represented by the output of the transform as a function of frequency. Most commonly functions of time or space are transformed, ...
.
It needs
bit operations, but as the constants hidden by the
big O notation are large, it is never used in practice.
[ Quote, from one of the authors of the algorithm: "The new algorithm is not really practical in its current form, because the proof given in our paper only works for ludicrously large numbers. Even if each digit was written on a hydrogen atom, there would not be nearly enough room available in the observable universe to write them down."] However, it also shows why galactic algorithms may still be useful. The authors state: "we are hopeful that with further refinements, the algorithm might become practical for numbers with merely billions or trillions of digits."
Matrix multiplication
The first improvement over brute-force matrix multiplication (which needs
multiplications) was the
Strassen algorithm: a recursive algorithm that needs
multiplications. This algorithm is not galactic and is used in practice. Further extensions of this, using sophisticated group theory, are the
Coppersmith–Winograd algorithm and its slightly better successors, needing
multiplications. These are galactic – "We nevertheless stress that such improvements are only of theoretical interest, since the huge constants involved in the complexity of fast matrix multiplication usually make these algorithms impractical."
Communication channel capacity
Claude Shannon
Claude Elwood Shannon (April 30, 1916 – February 24, 2001) was an American mathematician, electrical engineer, and cryptographer known as a "father of information theory".
As a 21-year-old master's degree student at the Massachusetts In ...
showed a simple but impractical
code that could reach the capacity of a
communication channel
A communication channel refers either to a physical transmission medium such as a wire, or to a logical connection over a multiplexed medium such as a radio channel in telecommunications and computer networking. A channel is used for inform ...
. It requires assigning a random code word to every possible
-bit message, then decoding by finding the closest code word. If
is chosen large enough, this beats any existing code and can get arbitrarily close to the capacity of the channel. Unfortunately, any
big enough to beat existing codes is also completely impractical. These codes, though never used, inspired decades of research into more practical algorithms that today can achieve rates arbitrarily close to channel capacity.
Sub-graphs
The problem of
deciding whether a graph
contains
as a
minor is
NP-complete
In computational complexity theory, a problem is NP-complete when:
# it is a problem for which the correctness of each solution can be verified quickly (namely, in polynomial time) and a brute-force search algorithm can find a solution by tryin ...
in general, but where
is fixed, it can be solved in polynomial time. The running time for testing whether
is a minor of
in this case is
,
where
is the number of vertices in
and the
big O notation hides a constant that depends superexponentially on
. The constant is greater than
in
Knuth's up-arrow notation, where
is the number of vertices in
. Even the case of
cannot be reasonably computed as the constant is greater than
with ''n'' = 65536.
Cryptographic breaks
For cryptographers, a cryptographic "break" is anything faster than a brute-force attack – i.e., performing one trial decryption for each possible key. In many cases, even though they are the best known methods, they are still infeasible with current technology. One example is the best attack known against 128-bit
AES
AES may refer to:
Businesses and organizations Companies
* AES Corporation, an American electricity company
* AES Data, former owner of Daisy Systems Holland
* AES Eletropaulo, a former Brazilian electricity company
* AES Andes, formerly AES Gener ...
, which takes only
operations.
Despite being impractical, theoretical breaks can sometimes provide insight into vulnerability patterns.
Traveling salesman problem
For several decades, the best known approximation to the
traveling salesman problem
The travelling salesman problem (also called the travelling salesperson problem or TSP) asks the following question: "Given a list of cities and the distances between each pair of cities, what is the shortest possible route that visits each cit ...
in a
metric space
In mathematics, a metric space is a set together with a notion of '' distance'' between its elements, usually called points. The distance is measured by a function called a metric or distance function. Metric spaces are the most general sett ...
was the very simple
Christofides algorithm The Christofides algorithm or Christofides–Serdyukov algorithm is an algorithm for finding approximate solutions to the travelling salesman problem, on instances where the distances form a metric space (they are symmetric and obey the triangle ine ...
which produced a path at most 50% longer than the optimum. (Many other algorithms could ''usually'' do much better, but could not provably do so.) In 2020, a newer and much more complex algorithm was discovered that can beat this by
percent. Although no one will ever switch to this algorithm for its very slight worst-case improvement, it is still considered important because "this minuscule improvement breaks through both a theoretical logjam and a psychological one".
Hutter search
A single algorithm, "Hutter search", can solve any well-defined problem in an asymptotically optimal time, barring some caveats. It works by searching through all possible algorithms (by runtime), while simultaneously searching through all possible proofs (by length of proof), looking for a proof of correctness for each algorithm. Since the proof of correctness is of finite size, it "only" adds a constant and does not affect the asymptotic runtime. However, this constant is so big that the algorithm is entirely impractical. For example, if the shortest proof of correctness of a given algorithm is 1000 bits long, the search will examine at least 2
999 other potential proofs first.
Optimization
Simulated annealing
Simulated annealing (SA) is a probabilistic technique for approximating the global optimum of a given function. Specifically, it is a metaheuristic to approximate global optimization in a large search space for an optimization problem. ...
, when used with a logarithmic cooling schedule, has been proven to find the global optimum of any optimization problem. However, such a cooling schedule results in entirely impractical runtimes, and is never used. However, knowing this ideal algorithm exists has led to practical variants that are able to find very good (though not provably optimal) solutions to complex optimization problems.
Minimum Spanning Trees
The
expected linear time MST algorithm The expected linear time MST algorithm is a randomized algorithm for computing the minimum spanning forest of a weighted graph with no isolated vertices. It was developed by David Karger, Philip Klein, and Robert Tarjan. The algorithm relies on ...
is able to discover the
Minimum spanning tree
A minimum spanning tree (MST) or minimum weight spanning tree is a subset of the edges of a connected, edge-weighted undirected graph that connects all the vertices together, without any cycles and with the minimum possible total edge weight. T ...
of a graph in
, where
is the number of edges and
is the number of nodes of the graph. However, the constant factor that is hidden by the
Big O notation is huge enough to make the algorithm impractical. An implementation is publicly available and given the experimentally estimated implementation constants, it would only be faster than
Borůvka's algorithm for graphs in which
.
References
{{Reflist
Mathematical notation
Asymptotic analysis
Analysis of algorithms