In
theoretical computer science
Theoretical computer science is a subfield of computer science and mathematics that focuses on the Abstraction, abstract and mathematical foundations of computation.
It is difficult to circumscribe the theoretical areas precisely. The Associati ...
, the computational complexity of matrix multiplication dictates
how quickly the operation of
matrix multiplication
In mathematics, specifically in linear algebra, matrix multiplication is a binary operation that produces a matrix (mathematics), matrix from two matrices. For matrix multiplication, the number of columns in the first matrix must be equal to the n ...
can be performed.
Matrix multiplication algorithm
Because matrix multiplication is such a central operation in many numerical algorithms, much work has been invested in making matrix multiplication algorithms efficient. Applications of matrix multiplication in computational problems are found in m ...
s are a central subroutine in theoretical and
numerical algorithms for
numerical linear algebra and
optimization
Mathematical optimization (alternatively spelled ''optimisation'') or mathematical programming is the selection of a best element, with regard to some criteria, from some set of available alternatives. It is generally divided into two subfiel ...
, so finding the fastest algorithm for matrix multiplication is of major practical relevance.
Directly applying the mathematical definition of matrix multiplication gives an algorithm that requires
field operations to multiply two matrices over that field ( in
big O notation
Big ''O'' notation is a mathematical notation that describes the asymptotic analysis, limiting behavior of a function (mathematics), function when the Argument of a function, argument tends towards a particular value or infinity. Big O is a memb ...
). Surprisingly, algorithms exist that provide better running times than this straightforward "schoolbook algorithm". The first to be discovered was
Strassen's algorithm, devised by
Volker Strassen in 1969 and often referred to as "fast matrix multiplication".
[
] The optimal number of field operations needed to multiply two square matrices
up to constant factors is still unknown. This is a major open question in
theoretical computer science
Theoretical computer science is a subfield of computer science and mathematics that focuses on the Abstraction, abstract and mathematical foundations of computation.
It is difficult to circumscribe the theoretical areas precisely. The Associati ...
.
, the best bound on the
asymptotic complexity of a matrix multiplication algorithm is .
[
] However, this and similar improvements to Strassen are not used in practice, because they are
galactic algorithms: the constant coefficient hidden by the
big O notation
Big ''O'' notation is a mathematical notation that describes the asymptotic analysis, limiting behavior of a function (mathematics), function when the Argument of a function, argument tends towards a particular value or infinity. Big O is a memb ...
is so large that they are only worthwhile for matrices that are too large to handle on present-day computers.
Simple algorithms
If ''A'', ''B'' are matrices over a field, then their product ''AB'' is also an matrix over that field, defined entrywise as
:
Schoolbook algorithm
The simplest approach to computing the product of two matrices ''A'' and ''B'' is to compute the arithmetic expressions coming from the definition of matrix multiplication. In pseudocode:
input ''A'' and ''B'', both ''n'' by ''n'' matrices
initialize ''C'' to be an ''n'' by ''n'' matrix of all zeros
for ''i'' from 1 to ''n'':
for ''j'' from 1 to ''n'':
for ''k'' from 1 to ''n'':
''C''
'i''''j''] = ''C''
'i''''j''] + ''A''
'i''''k'']*''B''
'k''''j'']
output ''C'' (as A*B)
This
algorithm
In mathematics and computer science, an algorithm () is a finite sequence of Rigour#Mathematics, mathematically rigorous instructions, typically used to solve a class of specific Computational problem, problems or to perform a computation. Algo ...
requires multiplications and additions of scalars for computing the product of two square matrices. Its
computational complexity
In computer science, the computational complexity or simply complexity of an algorithm is the amount of resources required to run it. Particular focus is given to computation time (generally measured by the number of needed elementary operations ...
is therefore , in a
model of computation
In computer science, and more specifically in computability theory and computational complexity theory, a model of computation is a model which describes how an output of a mathematical function is computed given an input. A model describes how ...
where field operations (addition and multiplication) take constant time (in practice, this is the case for
floating point numbers, but not necessarily for integers).
Strassen's algorithm
Strassen's algorithm improves on naive matrix multiplication through a
divide-and-conquer approach. The key observation is that multiplying two matrices can be done with only 7 multiplications, instead of the usual 8 (at the expense of 11 additional addition and subtraction operations). This means that, treating the input matrices as
block
Block or blocked may refer to:
Arts, entertainment and media Broadcasting
* Block programming, the result of a programming strategy in broadcasting
* W242BX, a radio station licensed to Greenville, South Carolina, United States known as ''96.3 ...
matrices, the task of multiplying matrices can be reduced to 7 subproblems of multiplying matrices. Applying this recursively gives an algorithm needing
field operations.
Unlike algorithms with faster asymptotic complexity, Strassen's algorithm is used in practice. The
numerical stability
In the mathematical subfield of numerical analysis, numerical stability is a generally desirable property of numerical algorithms. The precise definition of stability depends on the context: one important context is numerical linear algebra, and ...
is reduced compared to the naive algorithm, but it is faster in cases where or so
and appears in several libraries, such as
BLAS. Fast matrix multiplication algorithms cannot achieve ''component-wise stability'', but some can be shown to exhibit ''norm-wise stability''.
It is very useful for large matrices over exact domains such as
finite field
In mathematics, a finite field or Galois field (so-named in honor of Évariste Galois) is a field (mathematics), field that contains a finite number of Element (mathematics), elements. As with any field, a finite field is a Set (mathematics), s ...
s, where numerical stability is not an issue.
Matrix multiplication exponent

The ''matrix multiplication exponent'', usually denoted , is the smallest real number for which any two
matrices over a field can be multiplied together using
field operations. This notation is commonly used in
algorithm
In mathematics and computer science, an algorithm () is a finite sequence of Rigour#Mathematics, mathematically rigorous instructions, typically used to solve a class of specific Computational problem, problems or to perform a computation. Algo ...
s research, so that algorithms using matrix multiplication as a subroutine have bounds on running time that can update as bounds on improve.
Using a naive lower bound and schoolbook matrix multiplication for the upper bound, one can straightforwardly conclude that . Whether is a major open question in
theoretical computer science
Theoretical computer science is a subfield of computer science and mathematics that focuses on the Abstraction, abstract and mathematical foundations of computation.
It is difficult to circumscribe the theoretical areas precisely. The Associati ...
, and there is a line of research developing matrix multiplication algorithms to get improved bounds on .
All recent algorithms in this line of research use the ''laser method'', a generalization of the Coppersmith–Winograd algorithm, which was given by
Don Coppersmith and
Shmuel Winograd in 1990 and was the best matrix multiplication algorithm until 2010.
The conceptual idea of these algorithms is similar to Strassen's algorithm: a method is devised for multiplying two -matrices with fewer than multiplications, and this technique is applied recursively. The laser method has limitations to its power:
Ambainis, Filmus and
Le Gall prove that it cannot be used to show that by analyzing higher and higher tensor powers of a certain identity of Coppersmith and Winograd and neither for a wide class of variants of this approach.
In 2022 Duan, Wu and Zhou devised a variant breaking the first of the two barriers with ,
[ they do so by identifying a source of potential optimization in the laser method termed ''combination loss'' for which they compensate using an asymmetric version of the hashing method in the Coppersmith–Winograd algorithm.
Nonetheless, the above are classical examples of galactic algorithms. On the opposite, the above Strassen's algorithm of 1969 and Pan's algorithm of 1978, whose respective exponents are slightly above and below 2.78, have constant coefficients that make them feasible.
]
Group theory reformulation of matrix multiplication algorithms
Henry Cohn
Henry Cohn is an American mathematician. He is a principal researcher at Microsoft Research and an adjunct professor at Massachusetts Institute of Technology, MIT.
Cohn graduated from Harvard University in 2000 with a doctorate in mathematics. Co ...
, Robert Kleinberg, Balázs Szegedy and Chris Umans put methods such as the Strassen and Coppersmith–Winograd algorithms in an entirely different group-theoretic context, by utilising triples of subsets of finite groups which satisfy a disjointness property called the triple product property (TPP). They also give conjectures that, if true, would imply that there are matrix multiplication algorithms with essentially quadratic complexity. This implies that the optimal exponent of matrix multiplication is 2, which most researchers believe is indeed the case.[ One such conjecture is that families of wreath products of ]Abelian group
In mathematics, an abelian group, also called a commutative group, is a group in which the result of applying the group operation to two group elements does not depend on the order in which they are written. That is, the group operation is commu ...
s with symmetric groups realise families of subset triples with a simultaneous version of the TPP. Several of their conjectures have since been disproven by Blasiak, Cohn, Church, Grochow, Naslund, Sawin, and Umans using the Slice Rank method. Further, Alon, Shpilka and Chris Umans have recently shown that some of these conjectures implying fast matrix multiplication are incompatible with another plausible conjecture, the sunflower conjecture, which in turn is related to the cap set problem.[
]
Lower bounds for ω
There is a trivial lower bound of . Since any algorithm for multiplying two -matrices has to process all entries, there is a trivial asymptotic lower bound of operations for any matrix multiplication algorithm. Thus . It is unknown whether . The best known lower bound for matrix-multiplication complexity is , for bounded coefficient arithmetic circuits over the real or complex numbers, and is due to Ran Raz
Ran Raz () is a computer scientist who works in the area of computational complexity theory. He was a professor in the Faculty of Mathematics and Computer Science at the Weizmann Institute before becoming a professor of computer science at Prince ...
.
The exponent ω is defined to be a limit point
In mathematics, a limit point, accumulation point, or cluster point of a set S in a topological space X is a point x that can be "approximated" by points of S in the sense that every neighbourhood of x contains a point of S other than x itself. A ...
, in that it is the infimum of the exponent over all matrix multiplication algorithms. It is known that this limit point is not achieved. In other words, under the model of computation typically studied, there is no matrix multiplication algorithm that uses precisely operations; there must be an additional factor of .[
]
Rectangular matrix multiplication
Similar techniques also apply to rectangular matrix multiplication. The central object of study is , which is the smallest such that one can multiply a matrix of size with a matrix of size with arithmetic operations. A result in algebraic complexity states that multiplying matrices of size and requires the same number of arithmetic operations as multiplying matrices of size and and of size and , so this encompasses the complexity of rectangular matrix multiplication. This generalizes the square matrix multiplication exponent, since .
Since the output of the matrix multiplication problem is size , we have for all values of . If one can prove for some values of between 0 and 1 that , then such a result shows that for those . The largest ''k'' such that is known as the ''dual matrix multiplication exponent'', usually denoted ''α''. ''α'' is referred to as the " dual" because showing that is equivalent to showing that . Like the matrix multiplication exponent, the dual matrix multiplication exponent sometimes appears in the complexity of algorithms in numerical linear algebra and optimization.
The first bound on ''α'' is by Coppersmith in 1982, who showed that . The current best peer-reviewed bound on ''α'' is , given by Williams, Xu, Xu, and Zhou.[
]
Related problems
Problems that have the same asymptotic complexity as matrix multiplication include determinant
In mathematics, the determinant is a Scalar (mathematics), scalar-valued function (mathematics), function of the entries of a square matrix. The determinant of a matrix is commonly denoted , , or . Its value characterizes some properties of the ...
, matrix inversion, Gaussian elimination
In mathematics, Gaussian elimination, also known as row reduction, is an algorithm for solving systems of linear equations. It consists of a sequence of row-wise operations performed on the corresponding matrix of coefficients. This method can a ...
(see next section). Problems with complexity that is expressible in terms of include characteristic polynomial, eigenvalues (but not eigenvectors), Hermite normal form, and Smith normal form.
Matrix inversion, determinant and Gaussian elimination
In his 1969 paper, where he proved the complexity for matrix computation, Strassen proved also that matrix inversion, determinant
In mathematics, the determinant is a Scalar (mathematics), scalar-valued function (mathematics), function of the entries of a square matrix. The determinant of a matrix is commonly denoted , , or . Its value characterizes some properties of the ...
and Gaussian elimination
In mathematics, Gaussian elimination, also known as row reduction, is an algorithm for solving systems of linear equations. It consists of a sequence of row-wise operations performed on the corresponding matrix of coefficients. This method can a ...
have, up to a multiplicative constant, the same computational complexity
In computer science, the computational complexity or simply complexity of an algorithm is the amount of resources required to run it. Particular focus is given to computation time (generally measured by the number of needed elementary operations ...
as matrix multiplication. The proof does not make any assumptions on matrix multiplication that is used, except that its complexity is for some
The starting point of Strassen's proof is using block matrix
In mathematics, a block matrix or a partitioned matrix is a matrix that is interpreted as having been broken into sections called blocks or submatrices.
Intuitively, a matrix interpreted as a block matrix can be visualized as the original matrix w ...
multiplication. Specifically, a matrix of even dimension may be partitioned in four blocks
:
Under this form, its inverse is
:
provided that and are invertible.
Thus, the inverse of a matrix may be computed with two inversions, six multiplications and four additions or additive inverses of matrices. It follows that, denoting respectively by , and the number of operations needed for inverting, multiplying and adding matrices, one has
:
If one may apply this formula recursively:
:
If and one gets eventually
:
for some constant .
For matrices whose dimension is not a power of two, the same complexity is reached by increasing the dimension of the matrix to a power of two, by padding the matrix with rows and columns whose entries are 1 on the diagonal and 0 elsewhere.
This proves the asserted complexity for matrices such that all submatrices that have to be inverted are indeed invertible. This complexity is thus proved for almost all matrices, as a matrix with randomly chosen entries is invertible with probability one.
The same argument applies to LU decomposition, as, if the matrix is invertible, the equality
:
defines a block LU decomposition that may be applied recursively to and for getting eventually a true LU decomposition of the original matrix.
The argument applies also for the determinant, since it results from the block LU decomposition that
:
Minimizing number of multiplications
Related to the problem of minimizing the number of arithmetic operations is minimizing the number of multiplications, which is typically a more costly operation than addition. A algorithm for matrix multiplication must necessarily only use multiplication operations, but these algorithms are impractical. Improving from the naive multiplications for schoolbook multiplication, matrices in can be done with 47 multiplications, matrix multiplication over a commutative ring can be done in 21 multiplications (23 if non-commutative). The lower bound of multiplications needed is 2''mn''+2''n''−''m''−2 (multiplication of ''n''×''m''-matrices with ''m''×''n''-matrices using the substitution method, ''m''⩾''n''⩾3), which means n=3 case requires at least 19 multiplications and n=4 at least 34. For n=2 optimal 7 multiplications 15 additions are minimal, compared to only 4 additions for 8 multiplications.
See also
* Computational complexity of mathematical operations
* CYK algorithm, §Valiant's algorithm
* Freivalds' algorithm, a simple Monte Carlo algorithm that, given matrices , and , verifies in time if .
* Matrix chain multiplication
* Matrix multiplication
In mathematics, specifically in linear algebra, matrix multiplication is a binary operation that produces a matrix (mathematics), matrix from two matrices. For matrix multiplication, the number of columns in the first matrix must be equal to the n ...
, for abstract definitions
* Matrix multiplication algorithm
Because matrix multiplication is such a central operation in many numerical algorithms, much work has been invested in making matrix multiplication algorithms efficient. Applications of matrix multiplication in computational problems are found in m ...
, for practical implementation details
* Sparse matrix–vector multiplication
References
External links
Yet another catalogue of fast matrix multiplication algorithms
*{{cite journal , last1=Fawzi , first1=A. , last2=Balog , first2=M. , last3=Huang , first3=A. , first4=T. , last4=Hubert , first5=B. , last5=Romera-Paredes , first6=M. , last6=Barekatain , first7=A. , last7=Novikov , first8=F.J.R. , last8=Ruiz , first9=J. , last9=Schrittwieser , first10=G. , last10=Swirszcz , first11=D. , last11=Silver , first12=D. , last12=Hassabis , first13=P. , last13=Kohli , title=Discovering faster matrix multiplication algorithms with reinforcement learning , journal=Nature , volume=610 , issue= 7930, pages=47–53 , date=2022 , doi=10.1038/s41586-022-05172-4 , pmid=36198780 , pmc=9534758 , bibcode=2022Natur.610...47F , url=
Computer arithmetic algorithms
Computational complexity theory
Matrix theory
Unsolved problems in computer science