In
mathematics
Mathematics is an area of knowledge that includes the topics of numbers, formulas and related structures, shapes and the spaces in which they are contained, and quantities and their changes. These topics are represented in modern mathematics ...
, specifically
differential calculus
In mathematics, differential calculus is a subfield of calculus that studies the rates at which quantities change. It is one of the two traditional divisions of calculus, the other being integral calculus—the study of the area beneath a curve ...
, the inverse function theorem gives a sufficient condition for a
function to be
invertible in a
neighborhood
A neighbourhood (British English, Irish English, Australian English and Canadian English) or neighborhood (American English; see spelling differences) is a geographically localised community within a larger city, town, suburb or rural area, ...
of a point in its
domain
Domain may refer to:
Mathematics
*Domain of a function, the set of input values for which the (total) function is defined
** Domain of definition of a partial function
** Natural domain of a partial function
**Domain of holomorphy of a function
* ...
: namely, that its ''derivative is continuous and non-zero at the point''. The theorem also gives a
formula
In science, a formula is a concise way of expressing information symbolically, as in a mathematical formula or a ''chemical formula''. The informal use of the term ''formula'' in science refers to the general construct of a relationship betwe ...
for the
derivative
In mathematics, the derivative of a function of a real variable measures the sensitivity to change of the function value (output value) with respect to a change in its argument (input value). Derivatives are a fundamental tool of calculus. ...
of the
inverse function
In mathematics, the inverse function of a function (also called the inverse of ) is a function that undoes the operation of . The inverse of exists if and only if is bijective, and if it exists, is denoted by f^ .
For a function f\colon X ...
.
In
multivariable calculus
Multivariable calculus (also known as multivariate calculus) is the extension of calculus in one Variable (mathematics), variable to calculus with Function of several real variables, functions of several variables: the Differential calculus, di ...
, this theorem can be generalized to any
continuously differentiable
In mathematics, a differentiable function of one real variable is a function whose derivative exists at each point in its domain. In other words, the graph of a differentiable function has a non- vertical tangent line at each interior point in ...
,
vector-valued function whose
Jacobian determinant
In vector calculus, the Jacobian matrix (, ) of a vector-valued function of several variables is the matrix of all its first-order partial derivatives. When this matrix is square, that is, when the function takes the same number of variables ...
is nonzero at a point in its domain, giving a formula for the
Jacobian matrix
In vector calculus, the Jacobian matrix (, ) of a vector-valued function of several variables is the matrix of all its first-order partial derivatives. When this matrix is square, that is, when the function takes the same number of variable ...
of the inverse. There are also versions of the inverse function theorem for
complex holomorphic function
In mathematics, a holomorphic function is a complex-valued function of one or more complex variables that is complex differentiable in a neighbourhood of each point in a domain in complex coordinate space . The existence of a complex deriv ...
s, for differentiable maps between
manifold
In mathematics, a manifold is a topological space that locally resembles Euclidean space near each point. More precisely, an n-dimensional manifold, or ''n-manifold'' for short, is a topological space with the property that each point has a n ...
s, for differentiable functions between
Banach space
In mathematics, more specifically in functional analysis, a Banach space (pronounced ) is a complete normed vector space. Thus, a Banach space is a vector space with a metric that allows the computation of vector length and distance between vector ...
s, and so forth.
The theorem was first established by
Picard and
Goursat using an iterative scheme: the basic idea is to prove a
fixed point theorem
In mathematics, a fixed-point theorem is a result saying that a function ''F'' will have at least one fixed point (a point ''x'' for which ''F''(''x'') = ''x''), under some conditions on ''F'' that can be stated in general terms. Some authors cla ...
using the
contraction mapping theorem.
Statements
For functions of a single
variable, the theorem states that if
is a
continuously differentiable
In mathematics, a differentiable function of one real variable is a function whose derivative exists at each point in its domain. In other words, the graph of a differentiable function has a non- vertical tangent line at each interior point in ...
function with nonzero derivative at the point
; then
is injective (or bijective onto the image) in a neighborhood of
, the inverse is continuously differentiable near
, and the derivative of the inverse function at
is the reciprocal of the derivative of
at
:
It can happen that a function
may be injective near a point
while
. An example is
. In fact, for such a function, the inverse cannot be differentiable at
, since if
were differentiable at
, then, by the chain rule,
, which implies
. (The situation is different for holomorphic functions; see
#Holomorphic inverse function theorem below.)
For functions of more than one variable, the theorem states that if is a continuously differentiable function from an open set
of
into
, and the
derivative
In mathematics, the derivative of a function of a real variable measures the sensitivity to change of the function value (output value) with respect to a change in its argument (input value). Derivatives are a fundamental tool of calculus. ...
is invertible at a point (that is, the determinant of the
Jacobian matrix
In vector calculus, the Jacobian matrix (, ) of a vector-valued function of several variables is the matrix of all its first-order partial derivatives. When this matrix is square, that is, when the function takes the same number of variable ...
of at is non-zero), then there exist neighborhoods
of
in
and
of
such that
and
is bijective.
[Theorem 1.1.7. in ] Writing
, this means that the system of equations
has a unique solution for
in terms of
when
. Note that the theorem ''does not'' say
is bijective onto the image where
is invertible (the determinant of the Jacobian matrix is nonzero) but that it is locally bijective where
is invertible.
Moreover, the theorem says that the inverse function
is continuously differentiable, and its derivative at
is the inverse map of
; i.e.,
:
In other words, if
are Jacobian matrices representing
, this means:
:
The hard part of the theorem is the existence and differentiability of
. Assuming this, the inverse derivative formula follows from the
chain rule
In calculus, the chain rule is a formula that expresses the derivative of the composition of two differentiable functions and in terms of the derivatives of and . More precisely, if h=f\circ g is the function such that h(x)=f(g(x)) for every , ...
applied to
. (Indeed,
) Since taking the inverse is infinitely differentiable, the formula for the derivative of the inverse shows that if
is
-th differentiable, with nonzero derivative at the point , then the inverse is also
-th differentiable. Here
is a positive integer or
.
There are two variants of the inverse function theorem.
Given a continuously differentiable map
, the one is
*The derivative
is surjective (i.e., the Jacobian matrix representing it has rank
) if and only if there exists a continuously differentiable function
on a neighborhood
of
such
near
.
and the other is
*The derivative
is injective if and only if there exists a continuously differentiable function
on a neighborhood
of
such
near
.
In the first case (when
is surjective), the point
is called a
regular value. Since
, the first case is equivalent to saying
is not in the image of
critical points (a critical point is a point
such that the kernel of
is nonzero). The statement in the first case is sometimes also called the
submersion theorem
In mathematics, a submersion is a differentiable map between differentiable manifolds whose differential is everywhere surjective. This is a basic concept in differential topology. The notion of a submersion is dual to the notion of an immersion ...
.
These variants are restatements of the inverse functions theorem. Indeed, in the first case when
is surjective, we can find an (injective) linear map
such that
. Define
so that we have:
:
Thus, by the inverse function theorem,
has inverse near
; i.e.,
near
. The second case (
is injective) is seen in the similar way.
Example
Consider the
vector-valued function defined by:
:
The Jacobian matrix is:
:
with Jacobian determinant:
:
The determinant
is nonzero everywhere. Thus the theorem guarantees that, for every point in
, there exists a neighborhood about over which is invertible. This does not mean is invertible over its entire domain: in this case is not even
injective
In mathematics, an injective function (also known as injection, or one-to-one function) is a function that maps distinct elements of its domain to distinct elements; that is, implies . (Equivalently, implies in the equivalent contrapositi ...
since it is periodic:
.
Counter-example
If one drops the assumption that the derivative is continuous, the function no longer need be invertible. For example
and
has discontinuous derivative
and
, which vanishes arbitrarily close to
. These critical points are local max/min points of
, so
is not one-to-one (and not invertible) on any interval containing
. Intuitively, the slope
does not propagate to nearby points, where the slopes are governed by a weak but rapid oscillation.
Methods of proof
As an important result, the inverse function theorem has been given numerous proofs. The proof most commonly seen in textbooks relies on the
contraction mapping In mathematics, a contraction mapping, or contraction or contractor, on a metric space (''M'', ''d'') is a function ''f'' from ''M'' to itself, with the property that there is some real number 0 \leq k < 1 such that for all ''x'' an ...
principle, also known as the
Banach fixed-point theorem
In mathematics, the Banach fixed-point theorem (also known as the contraction mapping theorem or contractive mapping theorem) is an important tool in the theory of metric spaces; it guarantees the existence and uniqueness of fixed points of certa ...
(which can also be used as the key step in the proof of
existence and uniqueness of solutions to
ordinary differential equations
In mathematics, an ordinary differential equation (ODE) is a differential equation whose unknown(s) consists of one (or more) function(s) of one variable and involves the derivatives of those functions. The term ''ordinary'' is used in contrast ...
).
Since the fixed point theorem applies in infinite-dimensional (Banach space) settings, this proof generalizes immediately to the infinite-dimensional version of the inverse function theorem (see
Generalizations
A generalization is a form of abstraction whereby common properties of specific instances are formulated as general concepts or claims. Generalizations posit the existence of a domain or set of elements, as well as one or more common characteri ...
below).
An alternate proof in finite dimensions hinges on the
extreme value theorem
In calculus, the extreme value theorem states that if a real-valued function f is continuous on the closed interval ,b/math>, then f must attain a maximum and a minimum, each at least once. That is, there exist numbers c and d in ,b/math> s ...
for functions on a
compact set
In mathematics, specifically general topology, compactness is a property that seeks to generalize the notion of a closed and bounded subset of Euclidean space by making precise the idea of a space having no "punctures" or "missing endpoints", i. ...
.
Yet another proof uses
Newton's method
In numerical analysis, Newton's method, also known as the Newton–Raphson method, named after Isaac Newton and Joseph Raphson, is a root-finding algorithm which produces successively better approximations to the roots (or zeroes) of a real- ...
, which has the advantage of providing an
effective version of the theorem: bounds on the derivative of the function imply an estimate of the size of the neighborhood on which the function is invertible.
A proof using successive approximation
To prove existence, it can be assumed after an affine transformation that
and
, so that
.
By the fundamental theorem of calculus if
is a C
1 function,
, so that
. Setting
, it follows that
:
Now choose
so that
for
. Suppose that
and define
inductively by
and
. The assumptions show that if
then
:
.
In particular
implies
. In the inductive scheme
and
. Thus
is a
Cauchy sequence
In mathematics, a Cauchy sequence (; ), named after Augustin-Louis Cauchy, is a sequence whose elements become arbitrarily close to each other as the sequence progresses. More precisely, given any small positive distance, all but a finite numbe ...
tending to
. By construction
as required.
To check that
is C
1, write
so that
. By the inequalities above,
so that
.
On the other hand if
, then
. Using the
geometric series
In mathematics, a geometric series is the sum of an infinite number of terms that have a constant ratio between successive terms. For example, the series
:\frac \,+\, \frac \,+\, \frac \,+\, \frac \,+\, \cdots
is geometric, because each suc ...
for
, it follows that
. But then
:
tends to 0 as
and
tend to 0, proving that
is C
1 with
.
The proof above is presented for a finite-dimensional space, but applies equally well for
Banach space
In mathematics, more specifically in functional analysis, a Banach space (pronounced ) is a complete normed vector space. Thus, a Banach space is a vector space with a metric that allows the computation of vector length and distance between vector ...
s. If an invertible function
is C
k with
, then so too is its inverse. This follows by induction using the fact that the map
on operators is C
k for any
(in the finite-dimensional case this is an elementary fact because the inverse of a matrix is given as the
adjugate matrix
In linear algebra, the adjugate or classical adjoint of a square matrix is the transpose of its cofactor matrix and is denoted by . It is also occasionally known as adjunct matrix, or "adjoint", though the latter today normally refers to a differe ...
divided by its
determinant
In mathematics, the determinant is a scalar value that is a function of the entries of a square matrix. It characterizes some properties of the matrix and the linear map represented by the matrix. In particular, the determinant is nonzero if a ...
).
The method of proof here can be found in the books of
Henri Cartan
Henri Paul Cartan (; 8 July 1904 – 13 August 2008) was a French mathematician who made substantial contributions to algebraic topology.
He was the son of the mathematician Élie Cartan, nephew of mathematician Anna Cartan, oldest brother of c ...
,
Jean Dieudonné
Jean Alexandre Eugène Dieudonné (; 1 July 1906 – 29 November 1992) was a French mathematician, notable for research in abstract algebra, algebraic geometry, and functional analysis, for close involvement with the Nicolas Bourbaki pseudonym ...
,
Serge Lang
Serge Lang (; May 19, 1927 – September 12, 2005) was a French-American mathematician and activist who taught at Yale University for most of his career. He is known for his work in number theory and for his mathematics textbooks, including the i ...
,
Roger Godement and
Lars Hörmander
Lars Valter Hörmander (24 January 1931 – 25 November 2012) was a Swedish mathematician who has been called "the foremost contributor to the modern theory of linear partial differential equations". Hörmander was awarded the Fields Med ...
.
A proof using the contraction mapping principle
Here is a proof based on the
contraction mapping theorem. Specifically, following T. Tao, it uses the following consequence of the contraction mapping theorem.
Basically, the lemma says that a small perturbation of the identity map by a contraction map is injective and preserves a ball in some sense. Assuming the lemma for a moment, we prove the theorem first. As in the above proof, it is enough to prove the special case when
and
. Let
. The
mean value inequality applied to
says:
:
Since
and
is continuous, we can find an
such that
:
for all
in
. Then the early lemma says that
is injective on
and
. Then
:
is bijective and thus has the inverse. Next, we show the inverse
is continuously differentiable (this part of the argument is the same as that in the previous proof). This time, let
denote the inverse of
and
. For
, we write
or
. Now, by the early estimate, we have
:
and so
. Writing
for the operator norm,
:
As
, we have
and
is bounded. Hence,
is differentiable at
with the derivative
. Also,
is the same as the composition
where
; so
is continuous.
It remains to show the lemma. First, the map
is injective on
since if
, then
and so
:
,
which is a contradiction unless
. (This part does not need the assumption
.) Next we show
. The idea is to note that this is equivalent to, given a point
in
, find a fixed point of the map
:
where
such that
and the bar means a closed ball. To find a fixed point, we use the contraction mapping theorem and checking that
is a well-defined strict-contraction mapping is straightforward. Finally, we have:
since
:
As it might be clear, this proof is not substantially different from the previous one, as the proof of the contraction mapping theorem is by successive approximation.
Applications
Implicit function theorem
The inverse function theorem can be used to solve a system of equations
:
i.e., expressing
as functions of
, provided the Jacobian matrix is invertible. The
implicit function theorem allows to solve a more general system of equations:
:
for
in terms of
. Though more general, the theorem is actually a consequence of the inverse function theorem. First, the precise statement of the implicit function theorem is as follows:
*given a map
, if
,
is continuously differentiable in a neighborhood of
and the derivative of
at
is invertible, then there exists a differentiable map
for some neighborhoods
of
such that
. Moreover, if
, then
; i.e.,
is a unique solution.
To see this, consider the map
. By the inverse function theorem,
has the inverse
for some neighborhoods
. We then have:
:
implying
and
Thus
has the required property.
Giving a manifold structure
In differential geometry, the inverse function theorem is used to show that the pre-image of a
regular value under a smooth map is a manifold. Indeed, let
be such a smooth map from an open subset of
(since the result is local, there is no loss of generality with considering such a map). Fix a point
in
and then, by permuting the coordinates on
, assume the matrix
has rank
. Then the map
is such that
has rank
. Hence, by the inverse function theorem, we find the smooth inverse
of
defined in a neighborhood
of
. We then have
:
which implies
:
That is, after the change of coordinates by
,
is a coordinate projection (this fact is known as the
submersion theorem
In mathematics, a submersion is a differentiable map between differentiable manifolds whose differential is everywhere surjective. This is a basic concept in differential topology. The notion of a submersion is dual to the notion of an immersion ...
). Moreover, since
is bijective, the map
:
is bijective with the smooth inverse. That is to say,
gives a local parametrization of
around
. Hence,
is a manifold.
(Note the proof is quite similar to the proof of the implicit function theorem and, in fact, the implicit function theorem can be also used instead.)
More generally, the theorem shows that if a smooth map
is transversal to a submanifold
, then the pre-image
is a submanifold.
Global version
The inverse function theorem is a local result; it applies to each point. ''A priori'', the theorem thus only shows the function
is locally bijective (or locally diffeomorphic of some class). The next topological lemma can be used to upgrade local injectivity to injectivity that is global to some extent.
Proof: First assume
is
compact
Compact as used in politics may refer broadly to a pact or treaty; in more specific cases it may refer to:
* Interstate compact
* Blood compact, an ancient ritual of the Philippines
* Compact government, a type of colonial rule utilized in Britis ...
. If the conclusion of the theorem is false, we can find two sequences
such that
and
each converge to some points
in
. Since
is injective on
,
. Now, if
is large enough,
are in a neighborhood of
where
is injective; thus,
, a contradiction.
In general, consider the set
. It is disjoint from
for any subset
where
is injective. Let
be an increasing sequence of compact subsets with union
and with
contained in the interior of
. Then, by the first part of the proof, for each
, we can find a neighborhood
of
such that
. Then
has the required property.
(See also for an alternative approach.)
The lemma implies the following (a sort of) global version of the inverse function theorem:
Note that if
is a point, then the above is the usual inverse function theorem.
Holomorphic inverse function theorem
There is a version of the inverse function theorem for
holomorphic maps.
The theorem follows from the usual inverse function theorem. Indeed, let
denote the Jacobian matrix of
in variables
and
for that in
. Then we have
, which is nonzero by assumption. Hence, by the usual inverse function theorem,
is injective near
with continuously differentiable inverse. By chain rule, with
,
:
where the left-hand side and the first term on the right vanish since
and
are holomorphic. Thus,
for each
.
Similarly, there is the implicit function theorem for holomorphic functions.
As already noted earlier, it can happen that an injective smooth function has the inverse that is not smooth (e.g.,
in a real variable). This is not the case for holomorphic functions because of:
Formulations for manifolds
The inverse function theorem can be rephrased in terms of differentiable maps between
differentiable manifold
In mathematics, a differentiable manifold (also differential manifold) is a type of manifold that is locally similar enough to a vector space to allow one to apply calculus. Any manifold can be described by a collection of charts (atlas). One ma ...
s. In this context the theorem states that for a differentiable map
(of class
), if the
differential of
,
:
is a
linear isomorphism
In mathematics, and more specifically in linear algebra, a linear map (also called a linear mapping, linear transformation, vector space homomorphism, or in some contexts linear function) is a mapping V \to W between two vector spaces that pre ...
at a point
in
then there exists an open neighborhood
of
such that
:
is a
diffeomorphism
In mathematics, a diffeomorphism is an isomorphism of smooth manifolds. It is an invertible function that maps one differentiable manifold to another such that both the function and its inverse are differentiable.
Definition
Given two ...
. Note that this implies that the connected components of and containing ''p'' and ''F''(''p'') have the same dimension, as is already directly implied from the assumption that ''dF''
''p'' is an isomorphism.
If the derivative of is an isomorphism at all points in then the map is a
local diffeomorphism In mathematics, more specifically differential topology, a local diffeomorphism is intuitively a map between Smooth manifolds that preserves the local differentiable structure. The formal definition of a local diffeomorphism is given below.
Formal ...
.
Generalizations
Banach spaces
The inverse function theorem can also be generalized to differentiable maps between
Banach space
In mathematics, more specifically in functional analysis, a Banach space (pronounced ) is a complete normed vector space. Thus, a Banach space is a vector space with a metric that allows the computation of vector length and distance between vector ...
s ' and '. Let ' be an open neighbourhood of the origin in ' and
a continuously differentiable function, and assume that the Fréchet derivative
of ' at 0 is a
bounded linear isomorphism of ' onto '. Then there exists an open neighbourhood ' of
in ' and a continuously differentiable map
such that
for all ' in '. Moreover,
is the only sufficiently small solution ' of the equation
.
There is also the inverse function theorem for
Banach manifold
In mathematics, a Banach manifold is a manifold modeled on Banach spaces. Thus it is a topological space in which each point has a neighbourhood homeomorphic to an open set in a Banach space (a more involved and formal definition is given below) ...
s.
Constant rank theorem
The inverse function theorem (and the
implicit function theorem) can be seen as a special case of the constant rank theorem, which states that a smooth map with constant
rank
Rank is the relative position, value, worth, complexity, power, importance, authority, level, etc. of a person or object within a ranking, such as:
Level or position in a hierarchical organization
* Academic rank
* Diplomatic rank
* Hierarchy
* ...
near a point can be put in a particular normal form near that point.
Specifically, if
has constant rank near a point
, then there are open neighborhoods of and of
and there are diffeomorphisms
and
such that
and such that the derivative
is equal to
. That is, "looks like" its derivative near . The set of points
such that the rank is constant in a neighborhood of
is an open dense subset of ; this is a consequence of
semicontinuity
In mathematical analysis, semicontinuity (or semi-continuity) is a property of extended real-valued functions that is weaker than continuity. An extended real-valued function f is upper (respectively, lower) semicontinuous at a point x_0 if, ro ...
of the rank function. Thus the constant rank theorem applies to a generic point of the domain.
When the derivative of is injective (resp. surjective) at a point , it is also injective (resp. surjective) in a neighborhood of , and hence the rank of is constant on that neighborhood, and the constant rank theorem applies.
Polynomial functions
If it is true, the
Jacobian conjecture
In mathematics, the Jacobian conjecture is a famous unsolved problem concerning polynomials in several variables. It states that if a polynomial function from an ''n''-dimensional space to itself has Jacobian determinant which is a non-zero c ...
would be a variant of the inverse function theorem for polynomials. It states that if a vector-valued polynomial function has a
Jacobian determinant
In vector calculus, the Jacobian matrix (, ) of a vector-valued function of several variables is the matrix of all its first-order partial derivatives. When this matrix is square, that is, when the function takes the same number of variables ...
that is an invertible polynomial (that is a nonzero constant), then it has an inverse that is also a polynomial function. It is unknown whether this is true or false, even in the case of two variables. This is a major open problem in the theory of polynomials.
Selections
When
with
,
is
times
continuously differentiable
In mathematics, a differentiable function of one real variable is a function whose derivative exists at each point in its domain. In other words, the graph of a differentiable function has a non- vertical tangent line at each interior point in ...
, and the Jacobian
at a point
is of
rank
Rank is the relative position, value, worth, complexity, power, importance, authority, level, etc. of a person or object within a ranking, such as:
Level or position in a hierarchical organization
* Academic rank
* Diplomatic rank
* Hierarchy
* ...
, the inverse of
may not be unique. However, there exists a local
selection function such that
for all
in a
neighborhood
A neighbourhood (British English, Irish English, Australian English and Canadian English) or neighborhood (American English; see spelling differences) is a geographically localised community within a larger city, town, suburb or rural area, ...
of
,
,
is
times continuously differentiable in this neighborhood, and
(
is the
Moore–Penrose pseudoinverse of
).
See also
*
Banach fixed-point theorem
In mathematics, the Banach fixed-point theorem (also known as the contraction mapping theorem or contractive mapping theorem) is an important tool in the theory of metric spaces; it guarantees the existence and uniqueness of fixed points of certa ...
*
Implicit function theorem
*
Nash–Moser theorem In the mathematical field of analysis, the Nash–Moser theorem, discovered by mathematician John Forbes Nash and named for him and Jürgen Moser, is a generalization of the inverse function theorem on Banach spaces to settings when the required ...
Notes
References
*
*
*
*.
*
*
*
*
{{Analysis in topological vector spaces
Multivariable calculus
Differential topology
Inverse functions
Theorems in real analysis
Theorems in calculus
de:Satz von der impliziten Funktion#Satz von der Umkehrabbildung