mathematics Mathematics is a field of study that discovers and organizes methods, Mathematical theory, theories and theorems that are developed and Mathematical proof, proved for the needs of empirical sciences and mathematics itself. There are many ar ...

, the Legendre transformation (or Legendre transform), first introduced by

Adrien-Marie Legendre Adrien-Marie Legendre (; ; 18 September 1752 – 9 January 1833) was a French people, French mathematician who made numerous contributions to mathematics. Well-known and important concepts such as the Legendre polynomials and Legendre transforma ...

in 1787 when studying the minimal surface problem, is an involutive transformation on real-valued functions that are

convex Convex or convexity may refer to: Science and technology * Convex lens, in optics Mathematics * Convex set, containing the whole line segment that joins points ** Convex polygon, a polygon which encloses a convex set of points ** Convex polytop ...

on a real variable. Specifically, if a real-valued multivariable function is convex on one of its independent real variables, then the Legendre transform with respect to this variable is applicable to the function. In physical problems, the Legendre transform is used to convert functions of one quantity (such as position, pressure, or temperature) into functions of the conjugate quantity (momentum, volume, and entropy, respectively). In this way, it is commonly used in

classical mechanics Classical mechanics is a Theoretical physics, physical theory describing the motion of objects such as projectiles, parts of Machine (mechanical), machinery, spacecraft, planets, stars, and galaxies. The development of classical mechanics inv ...

to derive the

Hamiltonian Hamiltonian may refer to: * Hamiltonian mechanics, a function that represents the total energy of a system * Hamiltonian (quantum mechanics), an operator corresponding to the total energy of that system ** Dyall Hamiltonian, a modified Hamiltonian ...

formalism out of the Lagrangian formalism (or vice versa) and in

thermodynamics Thermodynamics is a branch of physics that deals with heat, Work (thermodynamics), work, and temperature, and their relation to energy, entropy, and the physical properties of matter and radiation. The behavior of these quantities is governed b ...

to derive the

thermodynamic potentials Thermodynamics is a branch of physics that deals with heat, work, and temperature, and their relation to energy, entropy, and the physical properties of matter and radiation. The behavior of these quantities is governed by the four laws of ther ...

, as well as in the solution of differential equations of several variables. For sufficiently smooth functions on the real line, the Legendre transform

f^*

of a function

f

can be specified, up to an additive constant, by the condition that the functions' first derivatives are inverse functions of each other. This can be expressed in Euler's derivative notation as

Df(\cdot) = \left( D f^* \right)^(\cdot)~,

where

D

is an operator of differentiation,

\cdot

represents an argument or input to the associated function,

(\phi)^(\cdot)

is an inverse function such that

(\phi) ^(\phi(x))=x

, or equivalently, as

f'(f^(x^*)) = x^*

and

f^(f'(x)) = x

in Lagrange's notation. The generalization of the Legendre transformation to affine spaces and non-convex functions is known as the

convex conjugate In mathematics and mathematical optimization, the convex conjugate of a function is a generalization of the Legendre transformation which applies to non-convex functions. It is also known as Legendre–Fenchel transformation, Fenchel transformati ...

(also called the Legendre–Fenchel transformation), which can be used to construct a function's

convex hull In geometry, the convex hull, convex envelope or convex closure of a shape is the smallest convex set that contains it. The convex hull may be defined either as the intersection of all convex sets containing a given subset of a Euclidean space, ...

Definition

Definition in one-dimensional real space

Let

I \sub \R

be an interval, and

f:I \to \R

convex function In mathematics, a real-valued function is called convex if the line segment between any two distinct points on the graph of a function, graph of the function lies above or on the graph between the two points. Equivalently, a function is conve ...

; then the ''Legendre transform'' ''of''

f

is the function

f^*:I^* \to \R

defined by

f^*(x^*) = \sup_(x^*x-f(x)),\ \ \ \ I^*= \left \

where

\sup

denotes the

supremum In mathematics, the infimum (abbreviated inf; : infima) of a subset S of a partially ordered set P is the greatest element in P that is less than or equal to each element of S, if such an element exists. If the infimum of S exists, it is unique, ...

over

I

, e.g.,

x

I

is chosen such that

x^*x - f(x)

is maximized at each

x^*

, or

x^*

is such that

x^*x-f(x)

has a bounded value throughout

I

(e.g., when

f(x)

is a linear function). The function

f^*

is called the

function of

f

. For historical reasons (rooted in analytic mechanics), the conjugate variable is often denoted

p

, instead of

x^*

. If the convex function

f

is defined on the whole line and is everywhere

differentiable In mathematics, a differentiable function of one real variable is a function whose derivative exists at each point in its domain. In other words, the graph of a differentiable function has a non- vertical tangent line at each interior point in ...

, then

f^*(p)=\sup_(px - f(x)) = \left( p x - f(x) \right), _

can be interpreted as the negative of the

y

-intercept of the

tangent line In geometry, the tangent line (or simply tangent) to a plane curve at a given point is, intuitively, the straight line that "just touches" the curve at that point. Leibniz defined it as the line through a pair of infinitely close points o ...

to the

graph Graph may refer to: Mathematics *Graph (discrete mathematics), a structure made of vertices and edges **Graph theory, the study of such graphs and their properties *Graph (topology), a topological space resembling a graph in the sense of discret ...

f

that has slope

p

Definition in n-dimensional real space

The generalization to convex functions

f:X \to \R

on a

convex set In geometry, a set of points is convex if it contains every line segment between two points in the set. For example, a solid cube (geometry), cube is a convex set, but anything that is hollow or has an indent, for example, a crescent shape, is n ...

X \sub \R^n

is straightforward:

f^*:X^* \to \R

has domain

X^*= \left \

and is defined by

f^*(x^*) = \sup_(\langle x^*,x\rangle-f(x)),\quad x^*\in X^* ~,

where

\langle x^*,x \rangle

denotes the

dot product In mathematics, the dot product or scalar productThe term ''scalar product'' means literally "product with a Scalar (mathematics), scalar as a result". It is also used for other symmetric bilinear forms, for example in a pseudo-Euclidean space. N ...

x^*

and

x

. The Legendre transformation is an application of the duality relationship between points and lines. The functional relationship specified by

f

can be represented equally well as a set of

(x,y)

points, or as a set of tangent lines specified by their slope and intercept values.

Understanding the Legendre transform in terms of derivatives

For a differentiable convex function

f

on the real line with the first derivative

f'

and its inverse

(f')^

, the Legendre transform of

f

f^*

, can be specified, up to an additive constant, by the condition that the functions' first derivatives are inverse functions of each other, i.e.,

f' = ((f^*)')^

and

(f^*)' = (f')^

. To see this, first note that if

f

as a convex function on the real line is differentiable and

\overline

is a critical point of the function of

x \mapsto p \cdot x -f(x)

, then the supremum is achieved at

\overline

(by convexity, see the first figure in this Wikipedia page). Therefore, the Legendre transform of

f

f^*(p)= p \cdot \overline - f(\overline)

. Then, suppose that the first derivative

f'

is invertible and let the inverse be

g = (f')^

. Then for each

p

, the point

g(p)

is the unique critical point

\overline

of the function

x \mapsto px -f(x)

(i.e.,

\overline = g(p)

) because

f'(g(p))=p

and the function's first derivative with respect to

x

g(p)

p-f'(g(p))=0

. Hence we have

f^*(p) = p \cdot g(p) - f(g(p))

for each

p

. By differentiating with respect to

p

, we find

(f^*)'(p) = g(p)+ p \cdot g'(p) - f'(g(p)) \cdot g'(p).

Since

f'(g(p))=p

this simplifies to

(f^*)'(p) = g(p) = (f')^(p)

. In other words, ''

(f^*)'

and

f'

are inverses to each other''. In general, if

h' = (f')^

as the inverse of

f',

then

h' = (f^*)'

so integration gives

f^* = h +c.

with a constant

c.

In practical terms, given

f(x),

the parametric plot of

xf'(x)-f(x)

versus

f'(x)

amounts to the graph of

f^*(p)

versus

p.

In some cases (e.g. thermodynamic potentials, below), a non-standard requirement is used, amounting to an alternative definition of with a ''minus sign'',

f(x) - f^*(p) = xp.

Formal definition in physics context

In analytical mechanics and thermodynamics, Legendre transformation is usually defined as follows: suppose

f

is a function of

x

; then we have ::

\mathrm f = \frac \mathrm x.

Performing the Legendre transformation on this function means that we take

p = \frac

as the independent variable, so that the above expression can be written as ::

\mathrm f = p \mathrm x,

and according to Leibniz's rule

\mathrm (uv) = u\mathrm v + v\mathrm u,

we then have ::

\mathrm \left(x p - f \right) = x \mathrm p + p \mathrm x - \mathrm f = x\mathrm p,

and taking

f^* = xp-f,

we have

\mathrm d f^* = x \mathrm p,

which means ::

\frac = x.

When

f

is a function of

n

variables

x_1, x_2, \cdots, x_n

, then we can perform the Legendre transformation on each one or several variables: we have ::

\mathrm f = p_1\mathrm x_1 + p_2 \mathrm x_2 + \cdots + p_n \mathrm x_n,

where

p_i = \frac.

Then if we want to perform the Legendre transformation on, e.g.

x_1

, then we take

p_1

together with

x_2, \cdots, x_n

as independent variables, and with Leibniz's rule we have ::

\mathrm (f - x_1 p_1) = -x_1 \mathrm p_1 + p_2 \mathrm x_2 + \cdots + p_n \mathrm x_n.

So for the function

\varphi(p_1, x_2, \cdots, x_n) = f(x_1, x_2, \cdots, x_n) - x_1 p_1,

we have ::

\frac = -x_1,\quad \frac = p_2,\quad \cdots, 
\quad \frac = p_n.

We can also do this transformation for variables

x_2, \cdots, x_n

. If we do it to all the variables, then we have ::

\mathrm \varphi = -x_1 \mathrm d p_1 - x_2 \mathrm p_2 - \cdots - x_n \mathrm p_n

where

\varphi = f-x_1 p_1 - x_2 p_2 - \cdots - x_n p_n.

In analytical mechanics, people perform this transformation on variables

\dot q_1, \dot q_2, \cdots, \dot q_n

of the Lagrangian

L(q_1, \cdots, q_n, \dot_1, \cdots, \dot_n)

to get the Hamiltonian:

H(q_1, \cdots, q_n, p_1, \cdots, p_n) = \sum_^n p_i \dot_i -
L(q_1, \cdots, q_n, \dot_1 \cdots, \dot_n).

In thermodynamics, people perform this transformation on variables according to the type of thermodynamic system they want; for example, starting from the cardinal function of state, the internal energy

U(S,V)

, we have ::

\mathrmU = T \mathrm S - p \mathrm V,

so we can perform the Legendre transformation on either or both of

S, V

to yield ::

\mathrm H = \mathrm (U + pV) \ \ \ \ \ \ \ \ \ \ = \ \ \ \ T\mathrm S + V \mathrm p

\mathrm F = \mathrm (U - TS) \ \ \ \ \ \ \ \ \ \ = -S\mathrm T - p \mathrm V

\mathrm G = \mathrm (U - TS + pV) = -S\mathrm T + V \mathrm p,

and each of these three expressions has a physical meaning. This definition of the Legendre transformation is the one originally introduced by Legendre in his work in 1787, and is still applied by physicists nowadays. Indeed, this definition can be mathematically rigorous if we treat all the variables and functions defined above: for example,

f,x_1,\cdots,x_n,p_1,\cdots,p_n,

as differentiable functions defined on an open set of

\R^n

or on a differentiable manifold, and

\mathrm f, \mathrm x_i, \mathrm p_i

their differentials (which are treated as cotangent vector field in the context of differentiable manifold). This definition is equivalent to the modern mathematicians' definition as long as

f

is differentiable and convex for the variables

x_1, x_2, \cdots, x_n.

Properties

*The Legendre transform of a convex function, of which double derivative values are all positive, is also a convex function of which double derivative values are all positive.''Proof.'' Let us show this with a doubly differentiable function

f(x)

with all positive double derivative values and with a bijective (invertible) derivative. For a fixed

p

, let

\bar

maximize or make the function

px - f(x)

bounded over

x

. Then the Legendre transformation of

f

f^*(p) = p\bar - f(\bar)

, thus,

f'(\bar) = p

by the maximizing or bounding condition

\frac(px - f(x)) = p - f'(x)= 0

. Note that

\bar

depends on

p

. (This can be visually shown in the 1st figure of this page above.) Thus

\bar = g(p)

where

g \equiv (f')^

, meaning that

g

is the inverse of

f'

that is the derivative of

f

(so

f'(g(p))= p

). Note that

g

is also differentiable with the following derivative (Inverse function rule),

\frac = \frac ~.

Thus, the Legendre transformation

f^*(p) = pg(p) - f(g(p))

is the composition of differentiable functions, hence it is differentiable. Applying the

product rule In calculus, the product rule (or Leibniz rule or Leibniz product rule) is a formula used to find the derivatives of products of two or more functions. For two functions, it may be stated in Lagrange's notation as (u \cdot v)' = u ' \cdot v ...

and the

chain rule In calculus, the chain rule is a formula that expresses the derivative of the Function composition, composition of two differentiable functions and in terms of the derivatives of and . More precisely, if h=f\circ g is the function such that h ...

with the found equality

\bar = g(p)

yields

\frac = g(p) + \left(p - f'(g(p))\right)\cdot \frac = g(p),

giving

\frac = \frac = \frac > 0,

f^*

is convex with its double derivatives are all positive. * The Legendre transformation is an

involution Involution may refer to: Mathematics * Involution (mathematics), a function that is its own inverse * Involution algebra, a *-algebra: a type of algebraic structure * Involute, a construction in the differential geometry of curves * Exponentiati ...

, i.e.,

f^ = f ~

. ''Proof.'' By using the above identities as

f'(\bar) = p

\bar = g(p)

f^*(p) = p\bar - f(\bar)

and its derivative

(f^*)'(p) = g(p)

& = f(y)~. \end

Note that this derivation does not require the condition to have all positive values in double derivative of the original function

f

Identities

As shown above, for a convex function

f(x)

, with

x = \bar

maximizing or making

px - f(x)

bounded at each

p

to define the Legendre transform

f^*(p) = p\bar - f(\bar)

and with

g \equiv (f')^

, the following identities hold. *

f'(\bar) = p

, *

\bar = g(p)

, *

(f^*)'(p) = g(p)

Examples

Example 1

Consider the exponential function

f(x) = e^x,

which has the domain

I=\mathbb

. From the definition, the Legendre transform is

f^*(x^*) = \sup_(x^*x-e^x),\quad x^*\in I^*

where

I^*

remains to be determined. To evaluate the

, compute the derivative of

x^*x-e^x

with respect to

x

and set equal to zero:

\frac (x^*x-e^x) = x^*-e^x = 0.

The

second derivative In calculus, the second derivative, or the second-order derivative, of a function is the derivative of the derivative of . Informally, the second derivative can be phrased as "the rate of change of the rate of change"; for example, the secon ...

-e^x

is negative everywhere, so the maximal value is achieved at

x = \ln(x^*)

. Thus, the Legendre transform is

f^*(x^*) = x^*\ln(x^*)-e^ = x^*(\ln(x^*) - 1)

and has domain

I^* = (0, \infty).

This illustrates that the domains of a function and its Legendre transform can be different. To find the Legendre transformation of the Legendre transformation of

f

f^(x) = \sup_(xx^*-x^*(\ln(x^*) - 1)),\quad x\in I,

where a variable

x

is intentionally used as the argument of the function

f^

to show the

property of the Legendre transform as

f^ = f

. we compute

\begin
0 
&= \frac\big( xx^*-x^*(\ln(x^*) - 1) \big)
= x - \ln(x^*)
\end

thus the maximum occurs at

x^* = e^x

because the second derivative

\fracf^(x) = - \frac < 0

over the domain of

f^

I^* = (0, \infty).

As a result,

f^

is found as

\begin
f^(x)
&= xe^x - e^x(\ln(e^x) - 1) 
= e^x,
\end

thereby confirming that

f = f^,

as expected.

Example 2

Let defined on , where is a fixed constant. For fixed, the function of , has the first derivative and second derivative ; there is one stationary point at , which is always a maximum. Thus, and

f^*(x^*)=\frac ~.

The first derivatives of , 2, and of , , are inverse functions to each other. Clearly, furthermore,

f^(x)=\fracx^2=cx^2~,

namely .

Example 3

Let for . For fixed, is continuous on

compact Compact as used in politics may refer broadly to a pact or treaty; in more specific cases it may refer to: * Interstate compact, a type of agreement used by U.S. states * Blood compact, an ancient ritual of the Philippines * Compact government, a t ...

, hence it always takes a finite maximum on it; it follows that the domain of the Legendre transform of

f

is . The stationary point at (found by setting that the first derivative of with respect to

x

equal to zero) is in the domain if and only if . Otherwise the maximum is taken either at or because the second derivative of with respect to

x

is negative as

-2

; for a part of the domain

x^* < 4

the maximum that can take with respect to

x \in,3 /math> is obtained at x = 2 while for x^* > 6 it becomes the maximum at x = 3 . Thus, it follows that f^*(x^*)=\begin
2x^*-4, & x^*<4\\
\frac, & 4\leq x^*\leq 6,\\
3x^*-9, & x^*>6.
\end

Example 4

The function is convex, for every (strict convexity is not required for the Legendre transformation to be well defined). Clearly is never bounded from above as a function of , unless . Hence is defined on and . ( The definition of the Legendre transform requires the existence of the

, that requires upper bounds.) One may check involutivity: of course, is always bounded as a function of , hence . Then, for all one has

\sup_(xx^*-f^*(x^*))=xc,

and hence .

Example 5

As an example of a convex continuous function that is not everywhere differentiable, consider

f(x)= , x,

. This gives

f^*(x^*) = \sup_(xx^*-, x, )=\max\left(\sup_ x(x^*-1), 
 \,\sup_ x(x^*+1)  \right),

and thus

f^*(x^*)=0

on its domain

I^*= 1,1 /math>.

Example 6: several variables

Let

f(x)=\langle x,Ax\rangle+c

be defined on , where is a real, positive definite matrix. Then is convex, and

\langle p,x\rangle-f(x)=\langle p,x \rangle-\langle x,Ax\rangle-c,

has gradient and Hessian , which is negative; hence the stationary point is a maximum. We have , and

f^*(p)=\frac\langle p,A^p\rangle-c.

Behavior of differentials under Legendre transforms

The Legendre transform is linked to

integration by parts In calculus, and more generally in mathematical analysis, integration by parts or partial integration is a process that finds the integral of a product of functions in terms of the integral of the product of their derivative and antiderivati ...

, . Let be a function of two independent variables and , with the differential

df = \frac\,dx + \frac\,dy = p\,dx + v\,dy.

Assume that the function is convex in for all , so that one may perform the Legendre transform on in , with the variable conjugate to (for information, there is a relation

\frac , _ = p

where

\bar

is a point in maximizing or making

px - f(x,y)

bounded for given and ). Since the new independent variable of the transform with respect to is , the differentials and in devolve to and in the differential of the transform, i.e., we build another function with its differential expressed in terms of the new basis and . We thus consider the function so that

dg = df - p\,dx - x\,dp = -x\,dp + v\,dy

x = -\frac

v = \frac.

The function is the Legendre transform of , where only the independent variable has been supplanted by . This is widely used in

, as illustrated below.

Applications

Analytical mechanics

A Legendre transform is used in

to derive the Hamiltonian formulation from the Lagrangian formulation, and conversely. A typical Lagrangian has the form

L(v,q)=\tfrac2\langle v,Mv\rangle-V(q),

where

(v,q)

are coordinates on , is a positive definite real matrix, and

\langle x,y\rangle = \sum_j x_j y_j.

For every fixed,

L(v, q)

is a convex function of

v

, while

V(q)

plays the role of a constant. Hence the Legendre transform of

L(v, q)

as a function of

v

is the Hamiltonian function,

H(p,q)=\tfrac  \langle p,M^p\rangle+V(q).

In a more general setting,

(v, q)

are local coordinates on the

tangent bundle A tangent bundle is the collection of all of the tangent spaces for all points on a manifold, structured in a way that it forms a new manifold itself. Formally, in differential geometry, the tangent bundle of a differentiable manifold M is ...

T\mathcal M

of a manifold

\mathcal M

. For each ,

L(v, q)

is a convex function of the tangent space . The Legendre transform gives the Hamiltonian

H(p, q)

as a function of the coordinates of the

cotangent bundle In mathematics, especially differential geometry, the cotangent bundle of a smooth manifold is the vector bundle of all the cotangent spaces at every point in the manifold. It may be described also as the dual bundle to the tangent bundle. This m ...

T^*\mathcal M

; the inner product used to define the Legendre transform is inherited from the pertinent canonical

symplectic structure Symplectic geometry is a branch of differential geometry and differential topology that studies symplectic manifolds; that is, differentiable manifolds equipped with a closed, nondegenerate 2-form. Symplectic geometry has its origins in the ...

. In this abstract setting, the Legendre transformation corresponds to the tautological one-form.

Thermodynamics

The strategy behind the use of Legendre transforms in thermodynamics is to shift from a function that depends on a variable to a new (conjugate) function that depends on a new variable, the conjugate of the original one. The new variable is the partial derivative of the original function with respect to the original variable. The new function is the difference between the original function and the product of the old and new variables. Typically, this transformation is useful because it shifts the dependence of, e.g., the energy from an extensive variable to its conjugate intensive variable, which can often be controlled more easily in a physical experiment. For example, the

internal energy The internal energy of a thermodynamic system is the energy of the system as a state function, measured as the quantity of energy necessary to bring the system from its standard internal state to its present internal state of interest, accoun ...

is an explicit function of the '' extensive variables''

entropy Entropy is a scientific concept, most commonly associated with states of disorder, randomness, or uncertainty. The term and the concept are used in diverse fields, from classical thermodynamics, where it was first recognized, to the micros ...

volume Volume is a measure of regions in three-dimensional space. It is often quantified numerically using SI derived units (such as the cubic metre and litre) or by various imperial or US customary units (such as the gallon, quart, cubic inch) ...

', and

chemical composition A chemical composition specifies the identity, arrangement, and ratio of the chemical elements making up a compound by way of chemical and atomic bonds. Chemical formulas can be used to describe the relative amounts of elements present in a com ...

(e.g.,

i = 1, 2, 3, \ldots

)

U = U \left (S,V,\ \right ),

which has a total differential

dU = T\,dS - P\,dV + \sum \mu_i \,dN _i

where

T = \left. \frac \right \vert _, P = \left. -\frac \right \vert _, \mu_i = \left. \frac \right \vert _

. (Subscripts are not necessary by the definition of partial derivatives but left here for clarifying variables.) Stipulating some common reference state, by using the (non-standard) Legendre transform of the internal energy with respect to volume , the

enthalpy Enthalpy () is the sum of a thermodynamic system's internal energy and the product of its pressure and volume. It is a state function in thermodynamics used in many measurements in chemical, biological, and physical systems at a constant extern ...

may be obtained as the following. To get the (standard) Legendre transform

U^*

of the internal energy with respect to volume , the function

u\left( p,S,V,\ \right)=pV-U

is defined first, then it shall be maximized or bounded by . To do this, the condition

\frac = p - \frac = 0 \to p = \frac

needs to be satisfied, so

U^* = \fracV - U

is obtained. This approach is justified because is a linear function with respect to (so a convex function on ) by the definition of extensive variables. The non-standard Legendre transform here is obtained by negating the standard version, so

-U^* = H = U - \fracV = U + PV

. is definitely a

state function In the thermodynamics of equilibrium, a state function, function of state, or point function for a thermodynamic system is a mathematical function relating several state variables or state quantities (that describe equilibrium states of a syste ...

as it is obtained by adding ( and as state variables) to a state function

U = U \left (S,V,\ \right )

, so its differential is an

exact differential In multivariate calculus, a differential (infinitesimal), differential or differential form is said to be exact or perfect (''exact differential''), as contrasted with an inexact differential, if it is equal to the general differential dQ for som ...

. Because of

dH = T\,dS + V\,dP + \sum \mu_i \,dN _i

and the fact that it must be an exact differential,

H = H(S,P,\)

. The enthalpy is suitable for description of processes in which the pressure is controlled from the surroundings. It is likewise possible to shift the dependence of the energy from the extensive variable of entropy, , to the (often more convenient) intensive variable , resulting in the Helmholtz and Gibbs free energies. The Helmholtz free energy , and Gibbs energy , are obtained by performing Legendre transforms of the internal energy and enthalpy, respectively,

A = U - TS ~,

G = H - TS = U + PV - TS ~.

The Helmholtz free energy is often the most useful thermodynamic potential when temperature and volume are controlled from the surroundings, while the Gibbs energy is often the most useful when temperature and pressure are controlled from the surroundings.

Variable capacitor

As another example from

physics Physics is the scientific study of matter, its Elementary particle, fundamental constituents, its motion and behavior through space and time, and the related entities of energy and force. "Physical science is that department of knowledge whi ...

, consider a parallel conductive plate

capacitor In electrical engineering, a capacitor is a device that stores electrical energy by accumulating electric charges on two closely spaced surfaces that are insulated from each other. The capacitor was originally known as the condenser, a term st ...

, in which the plates can move relative to one another. Such a capacitor would allow transfer of the electric energy which is stored in the capacitor into external mechanical work, done by the

force In physics, a force is an influence that can cause an Physical object, object to change its velocity unless counterbalanced by other forces. In mechanics, force makes ideas like 'pushing' or 'pulling' mathematically precise. Because the Magnitu ...

acting on the plates. One may think of the electric charge as analogous to the "charge" of a gas in a cylinder, with the resulting mechanical

exerted on a

piston A piston is a component of reciprocating engines, reciprocating pumps, gas compressors, hydraulic cylinders and pneumatic cylinders, among other similar mechanisms. It is the moving component that is contained by a cylinder (engine), cylinder a ...

. Compute the force on the plates as a function of , the distance which separates them. To find the force, compute the potential energy, and then apply the definition of force as the gradient of the potential energy function. The

electrostatic potential energy Electric potential energy is a potential energy (measured in joules) that results from conservative force, conservative Coulomb forces and is associated with the configuration of a particular set of point electric charge, charges within a defi ...

stored in a capacitor of the

capacitance Capacitance is the ability of an object to store electric charge. It is measured by the change in charge in response to a difference in electric potential, expressed as the ratio of those quantities. Commonly recognized are two closely related ...

and a positive

electric charge Electric charge (symbol ''q'', sometimes ''Q'') is a physical property of matter that causes it to experience a force when placed in an electromagnetic field. Electric charge can be ''positive'' or ''negative''. Like charges repel each other and ...

or negative charge on each conductive plate is (with using the definition of the capacitance as

C = \frac

U (Q, \mathbf) = \frac QV(Q,\mathbf) = \frac \frac,~

where the dependence on the area of the plates, the dielectric constant of the insulation material between the plates, and the separation are abstracted away as the

. (For a parallel plate capacitor, this is proportional to the area of the plates and inversely proportional to the separation.) The force between the plates due to the electric field created by the charge separation is then

\mathbf(\mathbf) = -\frac ~.

If the capacitor is not connected to any electric circuit, then the '' electric charges'' on the plates remain constant and the voltage varies when the plates move with respect to each other, and the force is the negative

gradient In vector calculus, the gradient of a scalar-valued differentiable function f of several variables is the vector field (or vector-valued function) \nabla f whose value at a point p gives the direction and the rate of fastest increase. The g ...

of the

electrostatic Electrostatics is a branch of physics that studies slow-moving or stationary electric charges. Since classical times, it has been known that some materials, such as amber, attract lightweight particles after rubbing. The Greek word (), mean ...

potential energy as

\mathbf(\mathbf) = \frac \frac \frac
= \frac \fracV(\mathbf)^2

where

V(Q,\mathbf) = V(\mathbf)

as the charge is fixed in this configuration. However, instead, suppose that the ''

volt The volt (symbol: V) is the unit of electric potential, Voltage#Galvani potential vs. electrochemical potential, electric potential difference (voltage), and electromotive force in the International System of Units, International System of Uni ...

age'' between the plates is maintained constant as the plate moves by connection to a battery, which is a reservoir for electric charges at a constant potential difference. Then the amount of ''charges''

Q

''is a variable'' instead of the voltage;

Q

and

V

are the Legendre conjugate to each other. To find the force, first compute the non-standard Legendre transform

U^*

with respect to

Q

(also with using

C = \frac

U^* = U - \left.\frac \_\mathbf \cdot Q =U - \frac \left. \frac \_\mathbf \cdot Q = U - QV = \frac QV - QV = -\frac QV= - \frac V^2 C(\mathbf).

This transformation is possible because

U

is now a linear function of

Q

so is convex on it. The force now becomes the negative gradient of this Legendre transform, resulting in the same force obtained from the original function

U

\mathbf(\mathbf) = -\frac = \frac \fracV^2 .

The two conjugate energies

U

and

U^*

happen to stand opposite to each other (their signs are opposite), only because of the

linear In mathematics, the term ''linear'' is used in two distinct senses for two different properties: * linearity of a '' function'' (or '' mapping''); * linearity of a '' polynomial''. An example of a linear function is the function defined by f(x) ...

ity of the

—except now is no longer a constant. They reflect the two different pathways of storing energy into the capacitor, resulting in, for instance, the same "pull" between a capacitor's plates.

Probability theory

In large deviations theory, the ''rate function'' is defined as the Legendre transformation of the logarithm of the moment generating function of a random variable. An important application of the rate function is in the calculation of tail probabilities of sums of i.i.d. random variables, in particular in Cramér's theorem. If

X_n

are i.i.d. random variables, let

S_n=X_1+\cdots+X_n

be the associated

random walk In mathematics, a random walk, sometimes known as a drunkard's walk, is a stochastic process that describes a path that consists of a succession of random steps on some Space (mathematics), mathematical space. An elementary example of a rand ...

and

M(\xi)

the moment generating function of

X_1

. For

\xi\in\mathbb R

= M(\xi)^n

. Hence, by

Markov's inequality In probability theory, Markov's inequality gives an upper bound on the probability that a non-negative random variable is greater than or equal to some positive Constant (mathematics), constant. Markov's inequality is tight in the sense that for e ...

, one has for

\xi\ge 0

and

a\in\mathbb R

P(S_n/n > a) \le e^M(\xi)^n=\exp n(\xi a - \Lambda(\xi)) /math>
where \Lambda(\xi)=\log M(\xi) .  Since the left-hand side is independent of \xi, we may take the infimum of the right-hand side, which leads one to consider the supremum of \xi a - \Lambda(\xi), i.e., the Legendre transform of \Lambda, evaluated at x=a .

Microeconomics

Legendre transformation arises naturally in

microeconomics Microeconomics is a branch of economics that studies the behavior of individuals and Theory of the firm, firms in making decisions regarding the allocation of scarcity, scarce resources and the interactions among these individuals and firms. M ...

in the process of finding the ''

supply Supply or supplies may refer to: *The amount of a resource that is available **Supply (economics), the amount of a product which is available to customers **Materiel, the goods and equipment for a military unit to fulfill its mission *Supply, as ...

'' of some product given a fixed price on the market knowing the cost function , i.e. the cost for the producer to make/mine/etc. units of the given product. A simple theory explains the shape of the supply curve based solely on the cost function. Let us suppose the market price for a one unit of our product is . For a company selling this good, the best strategy is to adjust the production so that its profit is maximized. We can maximize the profit

\text = \text - \text = PQ - C(Q)

by differentiating with respect to and solving

P - C'(Q_\text) = 0.

represents the optimal quantity of goods that the producer is willing to supply, which is indeed the supply itself:

S(P) = Q_\text(P) = (C')^(P).

If we consider the maximal profit as a function of price,

\text_\text(P)

, we see that it is the Legendre transform of the cost function

C(Q)

Geometric interpretation

For a strictly convex function, the Legendre transformation can be interpreted as a mapping between the

of the function and the family of

tangent In geometry, the tangent line (or simply tangent) to a plane curve at a given point is, intuitively, the straight line that "just touches" the curve at that point. Leibniz defined it as the line through a pair of infinitely close points o ...

s of the graph. (For a function of one variable, the tangents are well-defined at all but at most countably many points, since a convex function is

at all but at most countably many points.) The equation of a line with

slope In mathematics, the slope or gradient of a Line (mathematics), line is a number that describes the direction (geometry), direction of the line on a plane (geometry), plane. Often denoted by the letter ''m'', slope is calculated as the ratio of t ...

p

and

y

-intercept

b

is given by

y = p x + b

. For this line to be tangent to the graph of a function

f

at the point

\left(x_0, f(x_0)\right)

requires

f(x_0) = p x_0 + b

and

p = f'(x_0).

Being the derivative of a strictly convex function, the function

f'

is strictly monotone and thus

injective In mathematics, an injective function (also known as injection, or one-to-one function ) is a function that maps distinct elements of its domain to distinct elements of its codomain; that is, implies (equivalently by contraposition, impl ...

. The second equation can be solved for

x_0 = f^(p),

allowing elimination of

x_0

from the first, and solving for the

y

-intercept

b

of the tangent as a function of its slope

p,

b = f(x_0) - p x_0 = f\left(f^(p)\right) - p \cdot f^(p) = -f^\star(p)

where

f^

denotes the Legendre transform of

f.

The

family Family (from ) is a Social group, group of people related either by consanguinity (by recognized birth) or Affinity (law), affinity (by marriage or other relationship). It forms the basis for social order. Ideally, families offer predictabili ...

of tangent lines of the graph of

f

parameterized by the slope

p

is therefore given by

y = p x - f^(p),

or, written implicitly, by the solutions of the equation

F(x,y,p) = y + f^(p) - p x = 0~.

The graph of the original function can be reconstructed from this family of lines as the

envelope An envelope is a common packaging item, usually made of thin, flat material. It is designed to contain a flat object, such as a letter (message), letter or Greeting card, card. Traditional envelopes are made from sheets of paper cut to one o ...

of this family by demanding

\frac = f^(p) - x = 0.

Eliminating

p

from these two equations gives

y = x \cdot f^(x) - f^\left(f^(x)\right).

Identifying

y

with

f(x)

and recognizing the right side of the preceding equation as the Legendre transform of

f^,

yield

f(x) = f^(x) ~.

Legendre transformation in more than one dimension

For a differentiable real-valued function on an

open Open or OPEN may refer to: Music * Open (band), Australian pop/rock band * The Open (band), English indie rock band * ''Open'' (Blues Image album), 1969 * ''Open'' (Gerd Dudek, Buschi Niebergall, and Edward Vesala album), 1979 * ''Open'' (Go ...

convex subset of the Legendre conjugate of the pair is defined to be the pair , where is the image of under the

mapping , and is the function on given by the formula

g(y) = \left\langle y, x \right\rangle - f(x), \qquad x = \left(Df\right)^(y)

where

\left\langle u,v\right\rangle = \sum_^n u_k \cdot v_k

is the

scalar product In mathematics, the dot product or scalar productThe term ''scalar product'' means literally "product with a scalar as a result". It is also used for other symmetric bilinear forms, for example in a pseudo-Euclidean space. Not to be confused wit ...

on . The multidimensional transform can be interpreted as an encoding of the

of the function's epigraph in terms of its

supporting hyperplane In geometry, a supporting hyperplane of a Set (mathematics), set S in Euclidean space \mathbb R^n is a hyperplane that has both of the following two properties: * S is entirely contained in one of the two closed set, closed Half-space (geometry), h ...

s. This can be seen as consequence of the following two observations. On the one hand, the hyperplane tangent to the epigraph of

f

at some point

(\mathbf x, f(\mathbf x))\in U\times \mathbb

has normal vector

(\nabla f(\mathbf x),-1)\in\mathbb^

. On the other hand, any closed convex set

C\in\mathbb^m

can be characterized via the set of its supporting hyperplanes by the equations

\mathbf x\cdot\mathbf n = h_C(\mathbf n)

, where

h_C(\mathbf n)

is the support function of

C

. But the definition of Legendre transform via the maximization matches precisely that of the support function, that is,

f^*(\mathbf x)=h_(\mathbf x,-1)

. We thus conclude that the Legendre transform characterizes the epigraph in the sense that the tangent plane to the epigraph at any point

(\mathbf x,f(\mathbf x))

is given explicitly by

\.

Alternatively, if is a

vector space In mathematics and physics, a vector space (also called a linear space) is a set (mathematics), set whose elements, often called vector (mathematics and physics), ''vectors'', can be added together and multiplied ("scaled") by numbers called sc ...

and is its

dual vector space In mathematics, any vector space ''V'' has a corresponding dual vector space (or just dual space for short) consisting of all linear forms on ''V,'' together with the vector space structure of pointwise addition and scalar multiplication by const ...

, then for each point of and of , there is a natural identification of the

cotangent space In differential geometry, the cotangent space is a vector space associated with a point x on a smooth (or differentiable) manifold \mathcal M; one can define a cotangent space for every point on a smooth manifold. Typically, the cotangent space, T ...

s with and with . If is a real differentiable function over , then its

exterior derivative On a differentiable manifold, the exterior derivative extends the concept of the differential of a function to differential forms of higher degree. The exterior derivative was first described in its current form by Élie Cartan in 1899. The re ...

, , is a section of the

and as such, we can construct a map from to . Similarly, if is a real differentiable function over , then defines a map from to . If both maps happen to be inverses of each other, we say we have a Legendre transform. The notion of the tautological one-form is commonly used in this setting. When the function is not differentiable, the Legendre transform can still be extended, and is known as the Legendre-Fenchel transformation. In this more general setting, a few properties are lost: for example, the Legendre transform is no longer its own inverse (unless there are extra assumptions, like convexity).

Legendre transformation on manifolds

Let

M

be a

smooth manifold In mathematics, a differentiable manifold (also differential manifold) is a type of manifold that is locally similar enough to a vector space to allow one to apply calculus. Any manifold can be described by a collection of charts (atlas). One may ...

, let

E

and

\pi : E\to M

be a

vector bundle In mathematics, a vector bundle is a topological construction that makes precise the idea of a family of vector spaces parameterized by another space X (for example X could be a topological space, a manifold, or an algebraic variety): to eve ...

M

and its associated bundle projection, respectively. Let

L : E\to \R

be a smooth function. We think of

L

as a Lagrangian by analogy with the classical case where

M = \R

E = TM = \Reals \times \Reals

and

L(x,v) = \frac 1 2 m v^2 - V(x)

for some positive number

m\in \Reals

and function

V : M \to \Reals

. As usual, the dual of

E

is denote by

E^*

. The fiber of

\pi

over

x\in M

is denoted

E_x

, and the restriction of

L

E_x

is denoted by

L, _ : E_x\to \R

. The ''Legendre transformation'' of

L

is the smooth morphism

\mathbf F L : E \to E^*

defined by

\mathbf FL(v) = d(L, _)_v \in E_x^*

, where

x = \pi(v)

. Here we use the fact that since

E_x

is a vector space,

T_v(E_x)

can be identified with

E_x

. In other words,

\mathbf FL(v)\in E_x^*

is the covector that sends

w\in E_x

to the directional derivative

\left.\frac d \_ L(v + tw)\in \R

. To describe the Legendre transformation locally, let

U\subseteq M

be a coordinate chart over which

E

is trivial. Picking a trivialization of

E

over

U

, we obtain charts

E_U \cong U \times \R^r

and

E_U^* \cong U \times \R^r

. In terms of these charts, we have

\mathbf FL(x; v_1, \dotsc, v_r) = (x; p_1,\dotsc, p_r)

, where

p_i = \frac (x; v_1, \dotsc, v_r)

for all

i = 1, \dots, r

. If, as in the classical case, the restriction of

L : E\to \mathbb R

to each fiber

E_x

is strictly convex and bounded below by a positive definite quadratic form minus a constant, then the Legendre transform

\mathbf FL : E\to E^*

is a diffeomorphism.Ana Cannas da Silva. ''Lectures on Symplectic Geometry'', Corrected 2nd printing. Springer-Verlag, 2008. pp. 147-148. . Suppose that

\mathbf FL

is a diffeomorphism and let

H : E^* \to \R

be the "

" function defined by

H(p) = p \cdot v - L(v),

where

v = (\mathbf FL)^(p)

. Using the natural isomorphism

E\cong E^

, we may view the Legendre transformation of

H

as a map

\mathbf FH : E^* \to E

. Then we have

(\mathbf FL)^ = \mathbf FH.

Further properties

Scaling properties

The Legendre transformation has the following scaling properties: For ,

f(x) = a \cdot g(x) \Rightarrow f^\star(p) = a \cdot g^\star\left(\frac\right)

f(x) = g(a \cdot x) \Rightarrow f^\star(p) = g^\star\left(\frac\right).

It follows that if a function is homogeneous of degree then its image under the Legendre transformation is a homogeneous function of degree , where . (Since , with , implies .) Thus, the only monomial whose degree is invariant under Legendre transform is the quadratic.

Behavior under translation

f(x) = g(x) + b \Rightarrow f^\star(p) = g^\star(p) - b

f(x) = g(x + y) \Rightarrow f^\star(p) = g^\star(p) - p \cdot y

Behavior under inversion

f(x) = g^(x) \Rightarrow f^\star(p) = - p \cdot g^\star\left(\frac \right)

Behavior under linear transformations

Let be a

linear transformation In mathematics, and more specifically in linear algebra, a linear map (also called a linear mapping, linear transformation, vector space homomorphism, or in some contexts linear function) is a mapping V \to W between two vector spaces that pr ...

. For any convex function on , one has

(A f)^\star = f^\star A^\star

where is the

adjoint operator In mathematics, specifically in operator theory, each linear operator A on an inner product space defines a Hermitian adjoint (or adjoint) operator A^* on that space according to the rule :\langle Ax,y \rangle = \langle x,A^*y \rangle, where \l ...

of defined by

\left \langle Ax, y^\star \right \rangle = \left \langle x, A^\star y^\star \right \rangle,

and is the ''push-forward'' of along

(A f)(y) = \inf\.

A closed convex function is symmetric with respect to a given set of orthogonal linear transformations,

f(A x) = f(x), \; \forall x, \; \forall A \in G

if and only if In logic and related fields such as mathematics and philosophy, "if and only if" (often shortened as "iff") is paraphrased by the biconditional, a logical connective between statements. The biconditional is true in two cases, where either bo ...

is symmetric with respect to .

Infimal convolution

The infimal convolution of two functions and is defined as

\left(f \star_\inf g\right)(x) = \inf \left \.

Let be proper convex functions on . Then

\left( f_1 \star_\inf \cdots \star_\inf f_m \right)^\star = f_1^\star + \cdots + f_m^\star.

Fenchel's inequality

For any function and its convex conjugate ''Fenchel's inequality'' (also known as the ''Fenchel–Young inequality'') holds for every and , i.e., ''independent'' pairs,

\left\langle p,x \right\rangle \le f(x) + f^\star(p).

References

* * * Fenchel, W. (1949). "On conjugate convex functions", ''Can. J. Math'' 1: 73-77. * *

External links

{{Commons category, Legendre transformation
Legendre transform with figures
at maze5.net
Legendre and Legendre-Fenchel transforms in a step-by-step explanation
at onmyphd.com Transforms Duality theories Concepts in physics Convex analysis Mathematical physics

Definition

Definition in one-dimensional real space

Definition in n-dimensional real space

Understanding the Legendre transform in terms of derivatives

Formal definition in physics context

Properties

Identities

Examples

Example 1

Example 2

Example 3

Example 4

Example 5

Example 6: several variables

Behavior of differentials under Legendre transforms

Applications

Analytical mechanics

Thermodynamics

Variable capacitor

Probability theory

Microeconomics

Geometric interpretation

Legendre transformation in more than one dimension

Legendre transformation on manifolds

Further properties

Scaling properties

Behavior under translation

Behavior under inversion

Behavior under linear transformations

Infimal convolution

Fenchel's inequality

See also

References

Further reading

External links