mathematics Mathematics is an area of knowledge that includes the topics of numbers, formulas and related structures, shapes and the spaces in which they are contained, and quantities and their changes. These topics are represented in modern mathematics ...

, the total derivative of a function at a point is the best linear approximation near this point of the function with respect to its arguments. Unlike

partial derivative In mathematics, a partial derivative of a function of several variables is its derivative with respect to one of those variables, with the others held constant (as opposed to the total derivative, in which all variables are allowed to vary). Pa ...

s, the total derivative approximates the function with respect to all of its arguments, not just a single one. In many situations, this is the same as considering all partial derivatives simultaneously. The term "total derivative" is primarily used when is a function of several variables, because when is a function of a single variable, the total derivative is the same as the ordinary

derivative In mathematics, the derivative of a function of a real variable measures the sensitivity to change of the function value (output value) with respect to a change in its argument (input value). Derivatives are a fundamental tool of calculus. ...

of the function.

The total derivative as a linear map

Let

U \subseteq \R^n

be an

open subset In mathematics, open sets are a generalization of open intervals in the real line. In a metric space (a set along with a distance defined between any two points), open sets are the sets that, with every point , contain all points that are suff ...

. Then a function

f:U \to \R^m

is said to be (totally) differentiable at a point

a\in U

if there exists a

linear transformation In mathematics, and more specifically in linear algebra, a linear map (also called a linear mapping, linear transformation, vector space homomorphism, or in some contexts linear function) is a mapping V \to W between two vector spaces that pre ...

df_a:\R^n \to \R^m

such that :

\lim_ \frac=0.

The

linear map In mathematics, and more specifically in linear algebra, a linear map (also called a linear mapping, linear transformation, vector space homomorphism, or in some contexts linear function) is a mapping V \to W between two vector spaces that ...

df_a

is called the (total) derivative or (total) differential of

f

a

. Other notations for the total derivative include

D_a f

and

Df(a)

. A function is (totally) differentiable if its total derivative exists at every point in its domain. Conceptually, the definition of the total derivative expresses the idea that

df_a

is the best linear approximation to

f

at the point

a

. This can be made precise by quantifying the error in the linear approximation determined by

df_a

. To do so, write :

f(a + h) = f(a) + df_a(h) + \varepsilon(h),

where

\varepsilon(h)

equals the error in the approximation. To say that the derivative of

f

a

df_a

is equivalent to the statement :

\varepsilon(h) = o(\lVert h\rVert),

where

o

little-o notation Big ''O'' notation is a mathematical notation that describes the limiting behavior of a function when the argument tends towards a particular value or infinity. Big O is a member of a family of notations invented by Paul Bachmann, Edmund Lan ...

and indicates that

\varepsilon(h)

is much smaller than

\lVert h\rVert

h \to 0

. The total derivative

df_a

is the ''unique'' linear transformation for which the error term is this small, and this is the sense in which it is the best linear approximation to

f

. The function

f

is differentiable if and only if each of its components

f_i \colon U \to \R

is differentiable, so when studying total derivatives, it is often possible to work one coordinate at a time in the codomain. However, the same is not true of the coordinates in the domain. It is true that if

f

is differentiable at

a

, then each partial derivative

\partial f/\partial x_i

exists at

a

. The converse is false: It can happen that all of the partial derivatives of

f

a

exist, but

f

is not differentiable at

a

. This means that the function is very "rough" at

a

, to such an extreme that its behavior cannot be adequately described by its behavior in the coordinate directions. When

f

is not so rough, this cannot happen. More precisely, if all the partial derivatives of

f

a

exist and are continuous in a neighborhood of

a

, then

f

is differentiable at

a

. When this happens, then in addition, the total derivative of

f

is the linear transformation corresponding to the

Jacobian matrix In vector calculus, the Jacobian matrix (, ) of a vector-valued function of several variables is the matrix of all its first-order partial derivatives. When this matrix is square, that is, when the function takes the same number of variable ...

of partial derivatives at that point.

The total derivative as a differential form

When the function under consideration is real-valued, the total derivative can be recast using

differential form In mathematics, differential forms provide a unified approach to define integrands over curves, surfaces, solids, and higher-dimensional manifolds. The modern notion of differential forms was pioneered by Élie Cartan. It has many application ...

s. For example, suppose that

f \colon \R^n \to \R

is a differentiable function of variables

x_1, \ldots, x_n

. The total derivative of

f

a

may be written in terms of its Jacobian matrix, which in this instance is a row matrix: :

D f_a = \begin \frac(a) & \cdots & \frac(a) \end.

The linear approximation property of the total derivative implies that if :

\Delta x = \begin \Delta x_1 & \cdots & \Delta x_n \end^\mathsf

is a small vector (where the

\mathsf

denotes transpose, so that this vector is a column vector), then :

f(a + \Delta x) - f(a) \approx D f_a \cdot \Delta x = \sum_^n \frac(a) \cdot \Delta x_i.

Heuristically, this suggests that if

dx_1, \ldots, dx_n

are

infinitesimal In mathematics, an infinitesimal number is a quantity that is closer to zero than any standard real number, but that is not zero. The word ''infinitesimal'' comes from a 17th-century Modern Latin coinage ''infinitesimus'', which originally re ...

increments in the coordinate directions, then :

df_a = \sum_^n \frac(a) \cdot dx_i.

In fact, the notion of the infinitesimal, which is merely symbolic here, can be equipped with extensive mathematical structure. Techniques, such as the theory of

s, effectively give analytical and algebraic descriptions of objects like infinitesimal increments,

dx_i

. For instance,

dx_i

may be inscribed as a

linear functional In mathematics, a linear form (also known as a linear functional, a one-form, or a covector) is a linear map from a vector space to its field of scalars (often, the real numbers or the complex numbers). If is a vector space over a field , the ...

on the vector space

\R^n

. Evaluating

dx_i

at a vector

h

\R^n

measures how much

h

points in the

i

th coordinate direction. The total derivative

df_a

is a linear combination of linear functionals and hence is itself a linear functional. The evaluation

df_a(h)

measures how much

h

points in the direction determined by

f

a

, and this direction is the

gradient In vector calculus, the gradient of a scalar-valued differentiable function of several variables is the vector field (or vector-valued function) \nabla f whose value at a point p is the "direction and rate of fastest increase". If the gr ...

. This point of view makes the total derivative an instance of the

exterior derivative On a differentiable manifold, the exterior derivative extends the concept of the differential of a function to differential forms of higher degree. The exterior derivative was first described in its current form by Élie Cartan in 1899. The re ...

. Suppose now that

f

is a vector-valued function, that is,

f \colon \R^n \to \R^m

. In this case, the components

f_i

f

are real-valued functions, so they have associated differential forms

df_i

. The total derivative

df

amalgamates these forms into a single object and is therefore an instance of a vector-valued differential form.

The chain rule for total derivatives

The chain rule has a particularly elegant statement in terms of total derivatives. It says that, for two functions

f

and

g

, the total derivative of the

composite function In mathematics, function composition is an operation that takes two functions and , and produces a function such that . In this operation, the function is applied to the result of applying the function to . That is, the functions and ...

g \circ f

a

satisfies :

d(g \circ f)_a = dg_ \cdot df_a.

If the total derivatives of

f

and

g

are identified with their Jacobian matrices, then the composite on the right-hand side is simply matrix multiplication. This is enormously useful in applications, as it makes it possible to account for essentially arbitrary dependencies among the arguments of a composite function.

Example: Differentiation with direct dependencies

Suppose that ''f'' is a function of two variables, ''x'' and ''y''. If these two variables are independent, so that the domain of ''f'' is

\R^2

, then the behavior of ''f'' may be understood in terms of its partial derivatives in the ''x'' and ''y'' directions. However, in some situations, ''x'' and ''y'' may be dependent. For example, it might happen that ''f'' is constrained to a curve

y = y(x)

. In this case, we are actually interested in the behavior of the composite function

f(x, y(x))

. The partial derivative of ''f'' with respect to ''x'' does not give the true rate of change of ''f'' with respect to changing ''x'' because changing ''x'' necessarily changes ''y''. However, the chain rule for the total derivative takes such dependencies into account. Write

\gamma(x) = (x, y(x))

. Then, the chain rule says :

d(f \circ \gamma)_ = df_ \cdot d\gamma_.

By expressing the total derivative using Jacobian matrices, this becomes: :

\frac(x_0) = \frac(x_0, y(x_0)) \cdot \frac(x_0) + \frac(x_0, y(x_0)) \cdot \frac(x_0).

Suppressing the evaluation at

x_0

for legibility, we may also write this as :

\frac = \frac \frac + \frac \frac.

This gives a straightforward formula for the derivative of

f(x, y(x))

in terms of the partial derivatives of

f

and the derivative of

y(x)

. For example, suppose :

f(x,y)=xy.

The rate of change of ''f'' with respect to ''x'' is usually the partial derivative of ''f'' with respect to ''x''; in this case, :

\frac = y.

However, if ''y'' depends on ''x'', the partial derivative does not give the true rate of change of ''f'' as ''x'' changes because the partial derivative assumes that ''y'' is fixed. Suppose we are constrained to the line :

y=x.

Then :

f(x,y) = f(x,x) = x^2,

and the total derivative of ''f'' with respect to ''x'' is :

\frac = 2 x,

which we see is not equal to the partial derivative

\partial f/\partial x

. Instead of immediately substituting for ''y'' in terms of ''x'', however, we can also use the chain rule as above: :

\frac = \frac + \frac\frac = y+x \cdot 1 = x+y = 2x.

Example: Differentiation with indirect dependencies

While one can often perform substitutions to eliminate indirect dependencies, the

chain rule In calculus, the chain rule is a formula that expresses the derivative of the composition of two differentiable functions and in terms of the derivatives of and . More precisely, if h=f\circ g is the function such that h(x)=f(g(x)) for every , ...

provides for a more efficient and general technique. Suppose

L(t,x_1,\dots,x_n)

is a function of time

t

and

n

variables

x_i

which themselves depend on time. Then, the time derivative of

L

is :

\frac = \frac L \bigl(t, x_1(t), \ldots, x_n(t)\bigr).

The chain rule expresses this derivative in terms of the partial derivatives of

L

and the time derivatives of the functions

x_i

: :

\frac
= \frac + \sum_^n \frac\frac
= \biggl(\frac + \sum_^n \frac\frac\biggr)(L).

This expression is often used in

physics Physics is the natural science that studies matter, its fundamental constituents, its motion and behavior through space and time, and the related entities of energy and force. "Physical science is that department of knowledge which ...

for a

gauge transformation In physics, a gauge theory is a type of field theory in which the Lagrangian (and hence the dynamics of the system itself) does not change (is invariant) under local transformations according to certain smooth families of operations (Lie group ...

of the Lagrangian, as two Lagrangians that differ only by the total time derivative of a function of time and the

n

generalized coordinates In analytical mechanics, generalized coordinates are a set of parameters used to represent the state of a system in a configuration space. These parameters must uniquely define the configuration of the system relative to a reference state.,p. 39 ...

lead to the same equations of motion. An interesting example concerns the resolution of causality concerning the Wheeler–Feynman time-symmetric theory. The operator in brackets (in the final expression above) is also called the total derivative operator (with respect to

t

). For example, the total derivative of

f(x(t),y(t))

is :

\frac =  + .

Here there is no

\partial f / \partial t

term since

f

itself does not depend on the independent variable

t

directly.

Total differential equation

A ''total differential equation'' is a

differential equation In mathematics, a differential equation is an equation that relates one or more unknown functions and their derivatives. In applications, the functions generally represent physical quantities, the derivatives represent their rates of change, ...

expressed in terms of total derivatives. Since the

is coordinate-free, in a sense that can be given a technical meaning, such equations are intrinsic and ''geometric''.

Application to equation systems

economics Economics () is the social science that studies the production, distribution, and consumption of goods and services. Economics focuses on the behaviour and interactions of economic agents and how economies work. Microeconomics anal ...

, it is common for the total derivative to arise in the context of a system of equations. For example, a simple supply-demand system might specify the quantity ''q'' of a product demanded as a function ''D'' of its price ''p'' and consumers' income ''I'', the latter being an

exogenous variable In an economic model, an exogenous variable is one whose measure is determined outside the model and is imposed on the model, and an exogenous change is a change in an exogenous variable.Mankiw, N. Gregory. ''Macroeconomics'', third edition, 1997 ...

, and might specify the quantity supplied by producers as a function ''S'' of its price and two exogenous resource cost variables ''r'' and ''w''. The resulting system of equations :

q=D(p, I),

q=S(p, r, w),

determines the market equilibrium values of the variables ''p'' and ''q''. The total derivative

dp/dr

of ''p'' with respect to ''r'', for example, gives the sign and magnitude of the reaction of the market price to the exogenous variable ''r''. In the indicated system, there are a total of six possible total derivatives, also known in this context as comparative static derivatives: , , , , , and . The total derivatives are found by totally differentiating the system of equations, dividing through by, say , treating and as the unknowns, setting , and solving the two totally differentiated equations simultaneously, typically by using

Cramer's rule In linear algebra, Cramer's rule is an explicit formula for the solution of a system of linear equations with as many equations as unknowns, valid whenever the system has a unique solution. It expresses the solution in terms of the determinants o ...

Application to continuity equation

References

* A. D. Polyanin and V. F. Zaitsev, ''Handbook of Exact Solutions for Ordinary Differential Equations (2nd edition)'', Chapman & Hall/CRC Press, Boca Raton, 2003. * From thesaurus.maths.or
total derivative

External links

* * Ronald D. Kriz (2007

from

Virginia Tech Virginia Tech (formally the Virginia Polytechnic Institute and State University and informally VT, or VPI) is a public land-grant research university with its main campus in Blacksburg, Virginia. It also has educational facilities in six re ...

{{Analysis in topological vector spaces Differential calculus Differential operators Lagrangian mechanics Mathematical analysis Multivariable calculus