In
applied mathematics
Applied mathematics is the application of mathematical methods by different fields such as physics, engineering, medicine, biology, finance, business, computer science, and industry. Thus, applied mathematics is a combination of mathemat ...
, Hessian automatic differentiation are techniques based on
automatic differentiation
In mathematics and computer algebra, automatic differentiation (AD), also called algorithmic differentiation, computational differentiation, auto-differentiation, or simply autodiff, is a set of techniques to evaluate the derivative of a function ...
(AD)
that calculate the second derivative of an
-dimensional function, known as the
Hessian matrix
In mathematics, the Hessian matrix or Hessian is a square matrix of second-order partial derivatives of a scalar-valued function, or scalar field. It describes the local curvature of a function of many variables. The Hessian matrix was developed ...
.
When examining a function in a neighborhood of a point, one can discard many complicated global aspects of the function and accurately approximate it with simpler functions. The quadratic approximation is the best-fitting quadratic in the neighborhood of a point, and is frequently used in engineering and science. To calculate the quadratic approximation, one must first calculate its gradient and Hessian matrix.
Let
, for each
the Hessian matrix
is the second order derivative and is a symmetric matrix.
Reverse Hessian-vector products
For a given
, this method efficiently calculates the Hessian-vector product
. Thus can be used to calculate the entire Hessian by calculating
, for
.
The method works by first using forward AD to perform
, subsequently the method then calculates the gradient of
using Reverse AD to yield
. Both of these two steps come at a time cost proportional to evaluating the function, thus the entire Hessian can be evaluated at a cost proportional to n evaluations of the function.
Reverse Hessian: Edge_Pushing
An algorithm that calculates the entire Hessian with one forward and one reverse sweep of the computational graph is Edge_Pushing. Edge_Pushing is the result of applying the reverse gradient to the computational graph of the gradient. Naturally, this graph has ''n'' output nodes, thus in a sense one has to apply the reverse gradient method to each outgoing node. Edge_Pushing does this by taking into account overlapping calculations.
![Edgepushingexample]()