HOME

TheInfoList



OR:

In
applied mathematics Applied mathematics is the application of mathematical methods by different fields such as physics, engineering, medicine, biology, finance, business, computer science, and industry. Thus, applied mathematics is a combination of mathemat ...
, Hessian automatic differentiation are techniques based on
automatic differentiation In mathematics and computer algebra, automatic differentiation (AD), also called algorithmic differentiation, computational differentiation, auto-differentiation, or simply autodiff, is a set of techniques to evaluate the derivative of a function ...
(AD) that calculate the second derivative of an n-dimensional function, known as the
Hessian matrix In mathematics, the Hessian matrix or Hessian is a square matrix of second-order partial derivatives of a scalar-valued function, or scalar field. It describes the local curvature of a function of many variables. The Hessian matrix was developed ...
. When examining a function in a neighborhood of a point, one can discard many complicated global aspects of the function and accurately approximate it with simpler functions. The quadratic approximation is the best-fitting quadratic in the neighborhood of a point, and is frequently used in engineering and science. To calculate the quadratic approximation, one must first calculate its gradient and Hessian matrix. Let f: \mathbb^n \rightarrow \mathbb , for each x \in \mathbb^n the Hessian matrix H(x) \in \mathbb^ is the second order derivative and is a symmetric matrix.


Reverse Hessian-vector products

For a given u \in \mathbb^n, this method efficiently calculates the Hessian-vector product H(x)u . Thus can be used to calculate the entire Hessian by calculating H(x)e_i, for i = 1, \ldots, n. The method works by first using forward AD to perform f(x) \rightarrow u^T\nabla f(x), subsequently the method then calculates the gradient of u^T \nabla f(x) using Reverse AD to yield \nabla \left( u \cdot \nabla f(x)\right) = u^T H(x) = (H(x)u)^T. Both of these two steps come at a time cost proportional to evaluating the function, thus the entire Hessian can be evaluated at a cost proportional to n evaluations of the function.


Reverse Hessian: Edge_Pushing

An algorithm that calculates the entire Hessian with one forward and one reverse sweep of the computational graph is Edge_Pushing. Edge_Pushing is the result of applying the reverse gradient to the computational graph of the gradient. Naturally, this graph has ''n'' output nodes, thus in a sense one has to apply the reverse gradient method to each outgoing node. Edge_Pushing does this by taking into account overlapping calculations.