Descent Direction
In optimization, a descent direction is a vector \mathbf\in\mathbb R^n that, in the sense below, moves us closer towards a local minimum \mathbf^* of our objective function f:\mathbb R^n\to\mathbb R. Suppose we are computing \mathbf^* by an iterative method, such as line search. We define a descent direction \mathbf_k\in\mathbb R^n at the kth iterate to be any \mathbf_k such that \langle\mathbf_k,\nabla f(\mathbf_k)\rangle < 0, where denotes the . The motivation for such an approach is that small steps along guarantee that is reduced, by . Using this definition, the negative of a non-zero gradient is always a d ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Optimization (mathematics)
Mathematical optimization (alternatively spelled ''optimisation'') or mathematical programming is the selection of a best element, with regard to some criterion, from some set of available alternatives. It is generally divided into two subfields: discrete optimization and continuous optimization. Optimization problems of sorts arise in all quantitative disciplines from computer science and engineering to operations research and economics, and the development of solution methods has been of interest in mathematics for centuries. In the more general approach, an optimization problem consists of maximizing or minimizing a real function by systematically choosing input values from within an allowed set and computing the value of the function. The generalization of optimization theory and techniques to other formulations constitutes a large area of applied mathematics. More generally, optimization includes finding "best available" values of some objective function given a ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Line Search
In optimization, the line search strategy is one of two basic iterative approaches to find a local minimum \mathbf^* of an objective function f:\mathbb R^n\to\mathbb R. The other approach is trust region. The line search approach first finds a descent direction along which the objective function f will be reduced and then computes a step size that determines how far \mathbf should move along that direction. The descent direction can be computed by various methods, such as gradient descent or quasi-Newton method. The step size can be determined either exactly or inexactly. Example use Here is an example gradient method that uses a line search in step 4. # Set iteration counter \displaystyle k=0, and make an initial guess \mathbf_0 for the minimum # Repeat: # Compute a descent direction \mathbf_k # Choose \displaystyle \alpha_k to 'loosely' minimize h(\alpha_k)=f(\mathbf_k+\alpha_k\mathbf_k) over \alpha_k\in\mathbb R_+ # &nbs ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Inner Product
In mathematics, an inner product space (or, rarely, a Hausdorff pre-Hilbert space) is a real vector space or a complex vector space with an operation called an inner product. The inner product of two vectors in the space is a scalar, often denoted with angle brackets such as in \langle a, b \rangle. Inner products allow formal definitions of intuitive geometric notions, such as lengths, angles, and orthogonality (zero inner product) of vectors. Inner product spaces generalize Euclidean vector spaces, in which the inner product is the dot product or ''scalar product'' of Cartesian coordinates. Inner product spaces of infinite dimension are widely used in functional analysis. Inner product spaces over the field of complex numbers are sometimes referred to as unitary spaces. The first usage of the concept of a vector space with an inner product is due to Giuseppe Peano, in 1898. An inner product naturally induces an associated norm, (denoted , x, and , y, in the picture); ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Taylor's Theorem
In calculus, Taylor's theorem gives an approximation of a ''k''-times differentiable function around a given point by a polynomial of degree ''k'', called the ''k''th-order Taylor polynomial. For a smooth function, the Taylor polynomial is the truncation at the order ''k'' of the Taylor series of the function. The first-order Taylor polynomial is the linear approximation of the function, and the second-order Taylor polynomial is often referred to as the quadratic approximation. There are several versions of Taylor's theorem, some giving explicit estimates of the approximation error of the function by its Taylor polynomial. Taylor's theorem is named after the mathematician Brook Taylor, who stated a version of it in 1715, although an earlier version of the result was already mentioned in 1671 by James Gregory. Taylor's theorem is taught in introductory-level calculus courses and is one of the central elementary tools in mathematical analysis. It gives simple arithmetic formu ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Gradient Descent
In mathematics, gradient descent (also often called steepest descent) is a first-order iterative optimization algorithm for finding a local minimum of a differentiable function. The idea is to take repeated steps in the opposite direction of the gradient (or approximate gradient) of the function at the current point, because this is the direction of steepest descent. Conversely, stepping in the direction of the gradient will lead to a local maximum of that function; the procedure is then known as gradient ascent. Gradient descent is generally attributed to Augustin-Louis Cauchy, who first suggested it in 1847. Jacques Hadamard independently proposed a similar method in 1907. Its convergence properties for non-linear optimization problems were first studied by Haskell Curry in 1944, with the method becoming increasingly well-studied and used in the following decades. Description Gradient descent is based on the observation that if the multi-variable function F(\mathbf) is ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Conjugate Gradient Method
In mathematics, the conjugate gradient method is an algorithm for the numerical solution of particular systems of linear equations, namely those whose matrix is positive-definite. The conjugate gradient method is often implemented as an iterative algorithm, applicable to sparse systems that are too large to be handled by a direct implementation or other direct methods such as the Cholesky decomposition. Large sparse systems often arise when numerically solving partial differential equations or optimization problems. The conjugate gradient method can also be used to solve unconstrained optimization problems such as energy minimization. It is commonly attributed to Magnus Hestenes and Eduard Stiefel, who programmed it on the Z4, and extensively researched it. The biconjugate gradient method provides a generalization to non-symmetric matrices. Various nonlinear conjugate gradient methods seek minima of nonlinear optimization problems. Description of the problem address ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Positive Definite
In mathematics, positive definiteness is a property of any object to which a bilinear form or a sesquilinear form may be naturally associated, which is positive-definite. See, in particular: * Positive-definite bilinear form * Positive-definite function * Positive-definite function on a group * Positive-definite functional * Positive-definite kernel * Positive-definite matrix In mathematics, a symmetric matrix M with real entries is positive-definite if the real number z^\textsfMz is positive for every nonzero real column vector z, where z^\textsf is the transpose of More generally, a Hermitian matrix (that is, ... * Positive-definite quadratic form References *. *. {{Set index article, mathematics Quadratic forms ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Preconditioned Gradient Descent
In mathematics, preconditioning is the application of a transformation, called the preconditioner, that conditions a given problem into a form that is more suitable for numerical solving methods. Preconditioning is typically related to reducing a condition number of the problem. The preconditioned problem is then usually solved by an iterative method. Preconditioning for linear systems In linear algebra and numerical analysis, a preconditioner P of a matrix A is a matrix such that P^A has a smaller condition number than A. It is also common to call T=P^ the preconditioner, rather than P, since P itself is rarely explicitly available. In modern preconditioning, the application of T=P^, i.e., multiplication of a column vector, or a block of column vectors, by T=P^, is commonly performed in a matrix-free fashion, i.e., where neither P, nor T=P^ (and often not even A) are explicitly available in a matrix form. Preconditioners are useful in iterative methods to solve a linea ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Directional Derivative
In mathematics, the directional derivative of a multivariable differentiable (scalar) function along a given vector v at a given point x intuitively represents the instantaneous rate of change of the function, moving through x with a velocity specified by v. The directional derivative of a scalar function ''f'' with respect to a vector v at a point (e.g., position) x may be denoted by any of the following: \nabla_(\mathbf)=f'_\mathbf(\mathbf)=D_\mathbff(\mathbf)=Df(\mathbf)(\mathbf)=\partial_\mathbff(\mathbf)=\mathbf\cdot=\mathbf\cdot \frac. It therefore generalizes the notion of a partial derivative, in which the rate of change is taken along one of the curvilinear coordinate curves, all other coordinates being constant. The directional derivative is a special case of the Gateaux derivative. Definition The ''directional derivative'' of a scalar function :f(\mathbf) = f(x_1, x_2, \ldots, x_n) along a vector :\mathbf = (v_1, \ldots, v_n) is the function \nabla_ d ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |