Principal Components Analysis

picture info	Principal Components Analysis Principal component analysis (PCA) is a Linear map, linear dimensionality reduction technique with applications in exploratory data analysis, visualization and Data Preprocessing, data preprocessing. The data is linear map, linearly transformed onto a new coordinate system such that the directions (principal components) capturing the largest variation in the data can be easily identified. The principal components of a collection of points in a real coordinate space are a sequence of p unit vectors, where the i-th vector is the direction of a line that best fits the data while being orthogonal to the first i-1 vectors. Here, a best-fitting line is defined as one that minimizes the average squared perpendicular distance, perpendicular Distance from a point to a line, distance from the points to the line. These directions (i.e., principal components) constitute an orthonormal basis in which different individual dimensions of the data are Linear correlation, linearly uncorrelated. Ma ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Linear Independence In the theory of vector spaces, a set (mathematics), set of vector (mathematics), vectors is said to be if there exists no nontrivial linear combination of the vectors that equals the zero vector. If such a linear combination exists, then the vectors are said to be . These concepts are central to the definition of Dimension (vector space), dimension. A vector space can be of finite dimension or infinite dimension depending on the maximum number of linearly independent vectors. The definition of linear dependence and the ability to determine whether a subset of vectors in a vector space is linearly dependent are central to determining the dimension of a vector space. Definition A sequence of vectors \mathbf_1, \mathbf_2, \dots, \mathbf_k from a vector space is said to be ''linearly dependent'', if there exist Scalar (mathematics), scalars a_1, a_2, \dots, a_k, not all zero, such that :a_1\mathbf_1 + a_2\mathbf_2 + \cdots + a_k\mathbf_k = \mathbf, where \mathbf denotes ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Harold Hotelling Harold Hotelling (; September 29, 1895 – December 26, 1973) was an American mathematical statistician and an influential economic theorist, known for Hotelling's law, Hotelling's lemma, and Hotelling's rule in economics, as well as Hotelling's T-squared distribution in statistics. He also developed and named the principal component analysis method widely used in finance, statistics and computer science. He was associate professor of mathematics at Stanford University from 1927 until 1931, a member of the faculty of Columbia University from 1931 until 1946, and a professor of Mathematical Statistics at the University of North Carolina at Chapel Hill from 1946 until his death. A street in Chapel Hill bears his name. In 1972, he received the North Carolina Award for contributions to science. Statistics Hotelling is known to statisticians because of Hotelling's T-squared distribution which is a generalization of the Student's t-distribution in multivariate setting, and i ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Principal Axis Theorem In geometry and linear algebra, a principal axis is a certain line in a Euclidean space associated with a ellipsoid or hyperboloid, generalizing the major and minor axes of an ellipse or hyperbola. The principal axis theorem states that the principal axes are perpendicular, and gives a constructive procedure for finding them. Mathematically, the principal axis theorem is a generalization of the method of completing the square from elementary algebra. In linear algebra and functional analysis, the principal axis theorem is a geometrical counterpart of the spectral theorem. It has applications to the statistics of principal components analysis and the singular value decomposition. In physics, the theorem is fundamental to the studies of angular momentum and birefringence. Motivation The equations in the Cartesian plane \begin \frac + \frac &= 1 \\ pt \frac - \frac &= 1 \end define, respectively, an ellipse and a hyperbola. In each case, the and axes are the ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Karl Pearson Karl Pearson (; born Carl Pearson; 27 March 1857 – 27 April 1936) was an English biostatistician and mathematician. He has been credited with establishing the discipline of mathematical statistics. He founded the world's first university statistics department at University College London in 1911, and contributed significantly to the field of biometrics and meteorology. Pearson was also a proponent of Social Darwinism and eugenics, and his thought is an example of what is today described as scientific racism. Pearson was a protégé and biographer of Sir Francis Galton. He edited and completed both William Kingdon Clifford's ''Common Sense of the Exact Sciences'' (1885) and Isaac Todhunter's ''History of the Theory of Elasticity'', Vol. 1 (1886–1893) and Vol. 2 (1893), following their deaths. Early life and education Pearson was born in Islington, London, into a Quaker family. His father was William Pearson QC of the Inner Temple, and his mother Fanny (née Smit ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Lp Space In mathematics, the spaces are function spaces defined using a natural generalization of the -norm for finite-dimensional vector spaces. They are sometimes called Lebesgue spaces, named after Henri Lebesgue , although according to the Bourbaki group they were first introduced by Frigyes Riesz . spaces form an important class of Banach spaces in functional analysis, and of topological vector spaces. Because of their key role in the mathematical analysis of measure and probability spaces, Lebesgue spaces are used also in the theoretical discussion of problems in physics, statistics, economics, finance, engineering, and other disciplines. Preliminaries The -norm in finite dimensions The Euclidean length of a vector x = (x_1, x_2, \dots, x_n) in the n-dimensional real vector space \Reals^n is given by the Euclidean norm: \, x\, _2 = \left(^2 + ^2 + \dotsb + ^2\right)^. The Euclidean distance between two points x and y is the length \, x - y\, _2 of the straight line b ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Robust Principal Component Analysis Robust Principal Component Analysis (RPCA) is a modification of the widely used statistical procedure of principal component analysis (PCA) which works well with respect to ''grossly'' corrupted observations. A number of different approaches exist for Robust PCA, including an idealized version of Robust PCA, which aims to recover a low-rank matrix L0 from highly corrupted measurements M = L0 +S0. This decomposition in low-rank and sparse matrices can be achieved by techniques such as Principal Component Pursuit method (PCP), Stable PCP, Quantized PCP, Block based PCP, and Local PCP. Then, optimization methods are used such as the Augmented Lagrange Multiplier Method (ALM), Alternating Direction Method (ADM), Fast Alternating Minimization (FAM), Iteratively Reweighted Least Squares (IRLS ) or alternating projections (AP). Algorithms Non-convex method The 2014 guaranteed algorithm for the robust PCA problem (with the input matrix being M=L+S) is an alternating minimization type alg ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Orthogonal Coordinate System In mathematics, orthogonal coordinates are defined as a set of coordinates \mathbf q = (q^1, q^2, \dots, q^d) in which the coordinate hypersurfaces all meet at right angles (note that superscripts are indices, not exponents). A coordinate surface for a particular coordinate is the curve, surface, or hypersurface on which is a constant. For example, the three-dimensional Cartesian coordinates is an orthogonal coordinate system, since its coordinate surfaces constant, constant, and constant are planes that meet at right angles to one another, i.e., are perpendicular. Orthogonal coordinates are a special but extremely common case of curvilinear coordinates. Motivation While vector operations and physical laws are normally easiest to derive in Cartesian coordinates, non-Cartesian orthogonal coordinates are often used instead for the solution of various problems, especially boundary value problems, such as those arising in field theories of quantum mechanics, fluid f ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Cross-covariance In probability and statistics, given two stochastic processes \left\ and \left\, the cross-covariance is a function that gives the covariance of one process with the other at pairs of time points. With the usual notation \operatorname E for the expectation operator, if the processes have the mean functions \mu_X(t) = \operatorname \operatorname E _t/math> and \mu_Y(t) = \operatorname E _t/math>, then the cross-covariance is given by :\operatorname_(t_1,t_2) = \operatorname (X_, Y_) = \operatorname X_ - \mu_X(t_1))(Y_ - \mu_Y(t_2))= \operatorname _ Y_- \mu_X(t_1) \mu_Y(t_2).\, Cross-covariance is related to the more commonly used cross-correlation of the processes in question. In the case of two random vectors \mathbf=(X_1, X_2, \ldots , X_p)^ and \mathbf=(Y_1, Y_2, \ldots , Y_q)^, the cross-covariance would be a p \times q matrix \operatorname_ (often denoted \operatorname(X,Y)) with entries \operatorname_(j,k) = \operatorname(X_j, Y_k).\, Thus the term ''cross-covariance ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Canonical Correlation In statistics, canonical-correlation analysis (CCA), also called canonical variates analysis, is a way of inferring information from cross-covariance matrices. If we have two vectors ''X'' = (''X''1, ..., ''X''''n'') and ''Y'' = (''Y''1, ..., ''Y''''m'') of random variables, and there are correlations among the variables, then canonical-correlation analysis will find linear combinations of ''X'' and ''Y'' that have a maximum correlation with each other. T. R. Knapp notes that "virtually all of the commonly encountered parametric tests of significance can be treated as special cases of canonical-correlation analysis, which is the general procedure for investigating the relationships between two sets of variables." The method was first introduced by Harold Hotelling in 1936, although in the context of angles between flats the mathematical concept was published by Camille Jordan in 1875. CCA is now a cornerstone of multivariate statistics ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Factor Analysis Factor analysis is a statistical method used to describe variability among observed, correlated variables in terms of a potentially lower number of unobserved variables called factors. For example, it is possible that variations in six observed variables mainly reflect the variations in two unobserved (underlying) variables. Factor analysis searches for such joint variations in response to unobserved latent variables. The observed variables are modelled as linear combinations of the potential factors plus "error" terms, hence factor analysis can be thought of as a special case of errors-in-variables models. Simply put, the factor loading of a variable quantifies the extent to which the variable is related to a given factor. A common rationale behind factor analytic methods is that the information gained about the interdependencies between observed variables can be used later to reduce the set of variables in a dataset. Factor analysis is commonly used in psychometrics, pers ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Singular Value Decomposition In linear algebra, the singular value decomposition (SVD) is a Matrix decomposition, factorization of a real number, real or complex number, complex matrix (mathematics), matrix into a rotation, followed by a rescaling followed by another rotation. It generalizes the eigendecomposition of a square normal matrix with an orthonormal eigenbasis to any matrix. It is related to the polar decomposition#Matrix polar decomposition, polar decomposition. Specifically, the singular value decomposition of an m \times n complex matrix is a factorization of the form \mathbf = \mathbf, where is an complex unitary matrix, \mathbf \Sigma is an m \times n rectangular diagonal matrix with non-negative real numbers on the diagonal, is an n \times n complex unitary matrix, and \mathbf V^* is the conjugate transpose of . Such decomposition always exists for any complex matrix. If is real, then and can be guaranteed to be real orthogonal matrix, orthogonal matrices; in such contexts, the SVD ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]