HOME

TheInfoList



OR:

In statistics, canonical analysis (from grc, κανων bar,
measuring rod A measuring rod is a tool used to physically measure lengths and survey areas of various sizes. Most measuring rods are round or square sectioned; however, they can also be flat boards. Some have markings at regular intervals. It is likely th ...
, ruler) belongs to the family of regression methods for data analysis.
Regression analysis In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome' or 'response' variable, or a 'label' in machine learning parlance) and one ...
quantifies a relationship between a predictor variable and a criterion variable by the coefficient of
correlation In statistics, correlation or dependence is any statistical relationship, whether causal or not, between two random variables or bivariate data. Although in the broadest sense, "correlation" may indicate any type of association, in statisti ...
''r'', coefficient of determination ''r''2, and the standard regression coefficient ''β''. Multiple regression analysis expresses a relationship between a set of predictor variables and a single criterion variable by the
multiple correlation In statistics, the coefficient of multiple correlation is a measure of how well a given variable can be predicted using a linear function of a set of other variables. It is the correlation between the variable's values and the best predictions t ...
 ''R'', multiple coefficient of determination R², and a set of standard partial regression weights ''β''1, ''β''2, etc. Canonical variate analysis captures a relationship between a set of predictor variables and a set of criterion variables by the
canonical correlation In statistics, canonical-correlation analysis (CCA), also called canonical variates analysis, is a way of inferring information from cross-covariance matrices. If we have two vectors ''X'' = (''X''1, ..., ''X'n'') and ''Y' ...
s ''ρ''1, ''ρ''2, ..., and by the sets of canonical weights C and D.


Canonical analysis

Canonical analysis belongs to a group of methods which involve solving the characteristic equation for its latent roots and vectors. It describes formal structures in
hyperspace In science fiction, hyperspace (also known as nulspace, subspace, overspace, jumpspace and similar terms) is a concept relating to higher dimensions as well as parallel universes and a faster-than-light (FTL) method of interstellar travel. ...
invariant with respect to the rotation of their coordinates. In this type of solution, rotation leaves many optimizing properties preserved, provided it takes place in certain ways and in a subspace of its corresponding hyperspace. This rotation from the maximum intervariate correlation structure into a different, simpler and more meaningful structure increases the interpretability of the canonical weights C and D. In this the canonical analysis differs from
Harold Hotelling Harold Hotelling (; September 29, 1895 – December 26, 1973) was an American mathematical statistician and an influential economic theorist, known for Hotelling's law, Hotelling's lemma, and Hotelling's rule in economics, as well as Hotelling's ...
’s (1936) canonical variate analysis (also called the
canonical correlation analysis In statistics, canonical-correlation analysis (CCA), also called canonical variates analysis, is a way of inferring information from cross-covariance matrices. If we have two vectors ''X'' = (''X''1, ..., ''X'n'') and ''Y'' ...
), designed to obtain maximum (canonical) correlations between the predictor and criterion canonical variates. The difference between the canonical variate analysis and canonical analysis is analogous to the difference between the
principal components analysis Principal component analysis (PCA) is a popular technique for analyzing large datasets containing a high number of dimensions/features per observation, increasing the interpretability of data while preserving the maximum amount of information, and ...
and
factor analysis Factor analysis is a statistical method used to describe variability among observed, correlated variables in terms of a potentially lower number of unobserved variables called factors. For example, it is possible that variations in six observed ...
, each with its characteristic set of commonalities,
eigenvalues In linear algebra, an eigenvector () or characteristic vector of a linear transformation is a nonzero vector that changes at most by a scalar factor when that linear transformation is applied to it. The corresponding eigenvalue, often denoted b ...
and
eigenvectors In linear algebra, an eigenvector () or characteristic vector of a linear transformation is a nonzero vector that changes at most by a scalar factor when that linear transformation is applied to it. The corresponding eigenvalue, often denoted ...
.


Canonical analysis (simple)

Canonical analysis is a multivariate technique which is concerned with determining the relationships between groups of variables in a data set. The data set is split into two groups ''X'' and ''Y'', based on some common characteristics. The purpose of canonical analysis is then to find the relationship between ''X'' and ''Y'', i.e. can some form of ''X'' represent ''Y''. It works by finding the linear combination of ''X'' variables, i.e. ''X''1, ''X''2 etc., and linear combination of ''Y'' variables, i.e. ''Y''1, ''Y''2 etc., which are most highly correlated. This combination is known as the "first canonical variates" which are usually denoted ''U''1 and ''V''1, with the pair of ''U''1 and ''V''1 being called a "canonical function". The next canonical functions, ''U''2 and ''V''2 are then restricted so that they are uncorrelated with ''U''1 and ''V''1. Everything is scaled so that the variance equals 1. One can also construct relationships which are made to agree with constraint restrictions arising from theory or to agree with common sense/intuition. These are called maximum correlation models. (Tofallis, 1999) Mathematically, canonical analysis maximizes ''U′X′YV'' subject to ''U′X′XU'' = ''I'' and ''V′Y′YV'' = ''I'', where ''X'' and ''Y'' are the data matrices (row for instance and column for feature).


See also

*
RV coefficient In statistics, the RV coefficient is a multivariate generalization of the ''squared'' Pearson correlation coefficient (because the RV coefficient takes values between 0 and 1). It measures the closeness of two set of points that may each be repres ...


References

* * * *{{cite journal , last=Tofallis , first=C. , year=1999 , ssrn=1353202 , title=Model Building with Multiple Dependent Variables and Constraints , journal= J. R. Stat. Soc. D , volume=48 , issue=3 , pages=1–8 , doi=10.1111/1467-9884.00195 , arxiv=1109.0725 Regression analysis