HOME

TheInfoList



OR:

In
statistics Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...
, the multivariate ''t''-distribution (or multivariate Student distribution) is a multivariate probability distribution. It is a generalization to
random vector In probability, and statistics, a multivariate random variable or random vector is a list or vector of mathematical variables each of whose value is unknown, either because the value has not yet occurred or because there is imperfect knowledge ...
s of the Student's ''t''-distribution, which is a distribution applicable to univariate
random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a Mathematics, mathematical formalization of a quantity or object which depends on randomness, random events. The term 'random variable' in its mathema ...
s. While the case of a
random matrix In probability theory and mathematical physics, a random matrix is a matrix-valued random variable—that is, a matrix in which some or all of its entries are sampled randomly from a probability distribution. Random matrix theory (RMT) is the ...
could be treated within this structure, the matrix ''t''-distribution is distinct and makes particular use of the matrix structure.


Definition

One common method of construction of a multivariate ''t''-distribution, for the case of p dimensions, is based on the observation that if \mathbf y and u are independent and distributed as N(,) and \chi^2_\nu (i.e. multivariate normal and
chi-squared distribution In probability theory and statistics, the \chi^2-distribution with k Degrees of freedom (statistics), degrees of freedom is the distribution of a sum of the squares of k Independence (probability theory), independent standard normal random vari ...
s) respectively, the matrix \mathbf\, is a ''p'' × ''p'' matrix, and is a constant vector then the random variable =/\sqrt + has the density : \frac\left +\frac(-)^T^(-)\right and is said to be distributed as a multivariate ''t''-distribution with parameters ,,\nu. Note that \mathbf\Sigma is not the covariance matrix since the covariance is given by \nu/(\nu-2)\mathbf\Sigma (for \nu>2). The constructive definition of a multivariate ''t''-distribution simultaneously serves as a sampling algorithm: # Generate u \sim \chi^2_\nu and \mathbf \sim N(\mathbf, \boldsymbol), independently. # Compute \mathbf \gets \mathbf\sqrt+ \boldsymbol. This formulation gives rise to the hierarchical representation of a multivariate ''t''-distribution as a scale-mixture of normals: u \sim \mathrm(\nu/2,\nu/2) where \mathrm(a,b) indicates a gamma distribution with density proportional to x^e^, and \mathbf\mid u conditionally follows N(\boldsymbol,u^\boldsymbol). In the special case \nu=1, the distribution is a multivariate Cauchy distribution.


Derivation

There are in fact many candidates for the multivariate generalization of Student's ''t''-distribution. An extensive survey of the field has been given by Kotz and Nadarajah (2004). The essential issue is to define a probability density function of several variables that is the appropriate generalization of the formula for the univariate case. In one dimension (p=1), with t=x-\mu and \Sigma=1, we have the
probability density function In probability theory, a probability density function (PDF), density function, or density of an absolutely continuous random variable, is a Function (mathematics), function whose value at any given sample (or point) in the sample space (the s ...
:f(t) = \frac (1+t^2/\nu)^ and one approach is to use a corresponding function of several variables. This is the basic idea of
elliptical distribution In probability and statistics, an elliptical distribution is any member of a broad family of probability distributions that generalize the multivariate normal distribution. In the simplified two and three dimensional case, the joint distribution f ...
theory, where one writes down a corresponding function of p variables t_i that replaces t^2 by a quadratic function of all the t_i. It is clear that this only makes sense when all the marginal distributions have the same
degrees of freedom In many scientific fields, the degrees of freedom of a system is the number of parameters of the system that may vary independently. For example, a point in the plane has two degrees of freedom for translation: its two coordinates; a non-infinite ...
\nu. With \mathbf = \boldsymbol\Sigma^, one has a simple choice of multivariate density function :f(\mathbf t) = \frac \left(1+\sum_^ A_ t_i t_j/\nu\right)^ which is the standard but not the only choice. An important special case is the standard bivariate ''t''-distribution, ''p'' = 2: :f(t_1,t_2) = \frac \left(1+\sum_^ A_ t_i t_j/\nu\right)^ Note that \frac= \frac . Now, if \mathbf is the identity matrix, the density is :f(t_1,t_2) = \frac \left(1+(t_1^2 + t_2^2)/\nu\right)^. The difficulty with the standard representation is revealed by this formula, which does not factorize into the product of the marginal one-dimensional distributions. When \Sigma is diagonal the standard representation can be shown to have zero
correlation In statistics, correlation or dependence is any statistical relationship, whether causal or not, between two random variables or bivariate data. Although in the broadest sense, "correlation" may indicate any type of association, in statistics ...
but the
marginal distribution In probability theory and statistics, the marginal distribution of a subset of a collection of random variables is the probability distribution of the variables contained in the subset. It gives the probabilities of various values of the variable ...
s are not
statistically independent Independence is a fundamental notion in probability theory, as in statistics and the theory of stochastic processes. Two event (probability theory), events are independent, statistically independent, or stochastically independent if, informally s ...
. A notable spontaneous occurrence of the elliptical multivariate distribution is its formal mathematical appearance when least squares methods are applied to multivariate normal data such as the classical Markowitz minimum variance econometric solution for asset portfolios.


Cumulative distribution function

The definition of the
cumulative distribution function In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x. Ever ...
(cdf) in one dimension can be extended to multiple dimensions by defining the following probability (here \mathbf is a real vector): : F(\mathbf) = \mathbb(\mathbf\leq \mathbf), \quad \textrm\;\; \mathbf\sim t_\nu(\boldsymbol\mu,\boldsymbol\Sigma). There is no simple formula for F(\mathbf), but it can b
approximated numerically
via Monte Carlo integration.


Conditional Distribution

This was developed by Muirhead and Cornish. but later derived using the simpler chi-squared ratio representation above, by Roth and Ding. Let vector X follow a multivariate ''t'' distribution and partition into two subvectors of p_1, p_2 elements: : X_p = \begin X_1 \\ X_2 \end \sim t_p \left (\mu_p, \Sigma_, \nu \right ) where p_1 + p_2 = p , the known mean vectors are \mu_p = \begin \mu_1 \\ \mu_2 \end and the scale matrix is \Sigma_ = \begin \Sigma_ & \Sigma_ \\ \Sigma_ & \Sigma_ \end . Roth and Ding find the conditional distribution p(X_1, X_2) to be a new ''t''-distribution with modified parameters. : X_1, X_2 \sim t_\left( \mu_,\frac \Sigma_, \nu + p_2 \right) An equivalent expression in Kotz et. al. is somewhat less concise. Thus the conditional distribution is most easily represented as a two-step procedure. Form first the intermediate distribution X_1, X_2 \sim t_\left( \mu_, \Psi ,\tilde \right) above then, using the parameters below, the explicit conditional distribution becomes : f(X_1, X_2) =\frac\left +\frac( X_1 - \mu_ )^T^(X_1- \mu_ )\right where : \tilde \nu = \nu + p_2 Effective degrees of freedom, \nu is augmented by the number of disused variables p_2 . : \mu_ = \mu_1 + \Sigma_ \Sigma_^ \left(X_2 - \mu_2 \right ) is the conditional mean of x_1 : \Sigma_ = \Sigma_ - \Sigma_ \Sigma_ ^ \Sigma_ is the Schur complement of \Sigma_ \text \Sigma . : d_2 = (X_2 - \mu_2)^T \Sigma_^ (X_2 - \mu_2) is the squared
Mahalanobis distance The Mahalanobis distance is a distance measure, measure of the distance between a point P and a probability distribution D, introduced by Prasanta Chandra Mahalanobis, P. C. Mahalanobis in 1936. The mathematical details of Mahalanobis distance ...
of X_2 from \mu_2 with scale matrix \Sigma_ : \Psi = \frac \Sigma_ is the conditional scale matrix for \tilde > 2.


Copulas based on the multivariate ''t''

The use of such distributions is enjoying renewed interest due to applications in
mathematical finance Mathematical finance, also known as quantitative finance and financial mathematics, is a field of applied mathematics, concerned with mathematical modeling in the financial field. In general, there exist two separate branches of finance that req ...
, especially through the use of the Student's ''t'' copula.


Elliptical representation

Constructed as an
elliptical distribution In probability and statistics, an elliptical distribution is any member of a broad family of probability distributions that generalize the multivariate normal distribution. In the simplified two and three dimensional case, the joint distribution f ...
, take the simplest centralised case with spherical symmetry and no scaling, \Sigma = \operatorname \, , then the multivariate ''t''-PDF takes the form : f_X(X)= g(X^T X) = \frac \bigg( 1 + \nu^ X^T X \bigg)^ where X =(x_1, \cdots ,x_p )^T\text p\text and \nu = degrees of freedom as defined in Muirhead section 1.5. The covariance of X is : \operatorname \left( XX^T \right) = \int_^\infty \cdots \int_^\infty f_X(x_1,\dots, x_p) XX^T \, dx_1 \dots dx_p = \frac \operatorname The aim is to convert the Cartesian PDF to a radial one. Kibria and Joarder, define radial measure r_2 = R^2 = \frac and, noting that the density is dependent only on r2, we get
\operatorname r_2 = \int_^\infty \cdots \int_^\infty f_X(x_1,\dots, x_p) \frac \, dx_1 \dots dx_p = \frac
which is equivalent to the variance of p -element vector X treated as a univariate heavy-tail zero-mean random sequence with uncorrelated, yet statistically dependent, elements.


Radial Distribution

r_2 = \frac follows the Fisher-Snedecor or F distribution: : r_2 \sim f_( p,\nu) = B \bigg( \frac , \frac \bigg ) ^ \bigg (\frac \bigg )^ r_2^ \bigg( 1 + \frac r_2 \bigg) ^ having mean value \operatorname r_2 = \frac . F -distributions arise naturally in tests of sums of squares of sampled data after normalization by the sample standard deviation. By a change of random variable to y = \frac r_2 = \frac in the equation above, retaining p -vector X , we have \operatorname y = \int_^\infty \cdots \int_^\infty f_X(X) \frac \, dx_1 \dots dx_p = \frac and probability distribution : \begin f_Y(y, \,p,\nu) & = \left , \frac \^ B \bigg( \frac , \frac \bigg )^ \big (\frac \big )^ \big (\frac \big )^ y^ \big( 1 + y \big) ^ \\ \\ & = B \bigg ( \frac , \frac \bigg )^ y^(1+ y )^ \end which is a regular Beta-prime distribution y \sim \beta \, ' \bigg(y; \frac , \frac \bigg ) having mean value \frac = \frac .


Cumulative Radial Distribution

Given the Beta-prime distribution, the radial cumulative distribution function of y is known: : F_Y(y) \sim I \, \bigg(\frac ; \, \frac , \frac \bigg ) B\bigg( \frac , \frac \bigg )^ where I is the incomplete
Beta function In mathematics, the beta function, also called the Euler integral of the first kind, is a special function that is closely related to the gamma function and to binomial coefficients. It is defined by the integral : \Beta(z_1,z_2) = \int_0^1 t^ ...
and applies with a spherical \Sigma assumption. In the scalar case, p = 1, the distribution is equivalent to Student-''t'' with the equivalence t^2 = y^2 \sigma^ , the variable ''t'' having double-sided tails for CDF purposes, i.e. the "two-tail-t-test". The radial distribution can also be derived via a straightforward coordinate transformation from Cartesian to spherical. A constant radius surface at R = (X^TX)^ with PDF p_X(X) \propto \bigg( 1 + \nu^ R^2 \bigg)^ is an iso-density surface. Given this density value, the quantum of probability on a shell of surface area A_R and thickness \delta R at R is \delta P = p_X(R) \, A_R \delta R . The enclosed p -sphere of radius R has surface area A_R = \frac . Substitution into \delta P shows that the shell has element of probability \delta P = p_X(R) \frac \delta R which is equivalent to radial density function : f_R(R) = \frac \frac \bigg( 1 + \frac \bigg)^ which further simplifies to f_R(R) = \frac \bigg( \frac \bigg)^ \bigg( 1 + \frac \bigg)^ where B(*,*) is the
Beta function In mathematics, the beta function, also called the Euler integral of the first kind, is a special function that is closely related to the gamma function and to binomial coefficients. It is defined by the integral : \Beta(z_1,z_2) = \int_0^1 t^ ...
. Changing the radial variable to y=R^2 / \nu returns the previous Beta Prime distribution : f_Y(y) = \frac y^ \bigg( 1 + y \bigg)^ To scale the radial variables without changing the radial shape function, define scale matrix \Sigma = \alpha \operatorname , yielding a 3-parameter Cartesian density function, ie. the probability \Delta_P in volume element dx_1 \dots dx_p is : \Delta_P \big (f_X(X \,, \alpha, p, \nu) \big ) = \frac \bigg( 1 + \frac \bigg)^ \; dx_1 \dots dx_p or, in terms of scalar radial variable R , : f_R(R \,, \alpha, p, \nu) = \frac \bigg( \frac \bigg)^ \bigg( 1 + \frac \bigg)^


Radial Moments

The moments of all the radial variables , with the spherical distribution assumption, can be derived from the Beta Prime distribution. If Z \sim \beta'(a,b) then \operatorname (Z^m) = , a known result. Thus, for variable y = \frac R^2 we have : \operatorname (y^m) = = \frac, \; \nu/2 > m The moments of r_2 = \nu \, y are : \operatorname (r_2^m) = \nu^m\operatorname (y^m) while introducing the scale matrix \alpha \operatorname yields : \operatorname (r_2^m , \alpha) = \alpha^m \nu^m \operatorname (y^m) Moments relating to radial variable R are found by setting R =(\alpha\nu y)^ and M=2m whereupon : \operatorname (R^M ) =\operatorname \big((\alpha \nu y)^ \big)^ = (\alpha \nu )^ \operatorname (y^)= (\alpha \nu )^


Linear Combinations and Affine Transformation


Full Rank Transform

This closely relates to the multivariate normal method and is described in Kotz and Nadarajah, Kibria and Joarder, Roth, and Cornish. Starting from a somewhat simplified version of the central MV-t pdf: f_X(X) = \frac \left( 1+ \nu^ X^T \Sigma^ X \right) ^ , where \Kappa is a constant and \nu is arbitrary but fixed, let \Theta \in \mathbb^ be a full-rank matrix and form vector Y = \Theta X . Then, by straightforward change of variables : f_Y(Y) = \frac \left( 1+ \nu^Y^T \Theta^ \Sigma^ \Theta^ Y \right) ^ \left, \frac \ ^ The matrix of partial derivatives is \frac = \Theta_ and the Jacobian becomes \left, \frac \ = \left, \Theta \ . Thus : f_Y(Y) = \frac \left( 1 + \nu^ Y^T \Theta^ \Sigma^ \Theta^ Y \right) ^ The denominator reduces to : \left, \Sigma \^ \left, \Theta \ = \left, \Sigma \^ \left, \Theta \^ \left, \Theta^T \^ = \left, \Theta \Sigma \Theta^T \^ In full: : f_Y(Y) = \frac \left( 1 + \nu^ Y^T \left( \Theta \Sigma \Theta^T \right) ^ Y \right) ^ which is a regular MV-''t'' distribution. In general if X \sim t_p ( \mu, \Sigma, \nu ) and \Theta^ has full rank p then : \Theta X + c \sim t_p( \Theta \mu +c, \Theta \Sigma \Theta^T, \nu )


Marginal Distributions

This is a special case of the rank-reducing linear transform below. Kotz defines marginal distributions as follows. Partition X \sim t (p, \mu, \Sigma, \nu ) into two subvectors of p_1, p_2 elements: : X_p = \begin X_1 \\ X_2 \end \sim t \left ( p_1 + p_2, \mu_p, \Sigma_, \nu \right ) with p_1 + p_2 = p , means \mu_p = \begin \mu_1 \\ \mu_2 \end, scale matrix \Sigma_ = \begin \Sigma_ & \Sigma_ \\ \Sigma_ & \Sigma_ \end then X_1 \sim t \left ( p_1, \mu_1, \Sigma_, \nu \right ) , X_2 \sim t \left ( p_2, \mu_2, \Sigma_, \nu \right ) such that : f(X_1) = \frac\left +\frac(-)^T_^(-)\right : f(X_2) = \frac\left +\frac( - )^T_^(-)\right If a transformation is constructed in the form : \Theta_ = \begin 1 & \cdots & 0 & \cdots & 0 \\ 0 & \ddots & 0 & \cdots & 0 \\ 0 & \cdots & 1 & \cdots & 0 \end then vector Y = \Theta X , as discussed below, has the same distribution as the marginal distribution of X_1 .


Rank-Reducing Linear Transform

In the linear transform case, if \Theta is a rectangular matrix \Theta \in \mathbb^, m < p , of rank m the result is dimensionality reduction. Here, Jacobian \left, \Theta \ is seemingly rectangular but the value \left, \Theta \Sigma \Theta^T \^ in the denominator pdf is nevertheless correct. There is a discussion of rectangular matrix product determinants in Aitken. In general if X \sim t (p, \mu, \Sigma, \nu ) and \Theta^ has full rank m then : Y = \Theta X + c \sim t ( m, \Theta \mu + c, \Theta \Sigma \Theta^T, \nu ) : f_Y(Y) = \frac\left +\frac( Y - c_1 )^T ( \Theta \Sigma \Theta^T )^ (Y-c_1) \right, \; c_1 = \Theta \mu + c ''In extremis'', if ''m'' = 1 and \Theta becomes a row vector, then scalar ''Y'' follows a univariate double-sided Student-t distribution defined by t^2 = Y^2 / \sigma^2 with the same \nu degrees of freedom. Kibria et. al. use the affine transformation to find the marginal distributions which are also MV-''t''. * During affine transformations of variables with elliptical distributions all vectors must ultimately derive from one initial isotropic spherical vector Z whose elements remain 'entangled' and are not statistically independent. * A vector of independent student-''t'' samples is not consistent with the multivariate ''t'' distribution. * Adding two sample multivariate ''t'' vectors generated with independent Chi-squared samples and different \nu values: /\sqrt, \; \; /\sqrt will not produce internally consistent distributions, though they will yield a Behrens-Fisher problem. * Taleb compares many examples of fat-tail elliptical ''vs'' non-elliptical multivariate distributions


Related concepts

* In univariate statistics, the Student's ''t''-test makes use of Student's ''t''-distribution * The elliptical multivariate-''t'' distribution arises spontaneously in linearly constrained least squares solutions involving multivariate normal source data, for example the Markowitz global minimum variance solution in financial portfolio analysis. which addresses an ensemble of normal random vectors or a random matrix. It does not arise in ordinary least squares (OLS) or multiple regression with fixed dependent and independent variables which problem tends to produce well-behaved normal error probabilities. * Hotelling's ''T''-squared distribution is a distribution that arises in multivariate statistics. * The matrix ''t''-distribution is a distribution for random variables arranged in a matrix structure.


See also

*
Multivariate normal distribution In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional ( univariate) normal distribution to higher dimensions. One d ...
, which is the limiting case of the multivariate Student's t-distribution when \nu\uparrow\infty. *
Chi distribution In probability theory and statistics, the chi distribution is a continuous probability distribution over the non-negative real line. It is the distribution of the positive square root of a sum of squared independent Gaussian random variables. E ...
, the
pdf Portable document format (PDF), standardized as ISO 32000, is a file format developed by Adobe Inc., Adobe in 1992 to present documents, including text formatting and images, in a manner independent of application software, computer hardware, ...
of the scaling factor in the construction the Student's t-distribution and also the 2-norm (or
Euclidean norm Euclidean space is the fundamental space of geometry, intended to represent physical space. Originally, in Euclid's ''Elements'', it was the three-dimensional space of Euclidean geometry, but in modern mathematics there are ''Euclidean spaces'' ...
) of a multivariate normally distributed vector (centered at zero). ** Rayleigh distribution#Student's t, random vector length of multivariate ''t''-distribution *
Mahalanobis distance The Mahalanobis distance is a distance measure, measure of the distance between a point P and a probability distribution D, introduced by Prasanta Chandra Mahalanobis, P. C. Mahalanobis in 1936. The mathematical details of Mahalanobis distance ...


References


Literature

* * *


External links


Copula Methods vs Canonical Multivariate Distributions: the multivariate Student T distribution with general degrees of freedom
{{DEFAULTSORT:Multivariate Normal Distribution Continuous distributions Multivariate continuous distributions