HOME

TheInfoList



OR:

In statistics, Mallows's ''Cp'', named for Colin Lingwood Mallows, is used to assess the fit of a regression model that has been estimated using
ordinary least squares In statistics, ordinary least squares (OLS) is a type of linear least squares method for choosing the unknown parameters in a linear regression model (with fixed level-one effects of a linear function of a set of explanatory variables) by the ...
. It is applied in the context of model selection, where a number of predictor variables are available for predicting some outcome, and the goal is to find the best model involving a subset of these predictors. A small value of Cp means that the model is relatively precise. Mallows's ''Cp'' has been shown to be equivalent to
Akaike information criterion The Akaike information criterion (AIC) is an estimator of prediction error and thereby relative quality of statistical models for a given set of data. Given a collection of models for the data, AIC estimates the quality of each model, relative to ...
in the special case of Gaussian
linear regression In statistics, linear regression is a linear approach for modelling the relationship between a scalar response and one or more explanatory variables (also known as dependent and independent variables). The case of one explanatory variable is ...
.


Definition and properties

Mallows's ''Cp'' addresses the issue of
overfitting mathematical modeling, overfitting is "the production of an analysis that corresponds too closely or exactly to a particular set of data, and may therefore fail to fit to additional data or predict future observations reliably". An overfitt ...
, in which model selection statistics such as the residual sum of squares always get smaller as more variables are added to a model. Thus, if we aim to select the model giving the smallest residual sum of squares, the model including all variables would always be selected. Instead, the ''Cp'' statistic calculated on a sample of data estimates the sum squared prediction error (SSPE) as its
population Population typically refers to the number of people in a single area, whether it be a city or to