In
statistics, polynomial regression is a form of
regression analysis
In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome' or 'response' variable, or a 'label' in machine learning parlance) and one ...
in which the relationship between the
independent variable
Dependent and independent variables are variables in mathematical modeling, statistical modeling and experimental sciences. Dependent variables receive this name because, in an experiment, their values are studied under the supposition or deman ...
''x'' and the
dependent variable
Dependent and independent variables are variables in mathematical modeling, statistical modeling and experimental sciences. Dependent variables receive this name because, in an experiment, their values are studied under the supposition or dema ...
''y'' is modelled as an ''n''th degree
polynomial
In mathematics, a polynomial is an expression consisting of indeterminates (also called variables) and coefficients, that involves only the operations of addition, subtraction, multiplication, and positive-integer powers of variables. An ex ...
in ''x''. Polynomial regression fits a nonlinear relationship between the value of ''x'' and the corresponding
conditional mean
In probability theory, the conditional expectation, conditional expected value, or conditional mean of a random variable is its expected value – the value it would take “on average” over an arbitrarily large number of occurrences – given ...
of ''y'', denoted E(''y'' , ''x''). Although ''polynomial regression'' fits a nonlinear model to the data, as a
statistical estimation
Estimation theory is a branch of statistics that deals with estimating the values of parameters based on measured empirical data that has a random component. The parameters describe an underlying physical setting in such a way that their valu ...
problem it is linear, in the sense that the regression function E(''y'' , ''x'') is linear in the unknown
parameter
A parameter (), generally, is any characteristic that can help in defining or classifying a particular system (meaning an event, project, object, situation, etc.). That is, a parameter is an element of a system that is useful, or critical, when ...
s that are estimated from the
data
In the pursuit of knowledge, data (; ) is a collection of discrete values that convey information, describing quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpret ...
. For this reason, polynomial regression is considered to be a special case of
multiple linear regression
In statistics, linear regression is a linear approach for modelling the relationship between a scalar response and one or more explanatory variables (also known as dependent and independent variables). The case of one explanatory variable is c ...
.
The explanatory (independent) variables resulting from the polynomial expansion of the "baseline" variables are known as higher-degree terms. Such variables are also used in
classification Classification is a process related to categorization, the process in which ideas and objects are recognized, differentiated and understood.
Classification is the grouping of related facts into classes.
It may also refer to:
Business, organizat ...
settings.
History
Polynomial regression models are usually fit using the method of
least squares
The method of least squares is a standard approach in regression analysis to approximate the solution of overdetermined systems (sets of equations in which there are more equations than unknowns) by minimizing the sum of the squares of the r ...
. The least-squares method minimizes the
variance
In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its population mean or sample mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of number ...
of the
unbiased estimators
In statistics, an estimator is a rule for calculating an estimate of a given quantity based on observed data: thus the rule (the estimator), the quantity of interest (the estimand) and its result (the estimate) are distinguished. For example, the ...
of the coefficients, under the conditions of the
Gauss–Markov theorem. The least-squares method was published in 1805 by
Legendre and in 1809 by
Gauss
Johann Carl Friedrich Gauss (; german: Gauß ; la, Carolus Fridericus Gauss; 30 April 177723 February 1855) was a German mathematician and physicist who made significant contributions to many fields in mathematics and science. Sometimes refer ...
. The first
design
A design is a plan or specification for the construction of an object or system or for the implementation of an activity or process or the result of that plan or specification in the form of a prototype, product, or process. The verb ''to design'' ...
of an
experiment
An experiment is a procedure carried out to support or refute a hypothesis, or determine the efficacy or likelihood of something previously untried. Experiments provide insight into cause-and-effect by demonstrating what outcome occurs wh ...
for polynomial regression appeared in an 1815 paper of
Gergonne
Joseph Diez Gergonne (19 June 1771 at Nancy, France – 4 May 1859 at Montpellier, France) was a French mathematician and logician.
Life
In 1791, Gergonne enlisted in the French army as a captain. That army was undergoing rapid expansion becau ...
. In the twentieth century, polynomial regression played an important role in the development of
regression analysis
In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome' or 'response' variable, or a 'label' in machine learning parlance) and one ...
, with a greater emphasis on issues of
design
A design is a plan or specification for the construction of an object or system or for the implementation of an activity or process or the result of that plan or specification in the form of a prototype, product, or process. The verb ''to design'' ...
and
inference
Inferences are steps in reasoning, moving from premises to logical consequences; etymologically, the word '' infer'' means to "carry forward". Inference is theoretically traditionally divided into deduction and induction, a distinction that ...
. More recently, the use of polynomial models has been complemented by other methods, with non-polynomial models having advantages for some classes of problems.
Definition and example
The goal of regression analysis is to model the expected value of a dependent variable ''y'' in terms of the value of an independent variable (or vector of independent variables) ''x''. In simple linear regression, the model
:
is used, where ε is an unobserved random error with mean zero conditioned on a
scalar
Scalar may refer to:
*Scalar (mathematics), an element of a field, which is used to define a vector space, usually the field of real numbers
*Scalar (physics), a physical quantity that can be described by a single element of a number field such a ...
variable ''x''. In this model, for each unit increase in the value of ''x'', the conditional expectation of ''y'' increases by ''β''
1 units.
In many settings, such a linear relationship may not hold. For example, if we are modeling the yield of a chemical synthesis in terms of the temperature at which the synthesis takes place, we may find that the yield improves by increasing amounts for each unit increase in temperature. In this case, we might propose a quadratic model of the form
:
In this model, when the temperature is increased from ''x'' to ''x'' + 1 units, the expected yield changes by
(This can be seen by replacing ''x'' in this equation with ''x''+1 and subtracting the equation in ''x'' from the equation in ''x''+1.) For
infinitesimal changes in ''x'', the effect on ''y'' is given by the
total derivative
In mathematics, the total derivative of a function at a point is the best linear approximation near this point of the function with respect to its arguments. Unlike partial derivatives, the total derivative approximates the function with r ...
with respect to ''x'':
The fact that the change in yield depends on ''x'' is what makes the relationship between ''x'' and ''y'' nonlinear even though the model is linear in the parameters to be estimated.
In general, we can model the expected value of ''y'' as an ''n''th degree polynomial, yielding the general polynomial regression model
:
Conveniently, these models are all linear from the point of view of
estimation
Estimation (or estimating) is the process of finding an estimate or approximation, which is a value that is usable for some purpose even if input data may be incomplete, uncertain, or unstable. The value is nonetheless usable because it is de ...
, since the regression function is linear in terms of the unknown parameters ''β''
0, ''β''
1, .... Therefore, for
least squares
The method of least squares is a standard approach in regression analysis to approximate the solution of overdetermined systems (sets of equations in which there are more equations than unknowns) by minimizing the sum of the squares of the r ...
analysis, the computational and inferential problems of polynomial regression can be completely addressed using the techniques of
multiple regression
In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome' or 'response' variable, or a 'label' in machine learning parlance) and one o ...
. This is done by treating ''x'', ''x''
2, ... as being distinct independent variables in a multiple regression model.
Matrix form and calculation of estimates
The polynomial regression model
:
can be expressed in matrix form in terms of a design matrix
, a response vector
, a parameter vector
, and a vector
of random errors. The ''i''-th row of
and
will contain the ''x'' and ''y'' value for the ''i''-th data sample. Then the model can be written as a system of linear equations:
:
which when using pure matrix notation is written as
:
The vector of estimated polynomial regression coefficients (using
ordinary least squares
In statistics, ordinary least squares (OLS) is a type of linear least squares method for choosing the unknown parameters in a linear regression model (with fixed level-one effects of a linear function of a set of explanatory variables) by the ...
estimation
Estimation (or estimating) is the process of finding an estimate or approximation, which is a value that is usable for some purpose even if input data may be incomplete, uncertain, or unstable. The value is nonetheless usable because it is de ...
) is
:
assuming ''m'' < ''n'' which is required for the matrix to be invertible; then since
is a
Vandermonde matrix
In linear algebra, a Vandermonde matrix, named after Alexandre-Théophile Vandermonde, is a matrix with the terms of a geometric progression in each row: an matrix
:V=\begin
1 & x_1 & x_1^2 & \dots & x_1^\\
1 & x_2 & x_2^2 & \dots & x_2^\\
1 & x ...
, the invertibility condition is guaranteed to hold if all the
values are distinct. This is the unique least-squares solution.
Interpretation
Although polynomial regression is technically a special case of multiple linear regression, the interpretation of a fitted polynomial regression model requires a somewhat different perspective. It is often difficult to interpret the individual coefficients in a polynomial regression fit, since the underlying monomials can be highly correlated. For example, ''x'' and ''x''
2 have correlation around 0.97 when x is
uniformly distributed on the interval (0, 1). Although the correlation can be reduced by using
orthogonal polynomials
In mathematics, an orthogonal polynomial sequence is a family of polynomials such that any two different polynomials in the sequence are orthogonal to each other under some inner product.
The most widely used orthogonal polynomials are the cl ...
, it is generally more informative to consider the fitted regression function as a whole. Point-wise or simultaneous
confidence band
A confidence band is used in statistical analysis to represent the uncertainty in an estimate of a curve or function based on limited or noisy data. Similarly, a prediction band is used to represent the uncertainty about the value of a new data-po ...
s can then be used to provide a sense of the uncertainty in the estimate of the regression function.
Alternative approaches
Polynomial regression is one example of regression analysis using
basis functions
In mathematics, a basis function is an element of a particular basis for a function space. Every function in the function space can be represented as a linear combination of basis functions, just as every vector in a vector space can be repre ...
to model a functional relationship between two quantities. More specifically, it replaces
in linear regression with polynomial basis
, e.g.