HOME

TheInfoList



OR:

There are two main uses of the term calibration in statistics that denote special types of
statistical inference Statistical inference is the process of using data analysis to infer properties of an underlying distribution of probability.Upton, G., Cook, I. (2008) ''Oxford Dictionary of Statistics'', OUP. . Inferential statistical analysis infers properti ...
problems. "Calibration" can mean :*a reverse process to
regression Regression or regressions may refer to: Science * Marine regression, coastal advance due to falling sea level, the opposite of marine transgression * Regression (medicine), a characteristic of diseases to express lighter symptoms or less extent ( ...
, where instead of a future dependent variable being predicted from known explanatory variables, a known observation of the dependent variables is used to predict a corresponding explanatory variable; :*procedures in statistical classification to determine
class membership probabilities In machine learning, a probabilistic classifier is a classifier that is able to predict, given an observation of an input, a probability distribution over a set of classes, rather than only outputting the most likely class that the observation sh ...
which assess the uncertainty of a given new observation belonging to each of the already established classes. In addition, "calibration" is used in statistics with the usual general meaning of
calibration In measurement technology and metrology, calibration is the comparison of measurement values delivered by a device under test with those of a calibration standard of known accuracy. Such a standard could be another measurement device of known a ...
. For example, model calibration can be also used to refer to
Bayesian inference Bayesian inference is a method of statistical inference in which Bayes' theorem is used to update the probability for a hypothesis as more evidence or information becomes available. Bayesian inference is an important technique in statistics, and ...
about the value of a model's parameters, given some data set, or more generally to any type of fitting of a
statistical model A statistical model is a mathematical model that embodies a set of statistical assumptions concerning the generation of sample data (and similar data from a larger population). A statistical model represents, often in considerably idealized form, ...
. As Philip Dawid puts it, "a forecaster is ''well calibrated'' if, for example, of those events to which he assigns a probability 30 percent, the long-run proportion that actually occurs turns out to be 30 percent".


In regression

The ''calibration problem'' in regression is the use of known data on the observed relationship between a dependent variable and an independent variable to make estimates of other values of the independent variable from new observations of the dependent variable. This can be known as "inverse regression": see also sliced inverse regression. One example is that of dating objects, using observable evidence such as tree rings for
dendrochronology Dendrochronology (or tree-ring dating) is the scientific method of dating tree rings (also called growth rings) to the exact year they were formed. As well as dating them, this can give data for dendroclimatology, the study of climate and atmos ...
or
carbon-14 Carbon-14, C-14, or radiocarbon, is a radioactive isotope of carbon with an atomic nucleus containing 6 protons and 8 neutrons. Its presence in organic materials is the basis of the radiocarbon dating method pioneered by Willard Libby and col ...
for
radiometric dating Radiometric dating, radioactive dating or radioisotope dating is a technique which is used to date materials such as rocks or carbon, in which trace radioactive impurities were selectively incorporated when they were formed. The method compares ...
. The observation is caused by the age of the object being dated, rather than the reverse, and the aim is to use the method for estimating dates based on new observations. The
problem Problem solving is the process of achieving a goal by overcoming obstacles, a frequent part of most activities. Problems in need of solutions range from simple personal tasks (e.g. how to turn on an appliance) to complex issues in business an ...
is whether the model used for relating known ages with observations should aim to minimise the error in the observation, or minimise the error in the date. The two approaches will produce different results, and the difference will increase if the model is then used for
extrapolation In mathematics, extrapolation is a type of estimation, beyond the original observation range, of the value of a variable on the basis of its relationship with another variable. It is similar to interpolation, which produces estimates between kno ...
at some distance from the known results.


In classification

Calibration in
classification Classification is a process related to categorization, the process in which ideas and objects are recognized, differentiated and understood. Classification is the grouping of related facts into classes. It may also refer to: Business, organizat ...
means turning transform classifier scores into
class membership probabilities In machine learning, a probabilistic classifier is a classifier that is able to predict, given an observation of an input, a probability distribution over a set of classes, rather than only outputting the most likely class that the observation sh ...
. An overview of calibration methods for two-class and multi-class classification tasks is given by Gebel (2009) . The following univariate calibration methods exist for transforming classifier scores into
class membership probabilities In machine learning, a probabilistic classifier is a classifier that is able to predict, given an observation of an input, a probability distribution over a set of classes, rather than only outputting the most likely class that the observation sh ...
in the two-class case: * Assignment value approach, see Garczarek (2002) * Bayes approach, see Bennett (2002) * Isotonic regression, see Zadrozny and Elkan (2002) * Platt scaling (a form of
logistic regression In statistics, the logistic model (or logit model) is a statistical model that models the probability of an event taking place by having the log-odds for the event be a linear combination of one or more independent variables. In regression analy ...
), see Lewis and Gale (1994) and Platt (1999) * Bayesian Binning into Quantiles (BBQ) calibration, see Naeini, Cooper, Hauskrecht (2015) * Beta calibration, see Kull, Filho, Flach (2017) The following multivariate calibration methods exist for transforming classifier scores into
class membership probabilities In machine learning, a probabilistic classifier is a classifier that is able to predict, given an observation of an input, a probability distribution over a set of classes, rather than only outputting the most likely class that the observation sh ...
in the case with classes count greater than two: * Reduction to binary tasks and subsequent pairwise coupling, see Hastie and Tibshirani (1998)T. Hastie and R. Tibshirani,

" Classification by pairwise coupling. In: M. I. Jordan, M. J. Kearns and Sara Solla, S. A. Solla (eds.), Advances in Neural Information Processing Systems, volume 10, Cambridge, MIT Press, 1998.
* Dirichlet calibration, see Gebel (2009)


In prediction and forecasting

In
prediction A prediction (Latin ''præ-'', "before," and ''dicere'', "to say"), or forecast, is a statement about a future event or data. They are often, but not always, based upon experience or knowledge. There is no universal agreement about the exac ...
and forecasting, a
Brier score The Brier Score is a ''strictly proper score function'' or ''strictly proper scoring rule'' that measures the accuracy of probabilistic predictions. For unidimensional predictions, it is strictly equivalent to the mean squared error as applied t ...
is sometimes used to assess prediction accuracy of a set of predictions, specifically that the magnitude of the assigned probabilities track the relative frequency of the observed outcomes. Philip E. Tetlock employs the term "calibration" in this sense in his 2015 book ''
Superforecasting ''Superforecasting: The Art and Science of Prediction'' is a book by Philip E. Tetlock and Dan Gardner released in 2015. It details findings from The Good Judgment Project. Reviews ''The Economist'' reports that superforecasters are clever ...
''. This differs from
accuracy and precision Accuracy and precision are two measures of ''observational error''. ''Accuracy'' is how close a given set of measurements (observations or readings) are to their ''true value'', while ''precision'' is how close the measurements are to each other ...
. For example, as expressed by
Daniel Kahneman Daniel Kahneman (; he, דניאל כהנמן; born March 5, 1934) is an Israeli-American psychologist and economist notable for his work on the psychology of judgment and decision-making, as well as behavioral economics, for which he was award ...
, "if you give all events that happen a probability of .6 and all the events that don’t happen a probability of .4, your discrimination is perfect but your calibration is miserable". Aggregative Contingent Estimation was a program of the Office of Incisive Analysis (OIA) at the Intelligence Advanced Research Projects Activity (IARPA) that sponsored research and forecasting tournaments in partnership with
The Good Judgment Project The Good Judgment Project (GJP) is an organization dedicated to "harnessing the wisdom of the crowd to forecast world events". It was co-created by Philip E. Tetlock (author of ''Superforecasting'' and ''Expert Political Judgment''), decision sci ...
, co-created by Philip E. Tetlock, Barbara Mellers, and Don Moore. In
meteorology Meteorology is a branch of the atmospheric sciences (which include atmospheric chemistry and physics) with a major focus on weather forecasting. The study of meteorology dates back millennia, though significant progress in meteorology did not ...
, in particular, as concerns weather forecasting, a related mode of assessment is known as
forecast skill In the fields of forecasting and prediction, forecast skill or prediction skill is any measure of the accuracy and/or degree of association of prediction to an observation or estimate of the actual value of what is being predicted (formally, the pre ...
.


See also

* *


References

{{DEFAULTSORT:Calibration (Statistics) Regression analysis * Statistical classification