'cat'
, but instead a set like
. Depending on how good the underlying model is (how well it can discern between cats, dogs and other animals) and the specified significance level, these sets can be smaller or larger. For regression tasks, the output is prediction intervals, where a smaller significance level (fewer allowed errors) produces wider intervals which are less specific, and vice versa – more allowed errors produce tighter prediction intervals.
History
The conformal prediction first arose in a collaboration between Gammerman, Vovk, and Vapnik in 1998; this initial version of conformal prediction used E-values, though the version of conformal prediction best known today uses p-values and was proposed a year later by Saunders et al. Vovk, Gammerman, and their students and collaborators, particularly Craig Saunders, Harris Papadopoulos, and Kostas Proedrou, continued to develop the ideas of conformal prediction; major developments include the proposal of inductive conformal prediction (a.k.a. split conformal prediction), in 2002. A book on the topic was written by Vovk and Shafer in 2005, and a tutorial was published in 2008.Theory
The data has to conform to some standards, such as data being exchangeable (a slightly weaker assumption than the standardClassification algorithms
The goal of standardInductive conformal prediction (ICP)
Inductive Conformal Prediction was first known as inductive confidence machines, but was later re-introduced as ICP. It has gained popularity in practical settings because the underlying model does not need to be retrained for every new test example. This makes it interesting for any model that is heavy to train, such as neural networks.Mondrian inductive conformal prediction (MICP)
In MICP, the alpha values are class-dependent (Mondrian) and the underlying model does not follow the original online setting introduced in 2005. Training algorithm: # Train a machine learning model (MLM) # Run a calibration set through the MLM, save output from the chosen stage #* In deep learning, the softmax values are often used # Use a non-conformity function to compute ''α''-values #*A data point in the calibration set will result in an ''α''-value for its true class Prediction algorithm: #For a test data point, generate a new ''α''-value # Find a p-value for each class of the data point # If the p-value is greater than the significance level, include the class in the outputRegression algorithms
Conformal prediction was initially formulated for the task of classification, but was later modified for regression. Unlike classification, which outputs ''p''-values without a given significance level, regression requires a fixed significance level at prediction time in order to produce prediction intervals for a new test object. For classic conformal regression, there is no transductive algorithm. This is because it is impossible to postulate all possible labels for a new test object, because the label space is continuous. The available algorithms are all formulated in the inductive setting, which computes a prediction rule once and applies it to all future predictions.Inductive conformal prediction (ICP)
All inductive algorithms require splitting the available training examples into two disjoint sets: one set used for training the underlying model (the ''proper training set'') and one set for calibrating the prediction (the ''calibration set''). In ICP, this split is done once, thus training a single ML model. If the split is performed randomly and that data is exchangeable, the ICP model is proven to be automatically valid (i.e. the error rate corresponds to the required significance level). Training algorithm: # Split the training data into ''proper training'' ''set'' and ''calibration set'' # Train the underlying ML model using the ''proper training'' set # Predict the examples from the ''calibration'' set using the derived ML model → ''ŷ''-values # Optional: if using a ''normalized'' nonconformity function ## Train the normalization ML model ## Predict normalization scores → 𝜺 -values # Compute the nonconformity measures (''α''-values) for all calibration examples, using ''ŷ''- and 𝜺-values # Sort the nonconformity measure and generate nonconformity scores # Save underlying ML model, normalization ML model (if any) and nonconformity scores Prediction algorithm: Required input: ''significance level'' (''s'') # Predict the test object using the ML model → ''ŷ''''t'' # Optional: if using a normalized nonconformity function ## Predict the test object using normalization model → ''𝜺''''t'' # Pick the nonconformity score from the list of scores produced by the calibration set in training, corresponding to the significance level ''s'' → ''α''''s'' # Compute the prediction interval half width (''d'') from rearranging the nonconformity function and input ''α''''s'' (and optionally 𝜺) → ''d'' # Output prediction interval (''ŷ'' − ''d'', ''ŷ'' + ''d'') for the given significance level ''s''Split conformal prediction (SCP)
The SCP, often called aggregated conformal predictor (ACP), can be considered an ensemble of ICPs. SCP usually improves the efficiency of predictions (that is, it creates smaller prediction intervals) compared to a single ICP, but loses the automatic validity in the generated predictions. A common type of SCPs is the cross-conformal predictor (CCP), which splits the training data into ''proper training'' and ''calibration'' sets multiple times in a strategy similar to ''k''-fold cross-validation. Regardless of the splitting technique, the algorithm performs ''n'' splits and trains an ICP for each split. When predicting a new test object, it uses the median ''ŷ'' and ''d'' from the ''n'' ICPs to create the final prediction interval as (''ŷ''median − ''d''median, ''ŷ''median + ''d''median).Applications
Types of learning models
SeveralData used
Conformal prediction is used in a variety of fields and is an active area of research. For example, inConferences
Conformal prediction is one of the main subjects discussed during the COPA conference each year. Both theory and applications of conformal predictions are presented by leaders of the field. The conference has been held since 2012.{{Cite web, title=10th Symposium on Conformal and Probabilistic Prediction with Applications (COPA 2021), url=https://cml.rhul.ac.uk/copa2021/#nav-past, access-date=2021-09-15, website=cml.rhul.ac.uk It has been hosted in several different European countries including Greece, Great Britain, Italy and Sweden.See also
*References