In the
statistical
Statistics (from German: '' Statistik'', "description of a state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, industr ...
analysis of
observational data, propensity score matching (PSM) is a
statistical matching technique that attempts to
estimate
Estimation (or estimating) is the process of finding an estimate or approximation, which is a value that is usable for some purpose even if input data may be incomplete, uncertain, or unstable. The value is nonetheless usable because it is de ...
the effect of a treatment, policy, or other intervention by accounting for the
covariate
Dependent and independent variables are variables in mathematical modeling, statistical modeling and experimental sciences. Dependent variables receive this name because, in an experiment, their values are studied under the supposition or deman ...
s that predict receiving the treatment. PSM attempts to reduce the
bias
Bias is a disproportionate weight ''in favor of'' or ''against'' an idea or thing, usually in a way that is closed-minded, prejudicial, or unfair. Biases can be innate or learned. People may develop biases for or against an individual, a group ...
due to
confounding
In statistics, a confounder (also confounding variable, confounding factor, extraneous determinant or lurking variable) is a variable that influences both the dependent variable and independent variable, causing a spurious association. Con ...
variables that could be found in an estimate of the treatment effect obtained from simply comparing outcomes among
units that
received the treatment versus those that did not.
Paul R. Rosenbaum Paul R. Rosenbaum is the Robert G. Putzel Professor Emeritus in the Department of Statistics and Data Science at Wharton School of the University of Pennsylvania, where he worked from 1986 through 202Donald Rubin
Donald is a masculine given name derived from the Gaelic name ''Dòmhnall''.. This comes from the Proto-Celtic *''Dumno-ualos'' ("world-ruler" or "world-wielder"). The final -''d'' in ''Donald'' is partly derived from a misinterpretation of the ...
introduced the technique in 1983.
The possibility of bias arises because a difference in the treatment outcome (such as the
average treatment effect) between treated and untreated groups may be caused by a factor that predicts treatment rather than the treatment itself. In
randomized experiments, the randomization enables unbiased estimation of treatment effects; for each covariate, randomization implies that treatment-groups will be balanced on average, by the
law of large numbers
In probability theory, the law of large numbers (LLN) is a theorem that describes the result of performing the same experiment a large number of times. According to the law, the average of the results obtained from a large number of trials shou ...
. Unfortunately, for observational studies, the assignment of treatments to research subjects is typically not random.
Matching attempts to reduce the treatment assignment bias, and mimic randomization, by creating a sample of units that received the treatment that is comparable on all observed covariates to a sample of units that did not receive the treatment.
The "propensity" describes how likely a unit is to have been treated, given its covariate values. The stronger the confounding of treatment and covariates, and hence the stronger the bias in the analysis of the naive treatment effect, the better the covariates predict whether a unit is treated or not. By having units with similar propensity scores in both treatment and control, such confounding is reduced.
For example, one may be interested to know the
consequences of smoking. An observational study is required since it is unethical to randomly assign people to the treatment 'smoking.' The treatment effect estimated by simply comparing those who smoked to those who did not smoke would be biased by any factors that predict smoking (e.g.: gender and age). PSM attempts to control for these biases by making the groups receiving treatment and not-treatment comparable with respect to the control variables.
Overview
PSM is for cases of
causal inference
Causal inference is the process of determining the independent, actual effect of a particular phenomenon that is a component of a larger system. The main difference between causal inference and inference of association is that causal inference ana ...
and simple selection bias in
non-experimental
In fields such as epidemiology, social sciences, psychology and statistics, an observational study draws inferences from a sample (statistics), sample to a statistical population, population where the dependent and independent variables, independ ...
settings in which: (i) few units in the non-treatment comparison group are comparable to the treatment units; and (ii) selecting a subset of comparison units similar to the treatment unit is difficult because units must be compared across a high-dimensional set of pretreatment characteristics.
In normal matching, single characteristics that distinguish treatment and control groups are matched in an attempt to make the groups more alike. But if the two groups do not have substantial overlap, then substantial
error may be introduced. For example, if only the worst cases from the
untreated "comparison" group are compared to only the best cases from the
treatment group, the result may be
regression toward the mean
In statistics, regression toward the mean (also called reversion to the mean, and reversion to mediocrity) is the fact that if one sample of a random variable is extreme, the next sampling of the same random variable is likely to be closer to it ...
, which may make the comparison group look better or worse than reality.
PSM employs a predicted probability of group membership—e.g., treatment versus control group—based on observed predictors, usually obtained from
logistic regression
In statistics, the logistic model (or logit model) is a statistical model that models the probability of an event taking place by having the log-odds for the event be a linear combination of one or more independent variables. In regression an ...
to create a
counterfactual group. Propensity scores may be used for matching or as
covariate
Dependent and independent variables are variables in mathematical modeling, statistical modeling and experimental sciences. Dependent variables receive this name because, in an experiment, their values are studied under the supposition or deman ...
s, alone or with other matching variables or covariates.
General procedure
1. Estimate propensity scores, e.g. with
logistic regression
In statistics, the logistic model (or logit model) is a statistical model that models the probability of an event taking place by having the log-odds for the event be a linear combination of one or more independent variables. In regression an ...
:
*Dependent variable: ''Z'' = 1, if unit participated (i.e. is member of the treatment group); ''Z'' = 0, if unit did not participate (i.e. is member of the control group).
*Choose appropriate confounders (variables hypothesized to be associated with both treatment and outcome)
*Obtain an
estimation
Estimation (or estimating) is the process of finding an estimate or approximation, which is a value that is usable for some purpose even if input data may be incomplete, uncertain, or unstable. The value is nonetheless usable because it is de ...
for the propensity score: predicted probability ''p'' or log
'p''/(1 − ''p'')
2. Match each participant to one or more nonparticipants on propensity score, using one of these methods:
*
Nearest neighbor matching
*Optimal full matching: match each participants to unique non-participant(s) so as to minimize the total distance in propensity scores between participants and their matched non-participants. This method can be combined with other matching techniques.
*Caliper matching: comparison units within a certain width of the propensity score of the treated units get matched, where the width is generally a fraction of the standard deviation of the propensity score
*
Mahalanobis metric matching in conjunction with PSM
*
Stratification matching
*Difference-in-differences matching (kernel and local linear weights)
*Exact matching
3. Check that covariates are balanced across treatment and comparison groups within strata of the propensity score.
* Use standardized differences or graphs to examine distributions
* If covariates are not balanced, return to steps 1 or 2 and modify the procedure
4. Estimate effects based on new sample
*Typically: a weighted mean of within-match average differences in outcomes between participants and non-participants.
*Use analyses appropriate for non-independent matched samples if more than one nonparticipant is matched to each participant
Formal definitions
Basic settings
The basic case
is of two treatments (numbered 1 and 0), with ''N''
independent and identically distributed random variables
In probability theory and statistics, a collection of random variables is independent and identically distributed if each random variable has the same probability distribution as the others and all are mutually independent. This property is usu ...
subjects. Each subject ''i'' would respond to the treatment with
and to the control with
. The quantity to be estimated is the
average treatment effect: