Meta-regression is a

meta-analysis Meta-analysis is a method of synthesis of quantitative data from multiple independent studies addressing a common research question. An important part of this method involves computing a combined effect size across all of the studies. As such, th ...

that uses regression analysis to combine, compare, and synthesize research findings from multiple studies while adjusting for the effects of available covariates on a response variable. A meta-regression analysis aims to reconcile conflicting studies or corroborate consistent ones; a meta-regression analysis is therefore characterized by the collated studies and their corresponding data sets—whether the response variable is study-level (or equivalently ''aggregate'') data or individual participant data (or individual patient data in medicine). A data set is ''aggregate'' when it consists of summary statistics such as the

sample mean The sample mean (sample average) or empirical mean (empirical average), and the sample covariance or empirical covariance are statistics computed from a sample of data on one or more random variables. The sample mean is the average value (or me ...

effect size In statistics, an effect size is a value measuring the strength of the relationship between two variables in a population, or a sample-based estimate of that quantity. It can refer to the value of a statistic calculated from a sample of data, the ...

, or

odds ratio An odds ratio (OR) is a statistic that quantifies the strength of the association between two events, A and B. The odds ratio is defined as the ratio of the odds of event A taking place in the presence of B, and the odds of A in the absence of B ...

. On the other hand, individual participant data are in a sense ''raw'' in that all observations are reported with no abridgment and therefore no information loss. Aggregate data are easily compiled through internet search engines and therefore not expensive. However, individual participant data are usually confidential and are only accessible within the group or organization that performed the studies. Although

for observational data is also under extensive research, the literature largely centers around combining randomized controlled trials (RCTs). In RCTs, a study typically includes a trial that consists of ''arms''. An ''arm'' refers to a group of participants who received the same therapy, intervention, or treatment. A meta-analysis with some or all studies having more than two arms is called ''network meta-analysis'', ''indirect meta-analysis'', or a ''multiple treatment comparison''. Despite also being an umbrella term, ''meta-analysis'' sometimes implies that all included studies have strictly two arms each—same two treatments in all trials—to distinguish itself from network meta-analysis. A meta-regression can be classified in the same way—meta-regression and network meta-regression—depending on the number of distinct treatments in the regression analysis. Meta-analysis (and meta-regression) is often placed at the top of the evidence hierarchy provided that the analysis consists of individual participant data of randomized controlled clinical trials. Meta-regression plays a critical role in accounting for covariate effects, especially in the presence of categorical variables that can be used for subgroup analysis.

Meta-regression models

Meta-regression covers a large class of models which can differ depending on the characterization of the data at one's disposal. There is generally no one-size-fits-all description for meta-regression models. Individual participant data, in particular, allow flexible modeling that reflects different types of response variable(s): continuous, count, proportion, and correlation. However, aggregate data are generally modeled as a normal linear regression ''y''_''tk'' = x_''tk''''β'' + ''ε''_''tk'' using the central limit theorem and variable transformation, where the subscript ''k'' indicates the ''k''th study or trial, ''t'' denotes the ''t''th treatment, ''y''_''tk'' indicates the response endpoint for the ''k''th study's ''t''th arm, x_''tk'' is the arm-level covariate vector, ''ε''_''tk'' is the error term that is independently and identically distributed as a normal distribution. For example, a sample proportion ''p̂''_''tk'' can be logit-transformed or arcsine-transformed prior to meta-regression modeling, i.e., ''y''_''tk'' logit(''p̂''_''tk'') or ''y''_''tk'' arcsin(''p̂''_''tk''). Likewise, Fisher's ''z''-transformation can be used for sample correlations, i.e., ''y''_''tk'' arctanh(''r''_''tk''). The most common summary statistic reported in a study is the sample mean and the sample standard deviation, in which case no transformation is needed. It is also possible to derive an aggregate-data model from an underlying individual-participant-data model. For example, if ''y''_''itk'' is the binary response either zero or one where the additional subscript ''i'' indicates the ''i''th participant, the sample proportion ''p̂''_''tk'' as the sample average of ''y''_''itk'' for ''i'' = 1, 2, ..., ''n''_''tk'' may not require any transformation if

de Moivre–Laplace theorem In probability theory, the de Moivre–Laplace theorem, which is a special case of the central limit theorem, states that the normal distribution may be used as an approximation to the binomial distribution under certain conditions. In particul ...

is assumed to be at play. Note that if a meta-regression is study-level, as opposed to arm-level, there is no subscript ''t'' indicating the treatment assigned for the corresponding arm. One of the most important distinctions in meta-analysis models is whether to assume ''heterogeneity'' between studies. If a researcher assumes that studies are not ''heterogeneous'', it implies that the studies are only different due to sampling error with no material difference between studies, in which case no other source of variation would enter the model. On the other hand, if studies are heterogeneous, the additional source(s) of variation—aside from the sampling error represented by ''ε''_''tk''—must be addressed. This ultimately translates to a choice between fixed-effect meta-regression and random-effect (rigorously speaking, mixed-effect) meta-regression.

Fixed-effect meta-regression

Fixed-effect meta-regression reflects the belief that the studies involved lack substantial difference. An arm-level fixed-effect meta-regression is written as ''y''_''tk'' x_''tk''''β'' + ''ɛ''_''tk''. If only study-level summary statistics are available, the subscript ''t'' for treatment assignment can be dropped, yielding ''y''_''k'' x_''k''''β'' + ''ɛ''_''k''. The error term involves a variance term ''σ''_''tk''² (or ''σ''_''k''²) which is not estimable unless the sample variance ''s''_''tk''² (or ''s''_''k''²) is reported as well as ''y''_''tk'' (or ''y''_''k''). Most commonly, the model variance is assumed to be equal across arms and studies, in which case all subscripts are dropped, i.e., ''σ''². If the between-study variation is nonnegligible, the parameter estimates will be biased, and the corresponding statistical inference cannot be generalized.

Mixed-effect meta-regression

The terms ''random-effect meta-regression'' and ''mixed-effect meta-regression'' are equivalent. Although calling one a ''random-effect model'' signals the absence of fixed effects, which would technically disqualify it from being a regression model, one could argue that the modifier ''random-effect'' only adds to, not takes away from, what any regression model should include: fixed effects. Google Trends indicates that both terms enjoy similar levels of acceptance in publications as of July 24, 2021. Mixed-effect meta-regression includes a random-effect term in addition to the fixed effects, suggesting that the studies are heterogeneous. The random effects, denoted by w_''tk''''γ''_''k'', capture between-trial variability. The full model then becomes ''y''_''tk'' x_''tk''''β'' + w_''tk''''γ''_''k'' + ''ε''_''tk''. Random effects in meta-regression are intended to reflect the noisy treatment effects—unless assumed and modeled otherwise—which implies that the length of the corresponding coefficient vector ''γ''_''k'' should be the same as the number of treatments included in the study. This implies that treatments themselves are assumed to be a source of variation in the outcome variable—e.g., the group receiving a placebo will not have the same level of variability in cholesterol level as another that receives a cholesterol-lowering drug. Restricting our attention to the narrow definition of meta-analysis including two treatments, ''γ''_''k'' is two-dimensional, i.e., ''γ''_''k'' (''γ''_''1k'', ''γ''_''2k''), for which the model is recast as ''y''_''tk'' x_''tk''''β'' + ''γ''_''tk'' + ''ε''_''tk''. The advantage of writing the model in a matrix-vector notation is that the correlation between the treatments, Corr(''γ''_''1k'', ''γ''_''2k''), can be investigated. The random coefficient vector ''γ''_''k'' is then a noisy realization of the real treatment effect denoted by ''γ''. The distribution of ''γ''_''k'' is commonly assumed to be one in the location-scale family, most notably, a

multivariate normal distribution In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional ( univariate) normal distribution to higher dimensions. One d ...

, i.e., ''γ''_''k'' ~ ''N''(''γ'', Ω).

Which model to choose

Meta-regression has been employed as a technique to derive improved parameter estimates that are of direct use to policymakers. Meta-regression provides a framework for replication and offers a sensitivity analysis for model specification.T.D. Stanley and Stephen B. Jarrell, (1989). Meta-regression analysis: A quantitative method of literature surveys. ''Journal of Economic Surveys'', 19(3) 299-308. There are a number of strategies for identifying and coding empirical observational data. Meta-regression models can be extended for modeling within-study dependence, excess heterogeneity and publication selection. The fixed-effect regression model does not allow for within-study variation. The mixed effects model allows for within-study variation and between-study variation and is therefore taken as the most flexible model to choose in many applications. Although the heterogeneity assumption can be statistically tested and it is a widespread practice in many fields, if this test is followed by another set of regression analysis, the corresponding statistical inference is subject to what is called ''selective inference''. These heterogeneity tests also do not conclude that there is no heterogeneity even when they come out insignificant, and some researchers advise to opt for mixed-effect meta-regression at any rate.

Applications

Meta-regression is a statistically rigorous approach to

systematic review A systematic review is a scholarly synthesis of the evidence on a clearly presented topic using critical methods to identify, define and assess research on the topic. A systematic review extracts and interprets data from published studies on ...

s. Recent applications include quantitative reviews of the empirical literature in economics, business, energy and water policy.T.D. Stanley and Hristos Doucouliagos (2009). ''Meta-regression Analysis in Economics and Business'', New York: Routledge. Meta-regression analyses have been seen in studies of price and income elasticities of various commodities and taxes, productivity spillovers on multinational companies,H. Gorg and Eric Strobl (2001). Multinational companies and productivity spillovers: A meta-analysis. ''The Economic Journal'', 111(475) 723-739. and calculations on the value of a statistical life (VSL).F. Bellavance, Georges Dionne, and Martin Lebeau (2009). The value of a statistical life: A meta-analysis with a mixed effects regression model, ''Journal of Health Economics'', 28(2) 444-464. Other recent meta-regression analyses have focused on qualifying elasticities derived from demand functions. Examples include own price elasticities for alcohol, tobacco, water and energy. In energy conservation, meta-regression analysis has been used to evaluate behavioral information strategies in the residential electricity sector.M.A. Delmas, Miriam Fischlein and Omar I. Asensio (2013). Information strategies and energy conservation behavior: A meta-analysis of experimental studies 1975-2012. ''Energy Policy'', 61, 729-739. In water policy analysis, meta-regression has been used to evaluate cost savings estimates due to privatization of local government services for water distribution and solid waste collection.G. Bel, Xavier Fageda and Mildred E. Warner (2010). Is private production of public services cheaper than public production? A meta-regression analysis of solid waste and water services. ''Journal of Policy Analysis and Management.'' 29(3), 553-577. Meta-regression is an increasingly popular tool to evaluate the available evidence in cost-benefit analysis studies of a policy or program spread across multiple studies.

Meta-regression models

Fixed-effect meta-regression

Mixed-effect meta-regression

Which model to choose

Applications

See also

References

Further reading