Pseudoreplication (sometimes unit of analysis error) has many definitions. Pseudoreplication was originally defined in 1984 by Stuart H. Hurlbert as the use of inferential statistics to test for treatment effects with data from experiments where either treatments are not replicated (though samples may be) or replicates are not statistically independent. Subsequently, Millar and Anderson identified it as a special case of inadequate specification of random factors where both random and fixed factors are present. It is sometimes narrowly interpreted as an inflation of the number of samples or replicates which are not statistically independent. This definition omits the confounding of unit and treatment effects in a misspecified

F-ratio F-ratio or f-ratio may refer to: * The F-ratio used in statistics, which relates the variances of independent samples; see F-distribution * f-ratio (oceanography), which relates recycled and total primary production in the surface ocean * f-number ...

. In practice, incorrect F-ratios for statistical tests of fixed effects often arise from a default F-ratio that is formed over the error rather the mixed term. Lazic defined pseudoreplication as a problem of correlated samples (e.g. from

longitudinal studies A longitudinal study (or longitudinal survey, or panel study) is a research design that involves repeated observations of the same variables (e.g., people) over long periods of time (i.e., uses longitudinal data). It is often a type of observation ...

) where correlation is not taken into account when computing the confidence interval for the sample mean. For the effect of serial or temporal correlation also see Markov chain central limit theorem. Pseudoreplication correlation

The problem of inadequate specification arises when treatments are assigned to units that are subsampled and the treatment

in an analysis of variance (

ANOVA Analysis of variance (ANOVA) is a family of statistical methods used to compare the means of two or more groups by analyzing variance. Specifically, ANOVA compares the amount of variation ''between'' the group means to the amount of variation ''w ...

) table is formed with respect to the residual mean square rather than with respect to the among unit mean square. The F-ratio relative to the within unit mean square is vulnerable to the

confounding In causal inference, a confounder is a variable that influences both the dependent variable and independent variable, causing a spurious association. Confounding is a causal concept, and as such, cannot be described in terms of correlatio ...

of treatment and unit effects, especially when experimental unit number is small (e.g. four tank units, two tanks treated, two not treated, several subsamples per tank). The problem is eliminated by forming the F-ratio relative to the correct mean square in the ANOVA table (tank by treatment MS in the example above), where this is possible. The problem is addressed by the use of mixed models. Hurlbert reported "pseudoreplication" in 48% of the studies he examined, that used inferential statistics. Several studies examining scientific papers published up to 2016 similarly found about half of the papers were suspected of pseudoreplication. When time and resources limit the number of experimental units, and unit effects cannot be eliminated statistically by testing over the unit variance, it is important to use other sources of information to evaluate the degree to which an F-ratio is confounded by unit effects.

Replication

Replication increases the precision of an estimate, while randomization addresses the broader applicability of a sample to a population. Replication must be appropriate: replication at the experimental unit level must be considered, in addition to replication within units.

Hypothesis testing

Statistical tests A statistical hypothesis test is a method of statistical inference used to decide whether the data provide sufficient evidence to reject a particular hypothesis. A statistical hypothesis test typically involves a calculation of a test statistic. T ...

(e.g.

t-test Student's ''t''-test is a statistical test used to test whether the difference between the response of two groups is Statistical significance, statistically significant or not. It is any statistical hypothesis testing, statistical hypothesis test ...

and the related ANOVA family of tests) rely on appropriate replication to estimate

statistical significance In statistical hypothesis testing, a result has statistical significance when a result at least as "extreme" would be very infrequent if the null hypothesis were true. More precisely, a study's defined significance level, denoted by \alpha, is the ...

. Tests based on the t and F distributions assume homogeneous, normal, and independent errors. Correlated errors can lead to false precision and p-values that are too small.

Types

Hurlbert (1984) defined four types of pseudoreplication. *Simple pseudoreplication (Figure 5a in Hurlbert 1984) occurs when there is one experimental unit per treatment. Inferential statistics cannot separate variability due to treatment from variability due to experimental units when there is only one measurement per unit. *Temporal pseudoreplication (Figure 5c in Hurlbert 1984) occurs when experimental units differ enough in time that temporal effects among units are likely, and treatment effects are correlated with temporal effects. Inferential statistics cannot separate variability due to treatment from variability due to experimental units when there is only one measurement per unit. *Sacrificial pseudoreplication (Figure 5b in Hurlbert 1984) occurs when means within a treatment are used in an analysis, and these means are tested over the within unit variance. In Figure 5b, the erroneous F-ratio will have 1 df in the numerator (treatment) mean square and 4 df in the denominator mean square (2-1 = 1 df for each experimental unit). The correct F-ratio will have 1 df in the numerator (treatment) and 2 df in the denominator (2-1 = 1 df for each treatment). The correct F-ratio controls for effects of experimental units but with 2 df in the denominator it will have little power to detect treatment differences. *Implicit pseudoreplication occurs when standard errors (or confidence limits) are estimated within experimental units. As with other sources of pseudoreplication, treatment effects cannot be statistically separated from effects due to variation among experimental units.

References

{{reflist Design of experiments

Replication

Hypothesis testing

Types

See also

References