The margin of error is a statistic expressing the amount of random

sampling error In statistics, sampling errors are incurred when the statistical characteristics of a population are estimated from a subset, or sample, of that population. Since the sample does not include all members of the population, statistics of the sample ...

in the results of a survey. The larger the margin of error, the less confidence one should have that a poll result would reflect the result of a simultaneous census of the entire

population Population is a set of humans or other organisms in a given region or area. Governments conduct a census to quantify the resident population size within a given jurisdiction. The term is also applied to non-human animals, microorganisms, and pl ...

. The margin of error will be positive whenever a population is incompletely sampled and the outcome measure has positive

variance In probability theory and statistics, variance is the expected value of the squared deviation from the mean of a random variable. The standard deviation (SD) is obtained as the square root of the variance. Variance is a measure of dispersion ...

, which is to say, whenever the measure ''varies''. The term ''margin of error'' is often used in non-survey contexts to indicate

observational error Observational error (or measurement error) is the difference between a measured value of a quantity and its unknown true value.Dodge, Y. (2003) ''The Oxford Dictionary of Statistical Terms'', OUP. Such errors are inherent in the measurement ...

in reporting measured quantities.

Concept

Consider a simple ''yes/no'' poll

P

as a sample of

n

respondents drawn from a population

N \text(n \ll N)

reporting the percentage

p

of ''yes'' responses. We would like to know how close

p

is to the true result of a survey of the entire population

N

, without having to conduct one. If, hypothetically, we were to conduct a poll

P

over subsequent samples of

n

respondents (newly drawn from

N

), we would expect those subsequent results

p_1,p_2,\ldots

to be normally distributed about

\overline

, the true but unknown percentage of the population. The ''margin of error'' describes the distance within which a specified percentage of these results is expected to vary from

\overline

. Going by the

Central limit theorem In probability theory, the central limit theorem (CLT) states that, under appropriate conditions, the Probability distribution, distribution of a normalized version of the sample mean converges to a Normal distribution#Standard normal distributi ...

, the margin of error helps to explain how the distribution of sample means (or percentage of yes, in this case) will approximate a normal distribution as sample size increases. If this applies, it would speak about the sampling being unbiased, but not about the inherent distribution of the data. According to the 68-95-99.7 rule, we would expect that 95% of the results

p_1,p_2,\ldots

will fall within ''about'' two

standard deviation In statistics, the standard deviation is a measure of the amount of variation of the values of a variable about its Expected value, mean. A low standard Deviation (statistics), deviation indicates that the values tend to be close to the mean ( ...

s (

\plusmn2\sigma_

) either side of the true mean

\overline

. This interval is called the confidence interval, and the ''radius'' (half the interval) is called the ''margin of error'', corresponding to a 95% ''confidence level''. Generally, at a confidence level

\gamma

, a sample sized

n

of a population having expected standard deviation

\sigma

has a margin of error :

MOE_\gamma = z_\gamma \times \sqrt

where

z_\gamma

denotes the ''quantile'' (also, commonly, a ''

z-score In statistics, the standard score or ''z''-score is the number of standard deviations by which the value of a raw score (i.e., an observed value or data point) is above or below the mean value of what is being observed or measured. Raw scores ...

''), and

\sqrt

is the

standard error The standard error (SE) of a statistic (usually an estimator of a parameter, like the average or mean) is the standard deviation of its sampling distribution or an estimate of that standard deviation. In other words, it is the standard deviati ...

Standard deviation and standard error

We would expect the average of normally distributed values

p_1,p_2,\ldots

to have a standard deviation which somehow varies with

n

. The smaller

n

, the wider the margin. This is called the standard error

\sigma_\overline

. For the single result from our survey, we ''assume'' that

p = \overline

, and that ''all'' subsequent results

p_1,p_2,\ldots

together would have a variance

\sigma_^2=P(1-P)

. :

\text = \sigma_\overline \approx \sqrt \approx \sqrt

Note that

p(1-p)

corresponds to the variance of a

Bernoulli distribution In probability theory and statistics, the Bernoulli distribution, named after Swiss mathematician Jacob Bernoulli, is the discrete probability distribution of a random variable which takes the value 1 with probability p and the value 0 with pro ...

Maximum margin of error at different confidence levels

For a confidence ''level''

\gamma

, there is a corresponding confidence ''interval'' about the mean

\mu\plusmn z_\gamma\sigma

, that is, the interval

mu-z_\gamma\sigma,\mu+z_\gamma\sigma /math> within which values of P should fall with probability \gamma . Precise values of z_\gamma are given by the quantile function of the normal distribution (which the 68–95–99.7 rule approximates).

Note that z_\gamma is undefined for, \gamma,  \ge 1, that is, z_is undefined, as is z_.

Margin of error vs sample size and confidence level

Since

\max \sigma_P^2 = \max P(1-P) = 0.25

p = 0.5

, we can arbitrarily set

p=\overline = 0.5

, calculate

\sigma_

\sigma_\overline

, and

z_\gamma\sigma_\overline

to obtain the ''maximum'' margin of error for

P

at a given confidence level

\gamma

and sample size

n

, even before having actual results. With

p=0.5,n=1013

MOE_(0.5) = z_\sigma_\overline \approx z_\sqrt = 1.96\sqrt = 0.98/\sqrt=\plusmn3.1%

MOE_(0.5) = z_\sigma_\overline \approx z_\sqrt = 2.58\sqrt = 1.29/\sqrt=\plusmn4.1%

Also, usefully, for any reported

MOE_

MOE_ = \fracMOE_ \approx 1.3 \times MOE_

Specific margins of error

If a poll has multiple percentage results (for example, a poll measuring a single multiple-choice preference), the result closest to 50% will have the highest margin of error. Typically, it is this number that is reported as the margin of error for the entire poll. Imagine poll

P

reports

p_,p_,p_

71%, 27%, 2%, n=1013

MOE_(P_) = z_\sigma_\overline \approx 1.96\sqrt = 0.89/\sqrt=\plusmn2.8%

(as in the figure above) :

MOE_(P_) = z_\sigma_\overline \approx 1.96\sqrt = 0.87/\sqrt=\plusmn2.7%

MOE_(P_) = z_\sigma_\overline \approx 1.96\sqrt = 0.27/\sqrt=\plusmn0.8%

As a given percentage approaches the extremes of 0% or 100%, its margin of error approaches ±0%.

Comparing percentages

Imagine multiple-choice poll

P

reports

p_,p_,p_

46%, 42%, 12%, n=1013

. As described above, the margin of error reported for the poll would typically be

MOE_(P_)

, as

p_

is closest to 50%. The popular notion of ''statistical tie'' or ''statistical dead heat,'' however, concerns itself not with the accuracy of the individual results, but with that of the ''ranking'' of the results. Which is in first? If, hypothetically, we were to conduct a poll

P

over subsequent samples of

n

respondents (newly drawn from

N

), and report the result

p_ = p_ - p_

, we could use the ''standard error of difference'' to understand how

p_,p_,p_,\ldots

is expected to fall about

\overline

. For this, we need to apply the ''sum of variances'' to obtain a new variance,

\sigma_^2

, :

\sigma_^2=\sigma_^2 = \sigma_^2 + \sigma_^2-2\sigma_ = p_(1-p_) + p_(1-p_) + 2p_p_

where

\sigma_ = -P_P_

is the

covariance In probability theory and statistics, covariance is a measure of the joint variability of two random variables. The sign of the covariance, therefore, shows the tendency in the linear relationship between the variables. If greater values of one ...

P_

and

P_

. Thus (after simplifying), :

\text = \sigma_ \approx \sqrt = \sqrt = 0.029, P_=P_-P_

MOE_(P_) = z_\sigma_ \approx \plusmn

MOE_(P_) = z_\sigma_ \approx \plusmn

Note that this assumes that

P_

is close to constant, that is, respondents choosing either A or B would almost never choose C (making

P_

and

P_

close to ''perfectly negatively correlated''). With three or more choices in closer contention, choosing a correct formula for

\sigma_^2

becomes more complicated.

Effect of finite population size

The formulae above for the margin of error assume that there is an infinitely large population and thus do not depend on the size of population

N

, but only on the sample size

n

. According to sampling theory, this assumption is reasonable when the sampling fraction is small. The margin of error for a particular sampling method is essentially the same regardless of whether the population of interest is the size of a school, city, state, or country, as long as the sampling ''fraction'' is small. In cases where the sampling fraction is larger (in practice, greater than 5%), analysts might adjust the margin of error using a finite population correction to account for the added precision gained by sampling a much larger percentage of the population. FPC can be calculated using the formula (Equation 1) :

\operatorname = \sqrt

...and so, if poll

P

were conducted over 24% of, say, an electorate of 300,000 voters, :

MOE_(0.5) = z_\sigma_\overline \approx \frac=\plusmn0.4%

MOE_(0.5) = z_\sigma_\overline\sqrt\approx \frac\sqrt=\plusmn0.3%

Intuitively, for appropriately large

N

, :

\lim_ \sqrt\approx 1

\lim_ \sqrt = 0

In the former case,

n

is so small as to require no correction. In the latter case, the poll effectively becomes a census and sampling error becomes moot.

References

Sources

* Sudman, Seymour and Bradburn, Norman (1982). ''Asking Questions: A Practical Guide to Questionnaire Design''. San Francisco: Jossey Bass. *

External links

* * {{mathworld , urlname = MarginofError , title = Margin of Error Error Measurement Sampling (statistics) Statistical deviation and dispersion Statistical intervals