Margin Of Error
   HOME

TheInfoList



OR:

The margin of error is a statistic expressing the amount of random
sampling error In statistics, sampling errors are incurred when the statistical characteristics of a population are estimated from a subset, or sample, of that population. Since the sample does not include all members of the population, statistics of the sample ...
in the results of a survey. The larger the margin of error, the less confidence one should have that a poll result would reflect the result of a simultaneous census of the entire
population Population is a set of humans or other organisms in a given region or area. Governments conduct a census to quantify the resident population size within a given jurisdiction. The term is also applied to non-human animals, microorganisms, and pl ...
. The margin of error will be positive whenever a population is incompletely sampled and the outcome measure has positive
variance In probability theory and statistics, variance is the expected value of the squared deviation from the mean of a random variable. The standard deviation (SD) is obtained as the square root of the variance. Variance is a measure of dispersion ...
, which is to say, whenever the measure ''varies''. The term ''margin of error'' is often used in non-survey contexts to indicate
observational error Observational error (or measurement error) is the difference between a measured value of a quantity and its unknown true value.Dodge, Y. (2003) ''The Oxford Dictionary of Statistical Terms'', OUP. Such errors are inherent in the measurement ...
in reporting measured quantities.


Concept

Consider a simple ''yes/no'' poll P as a sample of n respondents drawn from a population N \text(n \ll N) reporting the percentage p of ''yes'' responses. We would like to know how close p is to the true result of a survey of the entire population N, without having to conduct one. If, hypothetically, we were to conduct a poll P over subsequent samples of n respondents (newly drawn from N), we would expect those subsequent results p_1,p_2,\ldots to be normally distributed about \overline, the true but unknown percentage of the population. The ''margin of error'' describes the distance within which a specified percentage of these results is expected to vary from \overline. Going by the
Central limit theorem In probability theory, the central limit theorem (CLT) states that, under appropriate conditions, the Probability distribution, distribution of a normalized version of the sample mean converges to a Normal distribution#Standard normal distributi ...
, the margin of error helps to explain how the distribution of sample means (or percentage of yes, in this case) will approximate a normal distribution as sample size increases. If this applies, it would speak about the sampling being unbiased, but not about the inherent distribution of the data. According to the 68-95-99.7 rule, we would expect that 95% of the results p_1,p_2,\ldots will fall within ''about'' two
standard deviation In statistics, the standard deviation is a measure of the amount of variation of the values of a variable about its Expected value, mean. A low standard Deviation (statistics), deviation indicates that the values tend to be close to the mean ( ...
s (\plusmn2\sigma_) either side of the true mean \overline.  This interval is called the confidence interval, and the ''radius'' (half the interval) is called the ''margin of error'', corresponding to a 95% ''confidence level''. Generally, at a confidence level \gamma, a sample sized n of a population having expected standard deviation \sigma has a margin of error :MOE_\gamma = z_\gamma \times \sqrt where z_\gamma denotes the ''quantile'' (also, commonly, a ''
z-score In statistics, the standard score or ''z''-score is the number of standard deviations by which the value of a raw score (i.e., an observed value or data point) is above or below the mean value of what is being observed or measured. Raw scores ...
''), and \sqrt is the
standard error The standard error (SE) of a statistic (usually an estimator of a parameter, like the average or mean) is the standard deviation of its sampling distribution or an estimate of that standard deviation. In other words, it is the standard deviati ...
.


Standard deviation and standard error

We would expect the average of normally distributed values  p_1,p_2,\ldots to have a standard deviation which somehow varies with n. The smaller n, the wider the margin. This is called the standard error \sigma_\overline. For the single result from our survey, we ''assume'' that p = \overline, and that ''all'' subsequent results p_1,p_2,\ldots together would have a variance \sigma_^2=P(1-P). : \text = \sigma_\overline \approx \sqrt \approx \sqrt Note that p(1-p) corresponds to the variance of a
Bernoulli distribution In probability theory and statistics, the Bernoulli distribution, named after Swiss mathematician Jacob Bernoulli, is the discrete probability distribution of a random variable which takes the value 1 with probability p and the value 0 with pro ...
.


Maximum margin of error at different confidence levels

For a confidence ''level'' \gamma, there is a corresponding confidence ''interval'' about the mean \mu\plusmn z_\gamma\sigma, that is, the interval mu-z_\gamma\sigma,\mu+z_\gamma\sigma/math> within which values of P should fall with probability \gamma. Precise values of z_\gamma are given by the quantile function of the normal distribution (which the 68–95–99.7 rule approximates). Note that z_\gamma is undefined for , \gamma, \ge 1, that is, z_ is undefined, as is z_. Since \max \sigma_P^2 = \max P(1-P) = 0.25 at p = 0.5, we can arbitrarily set p=\overline = 0.5, calculate \sigma_, \sigma_\overline, and z_\gamma\sigma_\overline to obtain the ''maximum'' margin of error for P at a given confidence level \gamma and sample size n, even before having actual results.  With p=0.5,n=1013 :MOE_(0.5) = z_\sigma_\overline \approx z_\sqrt = 1.96\sqrt = 0.98/\sqrt=\plusmn3.1% :MOE_(0.5) = z_\sigma_\overline \approx z_\sqrt = 2.58\sqrt = 1.29/\sqrt=\plusmn4.1% Also, usefully, for any reported MOE_ :MOE_ = \fracMOE_ \approx 1.3 \times MOE_


Specific margins of error

If a poll has multiple percentage results (for example, a poll measuring a single multiple-choice preference), the result closest to 50% will have the highest margin of error. Typically, it is this number that is reported as the margin of error for the entire poll. Imagine poll P reports p_,p_,p_ as 71%, 27%, 2%, n=1013 :MOE_(P_) = z_\sigma_\overline \approx 1.96\sqrt = 0.89/\sqrt=\plusmn2.8% (as in the figure above) :MOE_(P_) = z_\sigma_\overline \approx 1.96\sqrt = 0.87/\sqrt=\plusmn2.7% :MOE_(P_) = z_\sigma_\overline \approx 1.96\sqrt = 0.27/\sqrt=\plusmn0.8% As a given percentage approaches the extremes of 0% or 100%, its margin of error approaches ±0%.


Comparing percentages

Imagine multiple-choice poll P reports p_,p_,p_ as 46%, 42%, 12%, n=1013. As described above, the margin of error reported for the poll would typically be MOE_(P_), as p_ is closest to 50%. The popular notion of ''statistical tie'' or ''statistical dead heat,'' however, concerns itself not with the accuracy of the individual results, but with that of the ''ranking'' of the results. Which is in first? If, hypothetically, we were to conduct a poll P over subsequent samples of n respondents (newly drawn from N), and report the result p_ = p_ - p_, we could use the ''standard error of difference'' to understand how p_,p_,p_,\ldots is expected to fall about \overline. For this, we need to apply the ''sum of variances'' to obtain a new variance, \sigma_^2 , : \sigma_^2=\sigma_^2 = \sigma_^2 + \sigma_^2-2\sigma_ = p_(1-p_) + p_(1-p_) + 2p_p_ where \sigma_ = -P_P_ is the
covariance In probability theory and statistics, covariance is a measure of the joint variability of two random variables. The sign of the covariance, therefore, shows the tendency in the linear relationship between the variables. If greater values of one ...
of P_ and P_. Thus (after simplifying), : \text = \sigma_ \approx \sqrt = \sqrt = 0.029, P_=P_-P_ : MOE_(P_) = z_\sigma_ \approx \plusmn : MOE_(P_) = z_\sigma_ \approx \plusmn Note that this assumes that P_ is close to constant, that is, respondents choosing either A or B would almost never choose C (making P_ and P_ close to ''perfectly negatively correlated''). With three or more choices in closer contention, choosing a correct formula for \sigma_^2 becomes more complicated.


Effect of finite population size

The formulae above for the margin of error assume that there is an infinitely large population and thus do not depend on the size of population N, but only on the sample size n. According to sampling theory, this assumption is reasonable when the sampling fraction is small. The margin of error for a particular sampling method is essentially the same regardless of whether the population of interest is the size of a school, city, state, or country, as long as the sampling ''fraction'' is small. In cases where the sampling fraction is larger (in practice, greater than 5%), analysts might adjust the margin of error using a finite population correction to account for the added precision gained by sampling a much larger percentage of the population. FPC can be calculated using the formula (Equation 1) :\operatorname = \sqrt ...and so, if poll P were conducted over 24% of, say, an electorate of 300,000 voters, :MOE_(0.5) = z_\sigma_\overline \approx \frac=\plusmn0.4% :MOE_(0.5) = z_\sigma_\overline\sqrt\approx \frac\sqrt=\plusmn0.3% Intuitively, for appropriately large N, :\lim_ \sqrt\approx 1 :\lim_ \sqrt = 0 In the former case, n is so small as to require no correction. In the latter case, the poll effectively becomes a census and sampling error becomes moot.


See also

*
Engineering tolerance Engineering tolerance is the permissible limit or limits of variation in: # a physical dimension; # a measured value or physical property of a material, manufactured object, system, or service; # other measured values (such as temperature, hum ...
*
Key relevance In master locksmithing, key relevance is the measurable difference between an original key (lock), key and a copy made of that key, either from a wax impression or directly from the original, and how similar the two keys are in size and shape. It ...
*
Measurement uncertainty In metrology, measurement uncertainty is the expression of the statistical dispersion of the values attributed to a quantity measured on an interval or ratio scale. All measurements are subject to uncertainty and a measurement result is complet ...
*
Random error Observational error (or measurement error) is the difference between a measured value of a quantity and its unknown true value.Dodge, Y. (2003) ''The Oxford Dictionary of Statistical Terms'', OUP. Such errors are inherent in the measurement ...


References


Sources

* Sudman, Seymour and Bradburn, Norman (1982). ''Asking Questions: A Practical Guide to Questionnaire Design''. San Francisco: Jossey Bass. *


External links

* * {{mathworld , urlname = MarginofError , title = Margin of Error Error Measurement Sampling (statistics) Statistical deviation and dispersion Statistical intervals