statistics Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...

, a binomial proportion confidence interval is a confidence interval for the probability of success calculated from the outcome of a series of success–failure experiments (

Bernoulli trial In the theory of probability and statistics, a Bernoulli trial (or binomial trial) is a random experiment with exactly two possible outcomes, "success" and "failure", in which the probability of success is the same every time the experiment is ...

s). In other words, a binomial proportion confidence interval is an interval estimate of a success probability

\ p\

when only the number of experiments

\ n\

and the number of successes

\ n_\mathsf\

are known. There are several formulas for a binomial confidence interval, but all of them rely on the assumption of a

binomial distribution In probability theory and statistics, the binomial distribution with parameters and is the discrete probability distribution of the number of successes in a sequence of statistical independence, independent experiment (probability theory) ...

. In general, a binomial distribution applies when an experiment is repeated a fixed number of times, each trial of the experiment has two possible outcomes (success and failure), the probability of success is the same for each trial, and the trials are

statistically independent Independence is a fundamental notion in probability theory, as in statistics and the theory of stochastic processes. Two event (probability theory), events are independent, statistically independent, or stochastically independent if, informally s ...

. Because the binomial distribution is a

discrete probability distribution In probability theory and statistics, a probability distribution is a function that gives the probabilities of occurrence of possible events for an experiment. It is a mathematical description of a random phenomenon in terms of its sample spa ...

(i.e., not continuous) and difficult to calculate for large numbers of trials, a variety of approximations are used to calculate this confidence interval, all with their own tradeoffs in accuracy and computational intensity. A simple example of a binomial distribution is the set of various possible outcomes, and their probabilities, for the number of heads observed when a coin is flipped ten times. The observed binomial proportion is the fraction of the flips that turn out to be heads. Given this observed proportion, the confidence interval for the true probability of the coin landing on heads is a range of possible proportions, which may or may not contain the true proportion. A 95% confidence interval for the proportion, for instance, will contain the true proportion 95% of the times that the procedure for constructing the confidence interval is employed.

Problems with using a normal approximation or "Wald interval"

Normal_approx_interval_and_logistic_example

A commonly used formula for a binomial confidence interval relies on approximating the distribution of error about a binomially-distributed observation,

\hat p

, with a

normal distribution In probability theory and statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is f(x) = \frac ...

. The normal approximation depends on the

de Moivre–Laplace theorem In probability theory, the de Moivre–Laplace theorem, which is a special case of the central limit theorem, states that the normal distribution may be used as an approximation to the binomial distribution under certain conditions. In particul ...

(the original,

binomial Binomial may refer to: In mathematics *Binomial (polynomial), a polynomial with two terms *Binomial coefficient, numbers appearing in the expansions of powers of binomials *Binomial QMF, a perfect-reconstruction orthogonal wavelet decomposition * ...

-only version of the

central limit theorem In probability theory, the central limit theorem (CLT) states that, under appropriate conditions, the Probability distribution, distribution of a normalized version of the sample mean converges to a Normal distribution#Standard normal distributi ...

) and becomes unreliable when it violates the theorems' premises, as the sample size becomes small or the success probability grows close to either or . Using the normal approximation, the success probability

\ p\

is estimated by :

\ p ~~ \approx ~~ \hat p \pm \frac\ \sqrt\ ,

where

\ \hat p \equiv \frac\

is the proportion of successes in a

process and an estimator for

\ p\

in the underlying

Bernoulli distribution In probability theory and statistics, the Bernoulli distribution, named after Swiss mathematician Jacob Bernoulli, is the discrete probability distribution of a random variable which takes the value 1 with probability p and the value 0 with pro ...

. The equivalent formula in terms of observation counts is :

\ p ~~ \approx ~~ \frac \pm \frac \sqrt\ ,

where the data are the results of

\ n\

trials that yielded

\ n_\mathsf\

successes and

\ n_\mathsf = n - n_\mathsf\

failures. The distribution function argument

\ z_\alpha\

is the

\ 1 - \tfrac\

quantile In statistics and probability, quantiles are cut points dividing the range of a probability distribution into continuous intervals with equal probabilities or dividing the observations in a sample in the same way. There is one fewer quantile t ...

of a

standard normal distribution In probability theory and statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is f(x) = \frac e^ ...

(i.e., the

probit In probability theory and statistics, the probit function is the quantile function associated with the standard normal distribution. It has applications in data analysis and machine learning, in particular exploratory statistical graphics and ...

) corresponding to the target error rate

\ \alpha ~.

For a 95% confidence level, the error

\ \alpha = 1 - 0.95 = 0.05\ ,

\ 1 - \tfrac = 0.975\

and

\ z_ = 1.96 ~.

When using the Wald formula to estimate

\ p\

, or just considering the possible outcomes of this calculation, two problems immediately become apparent: * First, for

\ \hat p\

approaching either or , the interval narrows to zero width (falsely implying certainty). * Second, for values of

~ \hat p ~ < ~ \frac ~

(probability too low / too close to ), the interval boundaries exceed

\,\ 1

(''overshoot''). (Another version of the second, overshoot problem, arises when instead

\ 1 - \hat p\

falls below the same upper bound: probability too high / too close to .) An important theoretical derivation of this confidence interval involves the inversion of a hypothesis test. Under this formulation, the confidence interval represents those values of the population parameter that would have large

P-values In null-hypothesis significance testing, the ''p''-value is the probability of obtaining test results at least as extreme as the result actually observed, under the assumption that the null hypothesis is correct. A very small ''p''-value means ...

if they were tested as a hypothesized

population proportion In statistics a population proportion, generally denoted by P or the Greek letter \pi, is a parameter that describes a percentage value associated with a population. A census can be conducted to determine the actual value of a population paramete ...

. The collection of values,

\ \theta\ ,

for which the normal approximation is valid can be represented as :

\left\\ ,

where

\ y_\

is the lower

\ \tfrac\

of a

, vs.

\ z_\ ,

which is the ''upper'' (i.e.,

\ 1 - \tfrac \

) quantile. Since the test in the middle of the inequality is a

Wald test In statistics, the Wald test (named after Abraham Wald) assesses constraints on statistical parameters based on the weighted distance between the unrestricted estimate and its hypothesized value under the null hypothesis, where the weight is the ...

, the normal approximation interval is sometimes called the Wald interval or Wald method, after

Abraham Wald Abraham Wald (; ; , ; – ) was a Hungarian and American mathematician and statistician who contributed to decision theory, geometry and econometrics, and founded the field of sequential analysis. One of his well-known statistical works was ...

, but it was first described by

Laplace Pierre-Simon, Marquis de Laplace (; ; 23 March 1749 – 5 March 1827) was a French polymath, a scholar whose work has been instrumental in the fields of physics, astronomy, mathematics, engineering, statistics, and philosophy. He summariz ...

(1812).

Bracketing the confidence interval

Extending the normal approximation and Wald-Laplace interval concepts, Michael Short has shown that inequalities on the approximation error between the binomial distribution and the normal distribution can be used to accurately bracket the estimate of the confidence interval around

\ p\ :

\ \frac ~~ \le ~~ p ~~ \le ~~ \frac\

with :

\ \widehat \equiv \sqrt\ ,

and where

\ p\

is again the (unknown) proportion of successes in a Bernoulli trial process (as opposed to

\ \hat p \equiv \frac \

that estimates it) measured with

\ n\

trials yielding

\ k\

successes,

\ z_\alpha\

is the

\ 1 - \tfrac\

quantile of a standard normal distribution (i.e., the probit) corresponding to the target error rate

\ \alpha\ ,

and the constants

\ C_\mathsf, C_\mathsf, C_\mathsf, C_\mathsf, C_\mathsf, C_\mathsf, C_\mathsf\

and

\ C_\mathsf\

are simple algebraic functions of

\ z_\alpha ~.

For a fixed

\ \alpha\

(and hence

\ z_\alpha\

), the above inequalities give easily computed one- or two-sided intervals which bracket the exact binomial upper and lower confidence limits corresponding to the error rate

\ \alpha ~.

Standard error of a proportion estimation when using weighted data

Let there be a simple random sample

\ X_1,\  \ldots,\  X_n\

where each

\ X_i\

is i.i.d from a Bernoulli(p) distribution and weight

w_i

is the weight for each observation, with the(positive) weights

w_i

normalized so they sum to . The weighted sample proportion is:

\ \hat p = \sum_^n\ w_i\ X_i ~.

Since each of the

\ X_i\

is independent from all the others, and each one has variance

\ \operatorname\\ =\ p\ (1 - p) ~~

for every

~~ i\ =\ 1 ,\ \ldots\ , n\ ,

the sampling variance of the proportion therefore is: :

\ \operatorname\ = \sum_^n \operatorname\ =  p\ ( 1 - p )\ \sum_^n w_i^2 ~.

The standard error of

\ \hat p\

is the square root of this quantity. Because we do not know

\ p\ (1-p)\ ,

we have to estimate it. Although there are many possible estimators, a conventional one is to use

\ \hat p\ ,

the sample mean, and plug this into the formula. That gives: :

\ \operatorname\ \approx \sqrt\

For otherwise unweighted data, the effective weights are uniform

\ w_i = \tfrac\ ,

giving

\ \sum_^n w_i^2 = \tfrac ~.

The

\ \operatorname\

becomes

\sqrt\ ,

leading to the familiar formulas, showing that the calculation for weighted data is a direct generalization of them.

Wilson score interval

The Wilson score interval was developed by E.B. Wilson (1927). It is an improvement over the normal approximation interval in multiple respects: Unlike the symmetric normal approximation interval (above), the Wilson score interval is ''asymmetric'', and it doesn't suffer from problems of ''overshoot'' and ''zero-width intervals'' that afflict the normal interval. It can be safely employed with small samples and skewed observations. The observed

coverage probability In statistical estimation theory, the coverage probability, or coverage for short, is the probability that a confidence interval or confidence region will include the true value (parameter) of interest. It can be defined as the proportion of i ...

is consistently closer to the nominal value,

\ 1 - \alpha ~.

Like the normal interval, the interval can be computed directly from a formula. Wilson started with the normal approximation to the binomial: :

\ z_\alpha \approx \frac\

where

\ z_\alpha\

is the standard normal interval half-width corresponding to the desired confidence

\ 1 - \alpha ~.

The analytic formula for a binomial sample standard deviation is

\ \sigma_n = \sqrt ~.

Combining the two, and squaring out the radical, gives an equation that is quadratic in

\ p\ :

\left(\ p - \hat \ \right)^2 = \frac\ p\ \left(\ 1 - p\ \right)\ \qquad

\qquad
 p^2 - 2\ p\ \hat + ^2 = p\ \frac - p^2\ \frac ~.

Transforming the relation into a standard-form quadratic equation for

\ p\ ,

treating

\ \hat p\

and

\ n\

as known values from the sample (see prior section), and using the value of

\ z_\alpha\

that corresponds to the desired confidence

\ 1 - \alpha\

for the estimate of

\ p\

gives this:

\left(\ 1 + \frac\ \right)\ p^2 -
 \left(\ 2\  + \frac\ \right)\ p +
 \biggl(\ ^2\ \biggr) = 0 ~,

where all of the values bracketed by parentheses are known quantities. The solution for

\ p\

estimates the upper and lower limits of the confidence interval for

\ p ~.

Hence the probability of success

\ p\

is estimated by

\ \hat p\

and with

\ 1 - \alpha\

confidence bracketed in the interval :

p \quad \underset_\alpha \quad \bigl(\ w^- ,\ w^+\ \bigr) ~~ = ~~ \frac \Bigg(\ \hat p + \frac 
~~ \pm ~~
  \frac\ \sqrt ~\Biggr)\

where

\ \underset_\alpha\

is an abbreviation for :

\ \operatorname \Bigr\ = 1 - \alpha ~.

An equivalent expression using the observation counts

\ n_\mathsf\

and

\ n_\mathsf\

is :

p \quad \underset_\alpha \quad \frac
~ \pm ~ \frac
\sqrt\ ,

with the counts as above:

n_\mathsf \equiv

the count of observed "successes",

\ n_\mathsf \equiv

the count of observed "failures", and their sum is the total number of observations

\ n = n_\mathsf + n_\mathsf ~.

In practical tests of the formula's results, users find that this interval has good properties even for a small number of trials and / or the extremes of the probability estimate,

\ \hat p \equiv \frac~.

Intuitively, the center value of this interval is the weighted average of

\ \hat\

and

\ \tfrac\ ,

with

\ \hat\

receiving greater weight as the sample size increases. Formally, the center value corresponds to using a

pseudocount In statistics, additive smoothing, also called Laplace smoothing or Lidstone smoothing, is a technique used to smooth count data, eliminating issues caused by certain values having 0 occurrences. Given a set of observation counts \mathbf = \lang ...

\ \tfrac z_\alpha^2\ ,

the number of standard deviations of the confidence interval: Add this number to both the count of successes and of failures to yield the estimate of the ratio. For the common two standard deviations in each direction interval (approximately 95% coverage, which itself is approximately 1.96 standard deviations), this yields the estimate

\ \frac\ ,

which is known as the "plus four rule". Although the quadratic can be solved explicitly, in most cases Wilson's equations can also be solved numerically using the fixed-point iteration :

p_ = \hat \pm z_\alpha\ \sqrt

with

\ p_0 = \hat ~.

The Wilson interval can also be derived from the single sample z-test or

Pearson's chi-squared test Pearson's chi-squared test or Pearson's \chi^2 test is a statistical test applied to sets of categorical data to evaluate how likely it is that any observed difference between the sets arose by chance. It is the most widely used of many chi-squa ...

with two categories. The resulting interval, :

\left\\ ,

(with

\ y_\alpha\

the lower

\ \alpha\

quantile) can then be solved for

\ \theta\

to produce the Wilson score interval. The test in the middle of the inequality is a

score test In statistics, the score test assesses constraints on statistical parameters based on the gradient of the likelihood function—known as the ''score''—evaluated at the hypothesized parameter value under the null hypothesis. Intuitively, if the ...

The interval equality principle

Since the interval is derived by solving from the normal approximation to the binomial, the Wilson score interval

~ \bigl(\ w^-\ ,\ w^+\ \bigr) ~

has the property of being guaranteed to obtain the same result as the equivalent

z-test A ''Z''-test is any statistical test for which the distribution of the test statistic under the null hypothesis can be approximated by a normal distribution. ''Z''-test tests the mean of a distribution. For each statistical significance, signi ...

chi-squared test A chi-squared test (also chi-square or test) is a Statistical hypothesis testing, statistical hypothesis test used in the analysis of contingency tables when the sample sizes are large. In simpler terms, this test is primarily used to examine w ...

. This property can be visualised by plotting the

probability density function In probability theory, a probability density function (PDF), density function, or density of an absolutely continuous random variable, is a Function (mathematics), function whose value at any given sample (or point) in the sample space (the s ...

for the Wilson score interval (''see'' Wallis). After that, then also plotting a normal across each bound. The tail areas of the resulting Wilson and normal distributions represent the chance of a significant result, in that direction, must be equal. The continuity-corrected Wilson score interval and the Clopper-Pearson interval are also compliant with this property. The practical import is that these intervals may be employed as significance tests, with identical results to the source test, and new tests may be derived by geometry.

Wilson score interval with continuity correction

The Wilson interval may be modified by employing a

continuity correction In mathematics, a continuity correction is an adjustment made when a discrete object is approximated using a continuous object. Examples Binomial If a random variable ''X'' has a binomial distribution with parameters ''n'' and ''p'', i.e., '' ...

, in order to align the minimum

, rather than the average coverage probability, with the nominal value,

\ 1 - \alpha ~.

Just as the Wilson interval mirrors

, the Wilson interval with continuity correction mirrors the equivalent Yates' chi-squared test. The following formulae for the lower and upper bounds of the Wilson score interval with continuity correction

\ \left( w_\mathsf^- , w_\mathsf^+ \right)\

are derived from Newcombe: :

\begin
  w_\mathsf^- &= \max \left\\ ,\\
  w_\mathsf^+ &= \min \left\ ,
\end

for

\ \hat p \ne 0\

and

\ \hat p \ne 1 ~.

\ \hat p = 0\ ,

then

\ w_\mathsf^-\

must instead be set to

\ 0\ ;

\ \hat p = 1\ ,

then

\ w_\mathsf^+\

must be instead set to

\ 1 ~.

Wallis (2021) identifies a simpler method for computing continuity-corrected Wilson intervals that employs a special function based on Wilson's lower-bound formula: In Wallis' notation, for the lower bound, let :

~ \mathsf\left(\ \hat,\ n,\ \tfrac\ \right)\ \equiv\ w^-\ =\ \frac \Bigg(\ \hat p + \frac 
~~ - ~~
  \frac\ \sqrt ~\Biggr)\ ,

where

\ \alpha\

is the selected tolerable error level for

\ z_\alpha ~.

Then :

\ w_\mathsf^- = \mathsf\left(\ \max \left\,\ n,\ \tfrac\ \right) ~.

This method has the advantage of being further decomposable.

Jeffreys interval

The ''Jeffreys interval'' has a Bayesian derivation, but good frequentist properties (outperforming most frequentist constructions). In particular, it has coverage properties that are similar to those of the Wilson interval, but it is one of the few intervals with the advantage of being ''equal-tailed'' (e.g., for a 95% confidence interval, the probabilities of the interval lying above or below the true value are both close to 2.5%). In contrast, the Wilson interval has a systematic bias such that it is centred too close to

\ p = 0.5 ~.

The Jeffreys interval is the Bayesian

credible interval In Bayesian statistics, a credible interval is an interval used to characterize a probability distribution. It is defined such that an unobserved parameter value has a particular probability \gamma to fall within it. For example, in an experime ...

obtained when using the non-informative

Jeffreys prior In Bayesian statistics, the Jeffreys prior is a non-informative prior distribution for a parameter space. Named after Sir Harold Jeffreys, its density function is proportional to the square root of the determinant of the Fisher information matri ...

for the binomial proportion

\ p ~.

The Jeffreys prior for this problem is a

Beta distribution In probability theory and statistics, the beta distribution is a family of continuous probability distributions defined on the interval

, 1 The comma is a punctuation mark that appears in several variants in different languages. Some typefaces render it as a small line, slightly curved or straight, but inclined from the vertical; others give it the appearance of a miniature fille ...

or (0, 1) in terms of two positive Statistical parameter, parameters, denoted by ''alpha'' (''α'') an ...

with parameters

\ \left( \tfrac, \tfrac \right)\ ,

conjugate prior In Bayesian probability theory, if, given a likelihood function p(x \mid \theta), the posterior distribution p(\theta \mid x) is in the same probability distribution family as the prior probability distribution p(\theta), the prior and posteri ...

. After observing

\ x\

successes in

\ n\

trials, the

posterior distribution The posterior probability is a type of conditional probability that results from updating the prior probability with information summarized by the likelihood via an application of Bayes' rule. From an epistemological perspective, the posterior ...

for

\ p\

is a Beta distribution with parameters

\ \left( x + \tfrac, n - x + \tfrac \right) ~.

When

\ x \ne 0\

and

\ x \ne n\ ,

the Jeffreys interval is taken to be the

\ 100\ \left( 1 - \alpha \right)\ \mathrm\

equal-tailed posterior probability interval, i.e., the

\ \tfrac\ \alpha\

and

\ 1 - \tfrac\ \alpha \

quantiles of a Beta distribution with parameters

\ \left(\ x + \tfrac,\ n - x + \tfrac\ \right) ~.

In order to avoid the coverage probability tending to zero when

\ p \to 0\

or , when

\ x = 0\

the upper limit is calculated as before but the lower limit is set to , and when

\ x = n\

the lower limit is calculated as before but the upper limit is set to . Jeffreys' interval can also be thought of as a frequentist interval based on inverting the p-value from the

G-test In statistics, ''G''-tests are likelihood-ratio or maximum likelihood statistical significance tests that are increasingly being used in situations where chi-squared tests were previously recommended. Formulation The general formula for ''G'' i ...

after applying the Yates correction to avoid a potentially-infinite value for the test statistic.

Clopper–Pearson interval

The Clopper–Pearson interval is an early and very common method for calculating binomial confidence intervals. This is often called an 'exact' method, as it attains the nominal coverage level in an exact sense, meaning that the coverage level is never less than the nominal

\ 1 - \alpha ~.

The Clopper–Pearson interval can be written as :

\ S_ \cap S_\

or equivalently, :

\ \left( \inf S_\ ,\ \sup S_ \right)\

with :

S_ ~ \equiv ~ \left\ ~

and :

~ S_ ~ \equiv ~ \left\\ ,

where

\ 0 \le x \le n\

is the number of successes observed in the sample and

\ \mathsf\left( n; p \right)\

is a binomial random variable with

\ n\

trials and probability of success

\ p ~.

Equivalently we can say that the Clopper–Pearson interval is

\ \left(\ \frac - \varepsilon_1,\ \frac + \varepsilon_2\ \right)\

with confidence level

\ 1 - \alpha\

\ \varepsilon_i\

is the infimum of those such that the following tests of hypothesis succeed with significance

\ \frac\ :

# H₀:

\ p = \frac - \varepsilon_1\

with H_A:

\ p > \frac - \varepsilon_1

# H₀:

\ p = \frac + \varepsilon_2\

with H_A:

\ p < \frac + \varepsilon_2 ~.

Because of a relationship between the binomial distribution and the

beta distribution In probability theory and statistics, the beta distribution is a family of continuous probability distributions defined on the interval

or (0, 1) in terms of two positive Statistical parameter, parameters, denoted by ''alpha'' (''α'') an ...

, the Clopper–Pearson interval is sometimes presented in an alternate format that uses quantiles from the beta distribution. :

\ B\!\left(\tfrac\ ;\ x\ ,\ n - x + 1 \right) ~ < ~ p ~ < ~ B\!\left(\ 1 - \tfrac\ ;\ x + 1\ ,\ n - x\ \right)\

where

\ x\

is the number of successes,

\ n\

is the number of trials, and

\ B\!\left(\ p\ ;\ v\ ,\ w\ \right)\

is the th

from a beta distribution with shape parameters

\ v\

and

\ w ~.

Thus,

\ p_ ~ < ~ p < ~ p_\ ,

where: :

\tfrac\ \int_0^\ t^\ (1-t)^\ \mathrm\!\ t ~~ = ~~ \tfrac\ ,

\tfrac\ \int_0^\ t^\ (1-t)^\ \mathrm\!\ t ~ = ~ 1 - \tfrac ~.

The binomial proportion confidence interval is then

\ \left(\ p_\ ,\ p_ \right)\ ,

as follows from the relation between the Binomial distribution cumulative distribution function and the

regularized incomplete beta function In mathematics, the beta function, also called the Euler integral of the first kind, is a special function that is closely related to the gamma function and to binomial coefficients. It is defined by the integral : \Beta(z_1,z_2) = \int_0^1 t^( ...

. When

\ x\

is either or

\ n\ ,

closed-form expressions for the interval bounds are available: when

\ x = 0\

the interval is :

\ \left( 0\ ,\ 1 - \left(\ \tfrac\ \right)^ \right)\

and when

\ x = n\

it is :

\ \left(\ \left( \tfrac\ \right)^\ ,\ 1\ \right) ~.

The beta distribution is, in turn, related to the

F-distribution In probability theory and statistics, the ''F''-distribution or ''F''-ratio, also known as Snedecor's ''F'' distribution or the Fisher–Snedecor distribution (after Ronald Fisher and George W. Snedecor), is a continuous probability distribut ...

so a third formulation of the Clopper–Pearson interval can be written using F quantiles: :

\left(\ 1 + \frac\ \right)^
  ~~ < ~~ p ~~ < ~~
  \left(\ 1 + \frac\ \right)^

where

\ x\

is the number of successes,

\ n\

is the number of trials, and

\ F\!\left(\ c\ ;\  d_1\ , d_2\ \right)\

is the

\ c\

quantile from an

with

\ d_1\

and

\ d_2\

degrees of freedom. The Clopper–Pearson interval is an 'exact' interval, since it is based directly on the binomial distribution rather than any approximation to the binomial distribution. This interval never has less than the nominal coverage for any population proportion, but that means that it is usually conservative. For example, the true coverage rate of a 95% Clopper–Pearson interval may be well above 95%, depending on

\ n\

and

\ p ~.

Thus the interval may be wider than it needs to be to achieve 95% confidence, and wider than other intervals. In contrast, it is worth noting that other confidence interval may have coverage levels that are lower than the nominal

\ 1 - \alpha\ ,

i.e., the normal approximation (or "standard") interval, Wilson interval, Agresti–Coull interval, etc., with a nominal coverage of 95% may in fact cover less than 95%, even for large sample sizes. The definition of the Clopper–Pearson interval can also be modified to obtain exact confidence intervals for different distributions. For instance, it can also be applied to the case where the samples are drawn without replacement from a population of a known size, instead of repeated draws of a binomial distribution. In this case, the underlying distribution would be the

hypergeometric distribution In probability theory and statistics, the hypergeometric distribution is a Probability distribution#Discrete probability distribution, discrete probability distribution that describes the probability of k successes (random draws for which the ...

. The interval boundaries can be computed with numerical functions in R and scipy.stats.beta.ppf in Python. from scipy.stats import beta import numpy as np k = 20 n = 400 alpha = 0.05 p_u, p_o = beta.ppf( lpha / 2, 1 - alpha / 2

, k + 1 The comma is a punctuation mark that appears in several variants in different languages. Some typefaces render it as a small line, slightly curved or straight, but inclined from the vertical; others give it the appearance of a miniature fille ...

- k + 1, n - k if np.isnan(p_o): p_o = 1 if np.isnan(p_u): p_u = 0

Agresti–Coull interval

The Agresti–Coull interval is also another approximate binomial confidence interval. Given

\ n_\mathsf\

successes in

\ n\

trials, define :

\ \tilde n \equiv n + z^2_\alpha\

and :

\ \tilde p = \frac\left(\!\ n_\mathsf + \tfrac\!\ \right)\

Then, a confidence interval for

\ p\

is given by :

\ p ~~ \approx ~~ \tilde p ~ \pm ~ 
   z_\alpha\ \sqrt

where

\ z_\alpha = \operatorname\!\!\left(\ 1 - \tfrac\ \right)\

is the quantile of a standard normal distribution, as before (for example, a 95% confidence interval requires

\ \alpha = 0.05\ ,

thereby producing

\ z_ = 1.96\

). According to

Brown Brown is a color. It can be considered a composite color, but it is mainly a darker shade of orange. In the CMYK color model used in printing and painting, brown is usually made by combining the colors Orange (colour), orange and black. In the ...

, Cai, & DasGupta (2001), taking

\ z = 2\

instead of 1.96 produces the "add 2 successes and 2 failures" interval previously described by Agresti & Coull. This interval can be summarised as employing the centre-point adjustment,

\ \tilde p\ ,

of the Wilson score interval, and then applying the Normal approximation to this point. :

\ \tilde p = \frac\

Arcsine transformation

The arcsine transformation has the effect of pulling out the ends of the distribution. While it can stabilize the variance (and thus confidence intervals) of proportion data, its use has been criticized in several contexts. Let

\ X\

be the number of successes in

\ n\

trials and let

\ p = \tfrac\!\ X ~.

The variance of

\ p\

is :

\operatorname\ = \tfrac\ p\ (1 - p) ~.

Using the arc sine transform, the variance of the arcsine of

\ \sqrt\

is :

\ \operatorname \left\ ~ \approx ~ \frac = \frac = \frac ~.

So, the confidence interval itself has the form :

\ \sin^2 \left(\ -\ \frac + \arcsin\sqrt ~\right) ~ < ~ \theta ~ < ~ \sin^2 \left(\ +\ \frac\ + \arcsin\sqrt ~\right)\ ,

where

\ z_\alpha\

is the

\ 1 \ -\ \tfrac\

quantile of a standard normal distribution. This method may be used to estimate the variance of

\ p\

but its use is problematic when

\ p\

is close to or .

''t''_''a'' transform

Let

\ p\

be the proportion of successes. For

\ 0 \le a \le 2\ ,

\ t_a\ =\ \log\left(\ \frac\ \right)\ =\ a\ \log p - (2-a)\ \log(\ 1 - p\ )\

This family is a generalisation of the logit transform which is a special case with ''a'' = 1 and can be used to transform a proportional data distribution to an approximately

. The parameter ''a'' has to be estimated for the data set.

Rule of three — for when no successes are observed

The

rule of three Rule of three or Rule of Thirds may refer to: Science and technology *Rule of three (aeronautics), a rule of descent in aviation *Rule of three (C++ programming), a rule of thumb about class method definitions *Rule of three (computer programming) ...

is used to provide a simple way of stating an approximate 95% confidence interval for

p

, in the special case that no successes (

\hat p = 0

) have been observed. The interval is

\left(0,\ \tfrac \right)

. By symmetry, in the case of only successes (

\hat p = 1

), the interval is

\left(1 -  \tfrac,1 \right)

Comparison and discussion

There are several research papers that compare these and other confidence intervals for the binomial proportion. Both Ross (2003) and Agresti & Coull (1998) point out that exact methods such as the Clopper–Pearson interval may not work as well as some approximations. The normal approximation interval and its presentation in textbooks has been heavily criticised, with many statisticians advocating that it not be used. The principal problems are ''overshoot'' (bounds exceed

\ \left 0,\ 1\ \right

), ''zero-width intervals'' at

\ \hat p\ = 0\

or (falsely implying certainty), and overall inconsistency with significance testing. Of the approximations listed above, Wilson score interval methods (with or without continuity correction) have been shown to be the most accurate and the most robust, though some prefer Agresti & Coulls' approach for larger sample sizes. Wilson and Clopper–Pearson methods obtain consistent results with source significance tests, and this property is decisive for many researchers. Many of these intervals can be calculated in R using packages like .

References

{{DEFAULTSORT:Binomial Proportion Confidence Interval Statistical approximations Statistical intervals

Problems with using a normal approximation or "Wald interval"

Bracketing the confidence interval

Standard error of a proportion estimation when using weighted data

Wilson score interval

The interval equality principle

Wilson score interval with continuity correction

Jeffreys interval

Clopper–Pearson interval

Agresti–Coull interval

Arcsine transformation

''t''''a'' transform

Rule of three — for when no successes are observed

Comparison and discussion

See also

References

''t''_''a'' transform