A ''Z''-test is any

statistical test A statistical hypothesis test is a method of statistical inference used to decide whether the data provide sufficient evidence to reject a particular hypothesis. A statistical hypothesis test typically involves a calculation of a test statistic. ...

for which the

distribution Distribution may refer to: Mathematics *Distribution (mathematics), generalized functions used to formulate solutions of partial differential equations *Probability distribution, the probability of a particular value or value range of a varia ...

of the

test statistic Test statistic is a quantity derived from the sample for statistical hypothesis testing.Berger, R. L.; Casella, G. (2001). ''Statistical Inference'', Duxbury Press, Second Edition (p.374) A hypothesis test is typically specified in terms of a tes ...

under the

null hypothesis The null hypothesis (often denoted ''H''0) is the claim in scientific research that the effect being studied does not exist. The null hypothesis can also be described as the hypothesis in which no relationship exists between two sets of data o ...

can be approximated by a

normal distribution In probability theory and statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is f(x) = \frac ...

. ''Z''-test tests the mean of a distribution. For each

significance level In statistical hypothesis testing, a result has statistical significance when a result at least as "extreme" would be very infrequent if the null hypothesis were true. More precisely, a study's defined significance level, denoted by \alpha, is the ...

in the confidence interval, the ''Z''-test has a single critical value (for example, 1.96 for 5% two-tailed), which makes it more convenient than the Student's ''t''-test whose critical values are defined by the sample size (through the corresponding

degrees of freedom In many scientific fields, the degrees of freedom of a system is the number of parameters of the system that may vary independently. For example, a point in the plane has two degrees of freedom for translation: its two coordinates; a non-infinite ...

). Both the ''Z''-test and Student's ''t''-test have similarities in that they both help determine the significance of a set of data. However, the ''Z''-test is rarely used in practice because the population deviation is difficult to determine.

Applicability

Because of the

central limit theorem In probability theory, the central limit theorem (CLT) states that, under appropriate conditions, the Probability distribution, distribution of a normalized version of the sample mean converges to a Normal distribution#Standard normal distributi ...

, many test statistics are approximately normally distributed for large samples. Therefore, many statistical tests can be conveniently performed as approximate ''Z''-tests if the sample size is large or the population variance is known. If the population variance is unknown (and therefore has to be estimated from the sample itself) and the sample size is not large (''n'' < 30), the Student's ''t''-test may be more appropriate (in some cases, ''n'' < 50, as described below).

Procedure

How to perform a ''Z''-test when ''T'' is a statistic that is approximately normally distributed under the null hypothesis is as follows: First, estimate the

expected value In probability theory, the expected value (also called expectation, expectancy, expectation operator, mathematical expectation, mean, expectation value, or first Moment (mathematics), moment) is a generalization of the weighted average. Informa ...

μ of ''T'' under the null hypothesis and obtain an estimate ''s'' of the

standard deviation In statistics, the standard deviation is a measure of the amount of variation of the values of a variable about its Expected value, mean. A low standard Deviation (statistics), deviation indicates that the values tend to be close to the mean ( ...

of ''T''. Second, determine the properties of ''T'': one-tailed or two-tailed. For null hypothesis ''H''₀: ''μ'' ≥ ''μ''₀ vs

alternative hypothesis In statistical hypothesis testing, the alternative hypothesis is one of the proposed propositions in the hypothesis test. In general the goal of hypothesis test is to demonstrate that in the given condition, there is sufficient evidence supporting ...

''H''₁: ''μ'' < ''μ''₀, it is lower/left-tailed (one-tailed). For null hypothesis ''H''₀: ''μ'' ≤ ''μ''₀ vs alternative hypothesis ''H''₁: ''μ'' > ''μ''₀, it is upper/right-tailed (one-tailed). For null hypothesis ''H''₀: ''μ'' = ''μ''₀ vs alternative hypothesis ''H''₁: ''μ'' ≠ ''μ''₀, it is two-tailed. Third, calculate the

standard score In statistics, the standard score or ''z''-score is the number of standard deviations by which the value of a raw score (i.e., an observed value or data point) is above or below the mean value of what is being observed or measured. Raw scores ...

Z = \frac,

which one-tailed and two-tailed ''p''-values can be calculated as Φ(''Z'') (for lower/left-tailed tests), Φ(−''Z'') (for upper/right-tailed tests) and 2Φ(−, ''Z'', ) (for two-tailed tests), where Φ is the standard normal

cumulative distribution function In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x. Ever ...

Use in location testing

# The term "''Z''-test" is often used to refer specifically to the one-sample location test comparing the mean of a set of measurements to a given constant when the sample variance is known. For example, if the observed data ''X''₁, ..., ''X''_n are (i) independent, (ii) have a common mean μ, and (iii) have a common variance σ², then the sample average ''X'' has mean μ and variance

\frac

. # The null hypothesis is that the mean value of X is a given number μ₀. We can use ''X'' as a test-statistic, rejecting the null hypothesis if ''X'' − μ₀ is large. # To calculate the standardized statistic

Z=\frac

, we need to either know or have an approximate value for σ², from which we can calculate

s^2=\frac

. In some applications, σ² is known, but this is uncommon. # If the sample size is moderate or large, we can substitute the

sample variance In probability theory and statistics, variance is the expected value of the squared deviation from the mean of a random variable. The standard deviation (SD) is obtained as the square root of the variance. Variance is a measure of dispersion, ...

for σ², giving a ''plug-in'' test. The resulting test will not be an exact ''Z''-test since the uncertainty in the sample variance is not accounted for—however, it will be a good approximation unless the sample size is small. # A ''t''-test can be used to account for the uncertainty in the sample variance when the data are exactly normal. # Difference between ''Z''-test and ''t''-test: ''Z''-test is used when sample size is large (''n'' > 50), or the population variance is known. ''t''-test is used when sample size is small (''n'' < 50) and population variance is unknown. # There is no universal constant at which the sample size is generally considered large enough to justify use of the plug-in test. Typical rules of thumb: the sample size should be 50 observations or more. # For large sample sizes, the ''t''-test procedure gives almost identical ''p''-values as the ''Z''-test procedure. # Other location tests that can be performed as ''Z''-tests are the two-sample location test and the

paired difference test A paired difference test, better known as a paired comparison, is a type of location test that is used when comparing two sets of paired sample, paired measurements to assess whether their expected value, population means differ. A paired differen ...

Conditions

For the ''Z''-test to be applicable, certain conditions must be met. *

Nuisance parameter In statistics, a nuisance parameter is any parameter which is unspecified but which must be accounted for in the hypothesis testing of the parameters which are of interest. The classic example of a nuisance parameter comes from the normal distri ...

s should be known, or estimated with high accuracy (an example of a nuisance parameter would be the

in a one-sample location test). ''Z''-tests focus on a single parameter, and treat all other unknown parameters as being fixed at their true values. In practice, due to

Slutsky's theorem In probability theory, Slutsky's theorem extends some properties of algebraic operations on convergent sequences of real numbers to sequences of random variables. The theorem was named after Eugen Slutsky. Slutsky's theorem is also attributed to ...

, "plugging in"

consistent In deductive logic, a consistent theory is one that does not lead to a logical contradiction. A theory T is consistent if there is no formula \varphi such that both \varphi and its negation \lnot\varphi are elements of the set of consequences ...

estimates of nuisance parameters can be justified. However, if the sample size is not large enough for these estimates to be reasonably accurate, the ''Z''-test may not perform well. * The test statistic should follow a

. Generally, one appeals to the

to justify assuming that a test statistic varies normally. There is a great deal of statistical research on the question of when a test statistic varies approximately normally. If the variation of the test statistic is strongly non-normal, a ''Z''-test should not be used. If estimates of nuisance parameters are plugged in as discussed above, it is important to use estimates appropriate for the way the data were

sampled Sample or samples may refer to: * Sample (graphics), an intersection of a color channel and a pixel * Sample (material), a specimen or small quantity of something * Sample (signal), a digital discrete sample of a continuous analog signal * Sample ...

. In the special case of ''Z''-tests for the one or two sample location problem, the usual sample standard deviation is only appropriate if the data were collected as an independent sample. In some situations, it is possible to devise a test that properly accounts for the variation in plug-in estimates of nuisance parameters. In the case of one and two sample location problems, a ''t''-test does this.

Example

Suppose that in a particular geographic region, the mean and standard deviation of scores on a reading test are 100 points, and 12 points, respectively. Our interest is in the scores of 55 students in a particular school who received a mean score of 96. We can ask whether this mean score is significantly lower than the regional mean—that is, are the students in this school comparable to a simple random sample of 55 students from the region as a whole, or are their scores surprisingly low? First calculate the

standard error The standard error (SE) of a statistic (usually an estimator of a parameter, like the average or mean) is the standard deviation of its sampling distribution or an estimate of that standard deviation. In other words, it is the standard deviati ...

of the mean: :

\mathrm = \frac = \frac = \frac = 1.62

where

is the population standard deviation. Next calculate the ''z''-score, which is the distance from the sample mean to the population mean in units of the standard error: :

z = \frac = \frac = -2.47

In this example, we treat the population mean and variance as known, which would be appropriate if all students in the region were tested. When population parameters are unknown, a Student's ''t''-test should be conducted instead. The classroom mean score is 96, which is −2.47 standard error units from the population mean of 100. Looking up the ''z''-score in a table of the standard

cumulative probability, we find that the probability of observing a standard normal value below −2.47 is approximately 0.5 − 0.4932 = 0.0068. This is the one-sided ''p''-value for the null hypothesis that the 55 students are comparable to a simple random sample from the population of all test-takers. The two-sided ''p''-value is approximately 0.014 (twice the one-sided ''p''-value). Another way of stating things is that with probability 1 − 0.014 = 0.986, a simple random sample of 55 students would have a mean test score within 4 units of the population mean. We could also say that with 98.6% confidence we reject the

that the 55 test takers are comparable to a simple random sample from the population of test-takers. The ''Z''-test tells us that the 55 students of interest have an unusually low mean test score compared to most simple random samples of similar size from the population of test-takers. A deficiency of this analysis is that it does not consider whether the

effect size In statistics, an effect size is a value measuring the strength of the relationship between two variables in a population, or a sample-based estimate of that quantity. It can refer to the value of a statistic calculated from a sample of data, the ...

of 4 points is meaningful. If instead of a classroom, we considered a subregion containing 900 students whose mean score was 99, nearly the same ''z''-score and ''p''-value would be observed. This shows that if the sample size is large enough, very small differences from the null value can be highly statistically significant. See

statistical hypothesis testing A statistical hypothesis test is a method of statistical inference used to decide whether the data provide sufficient evidence to reject a particular hypothesis. A statistical hypothesis test typically involves a calculation of a test statistic. T ...

for further discussion of this issue.

Occurrence and applications

For maximum likelihood estimation of a parameter

Location tests are the most familiar ''Z''-tests. Another class of ''Z''-tests arises in

maximum likelihood In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data. This is achieved by maximizing a likelihood function so that, under the assumed stati ...

estimation of the

parameter A parameter (), generally, is any characteristic that can help in defining or classifying a particular system (meaning an event, project, object, situation, etc.). That is, a parameter is an element of a system that is useful, or critical, when ...

s in a parametric

statistical model A statistical model is a mathematical model that embodies a set of statistical assumptions concerning the generation of Sample (statistics), sample data (and similar data from a larger Statistical population, population). A statistical model repre ...

. Maximum likelihood estimates are approximately normal under certain conditions, and their asymptotic variance can be calculated in terms of the Fisher information. The maximum likelihood estimate divided by its standard error can be used as a test statistic for the null hypothesis that the population value of the parameter equals zero. More generally, if

\hat

is the maximum likelihood estimate of a parameter θ, and θ₀ is the value of θ under the null hypothesis, :

\frac

can be used as a ''Z''-test statistic. When using a ''Z''-test for maximum likelihood estimates, it is important to be aware that the normal approximation may be poor if the sample size is not sufficiently large. Although there is no simple, universal rule stating how large the sample size must be to use a ''Z''-test,

simulation A simulation is an imitative representation of a process or system that could exist in the real world. In this broad sense, simulation can often be used interchangeably with model. Sometimes a clear distinction between the two terms is made, in ...

can give a good idea as to whether a ''Z''-test is appropriate in a given situation. ''Z''-tests are employed whenever it can be argued that a test statistic follows a normal distribution under the null hypothesis of interest. Many

non-parametric Nonparametric statistics is a type of statistical analysis that makes minimal assumptions about the underlying distribution of the data being studied. Often these models are infinite-dimensional, rather than finite dimensional, as in parametric sta ...

test statistics, such as U statistics, are approximately normal for large enough sample sizes, and hence are often performed as ''Z''-tests.

Comparing the proportions of two binomials

The ''Z''-test for comparing two proportions is a statistical method used to evaluate whether the proportion of a certain characteristic differs significantly between two independent samples. This test leverages the property that the sample proportions (which is the average of observations coming from a

Bernoulli distribution In probability theory and statistics, the Bernoulli distribution, named after Swiss mathematician Jacob Bernoulli, is the discrete probability distribution of a random variable which takes the value 1 with probability p and the value 0 with pro ...

) are

asymptotically In analytic geometry, an asymptote () of a curve is a line such that the distance between the curve and the line approaches zero as one or both of the ''x'' or ''y'' coordinates tends to infinity. In projective geometry and related contexts, ...

normal under the

Central Limit Theorem In probability theory, the central limit theorem (CLT) states that, under appropriate conditions, the Probability distribution, distribution of a normalized version of the sample mean converges to a Normal distribution#Standard normal distributi ...

, enabling the construction of a ''Z''-test. The z-statistic for comparing two proportions is computed using:

z = \frac

Where: *

\hat_1

= sample proportion in the first sample *

\hat_2

= sample proportion in the second sample *

n_1

= size of the first sample *

n_2

= size of the second sample *

\hat

= pooled proportion, calculated as

\hat = \frac

, where

x_1

and

x_2

are the counts of successes in the two samples. The confidence interval for the difference between two proportions, based on the definitions above, is:

(\hat_1 - \hat_2) \pm z_ \sqrt

Where: *

z_

is the critical value of the standard normal distribution (e.g., 1.96 for a 95% confidence level). The MDE for when using the (two-sided) ''Z''-test formula for comparing two proportions, incorporating critical values for

\alpha

and

1-\beta

, and the standard errors of the proportions:Chow S-C, Shao J, Wang H, Lokhnygina Y (2018): Sample size calculations in clinical research. 3rd ed. CRC Press.

\text = , p_1 - p_2,  = z_ \sqrt + z_ \sqrt

Where: *

z_

: Critical value for the significance level. *

z_

: Quantile for the desired power. *

p_0=p_1=p_2

: When assuming the null is correct.

Applicability

Procedure

Use in location testing

Conditions

Example

Occurrence and applications

For maximum likelihood estimation of a parameter

Comparing the proportions of two binomials

See also

References

Further reading