In
statistics
Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...
, a population is a
set
Set, The Set, SET or SETS may refer to:
Science, technology, and mathematics Mathematics
*Set (mathematics), a collection of elements
*Category of sets, the category whose objects and morphisms are sets and total functions, respectively
Electro ...
of similar items or
events which is of interest for some question or
experiment
An experiment is a procedure carried out to support or refute a hypothesis, or determine the efficacy or likelihood of something previously untried. Experiments provide insight into cause-and-effect by demonstrating what outcome occurs whe ...
. A statistical population can be a group of existing objects (e.g. the set of all stars within the
Milky Way galaxy) or a
hypothetical and potentially
infinite group of objects conceived as a generalization from experience (e.g. the set of all possible hands in a game of poker).
A population with finitely many values
in the
support of the population distribution is a finite population with population size
. A population with infinitely many values in the support is called infinite population.
A common aim of statistical analysis is to produce
information
Information is an Abstraction, abstract concept that refers to something which has the power Communication, to inform. At the most fundamental level, it pertains to the Interpretation (philosophy), interpretation (perhaps Interpretation (log ...
about some chosen population.
In
statistical inference
Statistical inference is the process of using data analysis to infer properties of an underlying probability distribution.Upton, G., Cook, I. (2008) ''Oxford Dictionary of Statistics'', OUP. . Inferential statistical analysis infers properties of ...
, a subset of the population (a statistical ''
sample'') is chosen to represent the population in a statistical analysis. Moreover, the statistical sample must be
unbiased
Bias is a disproportionate weight ''in favor of'' or ''against'' an idea or thing, usually in a way that is inaccurate, closed-minded, prejudicial, or unfair. Biases can be innate or learned. People may develop biases for or against an individ ...
and
accurately model the population. The ratio of the size of this statistical sample to the size of the population is called a ''
sampling fraction''. It is then possible to
estimate the ''
population parameters'' using the appropriate
sample statistics.
For finite populations, sampling from the population typically removes the sampled value from the population
due to drawing samples without replacement. This introduces a violation of the typical
independent and identically distribution assumption so that sampling from finite populations requires "
finite population corrections" (which can be derived from the
hypergeometric distribution). As a rough rule of thumb, if the sampling fraction is below 10% of the population size, then finite population corrections can approximately be neglected.
Mean
The population mean, or population
expected value
In probability theory, the expected value (also called expectation, expectancy, expectation operator, mathematical expectation, mean, expectation value, or first Moment (mathematics), moment) is a generalization of the weighted average. Informa ...
, is a measure of the
central tendency either of a
probability distribution
In probability theory and statistics, a probability distribution is a Function (mathematics), function that gives the probabilities of occurrence of possible events for an Experiment (probability theory), experiment. It is a mathematical descri ...
or of a
random variable
A random variable (also called random quantity, aleatory variable, or stochastic variable) is a Mathematics, mathematical formalization of a quantity or object which depends on randomness, random events. The term 'random variable' in its mathema ...
characterized by that distribution. In a
discrete probability distribution
In probability theory and statistics, a probability distribution is a function that gives the probabilities of occurrence of possible events for an experiment. It is a mathematical description of a random phenomenon in terms of its sample spa ...
of a random variable
, the mean is equal to the sum over every possible value weighted by the probability of that value; that is, it is computed by taking the product of each possible value
of
and its probability
, and then adding all these products together, giving
.
An analogous formula applies to the case of a
continuous probability distribution
In probability theory and statistics, a probability distribution is a Function (mathematics), function that gives the probabilities of occurrence of possible events for an Experiment (probability theory), experiment. It is a mathematical descri ...
. Not every probability distribution has a defined mean (see the
Cauchy distribution for an example). Moreover, the mean can be infinite for some distributions.
For a finite population, the population mean of a property is equal to the arithmetic mean of the given property, while considering every member of the population. For example, the population mean height is equal to the sum of the heights of every individual—divided by the total number of individuals. The ''
sample mean'' may differ from the population mean, especially for small samples. The
law of large numbers states that the larger the size of the sample, the more likely it is that the sample mean will be close to the population mean.
[Schaum's Outline of Theory and Problems of Probability by Seymour Lipschutz and Marc Lipson]
p. 141
/ref>
See also
* Data collection system
*Horvitz–Thompson estimator
In statistics, the Horvitz–Thompson estimator, named after Daniel G. Horvitz and Donovan J. Thompson, is a method for estimating the total and mean of a pseudo-population in a stratified sample by applying inverse probability weighting to acc ...
*Sample (statistics)
In this statistics, quality assurance, and survey methodology, sampling is the selection of a subset or a statistical sample (termed sample for short) of individuals from within a population (statistics), statistical population to estimate char ...
*Sampling (statistics)
In this statistics, quality assurance, and survey methodology, sampling is the selection of a subset or a statistical sample (termed sample for short) of individuals from within a population (statistics), statistical population to estimate char ...
* Stratum (statistics)
* Bootstrap world
References
External links
Statistical Terms Made Simple
{{Authority control
Statistical theory