HOME

TheInfoList



OR:

In
statistics Statistics (from German: '' Statistik'', "description of a state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, indust ...
, a sampling frame is the source material or device from which a
sample Sample or samples may refer to: Base meaning * Sample (statistics), a subset of a population – complete data set * Sample (signal), a digital discrete sample of a continuous analog signal * Sample (material), a specimen or small quantity of ...
is drawn. It is a list of all those within a
population Population typically refers to the number of people in a single area, whether it be a city or town, region, country, continent, or the world. Governments typically quantify the size of the resident population within their jurisdiction usi ...
who can be sampled, and may include individuals, households or institutions. Importance of the sampling frame is stressed by Jessen and Salant and Dillman.Salant, Priscilla, and Don A. Dillman. "How to Conduct your own Survey: Leading professional give you proven techniques for getting reliable results" (1995)


Obtaining and organizing a sampling frame

In the most straightforward cases, such as when dealing with a batch of material from a production run, or using a
census A census is the procedure of systematically acquiring, recording and calculating information about the members of a given population. This term is used mostly in connection with national population and housing censuses; other common censuses inc ...
, it is possible to identify and measure every single item in the population and to include any one of them in our sample; this is known as ''direct element sampling''. However, in many other cases this is not possible; either because it is cost-prohibitive (reaching every citizen of a country) or impossible (reaching all humans alive). Having established the frame, there are a number of ways for organizing it to improve efficiency and effectiveness. It's at this stage that the researcher should decide whether the sample is in fact to be the whole population and would therefore be a
census A census is the procedure of systematically acquiring, recording and calculating information about the members of a given population. This term is used mostly in connection with national population and housing censuses; other common censuses inc ...
. This list should also facilitate access to the selected sampling units. A frame may also provide additional 'auxiliary information' about its elements; when this information is related to variables or groups of interest, it may be used to improve survey design. While not necessary for simple sampling, a sampling frame used for more advanced sample techniques, such as
stratified sampling In statistics, stratified sampling is a method of sampling from a population which can be partitioned into subpopulations. In statistical surveys, when subpopulations within an overall population vary, it could be advantageous to sample each ...
, may contain additional information (such as demographic information). For instance, an electoral register might include name and sex; this information can be used to ensure that a sample taken from that frame covers all demographic categories of interest. (Sometimes the auxiliary information is less explicit; for instance, a telephone number may provide some information about location.


Sampling frame qualities

An ideal sampling frame will have the following qualities: * all units have a logical, numerical identifier * all units can be found – their contact information, map location or other relevant information is present * the frame is organized in a logical, systematic fashion * the frame has additional information about the units that allow the use of more advanced sampling frames * every element of the population of interest is present in the frame * every element of the population is present ''only once'' in the frame * no elements from outside the population of interest are present in the frame * the data is 'up-to-date'


Types of sampling frames

The most straightforward type of frame is a list of elements of the population (preferably the entire population) with appropriate contact information. For example, in an
opinion poll An opinion poll, often simply referred to as a survey or a poll (although strictly a poll is an actual election) is a human research survey of public opinion from a particular sample. Opinion polls are usually designed to represent the opinion ...
, possible sampling frames include an
electoral register An electoral roll (variously called an electoral register, voters roll, poll book or other description) is a compilation that lists persons who are entitled to vote for particular elections in a particular jurisdiction. The list is usually broke ...
or a
telephone directory A telephone directory, commonly called a telephone book, telephone address book, phonebook, or the white and yellow pages, is a listing of telephone subscribers in a geographical area or subscribers to services provided by the organization that ...
. Other sampling frames can include employment records, school class lists, patient files in a hospital, organizations listed in a thematic database, and so on. On a more practical levels, sampling frames have the form of
computer file A computer file is a computer resource for recording data in a computer storage device, primarily identified by its file name. Just as words can be written to paper, so can data be written to a computer file. Files can be shared with and trans ...
s. Not all frames explicitly list population elements; some list only 'clusters'. For example, a
street map A road map, route map, or street map is a map that primarily displays roads and transport links rather than natural geographical information. It is a type of navigational map that commonly includes political boundaries and labels, making it ...
can be used as a frame for a door-to-door survey; although it doesn't show individual houses, we can select streets from the map and then select houses on those streets. This offers some advantages: such a frame would include people who have recently moved and are not yet on the list frames discussed above, and it may be easier to use because it doesn't require storing data for every unit in the population, only for a smaller number of clusters.


Sampling frames problems

The sampling frame must be representative of the population and this is a question outside the scope of statistical theory demanding the judgment of experts in the particular subject matter being studied. All the above frames omit some people who will vote at the next election and contain some people who will not; some frames will contain multiple records for the same person. People not in the frame have no prospect of being sampled. Because a cluster-based frame contains less information about the population, it may place constraints on the sample design, possibly requiring the use of less efficient sampling methods and/or making it harder to interpret the resulting data. Statistical theory tells us about the uncertainties in extrapolating from a sample to the frame. It should be expected that sample frames, will always contain some mistakes. In some cases, this may lead to
sampling bias In statistics, sampling bias is a bias in which a sample is collected in such a way that some members of the intended population have a lower or higher sampling probability than others. It results in a biased sample of a population (or non-human f ...
. Such bias should be minimized, and identified, although avoiding it completely in a real world is nearly impossible. One should also not assume that sources which claim to be unbiased and representative are such. In defining the frame, practical, economic, ethical, and technical issues need to be addressed. The need to obtain timely results may prevent extending the frame far into the future. The difficulties can be extreme when the population and frame are disjoint. This is a particular problem in
forecasting Forecasting is the process of making predictions based on past and present data. Later these can be compared (resolved) against what happens. For example, a company might estimate their revenue in the next year, then compare it against the actual ...
where inferences about the future are made from historical
data In the pursuit of knowledge, data (; ) is a collection of discrete values that convey information, describing quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpret ...
. In fact, in 1703, when
Jacob Bernoulli Jacob Bernoulli (also known as James or Jacques; – 16 August 1705) was one of the many prominent mathematicians in the Bernoulli family. He was an early proponent of Leibnizian calculus and sided with Gottfried Wilhelm Leibniz during the Le ...
proposed to
Gottfried Leibniz Gottfried Wilhelm (von) Leibniz . ( – 14 November 1716) was a German polymath active as a mathematician, philosopher, scientist and diplomat. He is one of the most prominent figures in both the history of philosophy and the history of mathem ...
the possibility of using historical mortality data to predict the
probability Probability is the branch of mathematics concerning numerical descriptions of how likely an event is to occur, or how likely it is that a proposition is true. The probability of an event is a number between 0 and 1, where, roughly speaking, ...
of early death of a living man,
Gottfried Leibniz Gottfried Wilhelm (von) Leibniz . ( – 14 November 1716) was a German polymath active as a mathematician, philosopher, scientist and diplomat. He is one of the most prominent figures in both the history of philosophy and the history of mathem ...
recognized the problem in replying: Leslie Kish posited four basic problems of sampling frames: # Missing elements: Some members of the population are not included in the frame. # Foreign elements: The non-members of the population are included in the frame. # Duplicate entries: A member of the population is surveyed more than once. # Groups or clusters: The frame lists clusters instead of individuals. Problems like those listed can be identified by the use of pre-survey tests and pilot studies.


References

{{reflist Sampling (statistics)