Aggregate data
   HOME

TheInfoList



OR:

Aggregate data is high-level
data In the pursuit of knowledge, data (; ) is a collection of discrete values that convey information, describing quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpret ...
which is acquired by combining individual-level data. For instance, the output of an industry is an aggregate of the firms’ individual outputs within that industry. Aggregate data are applied in statistics, data warehouses, and in economics. There is a distinction between aggregate data and individual data. Aggregate data refers to individual data that are averaged by geographic area, by year, by service agency, or by other means. Individual data are disaggregated individual results and are used to conduct analyses for estimation of subgroup differences. Aggregate data are mainly used by researchers and analysts, policymakers, banks and administrators for multiple reasons. They are used to evaluate policies, recognise trends and patterns of processes, gain relevant insights, and assess current measures for strategic planning. Aggregate data collected from various sources are used in different areas of studies such as comparative political analysis and APD scientific analysis for further analyses. Aggregate data are also used for medical and educational purposes. Aggregate data is widely used, but it also has some limitations, including drawing inaccurate inferences and false conclusions which is also termed ‘
ecological fallacy An ecological fallacy (also ecological ''inference'' fallacy or population fallacy) is a formal fallacy in the interpretation of statistical data that occurs when inferences about the nature of individuals are deduced from inferences about the g ...
’. ‘Ecological fallacy’ means that it is invalid for users to draw conclusions on the ecological relationships between two quantitative variables at the individual level.


Applications

In
statistics Statistics (from German: '' Statistik'', "description of a state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, indust ...
, aggregate data are data combined from several measurements. When data is aggregated, groups of observations are replaced with summary statistics based on those observations. In a data warehouse, the use of
aggregate Aggregate or aggregates may refer to: Computing and mathematics * collection of objects that are bound together by a root entity, otherwise known as an aggregate root. The aggregate root guarantees the consistency of changes being made within the ...
data dramatically reduces the time to query large sets of data. Developers pre-summarise queries that are regularly used, such as Weekly Sales across several dimensions for example by item hierarchy or geographical hierarchy. In
economics Economics () is the social science that studies the production, distribution, and consumption of goods and services. Economics focuses on the behaviour and interactions of economic agents and how economies work. Microeconomics anal ...
, aggregate data or data aggregates are high-level data that are composed from a multitude or combination of other more individual data, such as: *in
macroeconomics Macroeconomics (from the Greek prefix ''makro-'' meaning "large" + ''economics'') is a branch of economics dealing with performance, structure, behavior, and decision-making of an economy as a whole. For example, using interest rates, taxes, and ...
, data such as the overall
price level The general price level is a hypothetical measure of overall prices for some set of goods and services (the consumer basket), in an economy or monetary union during a given interval (generally one day), normalized relative to some base set ...
or overall
inflation rate In economics, inflation is an increase in the general price level of goods and services in an economy. When the general price level rises, each unit of currency buys fewer goods and services; consequently, inflation corresponds to a reductio ...
; and *in
microeconomics Microeconomics is a branch of mainstream economics that studies the behavior of individuals and firms in making decisions regarding the allocation of scarce resources and the interactions among these individuals and firms. Microeconomics fo ...
, data of an entire sector of an economy composed of many firms, or of all households in a city or region.


Major users


Researchers and analysts

Researchers use aggregate data to understand the prevalent
ethos Ethos ( or ) is a Greek word meaning "character" that is used to describe the guiding beliefs or ideals that characterize a community, nation, or ideology; and the balance between caution, and passion. The Greeks also used this word to refer to ...
, evaluate the essence of social realities and a social organisation, stipulate primary issues of concern in
research Research is "creative and systematic work undertaken to increase the stock of knowledge". It involves the collection, organization and analysis of evidence to increase understanding of a topic, characterized by a particular attentiveness ...
, and supply projections in relation to the nature of social issues. Aggregate data are useful for researchers when they are interested in investigating on the relationships between two distinct variables at the aggregate level, and the connections between an aggregate variable and a characteristic at the individual level. Researchers have also made an effort to evaluate policies, practices and precepts of systems critically with the assistance of aggregate data, to investigate the corresponding
relevance Relevance is the concept of one topic being connected to another topic in a way that makes it useful to consider the second topic when considering the first. The concept of relevance is studied in many different fields, including cognitive sc ...
and
efficacy Efficacy is the ability to perform a task to a satisfactory or expected degree. The word comes from the same roots as ''effectiveness'', and it has often been used synonymously, although in pharmacology a distinction is now often made between ...
.


Policymakers

Aggregate data are used by governments to develop more effective policies because they serve as a measure of how capable a government is to be aware of the demands and needs of its citizens and a measure of the way a government maintains social order effectively. For example, governments around the world use of aggregate mobile location data for analysis in response to Covid-19. Aggregate mobile location data could provide insights about the effectiveness of
social distancing In public health, social distancing, also called physical distancing, (NB. Regula Venske is president of the PEN Centre Germany.) is a set of non-pharmaceutical interventions or measures intended to prevent the spread of a contagious dis ...
measures launched by governments. Governments also use aggregate data to identify possible “hot spots” and the potential for transmission. As well as projecting
effectiveness Effectiveness is the capability of producing a desired result or the ability to produce desired output. When something is deemed effective, it means it has an intended or expected outcome, or produces a deep, vivid impression. Etymology The ori ...
of government policies, aggregate data analyses are also taken to evaluate the nature, assess the extent, recognise the trend and study the pattern of a specific phenomenon or process with the aim to devise strategies, prepare short- or long-term policies, and take efficacious and relevant procedures for control or prevention. Policymakers also utilise financial aggregates data in evaluating companies and households’ economic and financial activities because these data help to identify risks associated with financial stability. Policymakers can employ aggregate data to better understand the developments of a country’s economic and financial conditions.


Banks

Banks collect aggregated data from a significant number of customers and then anonymise the data through eliminating personal information. The main reason for banks to use aggregate data is to estimate
economic trend *all the economic indicators that are the subject of economic forecasting **see also: econometrics *general trends in the economy, see: economic history Economic history is the academic learning of economies or economic events of the past. R ...
s and gain insights on customer clusters. Banks are not permitted to share customers’
personal data Personal data, also known as personal information or personally identifiable information (PII), is any information related to an identifiable person. The abbreviation PII is widely accepted in the United States, but the phrase it abbreviates ha ...
, but aggregate data can be shared with banks’ business customers and can be accessed by other partners who also use the same platform to acquire information on aggregate data. In Australia, the Commonwealth Bank provides its business clients anonymised data related to their customers which are derived from card transactions. The ANZ also provides its business customers with anonymised data which is gathered from millions of merchant terminal transactions and ANZ card transactions. In the UK, the Integrated Urgent Care Aggregate Data Collection (IUC ADC) provides comprehensive information about IUC activity, its performance, as well as its service demand. Its data are sourced from the lead data providers responsible for offering integrated urgent care services in England. The
National Health Service The National Health Service (NHS) is the umbrella term for the publicly funded healthcare systems of the United Kingdom (UK). Since 1948, they have been funded out of general taxation. There are three systems which are referred to using the " ...
(NHS) under the
Department of Health and Social Care The Department of Health and Social Care (DHSC) is a department of His Majesty's Government responsible for government policy on health and adult social care matters in England, along with a few elements of the same matters which are not otherw ...
(DHSC) in England stated that this collection of aggregate data is going to replace the NHS 111 minimum dataset. It will also be used as a formal source for IUC statistics, as well as to oversee the Key Performance Indicators (KPIs) of the IUC ADC.


Administrators

National or regional level of available empirical data are used by administrators and intellectuals, as well as people who are concerned about a region or a society’s
welfare Welfare, or commonly social welfare, is a type of government support intended to ensure that members of a society can meet basic human needs such as food and shelter. Social security may either be synonymous with welfare, or refer specifical ...
, as sources of reference. In particular, administrators utilise aggregate data for assessments in current political, religious, social, or other atmosphere of a nation to track the gaps in social responses relating to time and space, and to dictate priorities for action. These assessments help administrators in evaluating current measures that are useful in future
strategic planning Strategic planning is an organization's process of defining its strategy or direction, and making decisions on allocating its resources to attain strategic goals. It may also extend to control mechanisms for guiding the implementation of the s ...
and provide indicators about effective corrective measures.


Sources and collection methods

Aggregate data can be a composition of various types of writings and records, including
biography A biography, or simply bio, is a detailed description of a person's life. It involves more than just the basic facts like education, work, relationships, and death; it portrays a person's experience of these life events. Unlike a profile or c ...
,
autobiography An autobiography, sometimes informally called an autobio, is a self-written account of one's own life. It is a form of biography. Definition The word "autobiography" was first used deprecatingly by William Taylor in 1797 in the English peri ...
, descriptive accounts and correspondence. For example, a researcher collects, collates, or compiles aggregate data through utilising multiple mechanisms of
social research Social research is a research conducted by social scientists following a systematic plan. Social research methodologies can be classified as quantitative and qualitative. * Quantitative designs approach social phenomena through quantifiable ...
, including
inventory Inventory (American English) or stock (British English) refers to the goods and materials that a business holds for the ultimate goal of resale, production or utilisation. Inventory management is a discipline primarily about specifying the sha ...
,
interview An interview is a structured conversation where one participant asks questions, and the other provides answers.Merriam Webster DictionaryInterview Dictionary definition, Retrieved February 16, 2016 In common parlance, the word "interview" ...
, an opinionnaire, and a
questionnaire A questionnaire is a research instrument that consists of a set of questions (or other types of prompts) for the purpose of gathering information from respondents through survey or statistical study. A research questionnaire is typically a mix ...
or
schedule A schedule or a timetable, as a basic time-management tool, consists of a list of times at which possible tasks, events, or actions are intended to take place, or of a sequence of events in the chronological order in which such things are ...
. Official or non-official agencies also collect and compile aggregate data on an ongoing basis through utilising infrastructures available within a department at the field level. Sources of aggregate data can also be regarded as tools for discovering data. In the US, some of the US data are presented in the form of tables. Examples of sources for these US aggregate data include the
United States Census Bureau The United States Census Bureau (USCB), officially the Bureau of the Census, is a principal agency of the U.S. Federal Statistical System, responsible for producing data about the American people and economy. The Census Bureau is part of th ...
,
Statistical Abstract of the United States The ''Statistical Abstract of the United States'' was a publication of the United States Census Bureau, an agency of the United States Department of Commerce. Published annually from 1878 to 2011, the statistics described social, political and ...
, and Social Explorer.
International Monetary Fund The International Monetary Fund (IMF) is a major financial agency of the United Nations, and an international financial institution, headquartered in Washington, D.C., consisting of 190 countries. Its stated mission is "working to foster glo ...
data, World DataBank, and Penn World Table are examples of transactional and international aggregate data sources.


Use of aggregate data


Comparative political analysis

Aggregate data is used in comparative political analysis because analysts do not only focus on individual’s behaviour. They also focus on the behaviour of areal units, including electoral constituencies and nations. In political activity analyses, significant data such as those related to
industrialisation Industrialisation ( alternatively spelled industrialization) is the period of social and economic change that transforms a human group from an agrarian society into an industrial society. This involves an extensive re-organisation of an econo ...
,
urbanization Urbanization (or urbanisation) refers to the population shift from rural to urban areas, the corresponding decrease in the proportion of people living in rural areas, and the ways in which societies adapt to this change. It is predominantly th ...
, as well as mass communication networks, are not expressed readily in individual levels. They are expressed in
per capita ''Per capita'' is a Latin phrase literally meaning "by heads" or "for each head", and idiomatically used to mean "per person". The term is used in a wide variety of social sciences and statistical research contexts, including government statistic ...
terms in order to control for the variations in the areal units’
population size In population genetics and population ecology, population size (usually denoted ''N'') is the number of individual organisms in a population. Population size is directly associated with amount of genetic drift, and is the underlying cause of effect ...
. Aggregate data are widely available because demographic, socio-economic, and political data are collected and published by the nations. This facilitates researchers and analysts in carrying out longer trend studies and allows them to bring changes and developments in a deeper focus.


APD scientific meta-analyses

Factors including the need for time, considerable resources and wide international
cooperation Cooperation (written as co-operation in British English) is the process of groups of organisms working or acting together for common, mutual, or some underlying benefit, as opposed to working in competition for selfish benefit. Many animal a ...
, impeded the use of individual patient data (IPD)
meta-analysis A meta-analysis is a statistical analysis that combines the results of multiple scientific studies. Meta-analyses can be performed when there are multiple scientific studies addressing the same question, with each individual study reporting m ...
, which led to most of the published meta-analyses relying upon aggregate patient data (APD). To acquire data in all trials on all patients, aggregate patient data are collected from completed studies being presented at professional meetings, published in the medical literature, or were directly supplied by individual investigators. The aggregated patient data are utilised by users including the
Cochrane Cochrane may refer to: Places Australia *Cochrane railway station, Sydney, a railway station on the closed Ropes Creek railway line Canada * Cochrane, Alberta * Cochrane Lake, Alberta * Cochrane District, Ontario ** Cochrane, Ontario, a town wit ...
Collaboration, the
United States Preventive Services Task Force The United States Preventive Services Task Force (USPSTF) is "an independent panel of experts in primary care and prevention that systematically reviews the evidence of effectiveness and develops recommendations for clinical preventive services". ...
, and multiple professional societies in providing support for clinical practice guidelines. Aggregate patient data are also used in time-to-event studies of meta-analyses as the results can inform investors about the worthiness to proceed to conducting more meta-analyses that are based on resource-intensive individual patient data.


Other uses


Health care

In a health information system, aggregate data is the integration of data concerning numerous patients. A particular patient cannot be traced based on aggregate data. These aggregated data are only counts, including
Tuberculous Tuberculosis (TB) is an infectious disease usually caused by ''Mycobacterium tuberculosis'' (MTB) bacteria. Tuberculosis generally affects the lungs, but it can also affect other parts of the body. Most infections show no symptoms, i ...
,
Malaria Malaria is a mosquito-borne infectious disease that affects humans and other animals. Malaria causes symptoms that typically include fever, tiredness, vomiting, and headaches. In severe cases, it can cause jaundice, seizures, coma, or death. ...
, or other diseases. Health facilities use this type of aggregated statistics to generate reports and indicators, and to undertake strategic planning in their health systems. Compared with aggregated data, patient data are individual data related to a single patient, including one’s name, age,
diagnosis Diagnosis is the identification of the nature and cause of a certain phenomenon. Diagnosis is used in many different disciplines, with variations in the use of logic, analytics, and experience, to determine "cause and effect". In systems engin ...
and medical history. Patient-based data are mainly used to track the progress of a patient, such as how the patient responds to particular treatment, over time. The COVID-19 Data Archive, also called the COVID-ARC, aggregates data from studies around the
globe A globe is a spherical model of Earth, of some other celestial body, or of the celestial sphere. Globes serve purposes similar to maps, but unlike maps, they do not distort the surface that they portray except to scale it down. A model glo ...
. Researchers are able to have access towards the discoveries of international colleagues and forges collaborations to facilitate processes involved in fighting against the disease. Specifically, using aggregated healthcare data allows health care providers to unbolt actionable clinical insights when for instance, thorough views of clinical data or continuous patient records become possible.


Education

Aggregate data such as aggregate school-level demographic data and aggregate school-level achievement data are used in experimental analysis to assess the relationships between student achievement and school-level interventions. Aggregate data can also be used in non-experimental analysis such as regression discontinuity analysis and interrupted time-series analysis. Individual-level data are not required in these non-experimental analyses. For example, interrupted time-series analysis estimates the impact brought by a school-level program through comparing a school’s achievement before and after the program is launched where individual-level data are not necessary.


Limitations

During the process of averaging units within some cluster or within a country, information is lost which increases the probability of drawing inaccurate inferences. Information loss occurs because aggregation of data ignores individual variation as if it were only a type of statistical noise or measurement error. Inference also vary from one to another when either individual firm data or aggregated data is used for analysis. For instance, calculation of country averages does not account for firm-specific variables, such as firm size, firm age, or firm-ownership concentration, but calculation of individual averages does. Differences exist between results generated from aggregate data and individual data. There is also a problem of ‘ecological fallacy’. The concept was brought about by Robinson (1950). The meaning of the term is that the variability around the individual-level means is significantly different from the variability encompassing the aggregate means. With the aggregate concept, things other than the individual equivalents of aggregate data are expressed, which means that individual-level conclusions cannot be drawn. Although aggregate data has wider applicability than individual-level data, it is more challenging for researchers to tackle with analysis on
subgroup In group theory, a branch of mathematics, given a group ''G'' under a binary operation ∗, a subset ''H'' of ''G'' is called a subgroup of ''G'' if ''H'' also forms a group under the operation ∗. More precisely, ''H'' is a subgroup ...
results when aggregate data is used. Eventually, individual information may also be required. Growth modelling and
longitudinal Longitudinal is a geometric term of location which may refer to: * Longitude ** Line of longitude, also called a meridian * Longitudinal engine, an internal combustion engine in which the crankshaft is oriented along the long axis of the vehicle, ...
modelling based on aggregate data are also difficult because variables can vary over time.


Other types of aggregate data


Financial aggregates data

Financial aggregates data is a type of aggregate data about credit and the
money supply In macroeconomics, the money supply (or money stock) refers to the total volume of currency held by the public at a particular point in time. There are several ways to define "money", but standard measures usually include currency in circu ...
in Australia, which is utilised by policymakers in evaluating both the households and the companies’ economic and financial activities.


Credit aggregates

Credit aggregates are measurements of the households and businesses’ borrowings from financial intermediaries. The amount of funds borrowed by businesses for purposes including project investments, assets purchases, or cash flow managements are also measured using credit aggregates.


Monetary aggregates

Monetary aggregates are measurements of the money or ‘money-like’ instruments of the banking system, which is owed to businesses and households. An example of a ‘money-like’ instrument is deposits in the
bank account A bank account is a financial account maintained by a bank or other financial institution in which the financial transactions between the bank and a customer are recorded. Each financial institution sets the terms and conditions for each type of ...
.


Census aggregate data

In the UK,
census A census is the procedure of systematically acquiring, recording and calculating information about the members of a given population. This term is used mostly in connection with national population and housing censuses; other common censuses inc ...
aggregate data are data generated as outputs from the United Kingdom censuses. They provide information about the socio-economic and demographic characteristics of the country’s population. They are a compilation of aggregated, or summarised, calculations of the number of individuals, household residents, or families in particular geographic areas with specific characteristics, or compounds of characteristics, taken from the subjects of people and places, populations, families, health, ethnicity and religion, housing and work. Aggregate data are used as components of the UK censuses’ outputs. They are obtained from analysis on the information given in the census returns. The census aggregate data are used to compare and describe population characteristics across various locations in the UK because they are able to provide comparable information at a range of geographical levels over the entire UK. Census aggregate data are also utilised in the academic sector for teaching and research purposes, as well as for site location and marketing in the private sector.


References

Statistical data types Summary statistics {{Software-stub