Latent And Observable Variables
   HOME

TheInfoList



OR:

In
statistics Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...
, latent variables (from
Latin Latin ( or ) is a classical language belonging to the Italic languages, Italic branch of the Indo-European languages. Latin was originally spoken by the Latins (Italic tribe), Latins in Latium (now known as Lazio), the lower Tiber area aroun ...
:
present participle In linguistics, a participle (; abbr. ) is a nonfinite verb form that has some of the characteristics and functions of both verbs and adjectives. More narrowly, ''participle'' has been defined as "a word derived from a verb and used as an adject ...
of ) are variables that can only be inferred indirectly through a
mathematical model A mathematical model is an abstract and concrete, abstract description of a concrete system using mathematics, mathematical concepts and language of mathematics, language. The process of developing a mathematical model is termed ''mathematical m ...
from other observable variables that can be directly observed or measured. Such '' latent variable models'' are used in many disciplines, including
engineering Engineering is the practice of using natural science, mathematics, and the engineering design process to Problem solving#Engineering, solve problems within technology, increase efficiency and productivity, and improve Systems engineering, s ...
,
medicine Medicine is the science and Praxis (process), practice of caring for patients, managing the Medical diagnosis, diagnosis, prognosis, Preventive medicine, prevention, therapy, treatment, Palliative care, palliation of their injury or disease, ...
,
ecology Ecology () is the natural science of the relationships among living organisms and their Natural environment, environment. Ecology considers organisms at the individual, population, community (ecology), community, ecosystem, and biosphere lev ...
,
physics Physics is the scientific study of matter, its Elementary particle, fundamental constituents, its motion and behavior through space and time, and the related entities of energy and force. "Physical science is that department of knowledge whi ...
,
machine learning Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of Computational statistics, statistical algorithms that can learn from data and generalise to unseen data, and thus perform Task ( ...
/
artificial intelligence Artificial intelligence (AI) is the capability of computer, computational systems to perform tasks typically associated with human intelligence, such as learning, reasoning, problem-solving, perception, and decision-making. It is a field of re ...
,
natural language processing Natural language processing (NLP) is a subfield of computer science and especially artificial intelligence. It is primarily concerned with providing computers with the ability to process data encoded in natural language and is thus closely related ...
,
bioinformatics Bioinformatics () is an interdisciplinary field of science that develops methods and Bioinformatics software, software tools for understanding biological data, especially when the data sets are large and complex. Bioinformatics uses biology, ...
, chemometrics,
demography Demography () is the statistical study of human populations: their size, composition (e.g., ethnic group, age), and how they change through the interplay of fertility (births), mortality (deaths), and migration. Demographic analysis examine ...
,
economics Economics () is a behavioral science that studies the Production (economics), production, distribution (economics), distribution, and Consumption (economics), consumption of goods and services. Economics focuses on the behaviour and interac ...
,
management Management (or managing) is the administration of organizations, whether businesses, nonprofit organizations, or a Government agency, government bodies through business administration, Nonprofit studies, nonprofit management, or the political s ...
,
political science Political science is the scientific study of politics. It is a social science dealing with systems of governance and Power (social and political), power, and the analysis of political activities, political philosophy, political thought, polit ...
,
psychology Psychology is the scientific study of mind and behavior. Its subject matter includes the behavior of humans and nonhumans, both consciousness, conscious and Unconscious mind, unconscious phenomena, and mental processes such as thoughts, feel ...
and the
social sciences Social science (often rendered in the plural as the social sciences) is one of the branches of science, devoted to the study of society, societies and the Social relation, relationships among members within those societies. The term was former ...
. Latent variables may correspond to aspects of physical reality. These could in principle be measured, but may not be for practical reasons. Among the earliest expressions of this idea is
Francis Bacon Francis Bacon, 1st Viscount St Alban (; 22 January 1561 – 9 April 1626) was an English philosopher and statesman who served as Attorney General and Lord Chancellor of England under King James I. Bacon argued for the importance of nat ...
's
polemic Polemic ( , ) is contentious rhetoric intended to support a specific position by forthright claims and to undermine the opposing position. The practice of such argumentation is called polemics, which are seen in arguments on controversial to ...
the ''
Novum Organum The ''Novum Organum'', fully ''Novum Organum, sive Indicia Vera de Interpretatione Naturae'' ("New organon, or true directions concerning the interpretation of nature") or ''Instaurationis Magnae, Pars II'' ("Part II of The Great Instauratio ...
'', itself a challenge to the more traditional logic expressed in
Aristotle Aristotle (; 384–322 BC) was an Ancient Greek philosophy, Ancient Greek philosopher and polymath. His writings cover a broad range of subjects spanning the natural sciences, philosophy, linguistics, economics, politics, psychology, a ...
's Organon: In this situation, the term ''hidden variables'' is commonly used, reflecting the fact that the variables are meaningful, but not observable. Other latent variables correspond to abstract concepts, like categories, behavioral or mental states, or data structures. The terms ''hypothetical variables'' or ''hypothetical constructs'' may be used in these situations. The use of latent variables can serve to reduce the dimensionality of data. Many observable variables can be aggregated in a model to represent an underlying concept, making it easier to understand the data. In this sense, they serve a function similar to that of scientific theories. At the same time, latent variables link observable " sub-symbolic" data in the real world to symbolic data in the modeled world.


Examples


Psychology

Latent variables, as created by factor analytic methods, generally represent "shared" variance, or the degree to which variables "move" together. Variables that have no correlation cannot result in a latent construct based on the common factor model. * The "
Big Five personality traits In personality psychology and psychometrics, the Big 5 or five-factor model (FFM) is a widely-used Scientific theory, scientific model for describing how personality Trait theory, traits differ across people using five distinct Factor analysis, ...
" have been inferred using
factor analysis Factor analysis is a statistical method used to describe variability among observed, correlated variables in terms of a potentially lower number of unobserved variables called factors. For example, it is possible that variations in six observe ...
. * extraversion * spatial ability * wisdom: “Two of the more predominant means of assessing wisdom include wisdom-related performance and latent variable measures.” * Spearman's g, or the general intelligence factor in
psychometrics Psychometrics is a field of study within psychology concerned with the theory and technique of measurement. Psychometrics generally covers specialized fields within psychology and education devoted to testing, measurement, assessment, and rela ...


Economics

Examples of latent variables from the field of
economics Economics () is a behavioral science that studies the Production (economics), production, distribution (economics), distribution, and Consumption (economics), consumption of goods and services. Economics focuses on the behaviour and interac ...
include
quality of life Quality of life (QOL) is defined by the World Health Organization as "an individual's perception of their position in life in the context of the culture and value systems in which they live and in relation to their goals, expectations, standards ...
, business confidence, morale, happiness and conservatism: these are all variables which cannot be measured directly. However, by linking these latent variables to other, observable variables, the values of the latent variables can be inferred from measurements of the observable variables. Quality of life is a latent variable which cannot be measured directly, so observable variables are used to infer quality of life. Observable variables to measure quality of life include wealth, employment, environment, physical and mental health, education, recreation and leisure time, and social belonging.


Medicine

Latent-variable methodology is used in many branches of
medicine Medicine is the science and Praxis (process), practice of caring for patients, managing the Medical diagnosis, diagnosis, prognosis, Preventive medicine, prevention, therapy, treatment, Palliative care, palliation of their injury or disease, ...
. A class of problems that naturally lend themselves to latent variables approaches are longitudinal studies where the time scale (e.g. age of participant or time since study baseline) is not synchronized with the trait being studied. For such studies, an unobserved time scale that is synchronized with the trait being studied can be modeled as a transformation of the observed time scale using latent variables. Examples of this include disease progression modeling and modeling of growth (see box).


Inferring latent variables

There exists a range of different model classes and methodology that make use of latent variables and allow inference in the presence of latent variables. Models include: * linear mixed-effects models and nonlinear mixed-effects models * Hidden Markov models *
Factor analysis Factor analysis is a statistical method used to describe variability among observed, correlated variables in terms of a potentially lower number of unobserved variables called factors. For example, it is possible that variations in six observe ...
*
Item response theory In psychometrics, item response theory (IRT, also known as latent trait theory, strong true score theory, or modern mental test theory) is a paradigm for the design, analysis, and scoring of Test (student assessment), tests, questionnaires, and sim ...
Analysis and inference methods include: *
Principal component analysis Principal component analysis (PCA) is a linear dimensionality reduction technique with applications in exploratory data analysis, visualization and data preprocessing. The data is linearly transformed onto a new coordinate system such that th ...
* Instrumented principal component analysisKelly, Bryan T. and Pruitt, Seth and Su, Yinan, Instrumented Principal Component Analysis (December 17, 2020). Available at SSRN: https://ssrn.com/abstract=2983919 or http://dx.doi.org/10.2139/ssrn.2983919 * Partial least squares regression *
Latent semantic analysis Latent semantic analysis (LSA) is a technique in natural language processing, in particular distributional semantics, of analyzing relationships between a set of documents and the terms they contain by producing a set of concepts related to the d ...
and
probabilistic latent semantic analysis Probabilistic latent semantic analysis (PLSA), also known as probabilistic latent semantic indexing (PLSI, especially in information retrieval circles) is a statistical technique for the analysis of two-mode and co-occurrence data. In effect, one c ...
* EM algorithms * Metropolis–Hastings algorithm


Bayesian algorithms and methods

Bayesian statistics Bayesian statistics ( or ) is a theory in the field of statistics based on the Bayesian interpretation of probability, where probability expresses a ''degree of belief'' in an event. The degree of belief may be based on prior knowledge about ...
is often used for inferring latent variables. *
Latent Dirichlet allocation In natural language processing, latent Dirichlet allocation (LDA) is a Bayesian network (and, therefore, a generative statistical model) for modeling automatically extracted topics in textual corpora. The LDA is an example of a Bayesian topic ...
* The Chinese restaurant process is often used to provide a prior distribution over assignments of objects to latent categories. * The Indian buffet process is often used to provide a prior distribution over assignments of latent binary features to objects.


See also

*
Confounding In causal inference, a confounder is a variable that influences both the dependent variable and independent variable, causing a spurious association. Confounding is a causal concept, and as such, cannot be described in terms of correlatio ...
*
Dependent and independent variables A variable is considered dependent if it depends on (or is hypothesized to depend on) an independent variable. Dependent variables are studied under the supposition or demand that they depend, by some law or rule (e.g., by a mathematical function ...
* Errors-in-variables models * Evidence lower bound *
Factor analysis Factor analysis is a statistical method used to describe variability among observed, correlated variables in terms of a potentially lower number of unobserved variables called factors. For example, it is possible that variations in six observe ...
* Intervening variable * Latent variable model *
Item response theory In psychometrics, item response theory (IRT, also known as latent trait theory, strong true score theory, or modern mental test theory) is a paradigm for the design, analysis, and scoring of Test (student assessment), tests, questionnaires, and sim ...
* Partial least squares path modeling * Partial least squares regression *
Proxy (statistics) In statistics, a proxy or proxy variable is a variable that is not in itself directly relevant, but that serves in place of an unobservable or immeasurable variable. In order for a variable to be a good proxy, it must have a close correlation, not ...
*
Rasch model The Rasch model, named after Georg Rasch, is a psychometric model for analyzing categorical data, such as answers to questions on a reading assessment or questionnaire responses, as a function of the trade-off between the respondent's abilities, ...
* Structural equation modeling


References


Further reading

* {{DEFAULTSORT:Latent Variable Social research Bayesian networks Econometric modeling Latent variable Psychometrics de:Latente Variable