HOME

TheInfoList



OR:

In statistics, a latent class model (LCM) relates a set of observed (usually discrete) multivariate variables to a set of
latent variable In statistics, latent variables (from Latin: present participle of ''lateo'', “lie hidden”) are variables that can only be inferred indirectly through a mathematical model from other observable variables that can be directly observed or me ...
s. It is a type of latent variable model. It is called a latent class model because the latent variable is discrete. A class is characterized by a pattern of
conditional probabilities In probability theory, conditional probability is a measure of the probability of an event occurring, given that another event (by assumption, presumption, assertion or evidence) has already occurred. This particular method relies on event B occu ...
that indicate the chance that variables take on certain values. Latent class analysis (LCA) is a subset of
structural equation modeling Structural equation modeling (SEM) is a label for a diverse set of methods used by scientists in both experimental and observational research across the sciences, business, and other fields. It is used most in the social and behavioral scienc ...
, used to find groups or subtypes of cases in multivariate
categorical data In statistics, a categorical variable (also called qualitative variable) is a variable that can take on one of a limited, and usually fixed, number of possible values, assigning each individual or other unit of observation to a particular group o ...
. These subtypes are called "latent classes".Lazarsfeld, P.F. and Henry, N.W. (1968) ''Latent structure analysis''. Boston: Houghton Mifflin Formann, A. K. (1984). ''Latent Class Analyse: Einführung in die Theorie und Anwendung atent class analysis: Introduction to theory and application'. Weinheim: Beltz. Confronted with a situation as follows, a researcher might choose to use LCA to understand the data: Imagine that symptoms a-d have been measured in a range of patients with diseases X, Y, and Z, and that disease X is associated with the presence of symptoms a, b, and c, disease Y with symptoms b, c, d, and disease Z with symptoms a, c and d. The LCA will attempt to detect the presence of latent classes (the disease entities), creating patterns of association in the symptoms. As in
factor analysis Factor analysis is a statistical method used to describe variability among observed, correlated variables in terms of a potentially lower number of unobserved variables called factors. For example, it is possible that variations in six observed ...
, the LCA can also be used to classify case according to their
maximum likelihood In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data. This is achieved by maximizing a likelihood function so that, under the assumed sta ...
class membership. Because the criterion for solving the LCA is to achieve latent classes within which there is no longer any association of one symptom with another (because the class is the disease which causes their association), and the set of diseases a patient has (or class a case is a member of) causes the symptom association, the symptoms will be "conditionally independent", i.e., conditional on class membership, they are no longer related.


Model

Within each latent class, the observed variables are
statistically independent Independence is a fundamental notion in probability theory, as in statistics and the theory of stochastic processes. Two events are independent, statistically independent, or stochastically independent if, informally speaking, the occurrence of ...
. This is an important aspect. Usually the observed variables are statistically dependent. By introducing the latent variable, independence is restored in the sense that within classes variables are independent (
local independence Within statistics, Local independence is the underlying assumption of latent variable models. The observed items are conditionally independent of each other given an individual score on the latent variable(s). This means that the latent variable exp ...
). We then say that the association between the observed variables is explained by the classes of the latent variable (McCutcheon, 1987). In one form, the latent class model is written as : p_ \approx \sum_t^T p_t \, \prod_n^N p^n_, where T is the number of latent classes and p_t are the so-called recruitment or unconditional probabilities that should sum to one. p^n_ are the marginal or conditional probabilities. For a two-way latent class model, the form is : p_ \approx \sum_t^T p_t \, p_ \, p_. This two-way model is related to
probabilistic latent semantic analysis Probabilistic latent semantic analysis (PLSA), also known as probabilistic latent semantic indexing (PLSI, especially in information retrieval circles) is a statistical technique for the analysis of two-mode and co-occurrence data. In effect, one ca ...
and
non-negative matrix factorization Non-negative matrix factorization (NMF or NNMF), also non-negative matrix approximation is a group of algorithms in multivariate analysis and linear algebra where a matrix is factorized into (usually) two matrices and , with the property that ...
.


Related methods

There are a number of methods with distinct names and uses that share a common relationship.
Cluster analysis Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense) to each other than to those in other groups (clusters). It is a main task of ...
is, like LCA, used to discover taxon-like groups of cases in data. Multivariate mixture estimation (MME) is applicable to continuous data, and assumes that such data arise from a mixture of distributions: imagine a set of heights arising from a mixture of men and women. If a multivariate mixture estimation is constrained so that measures must be uncorrelated within each distribution it is termed latent profile analysis. Modified to handle discrete data, this constrained analysis is known as LCA. Discrete latent trait models further constrain the classes to form from segments of a single dimension: essentially allocating members to classes on that dimension: an example would be assigning cases to social classes on a dimension of ability or merit. As a practical instance, the variables could be
multiple choice Multiple choice (MC), objective response or MCQ (for multiple choice question) is a form of an objective assessment in which respondents are asked to select only correct answers from the choices offered as a list. The multiple choice format is m ...
items of a political questionnaire. The data in this case consists of a N-way
contingency table In statistics, a contingency table (also known as a cross tabulation or crosstab) is a type of table in a matrix format that displays the (multivariate) frequency distribution of the variables. They are heavily used in survey research, business ...
with answers to the items for a number of respondents. In this example, the latent variable refers to political opinion and the latent classes to political groups. Given group membership, the
conditional probabilities In probability theory, conditional probability is a measure of the probability of an event occurring, given that another event (by assumption, presumption, assertion or evidence) has already occurred. This particular method relies on event B occu ...
specify the chance certain answers are chosen.


Application

LCA may be used in many fields, such as:
collaborative filtering Collaborative filtering (CF) is a technique used by recommender systems.Francesco Ricci and Lior Rokach and Bracha ShapiraIntroduction to Recommender Systems Handbook Recommender Systems Handbook, Springer, 2011, pp. 1-35 Collaborative filtering ...
, '' Behavior Genetics'' an
Evaluation of diagnostic tests


References

* * * *


External links

* Statistical Innovations
Home Page
2016. Website with latent class software (Latent GOLD 5.1), free demonstrations, tutorials, user guides, and publications for download. Also included: online courses, FAQs, and other related software. * The Methodology Center
Latent Class Analysis
a research center at
Penn State #Redirect Pennsylvania State University The Pennsylvania State University (Penn State or PSU) is a public state-related land-grant research university with campuses and facilities throughout Pennsylvania. Founded in 1855 as the Farmers' High ...
, free software, FAQ * John Uebersax
Latent Class Analysis
2006. A web-site with bibliography, software, links and FAQ for latent class analysis {{DEFAULTSORT:Latent Class Model Classification algorithms Latent variable models Market research Market segmentation