A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on
random
In common usage, randomness is the apparent or actual lack of pattern or predictability in events. A random sequence of events, symbols or steps often has no order and does not follow an intelligible pattern or combination. Individual rando ...
events.
It is a mapping or a function from possible
outcomes (e.g., the possible upper sides of a flipped coin such as heads
and tails
) in a
sample space
In probability theory, the sample space (also called sample description space, possibility space, or outcome space) of an experiment or random trial is the set of all possible outcomes or results of that experiment. A sample space is usually de ...
(e.g., the set
) to a
measurable space
In mathematics, a measurable space or Borel space is a basic object in measure theory. It consists of a set and a σ-algebra, which defines the subsets that will be measured.
Definition
Consider a set X and a σ-algebra \mathcal A on X. Then ...
, often the real numbers (e.g.,
in which 1 corresponding to
and -1 corresponding to
).

Informally, randomness typically represents some fundamental element of chance, such as in the roll of a
dice
Dice (singular die or dice) are small, throwable objects with marked sides that can rest in multiple positions. They are used for generating random values, commonly as part of tabletop games, including dice games, board games, role-playing ...
; it may also represent uncertainty, such as
measurement error
Observational error (or measurement error) is the difference between a measured value of a quantity and its true value.Dodge, Y. (2003) ''The Oxford Dictionary of Statistical Terms'', OUP. In statistics, an error is not necessarily a "mistak ...
.
However, the
interpretation of probability
The word probability has been used in a variety of ways since it was first applied to the mathematical study of games of chance. Does probability measure the real, physical, tendency of something to occur, or is it a measure of how strongly one be ...
is philosophically complicated, and even in specific cases is not always straightforward. The purely mathematical analysis of random variables is independent of such interpretational difficulties, and can be based upon a rigorous axiomatic setup.
In the formal mathematical language of
measure theory, a random variable is defined as a
measurable function
In mathematics and in particular measure theory, a measurable function is a function between the underlying sets of two measurable spaces that preserves the structure of the spaces: the preimage of any measurable set is measurable. This is i ...
from a
probability measure space (called the ''sample space'') to a
measurable space
In mathematics, a measurable space or Borel space is a basic object in measure theory. It consists of a set and a σ-algebra, which defines the subsets that will be measured.
Definition
Consider a set X and a σ-algebra \mathcal A on X. Then ...
. This allows consideration of the
pushforward measure In measure theory, a pushforward measure (also known as push forward, push-forward or image measure) is obtained by transferring ("pushing forward") a measure from one measurable space to another using a measurable function.
Definition
Given mea ...
, which is called the ''distribution'' of the random variable; the distribution is thus a
probability measure
In mathematics, a probability measure is a real-valued function defined on a set of events in a probability space that satisfies measure properties such as ''countable additivity''. The difference between a probability measure and the more g ...
on the set of all possible values of the random variable. It is possible for two random variables to have identical distributions but to differ in significant ways; for instance, they may be
independent
Independent or Independents may refer to:
Arts, entertainment, and media Artist groups
* Independents (artist group), a group of modernist painters based in the New Hope, Pennsylvania, area of the United States during the early 1930s
* Independe ...
.
It is common to consider the special cases of
discrete random variable
A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
s and
absolutely continuous random variable
In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon ...
s, corresponding to whether a random variable is valued in a discrete set (such as a finite set) or in an interval of
real number
In mathematics, a real number is a number that can be used to measurement, measure a ''continuous'' one-dimensional quantity such as a distance, time, duration or temperature. Here, ''continuous'' means that values can have arbitrarily small var ...
s. There are other important possibilities, especially in the theory of
stochastic processes, wherein it is natural to consider
random sequence The concept of a random sequence is essential in probability theory and statistics. The concept generally relies on the notion of a sequence of random variables and many statistical discussions begin with the words "let ''X''1,...,''Xn'' be indepen ...
s or
random function
In probability theory and related fields, a stochastic () or random process is a mathematical object usually defined as a family of random variables. Stochastic processes are widely used as mathematical models of systems and phenomena that appea ...
s. Sometimes a ''random variable'' is taken to be automatically valued in the real numbers, with more general random quantities instead being called ''
random element In probability theory, random element is a generalization of the concept of random variable to more complicated spaces than the simple real line. The concept was introduced by who commented that the “development of probability theory and expansio ...
s''.
According to
George Mackey
George Whitelaw Mackey (February 1, 1916 – March 15, 2006) was an American mathematician known for his contributions to quantum logic, representation theory, and noncommutative geometry.
Career
Mackey earned his bachelor of arts at Rice Unive ...
,
Pafnuty Chebyshev
Pafnuty Lvovich Chebyshev ( rus, Пафну́тий Льво́вич Чебышёв, p=pɐfˈnutʲɪj ˈlʲvovʲɪtɕ tɕɪbɨˈʂof) ( – ) was a Russian mathematician and considered to be the founding father of Russian mathematics.
Chebysh ...
was the first person "to think systematically in terms of random variables".
Definition
A random variable
is a
measurable function
In mathematics and in particular measure theory, a measurable function is a function between the underlying sets of two measurable spaces that preserves the structure of the spaces: the preimage of any measurable set is measurable. This is i ...
from a sample space
as a set of possible
outcome
Outcome may refer to:
* Outcome (probability), the result of an experiment in probability theory
* Outcome (game theory), the result of players' decisions in game theory
* ''The Outcome'', a 2005 Spanish film
* An outcome measure (or endpoint) ...
s to a
measurable space
In mathematics, a measurable space or Borel space is a basic object in measure theory. It consists of a set and a σ-algebra, which defines the subsets that will be measured.
Definition
Consider a set X and a σ-algebra \mathcal A on X. Then ...
. The technical axiomatic definition requires the sample space
to be a sample space of a
probability triple (see the
measure-theoretic definition). A random variable is often denoted by capital
roman letters
The Latin script, also known as Roman script, is an alphabetic writing system based on the letters of the classical Latin alphabet, derived from a form of the Greek alphabet which was in use in the ancient Greek city of Cumae, in southern Italy ...
such as
,
,
,
.
The probability that
takes on a value in a measurable set
is written as
:
Standard case
In many cases,
is
real-valued
In mathematics, value may refer to several, strongly related notions.
In general, a mathematical value may be any definite mathematical object. In elementary mathematics, this is most often a number – for example, a real number such as or an ...
, i.e.
. In some contexts, the term
random element In probability theory, random element is a generalization of the concept of random variable to more complicated spaces than the simple real line. The concept was introduced by who commented that the “development of probability theory and expansio ...
(see
extensions
Extension, extend or extended may refer to:
Mathematics
Logic or set theory
* Axiom of extensionality
* Extensible cardinal
* Extension (model theory)
* Extension (predicate logic), the set of tuples of values that satisfy the predicate
* Ext ...
) is used to denote a random variable not of this form.
When the
image
An image is a visual representation of something. It can be two-dimensional, three-dimensional, or somehow otherwise feed into the visual system to convey information. An image can be an artifact, such as a photograph or other two-dimensio ...
(or range) of
is
countable
In mathematics, a set is countable if either it is finite or it can be made in one to one correspondence with the set of natural numbers. Equivalently, a set is ''countable'' if there exists an injective function from it into the natural number ...
, the random variable is called a discrete random variable
and its distribution is a
discrete probability distribution
In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon i ...
, i.e. can be described by a
probability mass function
In probability and statistics, a probability mass function is a function that gives the probability that a discrete random variable is exactly equal to some value. Sometimes it is also known as the discrete density function. The probability mass ...
that assigns a probability to each value in the image of
. If the image is uncountably infinite (usually an
interval) then
is called a continuous random variable. In the special case that it is
absolutely continuous
In calculus, absolute continuity is a smoothness property of functions that is stronger than continuity and uniform continuity. The notion of absolute continuity allows one to obtain generalizations of the relationship between the two central ope ...
, its distribution can be described by a
probability density function
In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) c ...
, which assigns probabilities to intervals; in particular, each individual point must necessarily have probability zero for an absolutely continuous random variable. Not all continuous random variables are absolutely continuous, a
mixture distribution
In probability and statistics, a mixture distribution is the probability distribution of a random variable that is derived from a collection of other random variables as follows: first, a random variable is selected by chance from the collection ...
is one such counterexample; such random variables cannot be described by a probability density or a probability mass function.
Any random variable can be described by its
cumulative distribution function
In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x.
Ev ...
, which describes the probability that the random variable will be less than or equal to a certain value.
Extensions
The term "random variable" in statistics is traditionally limited to the
real-valued
In mathematics, value may refer to several, strongly related notions.
In general, a mathematical value may be any definite mathematical object. In elementary mathematics, this is most often a number – for example, a real number such as or an ...
case (
). In this case, the structure of the real numbers makes it possible to define quantities such as the
expected value
In probability theory, the expected value (also called expectation, expectancy, mathematical expectation, mean, average, or first moment) is a generalization of the weighted average. Informally, the expected value is the arithmetic mean of a ...
and
variance
In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its population mean or sample mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of number ...
of a random variable, its
cumulative distribution function
In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x.
Ev ...
, and the
moments of its distribution.
However, the definition above is valid for any
measurable space
In mathematics, a measurable space or Borel space is a basic object in measure theory. It consists of a set and a σ-algebra, which defines the subsets that will be measured.
Definition
Consider a set X and a σ-algebra \mathcal A on X. Then ...
of values. Thus one can consider random elements of other sets
, such as random
boolean value
In mathematics and mathematical logic, Boolean algebra is a branch of algebra. It differs from elementary algebra in two ways. First, the values of the variables are the truth values ''true'' and ''false'', usually denoted 1 and 0, whereas in ...
s,
categorical values,
complex numbers
In mathematics, a complex number is an element of a number system that extends the real numbers with a specific element denoted , called the imaginary unit and satisfying the equation i^= -1; every complex number can be expressed in the for ...
,
vector
Vector most often refers to:
*Euclidean vector, a quantity with a magnitude and a direction
*Vector (epidemiology), an agent that carries and transmits an infectious pathogen into another living organism
Vector may also refer to:
Mathematic ...
s,
matrices
Matrix most commonly refers to:
* ''The Matrix'' (franchise), an American media franchise
** ''The Matrix'', a 1999 science-fiction action film
** "The Matrix", a fictional setting, a virtual reality environment, within ''The Matrix'' (franchis ...
,
sequence
In mathematics, a sequence is an enumerated collection of objects in which repetitions are allowed and order matters. Like a set, it contains members (also called ''elements'', or ''terms''). The number of elements (possibly infinite) is called ...
s,
tree
In botany, a tree is a perennial plant with an elongated stem, or trunk, usually supporting branches and leaves. In some usages, the definition of a tree may be narrower, including only woody plants with secondary growth, plants that are ...
s,
sets,
shape
A shape or figure is a graphical representation of an object or its external boundary, outline, or external surface, as opposed to other properties such as color, texture, or material type.
A plane shape or plane figure is constrained to lie on ...
s,
manifold
In mathematics, a manifold is a topological space that locally resembles Euclidean space near each point. More precisely, an n-dimensional manifold, or ''n-manifold'' for short, is a topological space with the property that each point has a ...
s, and
function
Function or functionality may refer to:
Computing
* Function key, a type of key on computer keyboards
* Function model, a structured representation of processes in a system
* Function object or functor or functionoid, a concept of object-orie ...
s. One may then specifically refer to a ''random variable of
type
Type may refer to:
Science and technology Computing
* Typing, producing text via a keyboard, typewriter, etc.
* Data type, collection of values used for computations.
* File type
* TYPE (DOS command), a command to display contents of a file.
* Ty ...
'', or an ''
-valued random variable''.
This more general concept of a
random element In probability theory, random element is a generalization of the concept of random variable to more complicated spaces than the simple real line. The concept was introduced by who commented that the “development of probability theory and expansio ...
is particularly useful in disciplines such as
graph theory
In mathematics, graph theory is the study of '' graphs'', which are mathematical structures used to model pairwise relations between objects. A graph in this context is made up of '' vertices'' (also called ''nodes'' or ''points'') which are conn ...
,
machine learning
Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence.
Machine ...
,
natural language processing
Natural language processing (NLP) is an interdisciplinary subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language, in particular how to program computers to proc ...
, and other fields in
discrete mathematics
Discrete mathematics is the study of mathematical structures that can be considered "discrete" (in a way analogous to discrete variables, having a bijection with the set of natural numbers) rather than "continuous" (analogously to continu ...
and
computer science
Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to practical disciplines (includin ...
, where one is often interested in modeling the random variation of non-numerical
data structure
In computer science, a data structure is a data organization, management, and storage format that is usually chosen for Efficiency, efficient Data access, access to data. More precisely, a data structure is a collection of data values, the rel ...
s. In some cases, it is nonetheless convenient to represent each element of
, using one or more real numbers. In this case, a random element may optionally be represented as a
vector of real-valued random variables (all defined on the same underlying probability space
, which allows the different random variables to
covary
In probability theory and statistics, covariance is a measure of the joint variability of two random variables. If the greater values of one variable mainly correspond with the greater values of the other variable, and the same holds for the les ...
). For example:
*A random word may be represented as a random integer that serves as an index into the vocabulary of possible words. Alternatively, it can be represented as a random indicator vector, whose length equals the size of the vocabulary, where the only values of positive probability are
,
,
and the position of the 1 indicates the word.
*A random sentence of given length
may be represented as a vector of
random words.
*A
random graph
In mathematics, random graph is the general term to refer to probability distributions over graphs. Random graphs may be described simply by a probability distribution, or by a random process which generates them. The theory of random graphs ...
on
given vertices may be represented as a
matrix of random variables, whose values specify the
adjacency matrix
In graph theory and computer science, an adjacency matrix is a square matrix used to represent a finite graph. The elements of the matrix indicate whether pairs of vertices are adjacent or not in the graph.
In the special case of a finite simple ...
of the random graph.
*A
random function
In probability theory and related fields, a stochastic () or random process is a mathematical object usually defined as a family of random variables. Stochastic processes are widely used as mathematical models of systems and phenomena that appea ...
may be represented as a collection of random variables
, giving the function's values at the various points
in the function's domain. The
are ordinary real-valued random variables provided that the function is real-valued. For example, a
stochastic process is a random function of time, a
random vector
In probability, and statistics, a multivariate random variable or random vector is a list of mathematical variables each of whose value is unknown, either because the value has not yet occurred or because there is imperfect knowledge of its valu ...
is a random function of some index set such as
, and
random field In physics and mathematics, a random field is a random function over an arbitrary domain (usually a multi-dimensional space such as \mathbb^n). That is, it is a function f(x) that takes on a random value at each point x \in \mathbb^n(or some other ...
is a random function on any set (typically time, space, or a discrete set).
Distribution functions
If a random variable
defined on the probability space
is given, we can ask questions like "How likely is it that the value of
is equal to 2?". This is the same as the probability of the event
which is often written as
or
for short.
Recording all these probabilities of outputs of a random variable
yields the
probability distribution
In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomeno ...
of
. The probability distribution "forgets" about the particular probability space used to define
and only records the probabilities of various output values of
. Such a probability distribution, if
is real-valued, can always be captured by its
cumulative distribution function
In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x.
Ev ...
:
and sometimes also using a
probability density function
In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) c ...
,
. In
measure-theoretic terms, we use the random variable
to "push-forward" the measure
on
to a measure
on
. The measure
is called the "(probability) distribution of
" or the "law of
".
The density
, the
Radon–Nikodym derivative of
with respect to some reference measure
on
(often, this reference measure is the
Lebesgue measure
In measure theory, a branch of mathematics, the Lebesgue measure, named after French mathematician Henri Lebesgue, is the standard way of assigning a measure to subsets of ''n''-dimensional Euclidean space. For ''n'' = 1, 2, or 3, it coincides ...
in the case of continuous random variables, or the
counting measure In mathematics, specifically measure theory, the counting measure is an intuitive way to put a measure on any set – the "size" of a subset is taken to be the number of elements in the subset if the subset has finitely many elements, and infin ...
in the case of discrete random variables).
The underlying probability space
is a technical device used to guarantee the existence of random variables, sometimes to construct them, and to define notions such as
correlation and dependence
In statistics, correlation or dependence is any statistical relationship, whether causal or not, between two random variables or bivariate data. Although in the broadest sense, "correlation" may indicate any type of association, in statistic ...
or
independence
Independence is a condition of a person, nation, country, or state in which residents and population, or some portion thereof, exercise self-government, and usually sovereignty, over its territory. The opposite of independence is the s ...
based on a
joint distribution
Given two random variables that are defined on the same probability space, the joint probability distribution is the corresponding probability distribution on all possible pairs of outputs. The joint distribution can just as well be considered ...
of two or more random variables on the same probability space. In practice, one often disposes of the space
altogether and just puts a measure on
that assigns measure 1 to the whole real line, i.e., one works with probability distributions instead of random variables. See the article on
quantile function
In probability and statistics, the quantile function, associated with a probability distribution of a random variable, specifies the value of the random variable such that the probability of the variable being less than or equal to that value e ...
s for fuller development.
Examples
Discrete random variable
In an experiment a person may be chosen at random, and one random variable may be the person's height. Mathematically, the random variable is interpreted as a function which maps the person to the person's height. Associated with the random variable is a probability distribution that allows the computation of the probability that the height is in any subset of possible values, such as the probability that the height is between 180 and 190 cm, or the probability that the height is either less than 150 or more than 200 cm.
Another random variable may be the person's number of children; this is a discrete random variable with non-negative integer values. It allows the computation of probabilities for individual integer values – the probability mass function (PMF) – or for sets of values, including infinite sets. For example, the event of interest may be "an even number of children". For both finite and infinite event sets, their probabilities can be found by adding up the PMFs of the elements; that is, the probability of an even number of children is the infinite sum
.
In examples such as these, the
sample space
In probability theory, the sample space (also called sample description space, possibility space, or outcome space) of an experiment or random trial is the set of all possible outcomes or results of that experiment. A sample space is usually de ...
is often suppressed, since it is mathematically hard to describe, and the possible values of the random variables are then treated as a sample space. But when two random variables are measured on the same sample space of outcomes, such as the height and number of children being computed on the same random persons, it is easier to track their relationship if it is acknowledged that both height and number of children come from the same random person, for example so that questions of whether such random variables are correlated or not can be posed.
If
are countable sets of real numbers,
and
, then
is a discrete distribution function. Here
for
,
for
. Taking for instance an enumeration of all rational numbers as
, one gets a discrete function that is not necessarily a step function (piecewise constant).
Coin toss
The possible outcomes for one coin toss can be described by the sample space
. We can introduce a real-valued random variable
that models a $1 payoff for a successful bet on heads as follows:
If the coin is a
fair coin
In probability theory and statistics, a sequence of independent Bernoulli trials with probability 1/2 of success on each trial is metaphorically called a fair coin. One for which the probability is not 1/2 is called a biased or unfair coin. In th ...
, ''Y'' has a
probability mass function
In probability and statistics, a probability mass function is a function that gives the probability that a discrete random variable is exactly equal to some value. Sometimes it is also known as the discrete density function. The probability mass ...
given by:
Dice roll

A random variable can also be used to describe the process of rolling dice and the possible outcomes. The most obvious representation for the two-dice case is to take the set of pairs of numbers ''n''
1 and ''n''
2 from (representing the numbers on the two dice) as the sample space. The total number rolled (the sum of the numbers in each pair) is then a random variable ''X'' given by the function that maps the pair to the sum:
and (if the dice are
fair) has a probability mass function ''f''
''X'' given by:
Continuous random variable
Formally, a continuous random variable is a random variable whose
cumulative distribution function
In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x.
Ev ...
is
continuous
Continuity or continuous may refer to:
Mathematics
* Continuity (mathematics), the opposing concept to discreteness; common examples include
** Continuous probability distribution or random variable in probability and statistics
** Continuous g ...
everywhere.
There are no "
gaps", which would correspond to numbers which have a finite probability of
occurring. Instead, continuous random variables
almost never take an exact prescribed value ''c'' (formally,
) but there is a positive probability that its value will lie in particular
intervals
Interval may refer to:
Mathematics and physics
* Interval (mathematics), a range of numbers
** Partially ordered set#Intervals, its generalization from numbers to arbitrary partially ordered sets
* A statistical level of measurement
* Interval est ...
which can be
arbitrarily small In mathematics, the phrases arbitrarily large, arbitrarily small and arbitrarily long are used in statements to make clear of the fact that an object is large, small and long with little limitation or restraint, respectively. The use of "arbitraril ...
. Continuous random variables usually admit
probability density function
In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) c ...
s (PDF), which characterize their CDF and
probability measure
In mathematics, a probability measure is a real-valued function defined on a set of events in a probability space that satisfies measure properties such as ''countable additivity''. The difference between a probability measure and the more g ...
s;
such distributions are also called
absolutely continuous
In calculus, absolute continuity is a smoothness property of functions that is stronger than continuity and uniform continuity. The notion of absolute continuity allows one to obtain generalizations of the relationship between the two central ope ...
; but some continuous distributions are
singular
Singular may refer to:
* Singular, the grammatical number that denotes a unit quantity, as opposed to the plural and other forms
* Singular homology
* SINGULAR, an open source Computer Algebra System (CAS)
* Singular or sounder, a group of boar ...
, or mixes of an absolutely continuous part and a singular part.
An example of a continuous random variable would be one based on a spinner that can choose a horizontal direction. Then the values taken by the random variable are directions. We could represent these directions by North, West, East, South, Southeast, etc. However, it is commonly more convenient to map the sample space to a random variable which takes values which are real numbers. This can be done, for example, by mapping a direction to a bearing in degrees clockwise from North. The random variable then takes values which are real numbers from the interval [0, 360), with all parts of the range being "equally likely". In this case, ''X'' = the angle spun. Any real number has probability zero of being selected, but a positive probability can be assigned to any ''range'' of values. For example, the probability of choosing a number in [0, 180] is . Instead of speaking of a probability mass function, we say that the probability ''density'' of ''X'' is 1/360. The probability of a subset of [0, 360) can be calculated by multiplying the measure of the set by 1/360. In general, the probability of a set for a given continuous random variable can be calculated by integrating the density over the given set.
More formally, given any
interval , a random variable