probability Probability is the branch of mathematics concerning numerical descriptions of how likely an event is to occur, or how likely it is that a proposition is true. The probability of an event is a number between 0 and 1, where, roughly speaking, ...

and

statistics Statistics (from German: '' Statistik'', "description of a state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, indust ...

, a probability mass function is a function that gives the probability that a

discrete random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...

is exactly equal to some value. Sometimes it is also known as the discrete density function. The probability mass function is often the primary means of defining a discrete probability distribution, and such functions exist for either scalar or multivariate random variables whose

domain Domain may refer to: Mathematics *Domain of a function, the set of input values for which the (total) function is defined ** Domain of definition of a partial function ** Natural domain of a partial function **Domain of holomorphy of a function * ...

is discrete. A probability mass function differs from a

probability density function In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) ca ...

(PDF) in that the latter is associated with continuous rather than discrete random variables. A PDF must be integrated over an interval to yield a probability. The value of the random variable having the largest probability mass is called the

mode Mode ( la, modus meaning "manner, tune, measure, due measure, rhythm, melody") may refer to: Arts and entertainment * '' MO''D''E (magazine)'', a defunct U.S. women's fashion magazine * ''Mode'' magazine, a fictional fashion magazine which is ...

Formal definition

Probability mass function is the probability distribution of a discrete random variable, and provides the possible values and their associated probabilities. It is the function

p: \R \to,1 /math> defined by


for -\infin < x < \infin, where P is a

probability measure In mathematics, a probability measure is a real-valued function defined on a set of events in a probability space that satisfies measure properties such as ''countable additivity''. The difference between a probability measure and the more ge ...

p_X(x)

can also be simplified as

p(x)

. The probabilities associated with all (hypothetical) values must be non-negative and sum up to 1,

\sum_x p_X(x) = 1

and

p_X(x)\geq 0.

Thinking of probability as mass helps to avoid mistakes since the physical mass is conserved as is the total probability for all hypothetical outcomes

x

Measure theoretic formulation

A probability mass function of a discrete random variable

X

can be seen as a special case of two more general measure theoretic constructions: the

distribution Distribution may refer to: Mathematics * Distribution (mathematics), generalized functions used to formulate solutions of partial differential equations *Probability distribution, the probability of a particular value or value range of a vari ...

X

and the

X

with respect to the counting measure. We make this more precise below. Suppose that

(A, \mathcal A, P)

is a probability space and that

(B, \mathcal B)

is a measurable space whose underlying σ-algebra is discrete, so in particular contains singleton sets of

B

. In this setting, a random variable

X \colon A \to B

is discrete provided its image is countable. The pushforward measure

X_(P)

—called the distribution of

X

in this context—is a probability measure on

B

whose restriction to singleton sets induces the probability mass function (as mentioned in the previous section)

f_X \colon B \to \mathbb R

since

f_X(b)=P( X^( b ))=P(X=b)

for each

b \in B

. Now suppose that

(B, \mathcal B, \mu)

is a

measure space A measure space is a basic object of measure theory, a branch of mathematics that studies generalized notions of volumes. It contains an underlying set, the subsets of this set that are feasible for measuring (the -algebra) and the method that ...

equipped with the counting measure μ. The probability density function

f

X

with respect to the counting measure, if it exists, is the Radon–Nikodym derivative of the pushforward measure of

X

(with respect to the counting measure), so

f = d X_*P / d \mu

and

f

is a function from

B

to the non-negative reals. As a consequence, for any

b \in B

we have

P(X=b)=P( X^( b) ) = X_*(P)(b) = \int_ f d \mu = f(b),

demonstrating that

f

is in fact a probability mass function. When there is a natural order among the potential outcomes

x

, it may be convenient to assign numerical values to them (or ''n''-tuples in case of a discrete multivariate random variable) and to consider also values not in the

image An image is a visual representation of something. It can be two-dimensional, three-dimensional, or somehow otherwise feed into the visual system to convey information. An image can be an artifact, such as a photograph or other two-dimensio ...

X

. That is,

f_X

may be defined for all

real number In mathematics, a real number is a number that can be used to measure a ''continuous'' one-dimensional quantity such as a distance, duration or temperature. Here, ''continuous'' means that values can have arbitrarily small variations. Every ...

s and

f_X(x)=0

for all

x \notin X(S)

as shown in the figure. The image of

X

has a countable subset on which the probability mass function

f_X(x)

is one. Consequently, the probability mass function is zero for all but a countable number of values of

x

. The discontinuity of probability mass functions is related to the fact that the

cumulative distribution function In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x. Eve ...

of a discrete random variable is also discontinuous. If

X

is a discrete random variable, then

P(X = x) = 1

means that the casual event

(X = x)

is certain (it is true in 100% of the occurrences); on the contrary,

P(X = x) = 0

means that the casual event

(X = x)

is always impossible. This statement isn't true for a

continuous random variable In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon ...

X

, for which

P(X = x) = 0

for any possible

x

Discretization In applied mathematics, discretization is the process of transferring continuous functions, models, variables, and equations into discrete counterparts. This process is usually carried out as a first step toward making them suitable for numerica ...

is the process of converting a continuous random variable into a discrete one.

Examples

Finite

There are three major distributions associated, the

Bernoulli distribution In probability theory and statistics, the Bernoulli distribution, named after Swiss mathematician Jacob Bernoulli,James Victor Uspensky: ''Introduction to Mathematical Probability'', McGraw-Hill, New York 1937, page 45 is the discrete probabi ...

, the

binomial distribution In probability theory and statistics, the binomial distribution with parameters ''n'' and ''p'' is the discrete probability distribution of the number of successes in a sequence of ''n'' independent experiments, each asking a yes–no ques ...

and the geometric distribution. *Bernoulli distribution: ber(p) , is used to model an experiment with only two possible outcomes. The two outcomes are often encoded as 1 and 0.

p_X(x) = \begin
p, & \textx\text \\
1-p, & \textx\text
\end

An example of the Bernoulli distribution is tossing a coin. Suppose that

S

is the sample space of all outcomes of a single toss of a fair coin, and

X

is the random variable defined on

S

assigning 0 to the category "tails" and 1 to the category "heads". Since the coin is fair, the probability mass function is

p_X(x) = \begin
\frac, &x \in \,\\
0, &x \notin \.
\end

* Binomial distribution, models the number of successes when someone draws n times with replacement. Each draw or experiment is independent, with two possible outcomes. The associated probability mass function is

\binom p^k (1-p)^

. An example of the binomial distribution is the probability of getting exactly one 6 when someone rolls a fair die three times. * Geometric distribution describes the number of trials needed to get one success. Its probability mass function is

p_X(k) = (1-p)^ p

.An example is tossing a coin until the first "heads" appears.

p

denotes the probability of the outcome "heads", and

k

denotes the number of necessary coin tosses. Other distributions that can be modeled using a probability mass function are the categorical distribution (also known as the generalized Bernoulli distribution) and the multinomial distribution. * If the discrete distribution has two or more categories one of which may occur, whether or not these categories have a natural ordering, when there is only a single trial (draw) this is a categorical distribution. * An example of a multivariate discrete distribution, and of its probability mass function, is provided by the multinomial distribution. Here the multiple random variables are the numbers of successes in each of the categories after a given number of trials, and each non-zero probability mass gives the probability of a certain combination of numbers of successes in the various categories.

Infinite

The following exponentially declining distribution is an example of a distribution with an infinite number of possible outcomes—all the positive integers:

\text(X=i)= \frac\qquad \text i=1, 2, 3, \dots

Despite the infinite number of possible outcomes, the total probability mass is 1/2 + 1/4 + 1/8 + ⋯ = 1, satisfying the unit total probability requirement for a probability distribution.

Multivariate case

Two or more discrete random variables have a joint probability mass function, which gives the probability of each possible combination of realizations for the random variables.