In
probability theory
Probability theory is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expressing it through a set ...
, a probability density function (PDF), or density of a
continuous random variable
In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon ...
, is a
function whose value at any given sample (or point) in the
sample space (the set of possible values taken by the random variable) can be interpreted as providing a ''relative likelihood'' that the value of the random variable would be close to that sample. Probability density is the probability per unit length, in other words, while the ''absolute likelihood'' for a continuous random variable to take on any particular value is 0 (since there is an infinite set of possible values to begin with), the value of the PDF at two different samples can be used to infer, in any particular draw of the random variable, how much more likely it is that the random variable would be close to one sample compared to the other sample.
In a more precise sense, the PDF is used to specify the probability of the
random variable falling ''within a particular range of values'', as opposed to taking on any one value. This probability is given by the
integral
In mathematics, an integral assigns numbers to functions in a way that describes displacement, area, volume, and other concepts that arise by combining infinitesimal data. The process of finding integrals is called integration. Along with ...
of this variable's PDF over that range—that is, it is given by the area under the density function but above the horizontal axis and between the lowest and greatest values of the range. The probability density function is nonnegative everywhere, and the area under the entire curve is equal to 1.
The terms "''probability distribution function''" and "''probability function''" have also sometimes been used to denote the probability density function. However, this use is not standard among probabilists and statisticians. In other sources, "probability distribution function" may be used when the
probability distribution is defined as a function over general sets of values or it may refer to the
cumulative distribution function, or it may be a
probability mass function (PMF) rather than the density. "Density function" itself is also used for the probability mass function, leading to further confusion. In general though, the PMF is used in the context of
discrete random variables (random variables that take values on a countable set), while the PDF is used in the context of continuous random variables.
Example
Suppose bacteria of a certain species typically live 4 to 6 hours. The probability that a bacterium lives 5 hours is equal to zero. A lot of bacteria live for approximately 5 hours, but there is no chance that any given bacterium dies at exactly 5.00... hours. However, the probability that the bacterium dies between 5 hours and 5.01 hours is quantifiable. Suppose the answer is 0.02 (i.e., 2%). Then, the probability that the bacterium dies between 5 hours and 5.001 hours should be about 0.002, since this time interval is one-tenth as long as the previous. The probability that the bacterium dies between 5 hours and 5.0001 hours should be about 0.0002, and so on.
In this example, the ratio (probability of dying during an interval) / (duration of the interval) is approximately constant, and equal to 2 per hour (or 2 hour
−1). For example, there is 0.02 probability of dying in the 0.01-hour interval between 5 and 5.01 hours, and (0.02 probability / 0.01 hours) = 2 hour
−1. This quantity 2 hour
−1 is called the probability density for dying at around 5 hours. Therefore, the probability that the bacterium dies at 5 hours can be written as (2 hour
−1) ''dt''. This is the probability that the bacterium dies within an infinitesimal window of time around 5 hours, where ''dt'' is the duration of this window. For example, the probability that it lives longer than 5 hours, but shorter than (5 hours + 1 nanosecond), is (2 hour
−1)×(1 nanosecond) ≈ (using the
unit conversion nanoseconds = 1 hour).
There is a probability density function ''f'' with ''f''(5 hours) = 2 hour
−1. The
integral
In mathematics, an integral assigns numbers to functions in a way that describes displacement, area, volume, and other concepts that arise by combining infinitesimal data. The process of finding integrals is called integration. Along with ...
of ''f'' over any window of time (not only infinitesimal windows but also large windows) is the probability that the bacterium dies in that window.
Absolutely continuous univariate distributions
A probability density function is most commonly associated with
absolutely continuous univariate distributions. A
random variable has density
, where
is a non-negative
Lebesgue-integrable
In mathematics, the integral of a non-negative function of a single variable can be regarded, in the simplest case, as the area between the graph of that function and the -axis. The Lebesgue integral, named after French mathematician Henri Le ...
function, if:
Hence, if
is the
cumulative distribution function of
, then:
and (if
is continuous at
)
Intuitively, one can think of
as being the probability of
falling within the infinitesimal
interval