HOME

TheInfoList



OR:

In
probability theory Probability theory is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expressing it through a set o ...
, a Pitman–Yor process denoted PY(''d'', ''θ'', ''G''0), is a stochastic process whose sample path is a
probability distribution In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomeno ...
. A random sample from this process is an infinite discrete probability distribution, consisting of an infinite set of atoms drawn from ''G''0, with weights drawn from a two-parameter Poisson–Dirichlet distribution. The process is named after Jim Pitman and Marc Yor. The parameters governing the Pitman–Yor process are: 0 ≤ ''d'' < 1 a discount parameter, a strength parameter ''θ'' > −''d'' and a base distribution ''G''0 over a probability space  ''X''. When ''d'' = 0, it becomes the
Dirichlet process In probability theory, Dirichlet processes (after the distribution associated with Peter Gustav Lejeune Dirichlet) are a family of stochastic processes whose realization (probability), realizations are probability distributions. In other words, a ...
. The discount parameter gives the Pitman–Yor process more flexibility over tail behavior than the Dirichlet process, which has exponential tails. This makes Pitman–Yor process useful for modeling data with
power-law In statistics, a power law is a functional relationship between two quantities, where a relative change in one quantity results in a proportional relative change in the other quantity, independent of the initial size of those quantities: one ...
tails (e.g., word frequencies in natural language). The exchangeable random partition induced by the Pitman–Yor process is an example of a Poisson–Kingman partition, and of a Gibbs type random partition.


Naming conventions

The name "Pitman–Yor process" was coined by Ishwaran and James after Pitman and Yor's review on the subject. However the process was originally studied in Perman et al. It is also sometimes referred to as the two-parameter Poisson–Dirichlet process, after the two-parameter generalization of the Poisson–Dirichlet distribution which describes the joint distribution of the sizes of the atoms in the random measure, sorted by strictly decreasing order.


See also

*
Chinese restaurant process In probability theory, the Chinese restaurant process is a discrete-time stochastic process, analogous to seating customers at tables in a restaurant. Imagine a restaurant with an infinite number of circular tables, each with infinite capacity. Cu ...
*
Dirichlet distribution In probability and statistics, the Dirichlet distribution (after Peter Gustav Lejeune Dirichlet), often denoted \operatorname(\boldsymbol\alpha), is a family of continuous multivariate probability distributions parameterized by a vector \bolds ...
*
Latent Dirichlet allocation In natural language processing, Latent Dirichlet Allocation (LDA) is a generative statistical model that explains a set of observations through unobserved groups, and each group explains why some parts of the data are similar. The LDA is an exa ...


References

Stochastic processes Nonparametric Bayesian statistics Cluster analysis algorithms {{probability-stub