statistics Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...

, the Bhattacharyya distance is a quantity which represents a notion of similarity between two

probability distribution In probability theory and statistics, a probability distribution is a Function (mathematics), function that gives the probabilities of occurrence of possible events for an Experiment (probability theory), experiment. It is a mathematical descri ...

s. It is closely related to the Bhattacharyya coefficient, which is a measure of the amount of overlap between two

statistical Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...

samples or populations. It is not a

metric Metric or metrical may refer to: Measuring * Metric system, an internationally adopted decimal system of measurement * An adjective indicating relation to measurement in general, or a noun describing a specific type of measurement Mathematics ...

, despite being named a "distance", since it does not obey the

triangle inequality In mathematics, the triangle inequality states that for any triangle, the sum of the lengths of any two sides must be greater than or equal to the length of the remaining side. This statement permits the inclusion of Degeneracy (mathematics)#T ...

History

Both the Bhattacharyya distance and the Bhattacharyya coefficient are named after Anil Kumar Bhattacharyya, a

statistician A statistician is a person who works with Theory, theoretical or applied statistics. The profession exists in both the private sector, private and public sectors. It is common to combine statistical knowledge with expertise in other subjects, a ...

who worked in the 1930s at the

Indian Statistical Institute The Indian Statistical Institute (ISI) is a public research university headquartered in Kolkata, India with centers in New Delhi, Bengaluru, Chennai and Tezpur. It was declared an Institute of National Importance by the Government of India und ...

. He has developed this through a series of papers. He developed the method to measure the distance between two non-normal distributions and illustrated this with the classical multinomial populations, this work despite being submitted for publication in 1941, appeared almost five years later in

Sankhya Samkhya or Sankhya (; ) is a dualistic orthodox school of Hindu philosophy. It views reality as composed of two independent principles, '' Puruṣa'' ('consciousness' or spirit) and ''Prakṛti'' (nature or matter, including the human mind an ...

. Consequently, Professor Bhattacharyya started working toward developing a distance metric for probability distributions that are absolutely continuous with respect to the Lebesgue measure and published his progress in 1942, at Proceedings of the

Indian Science Congress Indian Science Congress Association (ISCA) is a premier scientific organisation of India with headquarters at Kolkata, West Bengal. The association started in the year 1914 in Calcutta and it meets annually in the first week of January. It h ...

and the final work has appeared in 1943 in the Bulletin of the

Calcutta Mathematical Society The Calcutta Mathematical Society (CalMathSoc) is an association of professional mathematicians dedicated to the interests of mathematical research and education in India. The Society has its head office located at Kolkata, India. Histor ...

Definition

For probability distributions

P

and

Q

on the same discrete

domain A domain is a geographic area controlled by a single person or organization. Domain may also refer to: Law and human geography * Demesne, in English common law and other Medieval European contexts, lands directly managed by their holder rather ...

\mathcal

, the Bhattacharyya distance is defined as :

D_B(P,Q) = -\ln \left( BC(P,Q) \right)

where :

BC(P,Q) = \sum_ \sqrt

is the Bhattacharyya coefficient for discrete probability distributions. For continuous probability distributions, with

P(dx) = p(x)dx

and

Q(dx) = q(x) dx

where

p(x)

and

q(x)

are the

probability density In probability theory, a probability density function (PDF), density function, or density of an absolutely continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values ...

functions, the Bhattacharyya coefficient is defined as :

BC(P,Q) = \int_ \sqrt\, dx

. More generally, given two probability measures

P, Q

on a measurable space

(\mathcal X, \mathcal B)

, let

\lambda

be a ( sigma finite) measure such that

P

and

Q

are

absolutely continuous In calculus and real analysis, absolute continuity is a smoothness property of functions that is stronger than continuity and uniform continuity. The notion of absolute continuity allows one to obtain generalizations of the relationship betwe ...

with respect to

\lambda

i.e. such that

P(dx) = p(x)\lambda(dx)

, and

Q(dx) = q(x)\lambda(dx)

for probability density functions

p, q

with respect to

\lambda

defined

\lambda

-almost everywhere. Such a measure, even such a probability measure, always exists, e.g.

\lambda = \tfrac12(P + Q)

. Then define the Bhattacharyya measure on

(\mathcal X, \mathcal B)

by :

bc(dx , P,Q) = \sqrt\, \lambda(dx) = \sqrt\lambda(dx).

It does not depend on the measure

\lambda

, for if we choose a measure

\mu

such that

\lambda

and an other measure choice

\lambda'

are absolutely continuous i.e.

\lambda = l(x)\mu

and

\lambda' = l'(x) \mu

, then :

P(dx) = p(x)\lambda(dx) = p'(x)\lambda'(dx) = p(x)l(x) \mu(dx) = p'(x)l'(x)\mu(dx)

, and similarly for

Q

. We then have :

bc(dx , P,Q) = \sqrt\, \lambda(dx) = \sqrt\, l(x)\mu(x) = \sqrt\mu(dx) = \sqrt\, \mu(dx) = \sqrt\,\lambda'(dx)

. We finally define the Bhattacharyya coefficient :

BC(P,Q) = \int_ bc(dx, P,Q) = \int_ \sqrt\, \lambda(dx)

. By the above, the quantity

BC(P,Q)

does not depend on

\lambda

, and by the Cauchy inequality

0\le BC(P,Q) \le 1

. Using

P(dx) = p(x)\lambda(dx)

, and

Q(dx) = q(x)\lambda(dx)

BC(P, Q) = \int_ \sqrt Q(dx) = \int_ \sqrt Q(dx) = E_Q\left sqrt\right

Gaussian case

Let

p\sim\mathcal(\mu_p,\sigma_p^2)

q\sim\mathcal(\mu_q,\sigma_q^2)

, where

(\mu ,\sigma ^)

is the

normal distribution In probability theory and statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is f(x) = \frac ...

with mean

\mu

and variance

\sigma^2

; then :

D_B(p,q) = \frac \frac + \frac 1 2 \ln\left(\frac\right)

. And in general, given two

multivariate normal In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional (univariate) normal distribution to higher dimensions. One de ...

distributions

p_i=\mathcal(\boldsymbol\mu_i,\,\boldsymbol\Sigma_i)

, :

D_B(p_1, p_2)=(\boldsymbol\mu_1-\boldsymbol\mu_2)^T \boldsymbol\Sigma^(\boldsymbol\mu_1-\boldsymbol\mu_2)+\ln \,\left(\right)

, where

\boldsymbol\Sigma=.

Note that the first term is a squared

Mahalanobis distance The Mahalanobis distance is a distance measure, measure of the distance between a point P and a probability distribution D, introduced by Prasanta Chandra Mahalanobis, P. C. Mahalanobis in 1936. The mathematical details of Mahalanobis distance ...

Properties

0 \le BC \le 1

and

0 \le D_B \le \infty

D_B

does not obey the

, though the

Hellinger distance In probability and statistics, the Hellinger distance (closely related to, although different from, the Bhattacharyya distance) is used to quantify the similarity between two probability distributions. It is a type of ''f''-divergence. The Hell ...

\sqrt

does.

Bounds on Bayes error

The Bhattacharyya distance can be used to upper and lower bound the

Bayes error rate In statistical classification, Bayes error rate is the lowest possible error rate for any classifier of a random outcome (into, for example, one of two categories) and is analogous to the irreducible error.K. Tumer, K. (1996) "Estimating the Bayes ...

\frac - \frac\sqrt \leq L^* \leq \rho

where

\rho = \mathbb E \sqrt

and

\eta(X) = \mathbb P(Y=1 ,  X)

is the posterior probability.

Applications

The Bhattacharyya coefficient quantifies the "closeness" of two random statistical samples. Given two sequences from distributions

P, Q

, bin them into

n

buckets, and let the frequency of samples from

P

in bucket

i

p_i

, and similarly for

q_i

, then the sample Bhattacharyya coefficient is :

BC(\mathbf,\mathbf) = \sum_^n \sqrt,

which is an estimator of

BC(P, Q)

. The quality of estimation depends on the choice of buckets; too few buckets would overestimate

BC(P, Q)

, while too many would underestimate. A common task in

classification Classification is the activity of assigning objects to some pre-existing classes or categories. This is distinct from the task of establishing the classes themselves (for example through cluster analysis). Examples include diagnostic tests, identif ...

is estimating the separability of classes. Up to a multiplicative factor, the squared

is a special case of the Bhattacharyya distance when the two classes are normally distributed with the same variances. When two classes have similar means but significantly different variances, the Mahalanobis distance would be close to zero, while the Bhattacharyya distance would not be. The Bhattacharyya coefficient is used in the construction of polar codes. The Bhattacharyya distance is used in feature extraction and selection, image processing,François Goudail, Philippe Réfrégier, Guillaume Delyon, "Bhattacharyya distance as a contrast parameter for statistical processing of noisy optical images", ''JOSA A'', Vol. 21, Issue 7, pp. 1231−1240 (2004)

speaker recognition Speaker recognition is the identification of a person from characteristics of voices. It is used to answer the question "Who is speaking?" The term voice recognition can refer to ''speaker recognition'' or speech recognition. Speaker verification ...

,Chang Huai You, "An SVM Kernel With GMM-Supervector Based on the Bhattacharyya Distance for Speaker Recognition", ''Signal Processing Letters'', IEEE, Vol 16, Is 1, pp. 49-52 phone clustering,Mak, B., "Phone clustering using the Bhattacharyya distance", ''Spoken Language'', 1996. ICSLP 96. Proceedings., Fourth International Conference on, Vol 4, pp. 2005–2008 vol.4, 3−6 Oct 1996 and in genetics.

References

External links

*
Statistical Intuition of Bhattacharyya's distance

* Nielsen, F.; Boltz, S. (2010). "The Burbea–Rao and Bhattacharyya centroids". IEEE Transactions on Information Theory. 57 (8): 5455–5466. * Kailath, T. (1967). "The Divergence and Bhattacharyya Distance Measures in Signal Selection". IEEE Transactions on Communication Technology. 15 (1): 52–60. * Djouadi, A.; Snorrason, O.; Garber, F. (1990). "The quality of Training-Sample estimates of the Bhattacharyya coefficient". IEEE Transactions on Pattern Analysis and Machine Intelligence. 12 (1): 92–97. {{DEFAULTSORT:Bhattacharyya Distance Statistical distance Statistical deviation and dispersion Anil Kumar Bhattacharya