statistics Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...

, Tschuprow's ''T'' is a measure of association between two nominal variables, giving a value between 0 and 1 (inclusive). It is closely related to

Cramér's V In statistics, Cramér's V (sometimes referred to as Cramér's phi and denoted as φ''c'') is a measure of association between two nominal variables, giving a value between 0 and +1 (inclusive). It is based on Pearson's chi-squared statistic an ...

, coinciding with it for square

contingency tables In statistics, a contingency table (also known as a cross tabulation or crosstab) is a type of table in a matrix format that displays the multivariate frequency distribution of the variables. They are heavily used in survey research, business int ...

. It was published by Alexander Tschuprow (alternative spelling: Chuprov) in 1939.Tschuprow, A. A. (1939) ''Principles of the Mathematical Theory of Correlation''; translated by M. Kantorowitsch. W. Hodge & Co.

Definition

For an ''r'' × ''c'' contingency table with ''r'' rows and ''c'' columns, let

\pi_

be the proportion of the population in cell

(i,j)

and let :

\pi_=\sum_^c\pi_

and

\pi_=\sum_^r\pi_.

Then the mean square contingency is given as :

\phi^2 = \sum_^r\sum_^c\frac ,

and Tschuprow's ''T'' as :

T = \sqrt .

Properties

''T'' equals zero if and only if independence holds in the table, i.e., if and only if

\pi_=\pi_\pi_

. ''T'' equals one if and only there is perfect dependence in the table, i.e., if and only if for each ''i'' there is only one ''j'' such that

\pi_>0

and vice versa. Hence, it can only equal 1 for square tables. In this it differs from

, which can be equal to 1 for any rectangular table.

Estimation

If we have a multinomial sample of size ''n'', the usual way to estimate ''T'' from the data is via the formula :

\hat T = \sqrt ,

where

p_=n_/n

is the proportion of the sample in cell

(i,j)

. This is the empirical value of ''T''. With

\chi^2

the Pearson chi-square statistic, this formula can also be written as :

\hat T = \sqrt .

References

{{Reflist * Liebetrau, A. (1983). Measures of Association (Quantitative Applications in the Social Sciences). Sage Publications Summary statistics for contingency tables

Definition

Properties

Estimation

See also

References