Twisting properties in general terms are associated with the properties of samples that identify with statistics that are suitable for exchange.

Description

Starting with a

sample Sample or samples may refer to: Base meaning * Sample (statistics), a subset of a population – complete data set * Sample (signal), a digital discrete sample of a continuous analog signal * Sample (material), a specimen or small quantity of s ...

\

observed from a

random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...

''X'' having a given

distribution law Distribution law or the Nernst's distribution law gives a generalisation which governs the distribution of a solute between two non miscible solvents. This law was first given by Nernst who studied the distribution of several solutes between dif ...

with a non-set parameter, a parametric inference problem consists of computing suitable values – call them estimates – of this parameter precisely on the basis of the sample. An estimate is suitable if replacing it with the unknown parameter does not cause major damage in next computations. In algorithmic inference, suitability of an estimate reads in terms of

compatibility Compatibility may refer to: Computing * Backward compatibility, in which newer devices can understand data generated by older devices * Compatibility card, an expansion card for hardware emulation of another device * Compatibility layer, compon ...

with the observed sample. In turn, parameter compatibility is a probability measure that we derive from the probability distribution of the random variable to which the parameter refers. In this way we identify a random parameter Θ compatible with an observed sample. Given a sampling mechanism

M_X=(g_\theta,Z)

, the rationale of this operation lies in using the ''Z'' seed distribution law to determine both the ''X'' distribution law for the given θ, and the Θ distribution law given an ''X'' sample. Hence, we may derive the latter distribution directly from the former if we are able to relate domains of the sample space to subsets of Θ

support Support may refer to: Arts, entertainment, and media * Supporting character Business and finance * Support (technical analysis) * Child support * Customer support * Income Support Construction * Support (structure), or lateral support, a ...

. In more abstract terms, we speak about twisting properties of samples with properties of parameters and identify the former with statistics that are suitable for this exchange, so denoting a well behavior w.r.t. the unknown parameters. The operational goal is to write the analytic expression of the

cumulative distribution function In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x. Ev ...

F_\Theta(\theta)

, in light of the observed value ''s'' of a statistic ''S'', as a function of the ''S'' distribution law when the ''X'' parameter is exactly θ.

Method

Given a sampling mechanism

M_X=(g_\theta,Z)

for the random variable ''X'', we model

\boldsymbol X=\

to be equal to

\

. Focusing on a relevant statistic

S=h_1(X_1,\ldots,X_m)

for the parameter ''θ'', the master equation reads :

s= h(g_\theta(z_1),\ldots, g_\theta(z_m))= \rho(\theta;z_1,\ldots,z_m).

When ''s'' is a well-behaved statistic w.r.t the parameter, we are sure that a monotone relation exists for each

\boldsymbol z=\

between ''s'' and θ. We are also assured that Θ, as a function of

\boldsymbol Z

for given ''s'', is a random variable since the master equation provides solutions that are feasible and independent of other (hidden) parameters.By default, capital letters (such as ''U'', ''X'') will denote random variables and small letters (''u'', ''x'') their corresponding realizations. The direction of the monotony determines for any

\boldsymbol z

a relation between events of the type

s\geq s'\leftrightarrow \theta\geq \theta'

or ''vice versa''

s\geq s'\leftrightarrow \theta\leq \theta'

, where

s'

is computed by the master equation with

\theta'

. In the case that ''s'' assumes discrete values the first relation changes into

s\geq s'\rightarrow \theta\geq \theta'\rightarrow s\geq s'+\ell

where

\ell>0

is the size of the ''s'' discretization grain, idem with the opposite monotony trend. Resuming these relations on all seeds, for ''s'' continuous we have either :

F_(\theta)= F_(s)

or :

F_(\theta)= 1-F_(s)

For ''s'' discrete we have an interval where

F_(\theta)

lies, because of

\ell>0

. The whole logical contrivance is called a twisting argument. A procedure implementing it is as follows.

Algorithm

Remark

The rationale behind twisting arguments does not change when parameters are vectors, though some complication arises from the management of joint inequalities. Instead, the difficulty of dealing with a vector of parameters proved to be the Achilles heel of Fisher's approach to the

fiducial distribution Fiducial inference is one of a number of different types of statistical inference. These are rules, intended for general application, by which conclusions can be drawn from samples of data. In modern statistical practice, attempts to work with ...

of parameters. Also Fraser’s constructive probabilities devised for the same purpose do not treat this point completely.

Example

For

\boldsymbol x

drawn from a gamma distribution, whose specification requires values for the parameters λ and ''k'', a twisting argument may be stated by following the below procedure. Given the meaning of these parameters we know that :

(k\leq k')\leftrightarrow(s_k \leq s_) \text \lambda,

(\lambda\leq\lambda')\leftrightarrow(s_\leq s_\lambda) \text k,

where

s_k=\prod_^m x_i

and

s_\lambda=\sum_^m x_i

. This leads to a joint cumulative distribution function :

F_(\lambda,k)=F_(\lambda) F_K(k) = F_(k) F_\Lambda(\lambda).

Using the first factorization and replacing

s_k

with

r_k=\frac

in order to have a distribution of

K

that is independent of

\Lambda

, we have :

F_(\lambda)=1 - \frac

F_K(k)=1-F_(r_K)

with ''m'' denoting the sample size,

s_\Lambda

and

r_K

are the observed statistics (hence with indices denoted by capital letters),

\Gamma(a,b)

the

incomplete gamma function In mathematics, the upper and lower incomplete gamma functions are types of special functions which arise as solutions to various mathematical problems such as certain integrals. Their respective names stem from their integral definitions, which ...

and

F_(r_K)

the

Fox's H function Fox's may refer to: * Fox's Biscuits, a bakery company in the United Kingdom * Fox's Confectionery, a confectioner in the United Kingdom **Fox's Glacier Mints * Fox's Pizza Den Fox's Pizza Den is a pizzeria chain based in Murrysville, Pennsylvania ...

that can be approximated with a gamma distribution again with proper parameters (for instance estimated through the method of moments) as a function of ''k'' and ''m''. With a sample size

m=30, s_\Lambda=72.82

and

r_K=

4.5\times 10^

, you may find the joint p.d.f. of the Gamma parameters ''K'' and

\Lambda

on the left. The marginal distribution of ''K'' is reported in the picture on the right.

Notes

References

* * *{{cite book , author=Apolloni, B , author2=Malchiodi, D. , author3=Gaito, S. , title=Algorithmic Inference in Machine Learning , publisher=Magill , series=International Series on Advanced Intelligence , location=Adelaide , volume=5 , quote=Advanced Knowledge International , edition=2nd , year=2006 Algorithmic inference Computational statistics