Unsolved Problems In Statistics
   HOME

TheInfoList



OR:

There are many longstanding
unsolved problems in mathematics Many mathematical problems have been stated but not yet solved. These problems come from many areas of mathematics, such as theoretical physics, computer science, algebra, analysis, combinatorics, algebraic, differential, discrete and Eucli ...
for which a solution has still not yet been found. The notable unsolved problems in
statistics Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...
are generally of a different flavor; according to John Tukey, "difficulties in identifying problems have delayed statistics far more than difficulties in solving problems." A list of "one or two open problems" (in fact 22 of them) was given by David Cox.


Inference and testing

* How to detect and correct for
systematic error Observational error (or measurement error) is the difference between a measurement, measured value of a physical quantity, quantity and its unknown true value.Dodge, Y. (2003) ''The Oxford Dictionary of Statistical Terms'', OUP. Such errors are ...
s, especially in sciences where
random error Observational error (or measurement error) is the difference between a measured value of a quantity and its unknown true value.Dodge, Y. (2003) ''The Oxford Dictionary of Statistical Terms'', OUP. Such errors are inherent in the measurement ...
s are large (a situation Tukey termed uncomfortable science). * The Graybill–Deal estimator is often used to estimate the common mean of two normal populations with unknown and possibly unequal variances. Though this estimator is generally unbiased, its admissibility remains to be shown. *
Meta-analysis Meta-analysis is a method of synthesis of quantitative data from multiple independent studies addressing a common research question. An important part of this method involves computing a combined effect size across all of the studies. As such, th ...
: Though independent
p-value In null-hypothesis significance testing, the ''p''-value is the probability of obtaining test results at least as extreme as the result actually observed, under the assumption that the null hypothesis is correct. A very small ''p''-value means ...
s can be combined using Fisher's method, techniques are still being developed to handle the case of dependent p-values. * Behrens–Fisher problem:
Yuri Linnik Yuri Vladimirovich Linnik (; January 8, 1915 – June 30, 1972) was a Soviet mathematician active in number theory, probability theory and mathematical statistics. Biography Linnik was born in Bila Tserkva, in present-day Ukraine. He went to ...
showed in 1966 that there is no uniformly most powerful test for the difference of two means when the variances are unknown and possibly unequal. That is, there is no
exact test An exact (significance) test is a statistical test such that if the null hypothesis is true, then all assumptions made during the derivation of the distribution of the test statistic are met. Using an exact test provides a significance test that ...
(meaning that, if the means are in fact equal, one that rejects the
null hypothesis The null hypothesis (often denoted ''H''0) is the claim in scientific research that the effect being studied does not exist. The null hypothesis can also be described as the hypothesis in which no relationship exists between two sets of data o ...
with probability exactly α) that is also the most powerful for all values of the variances (which are thus
nuisance parameter In statistics, a nuisance parameter is any parameter which is unspecified but which must be accounted for in the hypothesis testing of the parameters which are of interest. The classic example of a nuisance parameter comes from the normal distri ...
s). Though there are many approximate solutions (such as Welch's t-test), the problem continues to attract attention as one of the classic problems in statistics. *
Multiple comparisons Multiple comparisons, multiplicity or multiple testing problem occurs in statistics when one considers a set of statistical inferences simultaneously or estimates a subset of parameters selected based on the observed values. The larger the numbe ...
: There are various ways to adjust p-values to compensate for the simultaneous or sequential testing of hypotheses. Of particular interest is how to simultaneously control the overall error rate, preserve statistical power, and incorporate the dependence between tests into the adjustment. These issues are especially relevant when the number of simultaneous tests can be very large, as is increasingly the case in the analysis of data from
DNA microarray A DNA microarray (also commonly known as a DNA chip or biochip) is a collection of microscopic DNA spots attached to a solid surface. Scientists use DNA microarrays to measure the expression levels of large numbers of genes simultaneously or t ...
s. *
Bayesian statistics Bayesian statistics ( or ) is a theory in the field of statistics based on the Bayesian interpretation of probability, where probability expresses a ''degree of belief'' in an event. The degree of belief may be based on prior knowledge about ...
: A list of open problems in Bayesian statistics has been proposed.


Experimental design

* As the theory of
Latin square Latin ( or ) is a classical language belonging to the Italic branch of the Indo-European languages. Latin was originally spoken by the Latins in Latium (now known as Lazio), the lower Tiber area around Rome, Italy. Through the expansion o ...
s is a cornerstone in the
design of experiments The design of experiments (DOE), also known as experiment design or experimental design, is the design of any task that aims to describe and explain the variation of information under conditions that are hypothesized to reflect the variation. ...
, solving the problems in Latin squares could have immediate applicability to experimental design.


Problems of a more philosophical nature

* Sampling of species problem: How is a probability updated when there is unanticipated new data? * Doomsday argument: How valid is the probabilistic argument that claims to predict the
future The future is the time after the past and present. Its arrival is considered inevitable due to the existence of time and the laws of physics. Due to the apparent nature of reality and the unavoidability of the future, everything that currently ex ...
lifetime of the human race given only an estimate of the total number of humans born so far?


Notes


References

* * {{unsolved problems
Statistics Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...
Unsolved problems List of unsolved problems may refer to several notable conjectures or open problems in various academic fields: Natural sciences, engineering and medicine * Unsolved problems in astronomy * Unsolved problems in biology * Unsolved problems in ch ...
*Statistics