Midhinge

	Midhinge In statistics, the midhinge () is the average of the first and third quartiles and is thus a measure of location. Equivalently, it is the 25% trimmed mid-range or 25% midsummary; it is an L-estimator. The midhinge is defined as \begin \operatorname(X) &= \overline \\ &= \frac \\ &= \frac \\ &= M_(X). \end The midhinge is related to the interquartile range (), the difference of the third and first quartiles (i.e. ), which is a measure of statistical dispersion. The two are complementary in sense that if one knows the midhinge and the , one can find the first and third quartiles. The use of the term ''hinge'' for the lower or upper quartiles derives from John Tukey's work on exploratory data analysis in the late 1970s,Tukey, J. W. (1977) ''Exploratory Data Analysis'', Addison-Wesley. and ''midhinge'' is a fairly modern term dating from around that time. The midhinge is slightly simpler to calculate than the trimean (), which originated in the same context and equals the avera ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	L-estimator In statistics, an L-estimator (or L-statistic) is an estimator which is a linear combination of order statistics of the measurements. This can be as little as a single point, as in the median (of an odd number of values), or as many as all points, as in the mean. The main benefits of L-estimators are that they are often extremely simple, and often robust statistics: assuming sorted data, they are very easy to calculate and interpret, and are often resistant to outliers. They thus are useful in robust statistics, as descriptive statistics, in statistics education, and when computation is difficult. However, they are inefficient, and in modern times robust statistics M-estimators are preferred, although these are much more difficult computationally. In many circumstances L-estimators are reasonably efficient, and thus adequate for initial estimation. Examples A basic example is the median. Given ''n'' values x_1, \ldots, x_n, if n=2k+1 is odd, the median equals x_, the (n+1)/2 ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Trimean In statistics the trimean (TM), or Tukey's trimean, is a measure of a probability distribution's location defined as a weighted average of the distribution's median and its two quartiles: : TM= \frac This is equivalent to the arithmetic mean of the median and the midhinge: : TM= \frac\left(Q_2 + \frac\right) The foundations of the trimean were part of Arthur Bowley's teachings, and later popularized by statistician John Tukey in his 1977 book which has given its name to a set of techniques called exploratory data analysis. Like the median and the midhinge, but unlike the sample mean, it is a statistically resistant L-estimator with a breakdown point of 25%. This beneficial property has been described as follows: Efficiency Despite its simplicity, the trimean is a remarkably efficient estimator of population mean. More precisely, for a large data set (over 100 points) from a symmetric population, the average of the 18th, 50th, and 82nd percentile is the most efficient 3-p ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Mid-range In statistics, the mid-range or mid-extreme is a measure of central tendency of a sample defined as the arithmetic mean of the maximum and minimum values of the data set: :M=\frac. The mid-range is closely related to the range, a measure of statistical dispersion defined as the difference between maximum and minimum values. The two measures are complementary in sense that if one knows the mid-range and the range, one can find the sample maximum and minimum values. The mid-range is rarely used in practical statistical analysis, as it lacks efficiency as an estimator for most distributions of interest, because it ignores all intermediate points, and lacks robustness, as outliers change it significantly. Indeed, for many distributions it is one of the least efficient and least robust statistics. However, it finds some use in special cases: it is the maximally efficient estimator for the center of a uniform distribution, trimmed mid-ranges address robustness, and as an L-estima ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Midsummary In statistics, the mid-range or mid-extreme is a measure of central tendency of a sample defined as the arithmetic mean of the maximum and minimum values of the data set: :M=\frac. The mid-range is closely related to the range, a measure of statistical dispersion defined as the difference between maximum and minimum values. The two measures are complementary in sense that if one knows the mid-range and the range, one can find the sample maximum and minimum values. The mid-range is rarely used in practical statistical analysis, as it lacks efficiency as an estimator for most distributions of interest, because it ignores all intermediate points, and lacks robustness, as outliers change it significantly. Indeed, for many distributions it is one of the least efficient and least robust statistics. However, it finds some use in special cases: it is the maximally efficient estimator for the center of a uniform distribution, trimmed mid-ranges address robustness, and as an L-estimator ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Quartile In statistics, quartiles are a type of quantiles which divide the number of data points into four parts, or ''quarters'', of more-or-less equal size. The data must be ordered from smallest to largest to compute quartiles; as such, quartiles are a form of order statistic. The three quartiles, resulting in four data divisions, are as follows: * The first quartile (''Q''1) is defined as the 25th percentile where lowest 25% data is below this point. It is also known as the ''lower'' quartile. * The second quartile (''Q''2) is the median of a data set; thus 50% of the data lies below this point. * The third quartile (''Q''3) is the 75th percentile where lowest 75% data is below this point. It is known as the ''upper'' quartile, as 75% of the data lies below this point. Along with the minimum and maximum of the data (which are also quartiles), the three quartiles described above provide a five-number summary of the data. This summary is important in statistics because it provides infor ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Interquartile Range In descriptive statistics, the interquartile range (IQR) is a measure of statistical dispersion, which is the spread of the data. The IQR may also be called the midspread, middle 50%, fourth spread, or H‑spread. It is defined as the difference between the 75th and 25th percentiles of the data. To calculate the IQR, the data set is divided into quartiles, or four rank-ordered even parts via linear interpolation. These quartiles are denoted by ''Q''1 (also called the lower quartile), ''Q''2 (the median), and ''Q''3 (also called the upper quartile). The lower quartile corresponds with the 25th percentile and the upper quartile corresponds with the 75th percentile, so IQR = ''Q''3 − ''Q''1. The IQR is an example of a trimmed estimator, defined as the 25% trimmed range, which enhances the accuracy of dataset statistics by dropping lower contribution, outlying points. It is also used as a robust measure of scale It can be clearly visualized by the box on a box plot. Use ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Statistics Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, industrial, or social problem, it is conventional to begin with a statistical population or a statistical model to be studied. Populations can be diverse groups of people or objects such as "all people living in a country" or "every atom composing a crystal". Statistics deals with every aspect of data, including the planning of data collection in terms of the design of statistical survey, surveys and experimental design, experiments. When census data (comprising every member of the target population) cannot be collected, statisticians collect data by developing specific experiment designs and survey sample (statistics), samples. Representative sampling assures that inferences and conclusions can reasonably extend from the sample ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Location Parameter In statistics, a location parameter of a probability distribution is a scalar- or vector-valued parameter x_0, which determines the "location" or shift of the distribution. In the literature of location parameter estimation, the probability distributions with such parameter are found to be formally defined in one of the following equivalent ways: * either as having a probability density function or probability mass function f(x - x_0); or * having a cumulative distribution function F(x - x_0); or * being defined as resulting from the random variable transformation x_0 + X, where X is a random variable with a certain, possibly unknown, distribution. See also . A direct example of a location parameter is the parameter \mu of the normal distribution. To see this, note that the probability density function f(x , \mu, \sigma) of a normal distribution \mathcal(\mu,\sigma^2) can have the parameter \mu factored out and be written as: : g(x' = x - \mu , \sigma) = \frac \exp\left(-\f ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Trimmed Estimator In statistics, a trimmed estimator is an estimator derived from another estimator by excluding some of the extreme values, a process called truncation. This is generally done to obtain a more robust statistic, and the extreme values are considered outliers. Trimmed estimators also often have higher efficiency for mixture distributions, and heavy-tailed distributions than the corresponding untrimmed estimator, at the cost of lower efficiency for other distributions, such as the normal distribution. Given an estimator, the x% trimmed version is obtained by discarding the x% lowest or highest observations or on both end: it is a statistic on the ''middle'' of the data. For instance, the 5% trimmed mean is obtained by taking the mean of the 5% to 95% range. In some cases a trimmed estimator discards a fixed number of points (such as maximum and minimum) instead of a percentage. Examples The median is the most trimmed statistic (nominally 50%), as it discards all but the most centra ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Statistical Dispersion In statistics, dispersion (also called variability, scatter, or spread) is the extent to which a distribution is stretched or squeezed. Common examples of measures of statistical dispersion are the variance, standard deviation, and interquartile range. For instance, when the variance of data in a set is large, the data is widely scattered. On the other hand, when the variance is small, the data in the set is clustered. Dispersion is contrasted with location or central tendency, and together they are the most used properties of distributions. Measures of statistical dispersion A measure of statistical dispersion is a nonnegative real number that is zero if all the data are the same and increases as the data become more diverse. Most measures of dispersion have the same units as the quantity being measured. In other words, if the measurements are in metres or seconds, so is the measure of dispersion. Examples of dispersion measures include: * Standard deviation * Interquartile ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	John Tukey John Wilder Tukey (; June 16, 1915 – July 26, 2000) was an American mathematician and statistician, best known for the development of the fast Fourier Transform (FFT) algorithm and box plot. The Tukey range test, the Tukey lambda distribution, the Tukey test of additivity, and the Teichmüller–Tukey lemma all bear his name. He is also credited with coining the term '' bit'' and the first published use of the word ''software''. Biography Tukey was born in New Bedford, Massachusetts, in 1915, to a Latin teacher father and a private tutor. He was mainly taught by his mother and attended regular classes only for certain subjects like French. Tukey obtained a B.A. in 1936 and M.S. in 1937 in chemistry, from Brown University, before moving to Princeton University, where in 1939 he received a PhD in mathematics after completing a doctoral dissertation titled "On denumerability in topology". During World War II, Tukey worked at the Fire Control Research Office and coll ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Exploratory Data Analysis In statistics, exploratory data analysis (EDA) is an approach of data analysis, analyzing data sets to summarize their main characteristics, often using statistical graphics and other data visualization methods. A statistical model can be used or not, but primarily EDA is for seeing what the data can tell beyond the formal modeling and thereby contrasts with traditional hypothesis testing, in which a model is supposed to be selected before the data is seen. Exploratory data analysis has been promoted by John Tukey since 1970 to encourage statisticians to explore the data, and possibly formulate hypotheses that could lead to new data collection and experiments. EDA is different from Data analysis#Initial data analysis, initial data analysis (IDA), which focuses more narrowly on checking assumptions required for model fitting and hypothesis testing, and handling missing values and making transformations of variables as needed. EDA encompasses IDA. Overview Tukey defined data analysi ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]