Robust Regression
   HOME





Robust Regression
In robust statistics, robust regression seeks to overcome some limitations of traditional regression analysis. A regression analysis models the relationship between one or more independent variables and a dependent variable. Standard types of regression, such as ordinary least squares, have favourable properties if their underlying assumptions are true, but can give misleading results otherwise (i.e. are not robust to assumption violations). Robust regression methods are designed to limit the effect that violations of assumptions by the underlying data-generating process have on regression estimates. For example, least squares estimates for regression models are highly sensitive to outliers: an outlier with twice the error magnitude of a typical observation contributes four (two squared) times as much to the squared error loss, and therefore has more leverage over the regression estimates. The Huber loss function is a robust alternative to standard square error loss that r ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Robust Statistics
Robust statistics are statistics that maintain their properties even if the underlying distributional assumptions are incorrect. Robust Statistics, statistical methods have been developed for many common problems, such as estimating location parameter, location, scale parameter, scale, and regression coefficient, regression parameters. One motivation is to produce statistical methods that are not unduly affected by outliers. Another motivation is to provide methods with good performance when there are small departures from a Parametric statistics, parametric distribution. For example, robust methods work well for mixtures of two normal distributions with different standard deviations; under this model, non-robust methods like a t-test work poorly. Introduction Robust statistics seek to provide methods that emulate popular statistical methods, but are not unduly affected by outliers or other small departures from Statistical assumption, model assumptions. In statistics, classical e ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Stromberg2004
Stromberg, Strömberg, Strømberg, or Stroemberg may refer to: Places Germany * Stromberg, Oelde, a town in Oelde * Stromberg (landscape), a region in Baden-Württemberg ** Stromberg-Heuchelberg Nature Park * Stromberg (Siebengebirge), a mountain peak now better known as Petersberg * Stromberg (Verbandsgemeinde), a collective municipality in the district of Bad Kreuznach in Rheinland-Pfalz ** Stromberg (Hunsrück), a town and the seat of the collective municipality People * Stromberg (surname), people with the surname of any spelling variant Corporate * Stromberg, a Carburettor brand name used by Zenith Carburettor Company (British) and the Bendix Corporation * Strömberg (company), a Finnish manufacturer of electronic products * Stromberg Guitars, an American company producing guitars, mainly for jazz musicians, between 1906 and 1955 * Stromberg-Carlson, an American manufacturer of telephone equipment, radios and television * Stromberg-Voisinet, manufacturer of musica ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Mixture Distribution
In probability and statistics, a mixture distribution is the probability distribution of a random variable that is derived from a collection of other random variables as follows: first, a random variable is selected by chance from the collection according to given probabilities of selection, and then the value of the selected random variable is realized. The underlying random variables may be random real numbers, or they may be random vectors (each having the same dimension), in which case the mixture distribution is a multivariate distribution. In cases where each of the underlying random variables is continuous, the outcome variable will also be continuous and its probability density function is sometimes referred to as a mixture density. The cumulative distribution function (and the probability density function if it exists) can be expressed as a convex combination (i.e. a weighted sum, with non-negative weights that sum to 1) of other distribution functions and density functi ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Student's T-distribution
In probability theory and statistics, Student's  distribution (or simply the  distribution) t_\nu is a continuous probability distribution that generalizes the Normal distribution#Standard normal distribution, standard normal distribution. Like the latter, it is symmetric around zero and bell-shaped. However, t_\nu has Heavy-tailed distribution, heavier tails, and the amount of probability mass in the tails is controlled by the parameter \nu. For \nu = 1 the Student's distribution t_\nu becomes the standard Cauchy distribution, which has very fat-tailed distribution, "fat" tails; whereas for \nu \to \infty it becomes the standard normal distribution \mathcal(0, 1), which has very "thin" tails. The name "Student" is a pseudonym used by William Sealy Gosset in his scientific paper publications during his work at the Guinness Brewery in Dublin, Ireland. The Student's  distribution plays a role in a number of widely used statistical analyses, including Student's t- ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Theil–Sen Estimator
In non-parametric statistics, the Theil–Sen estimator is a method for robustly fitting a line to sample points in the plane (simple linear regression) by choosing the median of the slopes of all lines through pairs of points. It has also been called Sen's slope estimator, slope selection, the single median method, the Kendall robust line-fit method, and the Kendall–Theil robust line. It is named after Henri Theil and Pranab K. Sen, who published papers on this method in 1950 and 1968 respectively,; and after Maurice Kendall because of its relation to the Kendall tau rank correlation coefficient. Theil–Sen regression has several advantages over Ordinary least squares regression. It is insensitive to outliers. It can be used for significance tests even when residuals are not normally distributed. It can be significantly more accurate than non-robust simple linear regression (least squares) for skewed and heteroskedastic data, and competes well against least squares ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Least Trimmed Squares
Least trimmed squares (LTS), or least trimmed sum of squares, is a robust statistical method that fits a function to a set of data whilst not being unduly affected by the presence of outliers . It is one of a number of methods for robust regression. Description of method Instead of the standard least squares The method of least squares is a mathematical optimization technique that aims to determine the best fit function by minimizing the sum of the squares of the differences between the observed values and the predicted values of the model. The me ... method, which minimises the sum of squared residuals over ''n'' points, the LTS method attempts to minimise the sum of squared residuals over a subset, k, of those points. The unused n - k points do not influence the fit. In a standard least squares problem, the estimated parameter values β are defined to be those values that minimise the objective function ''S''(β) of squared residuals: :S = \sum_^n r_i(\beta)^2, where th ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Peter Rousseeuw
Peter J. Rousseeuw (born 13 October 1956) is a Belgian statistician known for his work on robust statistics and cluster analysis. He obtained his PhD in 1981 at the Vrije Universiteit Brussel, following research carried out at the ETH in Zurich, which led to a book on influence functions. Later he was professor at the Delft University of Technology, The Netherlands, at the University of Fribourg, Switzerland, and at the University of Antwerp, Belgium. Next he was a senior researcher at Renaissance Technologies. He then returned to Belgium as professor at KU Leuven, until becoming emeritus in 2022. His former PhD students include Annick Leroy, Hendrik Lopuhaä, Geert Molenberghs, Christophe Croux, Mia Hubert, Stefan Van Aelst, Tim Verdonck and Jakob Raymaekers. Research Rousseeuw has constructed and published many useful techniques. He proposed the Least Trimmed Squares method and S-estimators for robust regression, which can resist outliers in the data. He also introduced ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Leverage (statistics)
In statistics and in particular in regression analysis, leverage is a measure of how far away the independent variable values of an observation are from those of the other observations. ''High-leverage points'', if any, are outliers with respect to the independent variables. That is, high-leverage points have no neighboring points in \mathbb^ space, where '''' is the number of independent variables in a regression model. This makes the fitted model likely to pass close to a high leverage observation. Hence high-leverage points have the potential to cause large changes in the parameter estimates when they are deleted i.e., to be influential points. Although an influential point will typically have high leverage, a high leverage point is not necessarily an influential point. The leverage is typically defined as the diagonal elements of the hat matrix. Definition and interpretations Consider the linear regression model _i = \boldsymbol_i^\boldsymbol+_i, i=1,\, 2,\ldots,\, n. That is ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




M-estimator
In statistics, M-estimators are a broad class of extremum estimators for which the objective function is a sample average. Both non-linear least squares and maximum likelihood estimation are special cases of M-estimators. The definition of M-estimators was motivated by robust statistics, which contributed new types of M-estimators. However, M-estimators are not inherently robust, as is clear from the fact that they include maximum likelihood estimators, which are in general not robust. The statistical procedure of evaluating an M-estimator on a data set is called M-estimation. The "M" initial stands for "maximum likelihood-type". More generally, an M-estimator may be defined to be a zero of an estimating function. This estimating function is often the derivative of another statistical function. For example, a maximum-likelihood estimate is the point where the derivative of the likelihood function with respect to the parameter is zero; thus, a maximum-likelihood estimator is a cr ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Least Absolute Deviations
Least absolute deviations (LAD), also known as least absolute errors (LAE), least absolute residuals (LAR), or least absolute values (LAV), is a statistical optimality criterion and a statistical optimization technique based on minimizing the sum of absolute deviations (also ''sum of absolute residuals'' or ''sum of absolute errors'') or the ''L''1 norm of such values. It is analogous to the least squares technique, except that it is based on ''absolute values'' instead of squared values. It attempts to find a function which closely approximates a set of data by minimizing residuals between points generated by the function and corresponding data points. The LAD estimate also arises as the maximum likelihood estimate if the errors have a Laplace distribution. It was introduced in 1757 by Roger Joseph Boscovich. Formulation Suppose that the data set consists of the points (''x''''i'', ''y''''i'') with ''i'' = 1, 2, ..., ''n''. We want to find a function ''f'' such that f(x_ ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

S-PLUS
S-PLUS is a commercial implementation of the S (programming language), S programming language sold by TIBCO Software Inc. It features object-oriented programming capabilities and advanced analytical algorithms. Its statistical analysis capabilities are commonly used by Econometrics, econometricians. The S-PLUS FinMetrics software package was developed for econometric time series analysis. Due to the increasing popularity of the open source S successor R (programming language), R, TIBCO Software released thTIBCO Enterprise Runtime for R (TERR)as an alternative R interpreter. It is available on Windows and UNIX operating systems. Historical timeline In 1988, S-PLUS was first developed and released by a Seattle-based start-up company called Statistical Sciences, Inc. The company's founder and sole owner is R. Douglas Martin, professor of statistics at the University of Washington, Seattle. Martin originally learned S while working at Bell Laboratories, where the S language was or ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]