Additive Model

	Additive Model In statistics, an additive model (AM) is a nonparametric regression method. It was suggested by Jerome H. Friedman and Werner Stuetzle (1981) and is an essential part of the ACE algorithm. The ''AM'' uses a one-dimensional smoother to build a restricted class of nonparametric regression models. Because of this, it is less affected by the curse of dimensionality than e.g. a ''p''-dimensional smoother. Furthermore, the ''AM'' is more flexible than a standard linear model, while being more interpretable than a general regression surface at the cost of approximation errors. Problems with ''AM'', like many other machine learning methods, include model selection, overfitting, and multicollinearity. Description Given a data set \_^n of ''n'' statistical units, where \_^n represent predictors and y_i is the outcome, the ''additive model'' takes the form : \mathrm x_, \ldots, x_= \beta_0+\sum_^p f_j(x_) or : Y= \beta_0+\sum_^p f_j(X_)+\varepsilon Where \mathrm \epsilon = 0, \mathrm( ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Statistics Statistics (from German: '' Statistik'', "description of a state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, industrial, or social problem, it is conventional to begin with a statistical population or a statistical model to be studied. Populations can be diverse groups of people or objects such as "all people living in a country" or "every atom composing a crystal". Statistics deals with every aspect of data, including the planning of data collection in terms of the design of surveys and experiments.Dodge, Y. (2006) ''The Oxford Dictionary of Statistical Terms'', Oxford University Press. When census data cannot be collected, statisticians collect data by developing specific experiment designs and survey samples. Representative sampling assures that inferences and conclusions can reasonably extend from the sample to the population as a whole. An ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Smooth Function In mathematical analysis, the smoothness of a function is a property measured by the number of continuous derivatives it has over some domain, called ''differentiability class''. At the very minimum, a function could be considered smooth if it is differentiable everywhere (hence continuous). At the other end, it might also possess derivatives of all orders in its domain, in which case it is said to be infinitely differentiable and referred to as a C-infinity function (or C^ function). Differentiability classes Differentiability class is a classification of functions according to the properties of their derivatives. It is a measure of the highest order of derivative that exists and is continuous for a function. Consider an open set U on the real line and a function f defined on U with real values. Let ''k'' be a non-negative integer. The function f is said to be of differentiability class ''C^k'' if the derivatives f',f'',\dots,f^ exist and are continuous on U. If f is k-dif ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Journal Of The American Statistical Association The ''Journal of the American Statistical Association (JASA)'' is the primary journal published by the American Statistical Association, the main professional body for statisticians in the United States. It is published four times a year in March, June, September and December by Taylor & Francis, Ltd on behalf of the American Statistical Association. As a statistics journal it publishes articles primarily focused on the application of statistics, statistical theory and methods in economic, social, physical, engineering, and health sciences. The journal also includes reviews of academic books which are important to the advancement of the field. It had an impact factor of 2.063 in 2010, tenth highest in the "Statistics and Probability" category of ''Journal Citation Reports''. In a 2003 survey of statisticians, the ''Journal of the American Statistical Association'' was ranked first, among all journals, for "Applications of Statistics" and second (after '' Annals of Statistics' ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Projection Pursuit Projection pursuit (PP) is a type of statistical technique which involves finding the most "interesting" possible projections in multidimensional data. Often, projections which deviate more from a normal distribution are considered to be more interesting. As each projection is found, the data are reduced by removing the component along that projection, and the process is repeated to find new projections; this is the "pursuit" aspect that motivated the technique known as matching pursuit. The idea of projection pursuit is to locate the projection or projections from high-dimensional space to low-dimensional space that reveal the most details about the structure of the data set. Once an interesting set of projections has been found, existing structures (clusters, surfaces, etc.) can be extracted and analyzed separately. Projection pursuit has been widely used for blind source separation, so it is very important in independent component analysis. Projection pursuit seeks one proje ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Median Polish The median polish is a simple and robust exploratory data analysis procedure proposed by the statistician John Tukey. The purpose of median polish is to find an additively-fit model for data in a two-way layout table (usually, results from a factorial experiment) of the form row effect + column effect + overall median. Median polish utilizes the medians obtained from the rows and the columns of a two-way table to iteratively calculate the row effect and column effect on the data. The results are not meant to be sensitive to the outliers, as the iterative procedure uses the medians rather than the means. Model for a two-way table Suppose an experiment observes the variable Y under the influence of two variables. We can arrange the data in a two-way table in which one variable is constant along the rows and the other variable constant along the columns. Let ''i'' and ''j'' denote the position of rows and columns (e.g. y''ij'' denotes the value of y at the ''i''th row and the ''j''th ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Generalized Additive Model For Location, Scale, And Shape The Generalized Additive Model for Location, Scale and Shape (GAMLSS) is an approach to statistical modelling and learning. GAMLSS is a modern distribution-based approach to ( semiparametric) regression. A parametric distribution is assumed for the response (target) variable but the parameters of this distribution can vary according to explanatory variables using linear, nonlinear or smooth functions. In machine learning parlance, GAMLSS is a form of supervised machine learning. In particular, the GAMLSS statistical framework enables flexible regression and smoothing models to be fitted to the data. The GAMLSS model assumes the response variable has any parametric distribution which might be heavy or light-tailed, and positively or negatively skewed. In addition, all the parameters of the distribution ocation (e.g., mean), scale (e.g., variance) and shape (skewness and kurtosis)can be modeled as linear, nonlinear or smooth functions of explanatory variables. Overview of the mod ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Projection Pursuit Regression In statistics, projection pursuit regression (PPR) is a statistical model developed by Jerome H. Friedman and Werner Stuetzle which is an extension of additive models. This model adapts the additive models in that it first projects the data matrix of explanatory variables in the optimal direction before applying smoothing functions to these explanatory variables. Model overview The model consists of linear combinations of ridge functions: non-linear transformations of linear combinations of the explanatory variables. The basic model takes the form :y_i=\beta_0 + \sum_^r f_j (\beta_j^x_i) + \varepsilon_i , where ''xi'' is a 1 × ''p'' row of the design matrix containing the explanatory variables for example ''i'', ''yi'' is a 1 × 1 prediction, is a collection of ''r'' vectors (each a unit vector of length ''p'') which contain the unknown parameters, is a collection of ''r'' initially unknown smooth functions that map from ℝ → ℝ, and ''r'' is a hyperparameter. Good v ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Backfitting Algorithm In statistics, the backfitting algorithm is a simple iterative procedure used to fit a generalized additive model. It was introduced in 1985 by Leo Breiman and Jerome Friedman along with generalized additive models. In most cases, the backfitting algorithm is equivalent to the Gauss–Seidel method, an algorithm used for solving a certain linear system of equations. Algorithm Additive models are a class of non-parametric regression models of the form: : Y_i = \alpha + \sum_^p f_j(X_) + \epsilon_i where each X_1, X_2, \ldots, X_p is a variable in our p-dimensional predictor X, and Y is our outcome variable. \epsilon represents our inherent error, which is assumed to have mean zero. The f_j represent unspecified smooth functions of a single X_j. Given the flexibility in the f_j, we typically do not have a unique solution: \alpha is left unidentifiable as one can add any constants to any of the f_j and subtract this value from \alpha. It is common to rectify this by constr ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Generalized Additive Model In statistics, a generalized additive model (GAM) is a generalized linear model in which the linear response variable depends linearly on unknown smooth functions of some predictor variables, and interest focuses on inference about these smooth functions. GAMs were originally developed by Trevor Hastie and Robert Tibshirani to blend properties of generalized linear models with additive models. They can be interpreted as the discriminative generalization of the naive Bayes generative model. The model relates a univariate response variable, ''Y'', to some predictor variables, ''x''''i''. An exponential family distribution is specified for Y (for example normal, binomial or Poisson distributions) along with a link function ''g'' (for example the identity or log functions) relating the expected value of ''Y'' to the predictor variables via a structure such as : g(\operatorname(Y))=\beta_0 + f_1(x_1) + f_2(x_2)+ \cdots + f_m(x_m).\,\! The functions ''f''''i'' may be functions with ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Robert Tibshirani Robert Tibshirani (born July 10, 1956) is a professor in the Departments of Statistics and Biomedical Data Science at Stanford University. He was a professor at the University of Toronto from 1985 to 1998. In his work, he develops statistical tools for the analysis of complex datasets, most recently in genomics and proteomics. His most well-known contributions are the Lasso method, which proposed the use of L1 penalization in regression and related problems, and Significance Analysis of Microarrays. Education and early life Tibshirani was born on 10 July 1956 in Niagara Falls, Ontario, Canada. He received his B. Math. in statistics and computer science from the University of Waterloo in 1979 and a Master's degree in Statistics from University of Toronto in 1980. Tibshirani joined the doctoral program at Stanford University in 1981 and received his Ph.D. in 1984 under the supervision of Bradley Efron. His dissertation was entitled "Local likelihood estimation". Honors ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Trevor Hastie Trevor John Hastie (born 27 June 1953) is an American statistician and computer scientist. He is currently serving as the John A. Overdeck Professor of Mathematical Sciences and Professor of Statistics at Stanford University. Hastie is known for his contributions to applied statistics, especially in the field of machine learning, data mining, and bioinformatics. He has authored several popular books in statistical learning, including ''The Elements of Statistical Learning: Data Mining, Inference, and Prediction''. Hastie has been listed as an ISI Highly Cited Author in Mathematics by the ISI Web of Knowledge. Education and career Hastie was born on 27 June 1953 in South Africa. He received his B.S. in statistics from the Rhodes University in 1976 and master's degree from University of Cape Town in 1979. Hastie joined the doctoral program at Stanford University in 1980 and received his Ph.D. in 1984 under the supervision of Werner Stuetzle. His dissertation was "Principal Curves ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Backfitting Algorithm In statistics, the backfitting algorithm is a simple iterative procedure used to fit a generalized additive model. It was introduced in 1985 by Leo Breiman and Jerome Friedman along with generalized additive models. In most cases, the backfitting algorithm is equivalent to the Gauss–Seidel method, an algorithm used for solving a certain linear system of equations. Algorithm Additive models are a class of non-parametric regression models of the form: : Y_i = \alpha + \sum_^p f_j(X_) + \epsilon_i where each X_1, X_2, \ldots, X_p is a variable in our p-dimensional predictor X, and Y is our outcome variable. \epsilon represents our inherent error, which is assumed to have mean zero. The f_j represent unspecified smooth functions of a single X_j. Given the flexibility in the f_j, we typically do not have a unique solution: \alpha is left unidentifiable as one can add any constants to any of the f_j and subtract this value from \alpha. It is common to rectify this by constr ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]