Over-dispersion

	Over-dispersion In statistics, overdispersion is the presence of greater variability (statistical dispersion) in a data set than would be expected based on a given statistical model. A common task in applied statistics is choosing a parametric model to fit a given set of empirical observations. This necessitates an assessment of the fit of the chosen model. It is usually possible to choose the model parameters in such a way that the theoretical population mean of the model is approximately equal to the sample mean. However, especially for simple models with few parameters, theoretical predictions may not match empirical observations for higher moments. When the observed variance is higher than the variance of a theoretical model, overdispersion has occurred. Conversely, underdispersion means that there was less variation in the data than predicted. Overdispersion is a very common feature in applied data analysis because in practice, populations are frequently heterogeneous (non-uniform) contrary ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Statistics Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, industrial, or social problem, it is conventional to begin with a statistical population or a statistical model to be studied. Populations can be diverse groups of people or objects such as "all people living in a country" or "every atom composing a crystal". Statistics deals with every aspect of data, including the planning of data collection in terms of the design of statistical survey, surveys and experimental design, experiments. When census data (comprising every member of the target population) cannot be collected, statisticians collect data by developing specific experiment designs and survey sample (statistics), samples. Representative sampling assures that inferences and conclusions can reasonably extend from the sample ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Trivers–Willard Hypothesis In evolutionary biology and evolutionary psychology, the Trivers–Willard hypothesis, formally proposed by Robert Trivers and Dan Willard in 1973, suggests that female mammals adjust the sex ratio of offspring in response to maternal condition, so as to maximize their reproductive success ( fitness). For example, it may predict greater parental investment in males by parents in "good conditions" and greater investment in females by parents in "poor conditions" (relative to parents in good conditions). The reasoning for this prediction is as follows: Assume that parents have information on the sex of their offspring and can influence their survival differentially. While selection pressures exist to maintain a 1:1 sex ratio, evolution will favor local deviations from this if one sex has a likely greater reproductive payoff than is usual. Trivers and Willard also identified a circumstance in which reproducing individuals might experience deviations from expected offspring reproduct ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Sampling Error In statistics, sampling errors are incurred when the statistical characteristics of a population are estimated from a subset, or sample, of that population. Since the sample does not include all members of the population, statistics of the sample (often known as estimators), such as means and quartiles, generally differ from the statistics of the entire population (known as parameters). The difference between the sample statistic and population parameter is considered the sampling error.Sarndal, Swenson, and Wretman (1992), Model Assisted Survey Sampling, Springer-Verlag, For example, if one measures the height of a thousand individuals from a population of one million, the average height of the thousand is typically not the same as the average height of all one million people in the country. Since sampling is almost always done to estimate population parameters that are unknown, by definition exact measurement of the sampling errors will not be possible; however they can often ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Margin Of Error The margin of error is a statistic expressing the amount of random sampling error in the results of a Statistical survey, survey. The larger the margin of error, the less confidence one should have that a poll result would reflect the result of a simultaneous census of the entire Statistical population, population. The margin of error will be positive whenever a population is incompletely sampled and the outcome measure has positive variance, which is to say, whenever the measure ''varies''. The term ''margin of error'' is often used in non-survey contexts to indicate observational error in reporting measured quantities. Concept Consider a simple ''yes/no'' poll P as a sample of n respondents drawn from a population N \text(n \ll N) reporting the percentage p of ''yes'' responses. We would like to know how close p is to the true result of a survey of the entire population N, without having to conduct one. If, hypothetically, we were to conduct a poll P over subsequent sample ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Statistical Survey Survey methodology is "the study of survey methods". As a field of applied statistics concentrating on human-research surveys, survey methodology studies the sampling of individual units from a population and associated techniques of survey data collection, such as questionnaire construction and methods for improving the number and accuracy of responses to surveys. Survey methodology targets instruments or procedures that ask one or more questions that may or may not be answered. Researchers carry out statistical surveys with a view towards making statistical inferences about the population being studied; such inferences depend strongly on the survey questions used. Polls about public opinion, public-health surveys, market-research surveys, government surveys and censuses all exemplify quantitative research that uses survey methodology to answer questions about a population. Although censuses do not include a "sample", they do include other aspects of survey methodolog ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Kurtosis In probability theory and statistics, kurtosis (from , ''kyrtos'' or ''kurtos'', meaning "curved, arching") refers to the degree of “tailedness” in the probability distribution of a real-valued random variable. Similar to skewness, kurtosis provides insight into specific characteristics of a distribution. Various methods exist for quantifying kurtosis in theoretical distributions, and corresponding techniques allow estimation based on sample data from a population. It’s important to note that different measures of kurtosis can yield varying interpretations. The standard measure of a distribution's kurtosis, originating with Karl Pearson, is a scaled version of the fourth moment of the distribution. This number is related to the tails of the distribution, not its peak; hence, the sometimes-seen characterization of kurtosis as " peakedness" is incorrect. For this measure, higher kurtosis corresponds to greater extremity of deviations (or outliers), and not the configur ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Skewness In probability theory and statistics, skewness is a measure of the asymmetry of the probability distribution of a real-valued random variable about its mean. The skewness value can be positive, zero, negative, or undefined. For a unimodal distribution (a distribution with a single peak), negative skew commonly indicates that the ''tail'' is on the left side of the distribution, and positive skew indicates that the tail is on the right. In cases where one tail is long but the other tail is fat, skewness does not obey a simple rule. For example, a zero value in skewness means that the tails on both sides of the mean balance out overall; this is the case for a symmetric distribution but can also be true for an asymmetric distribution where one tail is long and thin, and the other is short but fat. Thus, the judgement on the symmetry of a given distribution by using only its skewness is risky; the distribution shape must be taken into account. Introduction Consider the two d ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Multilevel Model Multilevel models are statistical models of parameters that vary at more than one level. An example could be a model of student performance that contains measures for individual students as well as measures for classrooms within which the students are grouped. These models can be seen as generalizations of linear models (in particular, linear regression), although they can also extend to non-linear models. These models became much more popular after sufficient computing power and software became available. Multilevel models are particularly appropriate for research designs where data for participants are organized at more than one level (i.e., nested data). The units of analysis are usually individuals (at a lower level) who are nested within contextual/aggregate units (at a higher level). While the lowest level of data in multilevel models is usually an individual, repeated measurements of individuals may also be examined. As such, multilevel models provide an alternative type ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Logistic Regression In statistics, a logistic model (or logit model) is a statistical model that models the logit, log-odds of an event as a linear function (calculus), linear combination of one or more independent variables. In regression analysis, logistic regression (or logit regression) estimation theory, estimates the parameters of a logistic model (the coefficients in the linear or non linear combinations). In binary logistic regression there is a single binary variable, binary dependent variable, coded by an indicator variable, where the two values are labeled "0" and "1", while the independent variables can each be a binary variable (two classes, coded by an indicator variable) or a continuous variable (any real value). The corresponding probability of the value labeled "1" can vary between 0 (certainly the value "0") and 1 (certainly the value "1"), hence the labeling; the function that converts log-odds to probability is the logistic function, hence the name. The unit of measurement for the ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Normal Distribution In probability theory and statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is f(x) = \frac e^\,. The parameter is the mean or expectation of the distribution (and also its median and mode), while the parameter \sigma^2 is the variance. The standard deviation of the distribution is (sigma). A random variable with a Gaussian distribution is said to be normally distributed, and is called a normal deviate. Normal distributions are important in statistics and are often used in the natural and social sciences to represent real-valued random variables whose distributions are not known. Their importance is partly due to the central limit theorem. It states that, under some conditions, the average of many samples (observations) of a random variable with finite mean and variance is itself a random variable—whose distribution c ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Bernoulli Distribution In probability theory and statistics, the Bernoulli distribution, named after Swiss mathematician Jacob Bernoulli, is the discrete probability distribution of a random variable which takes the value 1 with probability p and the value 0 with probability q = 1-p. Less formally, it can be thought of as a model for the set of possible outcomes of any single experiment that asks a yes–no question. Such questions lead to outcome (probability), outcomes that are Boolean-valued function, Boolean-valued: a single bit whose value is success/yes and no, yes/Truth value, true/Binary code, one with probability ''p'' and failure/no/false (logic), false/Binary code, zero with probability ''q''. It can be used to represent a (possibly biased) coin toss where 1 and 0 would represent "heads" and "tails", respectively, and ''p'' would be the probability of the coin landing on heads (or vice versa where 1 would represent tails and ''p'' would be the probability of tails). In particular, unfair co ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Compound Distribution In probability and statistics, a compound probability distribution (also known as a mixture distribution or contagious distribution) is the probability distribution that results from assuming that a random variable is distributed according to some parametrized distribution, with (some of) the parameters of that distribution themselves being random variables. If the parameter is a scale parameter, the resulting mixture is also called a scale mixture. The compound distribution ("unconditional distribution") is the result of marginalizing (integrating) over the ''latent'' random variable(s) representing the parameter(s) of the parametrized distribution ("conditional distribution"). Definition A compound probability distribution is the probability distribution that results from assuming that a random variable X is distributed according to some parametrized distribution F with an unknown parameter \theta that is again distributed according to some other distribution G. The resulting d ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]