HOME

TheInfoList



OR:

The Z-factor is a measure of
statistical Statistics (from German: '' Statistik'', "description of a state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, industr ...
effect size In statistics, an effect size is a value measuring the strength of the relationship between two variables in a population, or a sample-based estimate of that quantity. It can refer to the value of a statistic calculated from a sample of data, the ...
. It has been proposed for use in
high-throughput screening High-throughput screening (HTS) is a method for scientific experimentation especially used in drug discovery and relevant to the fields of biology, materials science and chemistry. Using robotics, data processing/control software, liquid handlin ...
(where it is also known as Z-prime), and commonly written as Z' to judge whether the response in a particular
assay An assay is an investigative (analytic) procedure in laboratory medicine, mining, pharmacology, environmental biology and molecular biology for qualitatively assessing or quantitatively measuring the presence, amount, or functional activity of ...
is large enough to warrant further attention.


Background

In high-throughput screens, experimenters often compare a large number (hundreds of thousands to tens of millions) of single measurements of unknown samples to positive and negative control samples. The particular choice of experimental conditions and measurements is called an assay. Large screens are expensive in time and resources. Therefore, prior to starting a large screen, smaller test (or pilot) screens are used to assess the quality of an assay, in an attempt to predict if it would be useful in a high-throughput setting. The Z-factor is an attempt to quantify the suitability of a particular assay for use in a full-scale, high-throughput screen.


Definition

The Z-factor is defined in terms of four parameters: the
mean There are several kinds of mean in mathematics, especially in statistics. Each mean serves to summarize a given group of data, often to better understand the overall value ( magnitude and sign) of a given data set. For a data set, the '' ari ...
s (\mu) and standard deviations (\sigma) of both the positive (p) and negative (n) controls (\mu_p, \sigma_p, and \mu_n, \sigma_n). Given these values, the Z-factor is defined as: :\text = 1 - In practice, the Z-factor is estimated from the
sample mean The sample mean (or "empirical mean") and the sample covariance are statistics computed from a sample of data on one or more random variables. The sample mean is the average value (or mean value) of a sample of numbers taken from a larger po ...
s and sample standard deviations :\text = 1 -


Interpretation

The following interpretations for the Z-factor are taken from: Note that by the standards of many types of experiments, a zero Z-factor would suggest a large effect size, rather than a borderline useless result as suggested above. For example, if σpn=1, then μp=6 and μn=0 gives a zero Z-factor. But for normally-distributed data with these parameters, the probability that the positive control value would be less than the negative control value is less than 1 in 105. Extreme conservatism is used in high throughput screening due to the large number of tests performed.


Limitations

The constant factor 3 in the definition of the Z-factor is motivated by the
normal distribution In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is : f(x) = \frac e^ The parameter \mu i ...
, for which more than 99% of values occur within 3 standard deviations of the mean. If the data follow a strongly non-normal distribution, the reference points (e.g. the meaning of a negative value) may be misleading. Another issue is that the usual estimates of the mean and standard deviation are not robust; accordingly many users in the high-throughput screening community prefer the "Robust Z-prime" which substitutes the median for the mean and the median absolute deviation for the standard deviation. Extreme values (outliers) in either the positive or negative controls can adversely affect the Z-factor, potentially leading to an apparently unfavorable Z-factor even when the assay would perform well in actual screening . In addition, the application of the single Z-factor-based criterion to two or more positive controls with different strengths in the same assay will lead to misleading results . The absolute sign in the Z-factor makes it inconvenient to derive the statistical inference of Z-factor mathematically . A recently proposed statistical parameter, strictly standardized mean difference ( SSMD), can address these issues . One estimate of SSMD is robust to outliers.


See also

*
high-throughput screening High-throughput screening (HTS) is a method for scientific experimentation especially used in drug discovery and relevant to the fields of biology, materials science and chemistry. Using robotics, data processing/control software, liquid handlin ...
* SSMD *
Z-score In statistics, the standard score is the number of standard deviations by which the value of a raw score (i.e., an observed value or data point) is above or below the mean value of what is being observed or measured. Raw scores above the me ...
or
Standard score In statistics, the standard score is the number of standard deviations by which the value of a raw score (i.e., an observed value or data point) is above or below the mean value of what is being observed or measured. Raw scores above the me ...


References


Further reading

* Kraybill, B. (2005) "Quantitative Assay Evaluation and Optimization" (unpublished note) * Zhang XHD (2011
"Optimal High-Throughput Screening: Practical Experimental Design and Data Analysis for Genome-scale RNAi Research, Cambridge University Press"
{{DEFAULTSORT:Z-Factor Change detection Effect size Biological techniques and tools Statistical analysis Pharmaceutical industry Sample statistics