HOME

TheInfoList



OR:

The ratio average (RA) plot is an integer-based version of an MA plot for visualizing two-condition count data. Its distinctive arrow-like shape derives from the way it includes condition-unique (0,''n'') or (''n'',0) points into the plot via an epsilon factor.


Definition

An RA plot, like its cousin, the MA plot, is ''a'' re-scaled and (45-degree) rotated version of ''a'' simple two-dimensional scatter plot of ''a'' versus ''b'' where ''a'' and ''b'' are equal-length vectors of positive measurements. This rescaling and rotation allows for better visibility and emphasis of important outliers points that vary between the two measurement conditions. Essentially it is a plot of the log ratio vs the average log of each pairing of the elements of ''a'' and ''b''. Unlike an MA plot, however, because the RA plot takes non-negative integer counts as input, it must employ work-arounds to include mathematically invisible points (such as points where one or both element(s) of the pair is zero). If we modify our original ''a'' (or ''b'') vector via: : a = \begin a + \varepsilon, & \texta = 0 \\ a, & \texta > 0 \end where : 0 < \varepsilon < 0.5 then ''R'' and ''A'' can be defined as: : R=\log_2 (a / b) : A=\frac12 \times (\log_2 a + \log_2 b) ''R'', like ''M'', is plotted on the ''y''-axis and represents a log (fold change) ratio between ''a'' and ''b''. ''A'' is plotted on the ''x''-axis and represents the average abundance for a coordinate pair. The RA plot provides a quick overview of the distribution and size of a dataset consisting of non-zero counts.


Etymology

The acronym prefix "R.A." is sometimes pronounced as the one syllable word "ray" because of the plot's strong resemblance to a geometric ray. This characteristic arrow-like shape derives from two key features: on the right at the
vector Vector most often refers to: *Euclidean vector, a quantity with a magnitude and a direction *Vector (epidemiology), an agent that carries and transmits an infectious pathogen into another living organism Vector may also refer to: Mathematic ...
origin, a long asymptotic tail, and on the left (forming the arrow head) two (often dense) patches of condition-unique points.


Work-arounds for point visibility and inclusion


Condition unique points

Because a large portion of the pairs of ''a'' and ''b'' contain zeros in one or both conditions, they are impossible to plot as-is on a log scale. Other MA plotting functions artificially include these condition-unique points in the plot by spreading them vertically as a "smear" on the left or horizontally as a
rug
at the very top and bottom of the plot. In an RA plot, by contrast, the uniques are included via addition a small epsilon factor (between .1 and .5) which places them in a more statistically appropriate location in the plot.


Overplotting

Another problem with plotting this (or any) type of count data i
overplotting
which is solved in the RA plot by
jitter In electronics and telecommunications, jitter is the deviation from true periodicity of a presumably periodic signal, often in relation to a reference clock signal. In clock recovery applications it is called timing jitter. Jitter is a significa ...
ing the points out away from each other but no so far as to merge with other coordinates. The result of this feature is a patchwork-like appearance to the plot that fades away as the ''A'' increases.


Packages

Th
caroline CRAN R package
contains the only known implementation of an RA plot. However, the meta-transcriptomic

provides a wrapper around this RA plot implementation and is used for assessing fold change in transcription of genes (the points) while simultaneously visualizing each gene's taxonomic distributions as individual pie chart points.Schruth, D. & Marchetti, A. (2011). Microbial Assemblage Normalized Transcript Analysis. R package version 0.9.5.


Examples


library(caroline)
a <- rnbinom(n=10000, mu=5, size=2)
b <- rnbinom(n=10000, mu=5, size=2)

raPlot(a, b)


References


See also

* MA plot *
DNA microarray A DNA microarray (also commonly known as DNA chip or biochip) is a collection of microscopic DNA spots attached to a solid surface. Scientists use DNA microarrays to measure the expression levels of large numbers of genes simultaneously or to ...
*
Bland–Altman plot A Bland–Altman plot (difference plot) in analytical chemistry or biomedicine is a method of data plotting used in analyzing the agreement between two different assays. It is identical to a Tukey mean-difference plot, the name by which it is ...
{{DEFAULTSORT:Ra Plot Gene expression Microarrays Plots (graphics)