The redundancy principle in
biology
Biology is the scientific study of life. It is a natural science with a broad scope but has several unifying themes that tie it together as a single, coherent field. For instance, all organisms are made up of cells that process hereditar ...
expresses the need of many copies of the same entity (
cells
Cell most often refers to:
* Cell (biology), the functional basic unit of life
Cell may also refer to:
Locations
* Monastic cell, a small room, hut, or cave in which a religious recluse lives, alternatively the small precursor of a monastery w ...
,
molecule
A molecule is a group of two or more atoms held together by attractive forces known as chemical bonds; depending on context, the term may or may not include ions which satisfy this criterion. In quantum physics, organic chemistry, and bio ...
s,
ions
An ion () is an atom or molecule with a net electrical charge.
The charge of an electron is considered to be negative by convention and this charge is equal and opposite to the charge of a proton, which is considered to be positive by conve ...
) to fulfill a
biological function
In evolutionary biology, function is the reason some object or process occurred in a system that evolved through natural selection. That reason is typically that it achieves some result, such as that chlorophyll helps to capture the energy of sun ...
. Examples are numerous: disproportionate numbers of
spermatozoa
A spermatozoon (; also spelled spermatozoön; ; ) is a motile sperm cell (biology), cell, or moving form of the ploidy, haploid cell (biology), cell that is the male gamete. A spermatozoon Fertilization, joins an ovum to form a zygote. (A zygote ...
during
fertilization
Fertilisation or fertilization (see spelling differences), also known as generative fertilisation, syngamy and impregnation, is the fusion of gametes to give rise to a new individual organism or offspring and initiate its development. Pro ...
compared to one egg, large number of
neurotransmitter
A neurotransmitter is a signaling molecule secreted by a neuron to affect another cell across a synapse. The cell receiving the signal, any main body part or target cell, may be another neuron, but could also be a gland or muscle cell.
Neur ...
s released during
neuronal
A neuron, neurone, or nerve cell is an membrane potential#Cell excitability, electrically excitable cell (biology), cell that communicates with other cells via specialized connections called synapses. The neuron is the main component of nervous ...
communication compared to the number of
receptors, large numbers of released
calcium
Calcium is a chemical element with the symbol Ca and atomic number 20. As an alkaline earth metal, calcium is a reactive metal that forms a dark oxide-nitride layer when exposed to air. Its physical and chemical properties are most similar t ...
ions during transient in cells, and many more in molecular and cellular
transduction
Transduction ('' trans-'' + '' -duc-'' + '' -tion'', "leading through or across") can refer to:
* Signal transduction, any process by which a biological cell converts one kind of signal or stimulus into another
** Olfactory transduction
** Sugar ...
or
gene activation
Regulation of gene expression, or gene regulation, includes a wide range of mechanisms that are used by cells to increase or decrease the production of specific gene products (protein or RNA). Sophisticated programs of gene expression are wide ...
and
cell signaling
In biology, cell signaling (cell signalling in British English) or cell communication is the ability of a cell to receive, process, and transmit signals with its environment and with itself. Cell signaling is a fundamental property of all cellula ...
. This redundancy is particularly relevant when the sites of activation are physically separated from the initial position of the molecular messengers. The redundancy is often generated for the purpose of resolving the time constraint of fast-activating pathways. It can be expressed in terms of the theory of extreme statistics to determine its laws and quantify how the shortest paths are selected. The main goal is to estimate these large numbers from physical principles and mathematical derivations.
When a large distance separates the source and the target (a small activation site), the redundancy principle explains that this geometrical gap can be compensated by large number. Had nature used less copies than normal, activation would have taken a much longer time, as finding a small target by chance is a
rare event
Extreme value theory or extreme value analysis (EVA) is a branch of statistics dealing with the extreme deviations from the median of probability distributions. It seeks to assess, from a given ordered sample of a given random variable, the ...
and falls into
narrow escape problems.
Molecular rate
The time for the fastest particles to reach a target in the context of redundancy depends on the numbers and the local geometry of the target. In most of the time, it is the rate of activation. This rate should be used instead of the classical
Smoluchowski's rate describing the mean arrival time, but not the fastest. The statistics of the minimal time to activation set kinetic laws in biology, which can be quite different from the ones associated to average times.
Physical models
Stochastic process
The motion of a particle located at position
can be described by the Smoluchowski's limit of the
Langevin equation
In physics, a Langevin equation (named after Paul Langevin) is a stochastic differential equation describing how a system evolves when subjected to a combination of deterministic and fluctuating ("random") forces. The dependent variables in a Lange ...
:
where
is the
diffusion coefficient
Diffusivity, mass diffusivity or diffusion coefficient is a proportionality constant between the molar flux due to molecular diffusion and the gradient in the concentration of the species (or the driving force for diffusion). Diffusivity is enco ...
of the particle,
is the
friction coefficient
Friction is the force resisting the relative motion of solid surfaces, fluid layers, and material elements sliding against each other. There are several types of friction:
*Dry friction is a force that opposes the relative lateral motion of tw ...
per unit of mass,
the force per unit of mass, and
is a
Brownian motion
Brownian motion, or pedesis (from grc, πήδησις "leaping"), is the random motion of particles suspended in a medium (a liquid or a gas).
This pattern of motion typically consists of random fluctuations in a particle's position insi ...
. This model is classically used in
molecular dynamics
Molecular dynamics (MD) is a computer simulation method for analyzing the physical movements of atoms and molecules. The atoms and molecules are allowed to interact for a fixed period of time, giving a view of the dynamic "evolution" of th ...
simulations.
Jump processes
, which is for example a model of
telomere
A telomere (; ) is a region of repetitive nucleotide sequences associated with specialized proteins at the ends of linear chromosomes. Although there are different architectures, telomeres, in a broad sense, are a widespread genetic feature mo ...
length dynamics. Here
, with
.
Directed motion process
where
is a unit vector chosen from a uniform distribution. Upon hitting an obstacle at a boundary point
, the velocity changes to
where
is chosen on the unit sphere in the supporting half space at
from a uniform distribution, independently of
. This rectilinear with constant velocity is a simplified model of spermatozoon motion in a bounded domain
. Other models can be diffusion on graph, active graph motion.
Mathematical formulation: Computing the rate of arrival time for the fastest
The mathematical analysis of large numbers of molecules, which are obviously redundant in the traditional activation theory, is used to compute the in vivo time scale of
stochastic chemical reactions. The computation relies on asymptotics or probabilistic approaches to estimate the mean time of the fastest to reach a small target in various geometries.
With N non-interacting i.i.d. Brownian trajectories (ions) in a bounded domain Ω that bind at a site, the shortest arrival time is by definition
where
are the independent arrival times of the N ions in the medium. The survival distribution of arrival time of the fastest
is expressed in terms of a single particle,
. Here
is the survival probability of a single particle prior to binding at the target.This probability is computed from the solution of the
diffusion equation
The diffusion equation is a parabolic partial differential equation. In physics, it describes the macroscopic behavior of many micro-particles in Brownian motion, resulting from the random movements and collisions of the particles (see Fick's law ...
in a domain
:
where the boundary
contains NR binding sites
(
). The single particle survival probability is
so that
where
and
.
The probability density function (pdf) of the arrival time is
which gives the MFPT
The probability
can be computed using short-time asymptotics of the diffusion equation as shown in the next sections.
Explicit computation in dimension 1
The short-time asymptotic of the diffusion equation is based on the ray method approximation. For an semi-interval
, the survival pdf is solution of
that is
The survival probability with D=1 is
. To compute the MFPT, we expand the complementary error function
which gives
,
leading (the main contribution of the integral is near 0) to
This result is reminiscent of using the Gumbel's law. Similarly, escape from the interval
,ais computed from the infinite sum
.The conditional survival probability is approximated by
[
, where the maximum occurs at min ,a-yfor 0 (the shortest ray from y to the boundary). All other integrals can be computed explicitly, leading to
]
Arrival times of the fastest in higher dimensions
The arrival times of the fastest among many Brownian motion
Brownian motion, or pedesis (from grc, πήδησις "leaping"), is the random motion of particles suspended in a medium (a liquid or a gas).
This pattern of motion typically consists of random fluctuations in a particle's position insi ...
s are expressed in terms of the shortest distance from the source S to the absorbing window A, measured by the distance where d is the associated Euclidean distance
In mathematics, the Euclidean distance between two points in Euclidean space is the length of a line segment between the two points.
It can be calculated from the Cartesian coordinates of the points using the Pythagorean theorem, therefore o ...
. Interestingly, trajectories followed by the fastest are as close as possible from the optimal trajectories. In technical language, the associated trajectories of the fastest among N, concentrate near the optimal trajectory (shortest path) when the number N of particles increases. For a diffusion coefficient D and a window of size a, the expected first arrival times of N identically independent distributed Brownian particles initially positioned at the source S are expressed in the following asymptotic formulas :
These formulas show that the expected arrival time of the fastest particle is in dimension 1 and 2, O(1/\log(N)). They should be used instead of the classical forward rate in models of activation in biochemical reactions. The method to derive formulas is based on short-time asymptotic and the Green's function representation of the Helmholtz equation. Note that other distributions could lead to other decays with respect N.
Optimal Paths
Minimizing The optimal path in large N
The optimal paths for the fastest can be found using the Wencell-Freidlin functional in the Large-deviation theory. These paths correspond to the short-time asymptotics of the diffusion equation from a source to a target. In general, the exact solution is hard to find, especially for a space containing various distribution of obstacles.
The Wiener integral
In mathematics, the Wiener process is a real-valued continuous-time stochastic process named in honor of American mathematician Norbert Wiener for his investigations on the mathematical properties of the one-dimensional Brownian motion. It i ...
representation of the pdf for a pure Brownian motion is obtained for a zero drift and diffusion tensor constant, so that it is given by the probability of a sampled path until it exits at the small window at the random time T
where
in the product and T is the exit time in the narrow absorbing window Finally,
where is the ensemble of shortest paths selected among n Brownian trajectories, starting at point y and exiting between time t and t+dt from the domain . The probability is used to show that the empirical stochastic trajectories of concentrate near the shortest paths starting from y and ending at the small absorbing window , under the condition that . The paths of can be approximated using discrete broken lines among a finite number of points and we denote the associated ensemble by . Bayes' rule leads to where is the probability that a path of exits in m-discrete time steps. A path made of broken lines (random walk with a time step) can be expressed using Wiener path-integral. The probability of a Brownian path x(s) can be expressed in the limit of a path-integral with the functional:
The Survival probability conditioned on starting at y is given by the Wiener representation:
where is the limit Wiener measure: the exterior integral is taken over all end points x and the path integral is over all paths starting from x(0). When we consider n-independent paths (made of points with a time step that exit in m-steps, the probability of such an event is
.Indeed, when there are n paths of m steps, and the fastest one escapes in m-steps, they should all exit in m steps. Using the limit of path integral, we get heuristically the representation
where the integral is taken over all paths starting at y(0) and exiting at time . This formula suggests that when n is large, only the paths that minimize the integrant will contribute. For large n, this formula suggests that paths that will contribute the most are the ones that will minimize the exponent, which allows selecting the paths for which the energy functional is minimal, that is
where the integration is taken over the ensemble of regular paths inside starting at y and exiting in , defined as
This formal argument shows that the random paths associated to the fastest exit time are concentrated near the shortest paths. Indeed, the Euler-Lagrange equations for the extremal problem are the classical geodesic
In geometry, a geodesic () is a curve representing in some sense the shortest path ( arc) between two points in a surface, or more generally in a Riemannian manifold. The term also has meaning in any differentiable manifold with a connection. ...
s between y and a point in the narrow window .
Fastest escape from a cusp in two dimensions
The formula for the fastest escape can generalize to the case where the absorbing window is located in funnel cusp and the initial particles are distributed outside the cusp. The cusp has a size in the opening and a curvature R. The diffusion coefficient is D. The shortest arrival time, valid for large n is given by Hereand c is a constant that depends on the diameter of the domain. The time taken by the first arrivers is proportional to the reciprocal of the size of the narrow target . This formula is derived for fixed geometry and large n and not in the opposite limit of large n and small epsilon.
Concluding remarks
How nature sets the disproportionate numbers of particles remain unclear, but can be found using the theory of diffusion. One example is the number of neurotransmitters around 2000 to 3000 released during synaptic transmission, that are set to compensate the low copy number of receptors, so the probability of activation is restored to one.
In natural processes these large numbers should not be considered wasteful, but are necessary for generating the fastest possible response and make possible rare events that otherwise would never happen. This property is universal, ranging from the molecular scale to the population level.
Nature's strategy for optimizing the response time is not necessarily defined by the physics of the motion of an individual particle, but rather by the extreme statistics, that select the shortest paths. In addition, the search for a small activation site selects the particle to arrive first: although these trajectories are rare, they are the ones that set the time scale. We may need to reconsider our estimation toward numbers when punctioning nature in agreement with the redundant principle that quantifies the request to achieve the biological function.
References
{{Reflist
Biology terminology