In the
statistical
Statistics (from German: '' Statistik'', "description of a state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, industr ...
area of
survival analysis
Survival analysis is a branch of statistics for analyzing the expected duration of time until one event occurs, such as death in biological organisms and failure in mechanical systems. This topic is called reliability theory or reliability analysi ...
, an accelerated failure time model (AFT model) is a
parametric model that provides an alternative to the commonly used
proportional hazards models
Proportional hazards models are a class of survival models in statistics. Survival models relate the time that passes, before some event occurs, to one or more covariates that may be associated with that quantity of time. In a proportional ha ...
. Whereas a proportional hazards model assumes that the effect of a
covariate
Dependent and independent variables are variables in mathematical modeling, statistical modeling and experimental sciences. Dependent variables receive this name because, in an experiment, their values are studied under the supposition or deman ...
is to multiply the
hazard
A hazard is a potential source of harm. Substances, events, or circumstances can constitute hazards when their nature would allow them, even just theoretically, to cause damage to health, life, property, or any other interest of value. The probab ...
by some constant, an AFT model assumes that the effect of a covariate is to accelerate or decelerate the life course of a disease by some constant. This is especially appealing in a technical context where the 'disease' is a result of some mechanical process with a known sequence of intermediary stages.
Model specification
In full generality, the accelerated failure time model can be specified as
::
where
denotes the joint effect of covariates, typically
. (Specifying the regression coefficients with a negative sign implies that high values of the covariates ''increase'' the survival time, but this is merely a sign convention; without a negative sign, they increase the hazard.)
This is satisfied if the
probability density function
In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) c ...
of the event is taken to be
; it then follows for the
survival function
The survival function is a function that gives the probability that a patient, device, or other object of interest will survive past a certain time.
The survival function is also known as the survivor function
or reliability function.
The te ...
that
. From this it is easy to see that the moderated life time
is distributed such that
and the unmoderated life time
have the same distribution. Consequently,
can be written as
::
where the last term is distributed as
, i.e., independently of
. This reduces the accelerated failure time model to
regression analysis
In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome' or 'response' variable, or a 'label' in machine learning parlance) and one ...
(typically a
linear model
In statistics, the term linear model is used in different ways according to the context. The most common occurrence is in connection with regression models and the term is often taken as synonymous with linear regression model. However, the term ...
) where
represents the fixed effects, and
represents the noise. Different distributions of
imply different distributions of
, i.e., different baseline distributions of the survival time. Typically, in survival-analytic contexts, many of the observations are censored: we only know that
, not
. In fact, the former case represents survival, while the later case represents an event/death/censoring during the follow-up. These right-censored observations can pose technical challenges for estimating the model, if the distribution of
is unusual.
The interpretation of
in accelerated failure time models is straightforward:
means that everything in the relevant life history of an individual happens twice as fast. For example, if the model concerns the development of a tumor, it means that all of the pre-stages progress twice as fast as for the unexposed individual, implying that the expected time until a clinical disease is 0.5 of the baseline time. However, this does not mean that the hazard function
is always twice as high - that would be the
proportional hazards model
Proportional hazards models are a class of survival models in statistics. Survival models relate the time that passes, before some event occurs, to one or more covariates that may be associated with that quantity of time. In a proportional haz ...
.
Statistical issues
Unlike proportional hazards models, in which
Cox's semi-parametric proportional hazards model is more widely used than parametric models, AFT models are predominantly fully parametric i.e. a
probability distribution
In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomeno ...
is specified for
. (Buckley and James proposed a semi-parametric AFT but its use is relatively uncommon in applied research; in a 1992 paper, Wei pointed out that the Buckley–James model has no theoretical justification and lacks robustness, and reviewed alternatives.) This can be a problem, if a degree of realistic detail is required for modelling the distribution of a baseline lifetime. Hence, technical developments in this direction would be highly desirable.
Unlike proportional hazards models, the regression parameter estimates from AFT models are robust to omitted
covariate
Dependent and independent variables are variables in mathematical modeling, statistical modeling and experimental sciences. Dependent variables receive this name because, in an experiment, their values are studied under the supposition or deman ...
s. They are also less affected by the choice of probability distribution.
The results of AFT models are easily interpreted.
For example, the results of a
clinical trial
Clinical trials are prospective biomedical or behavioral research studies on human subject research, human participants designed to answer specific questions about biomedical or behavioral interventions, including new treatments (such as novel v ...
with mortality as the endpoint could be interpreted as a certain percentage increase in future
life expectancy
Life expectancy is a statistical measure of the average time an organism is expected to live, based on the year of its birth, current age, and other demographic factors like sex. The most commonly used measure is life expectancy at birth ...
on the new treatment compared to the control. So a patient could be informed that he would be expected to live (say) 15% longer if he took the new treatment.
Hazard ratio
In survival analysis, the hazard ratio (HR) is the ratio of the hazard rates corresponding to the conditions characterised by two distinct levels of a treatment variable of interest. For example, in a clinical study of a drug, the treated populat ...
s can prove harder to explain in layman's terms.
Distributions used in AFT models
The
log-logistic distribution
In probability and statistics, the log-logistic distribution (known as the Fisk distribution in economics) is a continuous probability distribution for a non-negative random variable. It is used in survival analysis as a parametric model for even ...
provides the most commonly used AFT model. Unlike the
Weibull distribution
In probability theory and statistics, the Weibull distribution is a continuous probability distribution. It is named after Swedish mathematician Waloddi Weibull, who described it in detail in 1951, although it was first identified by Maurice R ...
, it can exhibit a non-
monotonic
In mathematics, a monotonic function (or monotone function) is a function between ordered sets that preserves or reverses the given order. This concept first arose in calculus, and was later generalized to the more abstract setting of ord ...
hazard function which increases at early times and decreases at later times. It is somewhat similar in shape to the
log-normal distribution
In probability theory, a log-normal (or lognormal) distribution is a continuous probability distribution of a random variable whose logarithm is normally distributed. Thus, if the random variable is log-normally distributed, then has a normal ...
but it has heavier tails. The log-logistic
cumulative distribution function
In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x.
Ev ...
has a simple
closed form, which becomes important computationally when fitting data with
censoring. For the censored observations one needs the survival function, which is the complement of the cumulative distribution function, i.e. one needs to be able to evaluate
.
The
Weibull distribution
In probability theory and statistics, the Weibull distribution is a continuous probability distribution. It is named after Swedish mathematician Waloddi Weibull, who described it in detail in 1951, although it was first identified by Maurice R ...
(including the
exponential distribution
In probability theory and statistics, the exponential distribution is the probability distribution of the time between events in a Poisson point process, i.e., a process in which events occur continuously and independently at a constant averag ...
as a special case) can be parameterised as either a proportional hazards model or an AFT model, and is the only family of distributions to have this property. The results of fitting a Weibull model can therefore be interpreted in either framework. However, the biological applicability of this model may be limited by the fact that the hazard function is monotonic, i.e. either decreasing or increasing.
Other distributions suitable for AFT models include the
log-normal
In probability theory, a log-normal (or lognormal) distribution is a continuous probability distribution of a random variable whose logarithm is normally distributed. Thus, if the random variable is log-normally distributed, then has a normal ...
,
gamma and
inverse Gaussian distribution
In probability theory, the inverse Gaussian distribution (also known as the Wald distribution) is a two-parameter family of continuous probability distributions with support on (0,∞).
Its probability density function is given by
: f(x;\mu, ...
s, although they are less popular than the log-logistic, partly as their cumulative distribution functions do not have a closed form. Finally, the
generalized gamma distribution
The generalized gamma distribution is a continuous probability distribution with two shape parameters (and a scale parameter). It is a generalization of the gamma distribution which has one shape parameter (and a scale parameter). Since many di ...
is a three-parameter distribution that includes the
Weibull
Weibull is a Swedish locational surname. The Weibull family share the same roots as the Danish / Norwegian noble family of Falsenbr>They originated from and were named after the village of Weiböl in Widstedts parish, Jutland, but settled in Skå ...
,
log-normal
In probability theory, a log-normal (or lognormal) distribution is a continuous probability distribution of a random variable whose logarithm is normally distributed. Thus, if the random variable is log-normally distributed, then has a normal ...
and
gamma distributions as special cases.
References
Further reading
*
*
*
*
*
* Martinussen, Torben; Scheike, Thomas (2006), Dynamic Regression Models for Survival Data, Springer,
* Bagdonavicius, Vilijandas; Nikulin, Mikhail (2002), Accelerated Life Models. Modeling and Statistical Analysis, Chapman&Hall/CRC,
{{Statistics, analysis
Survival analysis