HOME

TheInfoList



OR:

In the
statistical Statistics (from German: '' Statistik'', "description of a state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, industr ...
area of
survival analysis Survival analysis is a branch of statistics for analyzing the expected duration of time until one event occurs, such as death in biological organisms and failure in mechanical systems. This topic is called reliability theory or reliability analysi ...
, an accelerated failure time model (AFT model) is a parametric model that provides an alternative to the commonly used
proportional hazards models Proportional hazards models are a class of survival models in statistics. Survival models relate the time that passes, before some event occurs, to one or more covariates that may be associated with that quantity of time. In a proportional ha ...
. Whereas a proportional hazards model assumes that the effect of a
covariate Dependent and independent variables are variables in mathematical modeling, statistical modeling and experimental sciences. Dependent variables receive this name because, in an experiment, their values are studied under the supposition or deman ...
is to multiply the
hazard A hazard is a potential source of harm. Substances, events, or circumstances can constitute hazards when their nature would allow them, even just theoretically, to cause damage to health, life, property, or any other interest of value. The probab ...
by some constant, an AFT model assumes that the effect of a covariate is to accelerate or decelerate the life course of a disease by some constant. This is especially appealing in a technical context where the 'disease' is a result of some mechanical process with a known sequence of intermediary stages.


Model specification

In full generality, the accelerated failure time model can be specified as :: \lambda(t, \theta)=\theta\lambda_0(\theta t) where \theta denotes the joint effect of covariates, typically \theta=\exp(- beta_1X_1 + \cdots + \beta_pX_p. (Specifying the regression coefficients with a negative sign implies that high values of the covariates ''increase'' the survival time, but this is merely a sign convention; without a negative sign, they increase the hazard.) This is satisfied if the
probability density function In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) c ...
of the event is taken to be f(t, \theta)=\theta f_0(\theta t); it then follows for the
survival function The survival function is a function that gives the probability that a patient, device, or other object of interest will survive past a certain time. The survival function is also known as the survivor function or reliability function. The te ...
that S(t, \theta)=S_0(\theta t). From this it is easy to see that the moderated life time T is distributed such that T\theta and the unmoderated life time T_0 have the same distribution. Consequently, \log(T) can be written as :: \log(T)=-\log(\theta)+\log(T\theta):=-\log(\theta)+\epsilon where the last term is distributed as \log(T_0), i.e., independently of \theta. This reduces the accelerated failure time model to
regression analysis In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome' or 'response' variable, or a 'label' in machine learning parlance) and one ...
(typically a
linear model In statistics, the term linear model is used in different ways according to the context. The most common occurrence is in connection with regression models and the term is often taken as synonymous with linear regression model. However, the term ...
) where -\log(\theta) represents the fixed effects, and \epsilon represents the noise. Different distributions of \epsilon imply different distributions of T_0, i.e., different baseline distributions of the survival time. Typically, in survival-analytic contexts, many of the observations are censored: we only know that T_i>t_i, not T_i=t_i. In fact, the former case represents survival, while the later case represents an event/death/censoring during the follow-up. These right-censored observations can pose technical challenges for estimating the model, if the distribution of T_0 is unusual. The interpretation of \theta in accelerated failure time models is straightforward: \theta=2 means that everything in the relevant life history of an individual happens twice as fast. For example, if the model concerns the development of a tumor, it means that all of the pre-stages progress twice as fast as for the unexposed individual, implying that the expected time until a clinical disease is 0.5 of the baseline time. However, this does not mean that the hazard function \lambda(t, \theta) is always twice as high - that would be the
proportional hazards model Proportional hazards models are a class of survival models in statistics. Survival models relate the time that passes, before some event occurs, to one or more covariates that may be associated with that quantity of time. In a proportional haz ...
.


Statistical issues

Unlike proportional hazards models, in which Cox's semi-parametric proportional hazards model is more widely used than parametric models, AFT models are predominantly fully parametric i.e. a
probability distribution In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomeno ...
is specified for \log(T_0). (Buckley and James proposed a semi-parametric AFT but its use is relatively uncommon in applied research; in a 1992 paper, Wei pointed out that the Buckley–James model has no theoretical justification and lacks robustness, and reviewed alternatives.) This can be a problem, if a degree of realistic detail is required for modelling the distribution of a baseline lifetime. Hence, technical developments in this direction would be highly desirable. Unlike proportional hazards models, the regression parameter estimates from AFT models are robust to omitted
covariate Dependent and independent variables are variables in mathematical modeling, statistical modeling and experimental sciences. Dependent variables receive this name because, in an experiment, their values are studied under the supposition or deman ...
s. They are also less affected by the choice of probability distribution. The results of AFT models are easily interpreted. For example, the results of a
clinical trial Clinical trials are prospective biomedical or behavioral research studies on human subject research, human participants designed to answer specific questions about biomedical or behavioral interventions, including new treatments (such as novel v ...
with mortality as the endpoint could be interpreted as a certain percentage increase in future
life expectancy Life expectancy is a statistical measure of the average time an organism is expected to live, based on the year of its birth, current age, and other demographic factors like sex. The most commonly used measure is life expectancy at birth ...
on the new treatment compared to the control. So a patient could be informed that he would be expected to live (say) 15% longer if he took the new treatment.
Hazard ratio In survival analysis, the hazard ratio (HR) is the ratio of the hazard rates corresponding to the conditions characterised by two distinct levels of a treatment variable of interest. For example, in a clinical study of a drug, the treated populat ...
s can prove harder to explain in layman's terms.


Distributions used in AFT models

The
log-logistic distribution In probability and statistics, the log-logistic distribution (known as the Fisk distribution in economics) is a continuous probability distribution for a non-negative random variable. It is used in survival analysis as a parametric model for even ...
provides the most commonly used AFT model. Unlike the
Weibull distribution In probability theory and statistics, the Weibull distribution is a continuous probability distribution. It is named after Swedish mathematician Waloddi Weibull, who described it in detail in 1951, although it was first identified by Maurice R ...
, it can exhibit a non-
monotonic In mathematics, a monotonic function (or monotone function) is a function between ordered sets that preserves or reverses the given order. This concept first arose in calculus, and was later generalized to the more abstract setting of ord ...
hazard function which increases at early times and decreases at later times. It is somewhat similar in shape to the
log-normal distribution In probability theory, a log-normal (or lognormal) distribution is a continuous probability distribution of a random variable whose logarithm is normally distributed. Thus, if the random variable is log-normally distributed, then has a normal ...
but it has heavier tails. The log-logistic
cumulative distribution function In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x. Ev ...
has a simple closed form, which becomes important computationally when fitting data with censoring. For the censored observations one needs the survival function, which is the complement of the cumulative distribution function, i.e. one needs to be able to evaluate S(t, \theta)=1-F(t, \theta). The
Weibull distribution In probability theory and statistics, the Weibull distribution is a continuous probability distribution. It is named after Swedish mathematician Waloddi Weibull, who described it in detail in 1951, although it was first identified by Maurice R ...
(including the
exponential distribution In probability theory and statistics, the exponential distribution is the probability distribution of the time between events in a Poisson point process, i.e., a process in which events occur continuously and independently at a constant averag ...
as a special case) can be parameterised as either a proportional hazards model or an AFT model, and is the only family of distributions to have this property. The results of fitting a Weibull model can therefore be interpreted in either framework. However, the biological applicability of this model may be limited by the fact that the hazard function is monotonic, i.e. either decreasing or increasing. Other distributions suitable for AFT models include the
log-normal In probability theory, a log-normal (or lognormal) distribution is a continuous probability distribution of a random variable whose logarithm is normally distributed. Thus, if the random variable is log-normally distributed, then has a normal ...
, gamma and
inverse Gaussian distribution In probability theory, the inverse Gaussian distribution (also known as the Wald distribution) is a two-parameter family of continuous probability distributions with support on (0,∞). Its probability density function is given by : f(x;\mu, ...
s, although they are less popular than the log-logistic, partly as their cumulative distribution functions do not have a closed form. Finally, the
generalized gamma distribution The generalized gamma distribution is a continuous probability distribution with two shape parameters (and a scale parameter). It is a generalization of the gamma distribution which has one shape parameter (and a scale parameter). Since many di ...
is a three-parameter distribution that includes the
Weibull Weibull is a Swedish locational surname. The Weibull family share the same roots as the Danish / Norwegian noble family of Falsenbr>They originated from and were named after the village of Weiböl in Widstedts parish, Jutland, but settled in Skå ...
,
log-normal In probability theory, a log-normal (or lognormal) distribution is a continuous probability distribution of a random variable whose logarithm is normally distributed. Thus, if the random variable is log-normally distributed, then has a normal ...
and gamma distributions as special cases.


References


Further reading

* * * * * * Martinussen, Torben; Scheike, Thomas (2006), Dynamic Regression Models for Survival Data, Springer, * Bagdonavicius, Vilijandas; Nikulin, Mikhail (2002), Accelerated Life Models. Modeling and Statistical Analysis, Chapman&Hall/CRC, {{Statistics, analysis Survival analysis