HOME

TheInfoList



OR:

The Lee–Carter model is a numerical algorithm used in mortality forecasting and
life expectancy Human life expectancy is a statistical measure of the estimate of the average remaining years of life at a given age. The most commonly used measure is ''life expectancy at birth'' (LEB, or in demographic notation ''e''0, where '' ...
forecasting Forecasting is the process of making predictions based on past and present data. Later these can be compared with what actually happens. For example, a company might Estimation, estimate their revenue in the next year, then compare it against the ...
. The input to the model is a matrix of age specific
mortality rates Mortality rate, or death rate, is a measure of the number of deaths (in general, or due to a specific cause) in a particular population, scaled to the size of that population, per unit of time. Mortality rate is typically expressed in units of d ...
ordered monotonically by time, usually with ages in columns and years in rows. The output is a forecasted matrix of mortality rates in the same format as the input. The model uses
singular value decomposition In linear algebra, the singular value decomposition (SVD) is a Matrix decomposition, factorization of a real number, real or complex number, complex matrix (mathematics), matrix into a rotation, followed by a rescaling followed by another rota ...
(SVD) to find: * A
univariate In mathematics, a univariate object is an expression (mathematics), expression, equation, function (mathematics), function or polynomial involving only one Variable (mathematics), variable. Objects involving more than one variable are ''wikt:multi ...
time series In mathematics, a time series is a series of data points indexed (or listed or graphed) in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. Thus it is a sequence of discrete-time data. ...
vector \mathbf_t that captures 80–90% of the mortality trend (here the subscript t refers to time), * A vector \mathbf_x that describes the relative mortality at each age (here the subscript x refers to age), and * A scaling constant (referred to here as s_1 but unnamed in the literature). \mathbf_t is usually linear, implying that gains to life expectancy are fairly constant year after year in most populations. Prior to computing SVD, age specific mortality rates are first transformed into \mathbf_, by taking their
logarithms In mathematics, the logarithm of a number is the exponent by which another fixed value, the base, must be raised to produce that number. For example, the logarithm of to base is , because is to the rd power: . More generally, if , the ...
, and then centering them by subtracting their age-specific means over time. The age-specific mean over time is denoted by \mathbf_x. The subscript x,t refers to the fact that \mathbf_ spans both age and time. Many researchers adjust the \mathbf_t vector by fitting it to empirical life expectancies for each year, using the \mathbf_x and \mathbf_x generated with SVD. When adjusted using this approach, changes to \mathbf_t are usually small. To forecast mortality, \mathbf_t (either adjusted or not) is projected into n future years using an
ARIMA Arima, officially The Royal Chartered Borough of Arima is the easternmost and second largest in area of the three boroughs of Trinidad and Tobago. It is geographically adjacent to Sangre Grande and Arouca at the south central foothills of the ...
model. The corresponding forecasted \mathbf_ is recovered by multiplying \mathbf_ by \mathbf_x and the first diagonal element of S (when \mathbf \mathbf \mathbf = \text(\mathbf_)). The actual mortality rates are recovered by taking exponentials of this vector. Because of the linearity of \mathbf_t, it is generally modeled as a
random walk In mathematics, a random walk, sometimes known as a drunkard's walk, is a stochastic process that describes a path that consists of a succession of random steps on some Space (mathematics), mathematical space. An elementary example of a rand ...
with trend. Life expectancy and other
life table In actuarial science and demography, a life table (also called a mortality table or actuarial table) is a table which shows, for each age, the probability that a person of that age will die before their next birthday ("probability of death"). In ...
measures can be calculated from this forecasted matrix after adding back the means and taking exponentials to yield regular mortality rates. In most implementations, confidence intervals for the forecasts are generated by simulating multiple mortality forecasts using
Monte Carlo Method Monte Carlo methods, or Monte Carlo experiments, are a broad class of computational algorithms that rely on repeated random sampling to obtain numerical results. The underlying concept is to use randomness to solve problems that might be ...
s. A band of mortality between 5% and 95% percentiles of the simulated results is considered to be a valid forecast. These simulations are done by extending \mathbf_t into the future using randomization based on the
standard error The standard error (SE) of a statistic (usually an estimator of a parameter, like the average or mean) is the standard deviation of its sampling distribution or an estimate of that standard deviation. In other words, it is the standard deviati ...
of \mathbf_t derived from the input data.


Algorithm

The algorithm seeks to find the least squares solution to the equation: :\ln = \mathbf_x + \mathbf_x \mathbf_t + \epsilon_ where \mathbf_ is a matrix of mortality rate for each age x in each year t. # Compute \mathbf_x which is the average over time of \ln for each age: #;: \mathbf_x = \frac # Compute \mathbf_ which will be used in SVD: #;: \mathbf_ = \ln - \mathbf_x # Compute the singular value decomposition of \mathbf_: #;: \mathbf \mathbf \mathbf = \text(\mathbf_) # Derive \mathbf_x, s_1 (the scaling eigenvalue), and \mathbf_t from \mathbf, \mathbf, and \mathbf: #;: \mathbf_x = (u_, u_, ..., u_) #;: \mathbf_t = (v_, v_, ..., v_) # Forecast \mathbf_t using a standard univariate
ARIMA Arima, officially The Royal Chartered Borough of Arima is the easternmost and second largest in area of the three boroughs of Trinidad and Tobago. It is geographically adjacent to Sangre Grande and Arouca at the south central foothills of the ...
model to n additional years: #;: \mathbf_ = \text(\mathbf_t, n) # Use the forecasted \mathbf_, with the original \mathbf_x, and \mathbf_x to calculate the forecasted mortality rate for each age: #;: \mathbf_ = \exp(\mathbf_x + s_1 \mathbf_ \mathbf_x)


Discussion

Without applying SVD or some other method of
dimension reduction Dimensionality reduction, or dimension reduction, is the transformation of data from a high-dimensional space into a low-dimensional space so that the low-dimensional representation retains some meaningful properties of the original data, ideally ...
the table of mortality data is a highly correlated multivariate data series, and the complexity of these multidimensional time series makes them difficult to forecast. SVD has become widely used as a method of dimension reduction in many different fields, including by
Google Google LLC (, ) is an American multinational corporation and technology company focusing on online advertising, search engine technology, cloud computing, computer software, quantum computing, e-commerce, consumer electronics, and artificial ...
in their page rank algorithm. The Lee–Carter model was introduced by Ronald D. Lee and Lawrence Carter in 1992 with the article "Modeling and Forecasting U.S. Mortality". The model grew out of their work in the late 1980s and early 1990s attempting to use inverse projection to infer rates in
historical demography Historical demography is the quantitative study of human population in the past. It is concerned with population size, with the three basic components of population change (fertility Fertility in colloquial terms refers the ability to have of ...
. The model has been used by the United States
Social Security Administration The United States Social Security Administration (SSA) is an Independent agencies of the United States government, independent agency of the Federal government of the United States, U.S. federal government that administers Social Security (United ...
, the US
Census Bureau The United States Census Bureau, officially the Bureau of the Census, is a principal agency of the U.S. federal statistical system, responsible for producing data about the American people and economy. The U.S. Census Bureau is part of the U ...
, and the United Nations. It has become the most widely used mortality forecasting technique in the world today. There have been extensions to the Lee–Carter model, most notably to account for missing years, correlated male and female populations, and large scale coherency in populations that share a mortality regime (western Europe, for example). Many related papers can be found o
Professor Ronald Lee's
website.


Implementations

There are few software packages for forecasting with the Lee–Carter model.
LCFIT
is a web-based package with interactive forms. * Professor Rob J. Hyndman provides a
R package for demography
that includes routines for creating and forecasting a Lee–Carter model. * Alternatives in R include th
StMoMo package
of Villegas, Millossovich and Kaishev (2015). * Professor German Rodriguez provide

using
Stata Stata (, , alternatively , occasionally stylized as STATA) is a general-purpose Statistics, statistical software package developed by StataCorp for data manipulation, visualization, statistics, and automated reporting. It is used by researchers ...
. * Using
Matlab MATLAB (an abbreviation of "MATrix LABoratory") is a proprietary multi-paradigm programming language and numeric computing environment developed by MathWorks. MATLAB allows matrix manipulations, plotting of functions and data, implementat ...
, Professor Eric Jondeau and Professor Michael Rockinger have put together th
Longevity Toolbox
for parameter estimation.


References

{{DEFAULTSORT:Lee-Carter model Actuarial science Population Population ecology