HOME

TheInfoList



OR:

The Lee–Carter model is a numerical algorithm used in mortality forecasting and
life expectancy Life expectancy is a statistical measure of the average time an organism is expected to live, based on the year of its birth, current age, and other demographic factors like sex. The most commonly used measure is life expectancy at birth ...
forecasting Forecasting is the process of making predictions based on past and present data. Later these can be compared (resolved) against what happens. For example, a company might estimate their revenue in the next year, then compare it against the actual ...
. The input to the model is a matrix of age specific
mortality rates Mortality rate, or death rate, is a measure of the number of deaths (in general, or due to a specific cause) in a particular population, scaled to the size of that population, per unit of time. Mortality rate is typically expressed in units of de ...
ordered monotonically by time, usually with ages in columns and years in rows. The output is a forecasted matrix of mortality rates in the same format as the input. The model uses
singular value decomposition In linear algebra, the singular value decomposition (SVD) is a factorization of a real or complex matrix. It generalizes the eigendecomposition of a square normal matrix with an orthonormal eigenbasis to any \ m \times n\ matrix. It is re ...
(SVD) to find: * A
univariate In mathematics, a univariate object is an expression, equation, function or polynomial involving only one variable. Objects involving more than one variable are multivariate. In some cases the distinction between the univariate and multivariate ...
time series In mathematics, a time series is a series of data points indexed (or listed or graphed) in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. Thus it is a sequence of discrete-time data. Ex ...
vector \mathbf_t that captures 80–90% of the mortality trend (here the subscript t refers to time), * A vector \mathbf_x that describes the relative mortality at each age (here the subscript x refers to age), and * A scaling constant (referred to here as s_1 but unnamed in the literature). Surprisingly, \mathbf_t is usually linear, implying that gains to life expectancy are fairly constant year after year in most populations. Prior to computing SVD, age specific mortality rates are first transformed into \mathbf_, by taking their
logarithms In mathematics, the logarithm is the inverse function to exponentiation. That means the logarithm of a number  to the base  is the exponent to which must be raised, to produce . For example, since , the ''logarithm base'' 10 of ...
, and then centering them by subtracting their age-specific means over time. The age-specific mean over time is denoted by \mathbf_x. The subscript x,t refers to the fact that \mathbf_ spans both age and time. Many researchers adjust the \mathbf_t vector by fitting it to empirical life expectancies for each year, using the \mathbf_x and \mathbf_x generated with SVD. When adjusted using this approach, changes to \mathbf_t are usually small. To forecast mortality, \mathbf_t (either adjusted or not) is projected into n future years using an
ARIMA Arima, officially The Royal Chartered Borough of Arima is the easternmost and second largest in area of the three boroughs of Trinidad and Tobago. It is geographically adjacent to Sangre Grande and Arouca at the south central foothills of th ...
model. The corresponding forecasted \mathbf_ is recovered by multiplying \mathbf_ by \mathbf_x and the first diagonal element of S (when \mathbf \mathbf \mathbf = \text(\mathbf_)). The actual mortality rates are recovered by taking exponentials of this vector. Because of the linearity of \mathbf_t, it is generally modeled as a
random walk In mathematics, a random walk is a random process that describes a path that consists of a succession of random steps on some mathematical space. An elementary example of a random walk is the random walk on the integer number line \mathbb Z ...
with trend. Life expectancy and other
life table In actuarial science and demography, a life table (also called a mortality table or actuarial table) is a table which shows, for each age, what the probability is that a person of that age will die before their next birthday ("probability of death ...
measures can be calculated from this forecasted matrix after adding back the means and taking exponentials to yield regular mortality rates. In most implementations,
confidence intervals In frequentist statistics, a confidence interval (CI) is a range of estimates for an unknown parameter. A confidence interval is computed at a designated ''confidence level''; the 95% confidence level is most common, but other levels, such as 9 ...
for the forecasts are generated by simulating multiple mortality forecasts using
Monte Carlo Method Monte Carlo methods, or Monte Carlo experiments, are a broad class of computational algorithms that rely on repeated random sampling to obtain numerical results. The underlying concept is to use randomness to solve problems that might be determi ...
s. A band of mortality between 5% and 95% percentiles of the simulated results is considered to be a valid forecast. These simulations are done by extending \mathbf_t into the future using randomization based on the
standard error The standard error (SE) of a statistic (usually an estimate of a parameter) is the standard deviation of its sampling distribution or an estimate of that standard deviation. If the statistic is the sample mean, it is called the standard error o ...
of \mathbf_t derived from the input data.


Algorithm

The algorithm seeks to find the least squares solution to the equation: :\ln = \mathbf_x + \mathbf_t \mathbf_x + \epsilon_ where \mathbf_ is a matrix of mortality rate for each age x in each year t. # Compute \mathbf_x which is the average over time of \ln for each age: #;: \mathbf_x = \frac # Compute \mathbf_ which will be used in SVD: #;: \mathbf_ = \ln - \mathbf_x # Compute the singular value decomposition of \mathbf_: #;: \mathbf \mathbf \mathbf = \text(\mathbf_) # Derive \mathbf_t, s_1 (the scaling eigenvalue), and \mathbf_x from \mathbf, \mathbf, and \mathbf: #;: \mathbf_t = (u_, u_, ..., u_) #;: \mathbf_x = (v_, v_, ..., v_) # Forecast \mathbf_t using a standard univariate
ARIMA Arima, officially The Royal Chartered Borough of Arima is the easternmost and second largest in area of the three boroughs of Trinidad and Tobago. It is geographically adjacent to Sangre Grande and Arouca at the south central foothills of th ...
model to n additional years: #;: \mathbf_ = \text(\mathbf_t, n) # Use the forecasted \mathbf_, with the original \mathbf_x, and \mathbf_x to calculate the forecasted mortality rate for each age: #;: \mathbf_ = \exp(\mathbf_x + s_1 \mathbf_ \mathbf_x)


Discussion

Without applying SVD or some other method of
dimension reduction Dimensionality reduction, or dimension reduction, is the transformation of data from a high-dimensional space into a low-dimensional space so that the low-dimensional representation retains some meaningful properties of the original data, ideally ...
the table of mortality data is a highly correlated multivariate data series, and the complexity of these multidimensional time series makes them difficult to forecast. SVD has become widely used as a method of dimension reduction in many different fields, including by
Google Google LLC () is an American Multinational corporation, multinational technology company focusing on Search Engine, search engine technology, online advertising, cloud computing, software, computer software, quantum computing, e-commerce, ar ...
in their
page rank PageRank (PR) is an algorithm used by Google Search to rank webpages, web pages in their search engine results. It is named after both the term "web page" and co-founder Larry Page. PageRank is a way of measuring the importance of website pages. A ...
algorithm. The Lee–Carter model was introduced by Ronald D. Lee and Lawrence Carter in 1992 with the article "Modeling and Forecasting the Time Series of U.S. Mortality," (Journal of the American Statistical Association 87 (September): 659–671). The model grew out of their work in the late 1980s and early 1990s attempting to use inverse projection to infer rates in
historical demography Historical demography is the quantitative study of human population in the past. It is concerned with population size, with the three basic components of population change (fertility, mortality, and migration), and with population characteristi ...
. The model has been used by the United States
Social Security Administration The United States Social Security Administration (SSA) is an independent agency of the U.S. federal government that administers Social Security, a social insurance program consisting of retirement, disability and survivor benefits. To qualify ...
, the US
Census Bureau The United States Census Bureau (USCB), officially the Bureau of the Census, is a principal agency of the Federal Statistical System of the United States, U.S. Federal Statistical System, responsible for producing data about the Americans, Ame ...
, and the United Nations. It has become the most widely used mortality forecasting technique in the world today. There have been extensions to the Lee–Carter model, most notably to account for missing years, correlated male and female populations, and large scale coherency in populations that share a mortality regime (western Europe, for example). Many related papers can be found o
Professor Ronald Lee's
website.


Implementations

There are surprisingly few software packages for forecasting with the Lee–Carter model.
LCFIT
is a web-based package with interactive forms. * Professor Rob J. Hyndman provides a
R package for demography
that includes routines for creating and forecasting a Lee–Carter model. * Alternatives in R include th
StMoMo package
of Villegas, Millossovich and Kaishev (2015). * Professor German Rodriguez provide

using Stata. * Using
Matlab MATLAB (an abbreviation of "MATrix LABoratory") is a proprietary multi-paradigm programming language and numeric computing environment developed by MathWorks. MATLAB allows matrix manipulations, plotting of functions and data, implementa ...
, Professor Eric Jondeau and Professor Michael Rockinger have put together th
Longevity Toolbox
for parameter estimation.


References

{{DEFAULTSORT:Lee-Carter model Actuarial science Population Population ecology