Dynamic discrete choice (DDC) models, also known as discrete choice models of
dynamic programming
Dynamic programming is both a mathematical optimization method and a computer programming method. The method was developed by Richard Bellman in the 1950s and has found applications in numerous fields, from aerospace engineering to economics.
I ...
, model an agent's choices over discrete options that have future implications. Rather than assuming observed choices are the result of static utility maximization, observed choices in DDC models are assumed to result from an agent's maximization of the
present value
In economics and finance, present value (PV), also known as present discounted value, is the value of an expected income stream determined as of the date of valuation. The present value is usually less than the future value because money has in ...
of utility, generalizing the
utility theory
As a topic of economics, utility is used to model worth or value. Its usage has evolved significantly over time. The term was introduced initially as a measure of pleasure or happiness as part of the theory of utilitarianism by moral philosophe ...
upon which
discrete choice
In economics, discrete choice models, or qualitative choice models, describe, explain, and predict choices between two or more discrete alternatives, such as entering or not entering the labor market, or choosing between modes of transport. Su ...
models are based.
The goal of DDC methods is to estimate the
structural parameters of the agent's decision process. Once these parameters are known, the researcher can then use the estimates to simulate how the agent would behave in a counterfactual state of the world. (For example, how a prospective college student's enrollment decision would change in response to a tuition increase.)
Mathematical representation
Agent
's
maximization problem can be written mathematically as follows:
:
where
*
are
state variables
A state variable is one of the set of variables that are used to describe the mathematical "state" of a dynamical system. Intuitively, the state of a system describes enough about the system to determine its future behaviour in the absence of a ...
, with
the agent's
initial condition
*
represents
's decision from among
discrete alternatives
*
is the
discount factor
Discounting is a financial mechanism in which a debtor obtains the right to delay payments to a creditor, for a defined period of time, in exchange for a charge or fee.See "Time Value", "Discount", "Discount Yield", "Compound Interest", "Efficient ...
*
is the
flow utility receives from choosing alternative
in period
, and depends on both the state
and unobserved factors
*
is the
time horizon
Time is the continued sequence of existence and events that occurs in an apparently irreversible succession from the past, through the present, into the future. It is a component quantity of various measurements used to sequence events, to c ...
* The expectation
is taken over both the
's and
's in
. That is, the agent is uncertain about future transitions in the states, and is also uncertain about future realizations of unobserved factors.
Simplifying assumptions and notation
It is standard to impose the following simplifying assumptions and notation of the dynamic decision problem:
;1. Flow utility is additively separable and linear in parameters
The flow utility can be written as an additive sum, consisting of deterministic and stochastic elements. The deterministic component can be written as a linear function of the
structural parameters.
:
;2. The optimization problem can be written as a
Bellman equation
A Bellman equation, named after Richard E. Bellman, is a necessary condition for optimality associated with the mathematical optimization method known as dynamic programming. It writes the "value" of a decision problem at a certain point in time ...
Define by
the ''ex ante'' value function for individual
in period
just before
is revealed:
:
where the expectation operator
is over the
's, and where
represents the probability distribution over
conditional on
. The expectation over state transitions is accomplished by taking the integral over this probability distribution.
It is possible to decompose
into deterministic and stochastic components:
:
where
is the value to choosing alternative
at time
and is written as
:
where now the expectation
is taken over the
.
;3. The optimization problem follows a
Markov decision process
The states
follow a
Markov chain
A Markov chain or Markov process is a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event. Informally, this may be thought of as, "What happen ...
. That is, attainment of state
depends only on the state
and not
or any prior state.
Conditional value functions and choice probabilities
The value function in the previous section is called the conditional value function, because it is the value function conditional on choosing alternative
in period
. Writing the conditional value function in this way is useful in constructing formulas for the choice probabilities.
To write down the choice probabilities, the researcher must make an assumption about the distribution of the
's. As in static discrete choice models, this distribution can be assumed to be
iid Type I extreme value,
generalized extreme value,
multinomial probit
In statistics and econometrics, the multinomial probit model is a generalization of the probit model used when there are several possible categories that the dependent variable can fall into. As such, it is an alternative to the multinomial logi ...
, or
mixed logit
Mixed is the past tense of ''mix''.
Mixed may refer to:
* Mixed (United Kingdom ethnicity category), an ethnicity category that has been used by the United Kingdom's Office for National Statistics since the 1991 Census
* ''Mixed'' (album), a co ...
.
For the case where
is multinomial logit (i.e. drawn
iid from the
Type I extreme value distribution
In probability theory
Probability theory is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expre ...
), the formulas for the choice probabilities would be:
:
Estimation
Estimation of dynamic discrete choice models is particularly challenging, due to the fact that the researcher must solve the backwards recursion problem for each guess of the structural parameters.
The most common methods used to estimate the structural parameters are
maximum likelihood estimation
In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data. This is achieved by maximizing a likelihood function so that, under the assumed stati ...
and
method of simulated moments In econometrics, the method of simulated moments (MSM) (also called simulated method of moments) is a structural estimation technique introduced by Daniel McFadden. It extends the generalized method of moments to cases where theoretical moment f ...
.
Aside from estimation methods, there are also solution methods. Different solution methods can be employed due to complexity of the problem. These can be divided into full-solution methods and non-solution methods.
Full-solution methods
The foremost example of a full-solution method is the nested fixed point (NFXP) algorithm developed by
John Rust
John Philip Rust (born May 23, 1955) is an American economist and econometrician.
John Rust received his PhD from MIT in 1983 and taught at the University of Wisconsin, Yale University and University of Maryland before joining Georgetown U ...
in 1987.
The NFXP algorithm is described in great detail in its documentation manual.
A recent work by Che-Lin Su and
Kenneth Judd
Kenneth Lewis Judd (born March 24, 1953) is a computational economist at Stanford University, where he is the Paul H. Bauer Senior Fellow at the Hoover Institution. He received his PhD in economics from the University of Wisconsin in 1980. H ...
in 2012
[
] implements another approach (dismissed as intractable by Rust in 1987), which uses
constrained optimization
In mathematical optimization, constrained optimization (in some contexts called constraint optimization) is the process of optimizing an objective function with respect to some variables in the presence of constraints on those variables. The obj ...
of the likelihood function, a special case of
mathematical programming with equilibrium constraints Mathematical programming with equilibrium constraints (MPEC) is the study of
constrained optimization problems where the constraints include variational inequalities or complementarities. MPEC is related to the Stackelberg game.
MPEC is used ...
(MPEC).
Specifically, the likelihood function is maximized subject to the constraints imposed by the model, and expressed in terms of the additional variables that describe the model's structure. This approach requires powerful optimization software such as
Artelys Knitro
Artelys Knitro is a commercial software package for solving large scale nonlinear mathematical optimization problems.
KNITRO – (the original solver name) short for "Nonlinear Interior point Trust Region Optimization" (the "K" is silent) – wa ...
because of the high dimensionality of the optimization problem.
Once it is solved, both the structural parameters that maximize the likelihood, and the solution of the model are found.
In the later article
[
] Rust and coauthors show that the speed advantage of MPEC compared to NFXP is not significant. Yet, because the computations required by MPEC do not rely on the structure of the model, its implementation is much less labor intensive.
Despite numerous contenders, the NFXP maximum likelihood estimator remains the leading estimation method
for Markov decision models.
Non-solution methods
An alternative to full-solution methods is non-solution methods. In this case, the researcher can estimate the structural parameters without having to fully solve the backwards recursion problem for each parameter guess. Non-solution methods are typically faster while requiring more assumptions, but the additional assumptions are in many cases realistic.
The leading non-solution method is conditional choice probabilities, developed by V. Joseph Hotz and Robert A. Miller.
[
]
Examples
Bus engine replacement model
The bus engine replacement model developed in the seminal paper is one of the first dynamic stochastic models of discrete choice estimated using real data, and continues to serve as classical example of the problems of this type.
The model is a simple regenerative
optimal stopping
In mathematics, the theory of optimal stopping or early stopping
: (For French translation, secover storyin the July issue of ''Pour la Science'' (2009).) is concerned with the problem of choosing a time to take a particular action, in order to ...
stochastic dynamic problem faced by the decision maker, Harold Zurcher, superintendent of maintenance at the
Madison Metropolitan Bus Company in
Madison, Wisconsin
Madison is the county seat of Dane County, Wisconsin, Dane County and the capital city of the U.S. state of Wisconsin. As of the 2020 United States Census, 2020 census the population was 269,840, making it the second-largest city in Wisconsin b ...
. For every
bus
A bus (contracted from omnibus, with variants multibus, motorbus, autobus, etc.) is a road vehicle that carries significantly more passengers than an average car or van. It is most commonly used in public transport, but is also in use for cha ...
in operation in each time period Harold Zurcher has to decide whether to replace the
engine
An engine or motor is a machine designed to convert one or more forms of energy into mechanical energy.
Available energy sources include potential energy (e.g. energy of the Earth's gravitational field as exploited in hydroelectric power ...
and bear the associated replacement cost, or to continue operating the bus at an ever raising cost of operation, which includes insurance and the cost of lost ridership in the case of a breakdown.
Let
denote the
odometer
An odometer or odograph is an instrument used for measuring the distance traveled by a vehicle, such as a bicycle or car. The device may be electronic, mechanical, or a combination of the two ( electromechanical). The noun derives from ancient G ...
reading (mileage) at period
,
cost of operating the bus which depends on the vector of parameters
,
cost of replacing the engine, and
the
discount factor
Discounting is a financial mechanism in which a debtor obtains the right to delay payments to a creditor, for a defined period of time, in exchange for a charge or fee.See "Time Value", "Discount", "Discount Yield", "Compound Interest", "Efficient ...
. Then the per-period utility is given by
:
where
denotes the decision (keep or replace) and
and
represent the component of the utility observed by Harold Zurcher, but not John Rust. It is assumed that
and
are independent and identically distributed with the
Type I extreme value distribution
In probability theory
Probability theory is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expre ...
, and that
are independent of
conditional on
.
Then the optimal decisions satisfy the
Bellman equation
A Bellman equation, named after Richard E. Bellman, is a necessary condition for optimality associated with the mathematical optimization method known as dynamic programming. It writes the "value" of a decision problem at a certain point in time ...
:
where
and
are respectively transition densities for the observed and unobserved states variables. Time indices in the Bellman equation are dropped because the model is formulated in the infinite horizon settings, the unknown optimal policy is
stationary
In addition to its common meaning, stationary may have the following specialized scientific meanings:
Mathematics
* Stationary point
* Stationary process
* Stationary state
Meteorology
* A stationary front is a weather front that is not moving ...
, i.e. independent of time.
Given the distributional assumption on
, the probability of particular choice
is given by
:
where
is a unique solution to the
functional equation
In mathematics, a functional equation
is, in the broadest meaning, an equation in which one or several functions appear as unknowns. So, differential equations and integral equations are functional equations. However, a more restricted mea ...
:
It can be shown that the latter functional equation defines a
contraction mapping In mathematics, a contraction mapping, or contraction or contractor, on a metric space (''M'', ''d'') is a function ''f'' from ''M'' to itself, with the property that there is some real number 0 \leq k < 1 such that for all ''x'' an ...
if the state space
is bounded, so there will be a unique solution
for any
, and further the
implicit function theorem holds, so
is also a
smooth function
In mathematical analysis, the smoothness of a function is a property measured by the number of continuous derivatives it has over some domain, called ''differentiability class''. At the very minimum, a function could be considered smooth if ...
of
for each
.
Estimation with nested fixed point algorithm
The contraction mapping above can be solved numerically for the fixed point
that yields choice probabilities
for any given value of
. The
log-likelihood
The likelihood function (often simply called the likelihood) represents the probability of random variable realizations conditional on particular values of the statistical parameters. Thus, when evaluated on a given sample, the likelihood funct ...
function can then be formulated as
:
where
and
represent data on state variables (odometer readings) and
decision (keep or replace) for
individual buses, each in
periods.
The joint algorithm for solving the fixed point problem given a particular value of parameter
and maximizing the log-likelihood
with respect to
was named by John Rust ''nested fixed point algorithm'' (NFXP).
Rust's implementation of the nested fixed point algorithm is highly optimized for this problem, using
Newton–Kantorovich iterations to calculate
and
quasi-Newton method Quasi-Newton methods are methods used to either find zeroes or local maxima and minima of functions, as an alternative to Newton's method. They can be used if the Jacobian or Hessian is unavailable or is too expensive to compute at every iteration ...
s, such as the
Berndt–Hall–Hall–Hausman algorithm The Berndt–Hall–Hall–Hausman (BHHH) algorithm is a numerical optimization algorithm similar to the Newton–Raphson algorithm, but it replaces the observed negative Hessian matrix with the outer product of the gradient. This approximation is b ...
, for likelihood maximization.
Estimation with MPEC
In the nested fixed point algorithm,
is recalculated for each guess of the parameters . The MPEC method instead solves the
constrained optimization
In mathematical optimization, constrained optimization (in some contexts called constraint optimization) is the process of optimizing an objective function with respect to some variables in the presence of constraints on those variables. The obj ...
problem:
:
This method is faster to compute than non-optimized implementations of the nested fixed point algorithm, and takes about as long as highly optimized implementations.
Estimation with non-solution methods
The conditional choice probabilities method of Hotz and Miller can be applied in this setting. Hotz, Miller, Sanders, and Smith proposed a computationally simpler version of the method, and tested it on a study of the bus engine replacement problem. The method works by estimating conditional choice probabilities using
simulation
A simulation is the imitation of the operation of a real-world process or system over time. Simulations require the use of models; the model represents the key characteristics or behaviors of the selected system or process, whereas the ...
, then backing out the implied differences in
value function The value function of an optimization problem gives the value attained by the objective function at a solution, while only depending on the parameters of the problem. In a controlled dynamical system, the value function represents the optimal payo ...
s.
See also
*
Inverse reinforcement learning
Reinforcement learning (RL) is an area of machine learning concerned with how intelligent agents ought to take actions in an environment in order to maximize the notion of cumulative reward. Reinforcement learning is one of three basic machine ...
References
Further reading
*
*
*
* {{cite book , last=Rust , first=John , title=Handbook of Econometrics , volume=4 , pages=3081–3143 , chapter=Chapter 51 Structural estimation of markov decision processes , publisher=Elsevier , year=1994 , isbn=978-0-444-88766-5 , issn=1573-4412 , doi=10.1016/s1573-4412(05)80020-0
Choice modelling
Economics models
Mathematical and quantitative methods (economics)
Dynamic programming