identification of hyperbolic discount factor in dynamic

25
Identification of hyperbolic discount factor in dynamic discrete choice model with multiple terminating actions Chao Wang April 25, 2021 Abstract This paper studies identification of quasi-hyperbolic discount dynamic discrete choice models in both finite and infinite horizons, exploring the unique features of the presence of multiple terminating actions. Under economically meaningful exclusion restrictions, the identification of discount factors is characterized by polynomial moment conditions. The presence of multiple terminating actions greatly reduces the complication of the identification and also helps relax the restrictions imposed on the flow utility function. This paper also examines the impact of estimating the ‘underlying’ hyperbolic discounting model as the prevalent exponential discount model. I find that such misspecification could lead to misleading policy implications. Keywords: identification, hyperbolic discount factor, dynamic discrete choice model 1

Upload: others

Post on 02-Apr-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Identification of hyperbolic discount factor in dynamicdiscrete choice model with multiple terminating

actionsChao Wang

April 25, 2021

Abstract

This paper studies identification of quasi-hyperbolic discount dynamic discrete choice modelsin both finite and infinite horizons, exploring the unique features of the presence of multipleterminating actions. Under economically meaningful exclusion restrictions, the identification ofdiscount factors is characterized by polynomial moment conditions. The presence of multipleterminating actions greatly reduces the complication of the identification and also helps relaxthe restrictions imposed on the flow utility function. This paper also examines the impact ofestimating the ‘underlying’ hyperbolic discounting model as the prevalent exponential discountmodel. I find that such misspecification could lead to misleading policy implications.Keywords: identification, hyperbolic discount factor, dynamic discrete choice model

1

1 Introduction

The time inconsistency has got extensive attention from economists. The prevalent approachto model time inconsistency is (quasi-)hyperbolic discounting. Such a framework is convenient instudying problems like addiction and self-control. (e.g. O’Donoghue and Rabin, 1999, [1]). Intraditional economics, agents are assumed to discount the future value in the exponential rate,which is similar to the compound interest rate. However, there are more and more evidence fromexperiments showing that people may not only evaluate the future in this way. In the lab ex-periments, considerable amount of people tended to show present-biased preference. For example,when people are asked about ”whether you like 1 dollar tomorrow or 2 dollars two days later”, mostpeople would choose 1 dollar tomorrow. However, when asked about ”whether you like 1 dollar onemonth later, or 2 dollars one month plus one day later”, some people would choose 2 dollar case.In the present-biased model, the intertemperal optimization problem can be regarded as consistingof many independent selves, one in each period. Every self has different time preference, and theyare inconsistent with each other. That is, the choices made by current self today might not bepreferred by the future self, even if they know the same information. This kind of inconsistencygenerates the commitment and self-control problem, and also leaves room for policy intervention.

There are two types of present-biased people called naive and sophisticated agents. The naiveagents are those who care about the future in present-biased mode but perceive that they willdiscount the future as if they are using exponential rate tomorrow. That is, naive agents are present-biased in current period, but think that they will not be biased in the future (when tomorrow reallycomes, they are still present-biased). The sophisticated agents are present-biased in current period,and they know that they will be present-biased in the future.

One big problem in adopting the hyperbolic discounting factor model is joint identificationof both present-biased discounting factor and the exponential discount factor. In the standarddynamic discrete choice (DDC) model, the discount factor is non-parameterically underidentified(see e.g. Rust 1994 [2], Magnac and Thesmar 2002 [3]). Fang and Wang (2015) [4] provide theidentification condition for partially naive case. Abbring and Dalijord (2020) [5] show that the proofin Fang and Wang (2015) [4] is incorrect. Abbring, Daljord and Iskhakov (2019) [6] also give theset identification result of discount factor for sophisticated agents in finite horizon case. Mahajan,Michel and Tarozzi (2020) [7] give the identification result of present-biased time preference withfinite horizon. Chan (2017) [8] estimates a DDC model with hyperbolic discounting factor toanalyze the welfare dependence with the identification argument of Fang and Wang (2015) [4].

Most research above focuses on the finite horizon problem. One big advantage of using finitehorizon problem is that it is very easy to characterize the optimal behaviors via backward induction.However, the identification in this circumstance usually requires getting access to the data in thefinal period. This data requirement sometimes is restrictive. On the other hand, there are plentyof cases when the agents are assumed to make decision across a long time, such as house mortgage,saving in pension, etc. In those cases it would be hard to observe the final period data. When thedecision is over a long time span, one could expect that the behaviors further away from the terminal

2

period might be similar to that of the infinite horizon framework. Moreover, in practice, there areplenty examples that agents do not have a clear recognition of the terminal period, i.e., randomstopping problems. In circumstances like this, a infinite horizon model might be more appropriate.Due to its difficulty in identification, infinite horizon DDC with hyperbolic discounting factor hasnot been sufficiently explored.

In this paper, I provide the conditions of identification of hyperbolic discounting factor in DDCmodel, in both finite horizon and infinite horizon scenarios. The identification extends the existingresults, i.e., Rust (1987) [9], Hotz and Miller (1993) [10]. The contribution of this paper is toexplore the unique features of multiple terminating actions, motivated by the borrower’s decisionsafter taking up a reverse mortgage in the reality, i.e., Blevins et al (2020) [11]. Terminating actionsare those actions that will terminate the model after being adopted. The introduction of multipleterminating action can help to reduce the normalization in flow utility function for finite horizonexponential case [11]. Taking housing mortgage as an example, in each period, the agents canchoose to make interest payment and continue the mortgage, but they could also choose to defaultor refinance the mortgage, which will lead to the termination of the mortgage contract. In this case,there are two kinds of terminating actions, default and refinance. Then housing mortgage can beregarded as model with multiple terminating actions. I show that multiple terminating action canalso help reduce the normalization requirement in infinite and finite horizon hyperbolic discountingcase. With normalizing the reference flow utility for at least two states, present-biased discountingfactor, exponential discounting factor and flow utility can be identified.

This paper also contributes to the existing literature in that I use Monte Carlo simulations tounderstand the difference between the hyperbolic discounting model and the exponential discount-ing model. Specifically, I first generate the agent’s action according to the optimal behaviors in thehyperbolic discounting model. I then estimate the model primitives for both the hyperbolic dis-counting model and the exponential discounting model. I find that exponential discounting modeltends to underestimate the exponential discounting factor, especially when the optimal present-biased discounting factor is small. In this case, misspecification could also estimate the flow utilitywrongly. I then simulate the agent’s actions under different policy changes using both estimatesto understanding the risk of mis-specification of a hyperbolic discounting model as an exponen-tial discounting model. 1 I find that mis-specification could result in misleading counterfactualoutcomes.

The rest of the paper is organized in the following structure: section 2 provides the DDC modelsetup in both infinite horizon case and finite horizon case; section 3 discusses the conditions inidentification of discounting factors; section 4 provides simulation result in a simple case to showthe effect of the misspecification; section 5 concludes.

1There are research of normalization in reference utility impacting the counterfactual analysis, e.g. Norets andTang (2014) [12], Aguirregabiria and Suzuki (2014) [13], Kalouptsidi et al (2015) [14].

3

2 A General Model of hyperbolic discounting

Consider a dynamic discrete choice model that an individual needs to decides her action inperiod t = 1, ..., T , where T can be finite or infinite. Let ut represent the individual’s utility atperiod t. The life-time utility for an individual with a present-biased time preference at period t

can be represented as

Ut(ut, ut+1, · · · , uT ) ≡ ut + βδ

T∑t′=t+1

δt′−t−1Eut′ , T ≤ ∞

where β captures the individual’s present-biased time preference, and δ is the long-term discountingrate, which is the same as the discount rate in the traditional exponential discounting framework.This hyperbolic discount factor framework nests the exponential framework that is equivalent toβ = 1, meaning that the agent does not have present-biased time preference (Abbring, Daljord andIskhakov (2019) [6]).

The dynamic process can be summarized as the follows. In each period t, the agent chooses anaction kt from a choice set D = {1, 2, ...,K}. Prior to making the choice, the agent can observestate variables x ∈ X = {x1, · · · , xJ} and ϵ = {ϵ1,t, ..., ϵK,t}, where state variables xt are observableto both the agent and the econometrician, while state variables ϵt are only observable to the agent.

Following the existing literature, we model that the action-specific private information enteringthe agent’s utility function additively separable. That is, the utility function in period t can berepresented as:

uk,t(x, k−1, ϵ) = uk,t(x, k−1) + ϵk,t, (1)

where k−1 is the action in the last period. We allow the lagged action entering the utility functionfor flexibility. In some applications, the lagged action does not affect the flow utility. The statevariables xt are assumed to have finite support X = {x1, x2, · · · , xJ} and have stationary Markovprocess controlled by kt. The transition matrix is defined as Qk(x

′|x) if k ∈ D is chosen. Theshocks ϵk,t is independent from xt, prior states and choices. The shocks are also independent overtime, across choices, and have type 1 extreme value distributions for analytical simplification.

The agent then chooses her action every period t to maximize the life-time utility at time t bytaking into account the future utility generated by the future selfs who is also a utility maximizer.For those present-biased agents, there are two types of them depending on how the current selfviews their future selfs: sophisticated and naive. The sophisticated agent knows that her futureself is also present-biased and maximizes the life time utility characterized by Equation 1, whilethe naive agent thinks that the future self is time-consistent and maximizes the life-time utilitycharacterized by Equation 1 with β = 1. Let σt : X × RK → D be an arbitrary choice strategyin period t and σt = {στ}∞τ=t be the life-time strategy profile. In what follows, we are going tocharacterize the agent’s optimal decision for both sophisticated and naive agent separately.

An intuitive example to understand sophisticated agent’s decision process is: a smoker is con-

4

sidering quitting smoker every period, a sophisticated agent is aware that it is difficult to quitsmoking now, and it is difficult to quit smoking in the future, but she know that from the long-term perspective she should quit it. There is conflict between the current self and the self thatfocuses on the long term benefit; and the agent is aware of this conflict. On the other hand, thereis not conflict between the perceived future self and the future self.

An intuitive example to understand naive agent’s decision process is: a smoker is consideringquitting smoker every period. The smoker thinks that she will quit tomorrow because the discountrate of future benefit from quitting smoking tomorrow is only δ while it is δβ today. However, whentomorrow comes, the smoker chooses to continue smoking and thinks that she will quit the nextday. Naive agent tends to postpone unpleasant action and thinks that she will do it in the futurebut she continue postpone it when tomorrow comes. There is conflict between the perceived futureself and the future self. Specifically, the perceived future self is time-consistent and maximizesthe long term benefit while the actual decision future self makes is bounded by the present-biasedpreference.

To incorporate the prevalent feature of a dynamic discrete choice of terminal actions, we intro-duce a dummy variable Ik,k−1 to control whether the decision process enters into the next periodor not. Specifically, Ik,k−1 = 0 indicates the dynamic process ends while Ik,k−1 = 1 means thedynamic process continues.

2.1 Infinite Horizon Framework

Infinite-horizon Sophisticated agent optimization In infinite horizon model, the wholeprocedure is stationary so I drop off the subscript t from states, actions, utility functions and valuefunctions. Then, the agent’s current choice-specific value function is stationary and can be generallydescribed as

wk(x, k−1; σ̃) = uk(x, k−1) + Ik,k−1βδ∑x′

v(x′, k; σ̃)Qk(x′|x), (2)

where v(x′, k) is the perceived long-run value function defined as

v(x, k−1; σ̃) = EϵV (x, k−1, ϵ; σ̃) ≡ Eϵ{uσ̃(x, k−1) + ϵ(σ̃) + δEx′,ϵ′[V (x′, σ̃, ϵ′; σ̃)|x, σ̃

]}, (3)

where V (x, k−1, ϵ; σ̃) is the stationary value function under the perceived future self’s strategy profileσ̃ ≡ {σ̃t}∞t=1. A forward-looking agent faces the tradeoff between current utility and future valuesdiscounted disproportionally by factor βδ, while all future utilities are discounted geometrically byfactor δ.

Since the agent is sophisticated, her perceptions of the future strategies are consistent with hercurrent strategy due to stationary. Thus, in equilibrium, her current selves use a perception perfectstrategy (O’Donoghue and Rabin, 1999), which is a strategy profile σ∗ such that each σ∗ is a best

5

response to her perceived future strategy profile σ∗:

σ∗(x, k−1, ϵ) = argmaxk∈D

{wk(x, k−1;σ∗) + ϵk}. (4)

Following the existing literature, we further define the conditional choice probability (CCP) thatrepresents the probability each action being chosen given the current state variable. Thus, we cancharacterize the equilibrium CCPs via the following equation:

pk(x, k−1) ≡ Pr[σ∗(x, k−1, ϵ) = k] = Pr[wk(x, k−1;σ∗) + ϵk ≥ wk′(x, k−1;σ

∗) + ϵk, ∀k′ ̸= k]

for all k ∈ D;x ∈ X . Consequently, we can characterize the agent’s optimal decision via theequilibrium CCPs. Because we assume that the private shocks follow the type-1 extreme valuedistribution, the equilibrium CCPs can be represented as

pk(x, k−1) =wk(x, k−1;σ

∗)∑k′ wk′(x, k−1;σ∗)

, (5)

where the perceived long-run value function v(x, k−1;σ∗) can be further represented as

v(x, k−1;σ∗) = Eϵ

[uσ∗(x,k−1,ϵ)(x, k−1) + ϵ(σ∗) + δ

∑x′

v(x′, σ∗;σ∗)Qσ∗(x,k−1,ϵ′)(x′|x)

]

= Eϵ

[wσ∗(x,k−1,ϵ)(x, k−1;σ

∗) + ϵ(σ∗) + δ(1− β)∑x′

v(x′, σ∗;σ∗)Qσ∗(x,k−1,ϵ′)(x′|x)

]

= Eϵmaxk [wk(x, k−1;σ∗) + ϵk] + Eϵ

[δ(1− β)

∑x′

v(x′, σ∗;σ∗)Qσ∗(x,k−1,ϵ)(x′|x)

]= γ − logpK(x, k−1) + wK(x, k−1;σ

∗)

+ δ(1− β)∑k

Ik,k−1

∑x′

v(x′, k;σ∗)Qk(x′|x)pk(x, k−1),

(6)

The last equality uses the conclusion under type-I extreme value distribution

Eϵmaxk [wk(x, k−1)− wK(x, k−1) + ϵk] = γ − logpK(x, k−1)

which is a one-to-one mapping for the long-term value. γ is the mean of type-I extreme valuedistribution. 2 Note that the long-term value is an aggregation over the choice-specific valuesassociated with all possible actions being optimal. Consequently, the optimal behaviors of theagent can be characterized by Equations 2, 5, and 6. By some fixed point theorem, the perceptionperfect strategy exists. The existence of perception perfect strategy for the same kind of stochasticgames has been shown by Peeters (2004) [15].

2Since there is terminating action, the Euler’s constant γ does have an impact in the perceived long-run valuefunction here and it may not be normalized casually.

6

2.2 Finite Horizon Framework

Finite-horizon Sophisticated agent optimization For any kind of agent, the current choice-specific value function at period T can be written as

wkT ,T (xT , kT−1) = ukT ,T (xT , kT−1), kT ∈ D (7)

For period t ̸= T , the agent’s current choice-specific value function is not necessarily stationaryand can be generally described as

wkt,t(xt, kt−1; σ̃t+1) = ukt,t(xt, kt−1) + Ikt,kt−1βδ∑xt+1

vt+1(xt+1, kt; σ̃t+1)Qkt(xt+1|xt), (8)

where vt(xt, kt−1; σ̃t) is the perceived long-run value function defined as

vt(xt, kt−1; σ̃t) = EϵVt(xt, kt−1, ϵt; σ̃t) (9)

≡ Eϵt{uσ̃t,t(xt, kt−1) + ϵ(σ̃t) + δExt+1,ϵt+1 [Vt+1(xt+1, σ̃t, ϵt+1; σ̃t+1)|xt, σ̃t]}

where Vt(xt, kt−1, ϵt; σ̃t) is value function under the perceived future self’s strategy profile σ̃t ≡{σ̃τ}Tτ=t. The perception perfect strategy is defined as astrategy profile σ∗

1 such that each σ∗t is a

best response to her perceived future strategy profile σ∗t+1:

σ∗t (xt, kt−1, ϵt) = argmax

kt∈D{wkt,t(xt, kt−1;σ

∗t+1) + ϵkt}. (10)

Similarly, the equilibrium CCPs are defined via the following equation:

pkt,t(xt, kt−1) ≡ Pr[σ∗t (xt, kt−1, ϵt) = k] (11)

= Pr[wk,t(xt, kt−1;σ∗t+1) + ϵk ≥ wj,t(xt, kt−1;σ

∗t+1) + ϵj , ∀j ̸= k]

Following the type-1 extreme value distribution, the equilibrium CCPs can be represented as

pkt,t(xt, kt−1) =wkt,t(xt, kt−1;σ

∗t+1)∑

j∈D wj,t(xt, kt−1;σ∗t+1)

, (12)

where the perceived long-run value function vt(xt, kt−1;σ∗t ) can be further represented as

vt(xt, kt−1;σ∗t )

= γ − logpK,t(xt, kt−1) + wK,t(xt, kt−1;σ

∗t+1)

+δ(1− β)∑

kt∈DIkt,kt−1

∑xt+1∈X

vt+1(xt+1, kt;σ∗t+1)Qkt(xt+1|xt)pkt,t(xt, kt−1), t < T

= γ − logpK,t(xt, kt−1) + wK,t(xt, kt−1); t = T

(13)

7

3 Identification

This section studies how the unknown primitives of the model, Q1, ..., QK ;β; δ;uk(x) are identi-fied. The data record the state variable and the action decisions taken by the agent in each period.I focus on identifying the present-bias β and the discount factor δ. Given the observed state anddecisions, the transition matrix Qk can be directly identified.

Notice that the choice decision depend on the primitives through the value differences wk(x, k−1)−wK(x, k−1). Specifically, under the type-1 extreme value distribution, we have

ln( pk(x, k−1)

pK(x, k−1)

)= wk(x, k−1)− wK(x, k−1), (14)

for all k ∈ D, x ∈ X . Thus, the choice specific value contrasts can be uniquely recovered fromthe observed choice probabilities. Thus, the question boils down to identify the present-bias and(exponential) discount factors and flow utility functions using the identified choice specific valuecontrasts and transition matrix Qk, which is pioneered in Hotz and Miller, 1993.

Before we examine the identification of the underlying model primitives, we first introduce thespecific features in the reverse mortgage besides the general framework. In the reverse mortgagesetting, we model the borrower in each period has possible options to choose from, i.e., D = {0, 1, 2},where choice 0 means that the agent chooses to continue (or make decision next period), and choice2 means that the agent chooses to terminate the project immediately. For option 1, the projectis terminated if the agent has already chosen option 1 in the previous period. Thus, choice 1 andchoice 2 are both terminating actions. That is to say, Ik=2,k−1 = 0, and Ik=1,k−1=1 = 0.

Then the choice specific value can be written as following:

wk(x, k−1 = 0) =

u0(x, 0) + βδ

∑x′∈X

v(x′, 0)Q0(x′|x), k = 0

u1(x, 0) + βδ∑

x′∈Xv(x′, 1)Q1(x

′|x), k = 1

u2(x, 0), k = 2

(15)

wk(x, k−1 = 1) =

u0(x, 1) + βδ

∑x′∈X

v(x′, 0)Q0(x′|x), k = 0

u1(x, 1), k = 1

u2(x, 1), k = 2

(16)

We then exploit the special feature of multiple terminating actions to rewrite the value of action 0and 1 relative to the terminating action 2 as

wk(x, k−1)− w2(x, k−1) =

u0(x, 0)− u2(x, 0) + βδ

∑x′ v(x′, 0)Q0(x

′|x), k−1 = 0, k = 0

u0(x, 1)− u2(x, 1) + βδ∑

x′ v(x′, 0)Q0(x′|x), k−1 = 1, k = 0

u1(x, 0)− u2(x, 0) + βδ∑

x′ v(x′, 1)Q1(x′|x), k−1 = 0, k = 1

u1(x, 1)− u2(x, 1), k−1 = 1, k = 1

(17)

8

Consequently, the log odd ratio of action 1 for different lagged action k−1 can be represented as

∆lnp12(x) ≡ ln(p1(x, 0)p2(x, 0)

)− ln

(p1(x, 1)p2(x, 1)

)= [w1(x, 0)− w2(x, 0)]− [w1(x, 1)− w2(x, 1)]

= [u1(x, 0)− u2(x, 0)]− [u1(x, 1)− u2(x, 1)] + βδ∑x′

v(x′, 1)Q1(x′|x), (18)

which provides a moment condition regarding the product of the present-biased factor and thediscount factor βδ and the long term value v(x′, 1), while the log odd ratio ∆lnp12(x) is identifiedfrom the data directly. This log odd ratio is determined by three components, the relative utilities[u1(x, 0)− u2(x, 0)] − [u1(x, 1)− u2(x, 1)], the future discount factor βδ, and the perceived longterm value v(x′, 1).

Due to the special features in the DDC models, that an individual makes decisions based onrelative utilities instead of the absolute value of the utilities. Some restrictions regarding the utilityfunctions are necessary. In general, this exclusion restriction is set where there are some states,periods, or choices, not affecting the flow utility directly but affecting the value function. Sincein the stationary infinite horizon case, the flow utility is time-invariant. Here I just impose theexclusion restriction on the state k−1:

Assumption 1 The flow utility satisfies the following restriction:

u0(x, 0)− u0(x, 1) = h0(x) the punishment of missing payment in the previous year if continuing

u2(x, 0)− u2(x, 1) = 0 the punishment of missing payment in the previous year if refinancing

u1(x, 1)− u1(x, 0) = u2(x) the extra cost due to termination by default

The assumption 1 is relaxed from Blevins et al (2020) [11], where they assume that the previousaction k−1 doesn’t affect the flow utility of continuing (i.e. h0(x) = 0) and refinance (immediatetermination) 3 , and the utility difference between default after missing payment in the previousperiod (u1(x, 1)) or not (u1(x, 0)) is just the u2(x) (because utility from refinancing is not affected bythe previous action, here I just omit k−1). The reason is that in reverse mortgage, missing paymentfor one period would not lead to foreclosure immediately. If they default for two consecutive periods,the mortgage will get foreclosure. Thus, they can still enjoy the benefit from the mortgage if theydefault for one period, but they will get foreclosure due to default twice, the flow utility will be thebenefit from mortgage at this period (u1(x, 0)) plus the utility of terminating the mortgage u2(x)

3In the literature, most of the DDC models set the reference utility (here is u2(x, k−1)) to be normalized, which canalso be done here with the conclusion still holding. However, because of the impact of normalization on counterfactualanalysis [12][13][14], it is meaningful to keep the functional form of reference utility .

Instead, I put a restriction on the reference utility, and it can be economically meaningful: for example, the commonway to terminate the reverse mortgage is selling the house, and the reverse mortgage is ”non-recourse”, which meansthat the borrowers would not owe more than the value of the house (the higher part will be insured by FederalHousing Administration (FHA)). If the credit is used up and the borrowers decide to sell the house, default in theprevious period may not affect the borrower because they only suffer losing the house value. Similar restrictions arealso used in the literature like Mahajan, Michel and Tarozzi (2020) [7].

9

4. On the other hand, one can easily test on the exclusion restriction directly via checking howthe previous action affecting the choice over continuing vs refinance. That is, by the terminatingaction feature, we have

ln(p0(x, 0)p2(x, 0)

)− ln

(p0(x, 1)p2(x, 1)

)= [u0(x, 0)− u2(x, 0)]− [u0(x, 1)− u2(x, 1)] = h0(x) (19)

The component in the left-hand side can be directly identified and estimated from the data, whilethe component in the right-hand side equals to h0(x) if the exclusion restriction is satisfied.

If the restriction on the flow utility is satisfied, following the literature on identification in DDCmodels, we stack moment conditions in Equation 18 in the following matrix form:

∆lnp12 = −u2 + βδQ1v(1) (20)

where ∆lnp12 ≡ [∆lnp12(x1), ...,∆lnp12(xJ)]′ and v(k−1) ≡ [v(x1, k−1), ..., v(xJ , k−1)]

′ are a J × 1

column vector, and the transition matrix Q1 is a J × J matrix. From this moment condition,we can represent the perceived long run value directly as a function of the observed CCPs, thetime-inconsistency, and the discount factor if the transition matrix Q1 is invertible.

To quantify the relationship between the long term value and the CCPs, we express the perceivedlong-run value function v(x, k−1) following Equation 6. Note that the perceived long-run valuefunction varies with the lagged action k−1.

v(x, 0) = m(x, 0) + u2(x, 0) + δ(1− β)p0(x, 0)∑x′∈X

v(x′, 0)Q0(x′|x)

+ δ(1− β)p1(x, 0)∑x′∈X

v(x′, 1)Q1(x′|x)

v(x, 1) = m(x, 1) + u2(x, 1) + δ(1− β)p0(x, 1)∑x′∈X

v(x′, 0)Q0(x′|x), (21)

where m(x, k−1) ≡ γ − logpK(x, k−1) is directly identified from the observed choice data.If the restriction on the flow utility is satisfied, i.e., u2(x, 0) = u2(x, 1), we then stack the above

equations over all possible values of the state variable x and rewrite them in the following matrixform:

v(0) = m(0) + u2 + δ(1− β)Q̄00v(0) + δ(1− β)Q̄10v(1)

v(1) = m(1) + u2 + δ(1− β)Q̄01v(0), (22)

where u2 is a J × 1 vector of the flow utility of refinancing or immediate terminating, and Q̄k,k−1

4In Blevins et al(2020) appendix, they also adopt another form of restriction: u1(x, 1)− u1(x, 0) = 0 to show theidentification result. In this paper, it can be easily shown that the identification result also hold under this restriction,or even u1(x, 1)− u1(x, 0) = h0(x), u2(x) + h0(x).

10

is the mixture of the transition matrix and can be represented as

Q̄k,k−1 ≡

pk(x1, k−1)Qk(x1)

pk(x2, k−1)Qk(x2)

· · ·pk(xJ , k−1)Qk(xJ)

=

pk(x1, k−1)[Qk(x1|x1), Qk(x2|x1), · · · , Qk(xJ |x1)]pk(x2, k−1)[Qk(x1|x2), Qk(x2|x2), · · · , Qk(xJ |x2)]

· · ·pk(xJ , k−1)[Qk(x1|xJ), Qk(x2|xJ), · · · , Qk(xJ |xJ)]

, (23)

which can be identified from the data directly.Combining equation 20 and 22 in matrix form:δ(1− β)Q̄00 − I Q̄10 I

δ(1− β)Q̄01 −I I

0 −βδQ1 I

v(0)v(1)

u2

=

−m(0)

−m(1)

−∆lnp12

(24)

The identification then proceeds in two steps. First of all, we show that the utility functions canbe identified given the discount factor β and the time-inconsistency δ, which requires the followingrank condition. Second, once the utility function being identified, the model generates polynomialmoment conditions with the discount factor and the time-inconsistency as the only unknowns,allowing identification of both factors.

Assumption 2 Matrix A(β, δ) =

δ(1− β)Q̄00 − I Q̄10 I

δ(1− β)Q̄01 −I I

0 βδQ1 I

is of full rank.

With the full rank condition in Assumption 2, We summarize this result in the following theorem:

Proposition 1 Under Assumption 1 and 2, flow utility function uk(x, k−1) can be identified giventhe discount factors β and δ.

Proof. By the moment condition matrix 24, each element in the matrix A(β, δ) is known by datagiven β and δ, and each element on the right hand side of 24 is known from data. Then underassumption 2, the value function v(0), v(1) and reference utility u2 can be expressed by lineartransformation of data given β and δ. Then equation 17 just identifies the flow utility for otheractions. □

We have exhaust all variations in the above theorem. Equations 20 and 22 just identifies u2,v(1) and v(0). Equation 17 just identifies the flow utility for other actions. To identify the discountfactor β and δ, we need either normalization of utility or some exclusion restrictions.

The simple way to do here is to normalize the reference utility. If the flow utility function u2(x)

are normalized for two states x1 , x2, then the discount factors can be identified. This is becausegiven 22 and 20, and the following rank condition

Assumption 3 Matrix Q1 and Q̄01 are of full rank.

11

Then the reference utility u2(x) can be expressed as following

1

βδ

1

δ(1− β)Q̄−1

01 Q−11 g − 1

δ(1− β)Q̄−1

01 (m(1) + u2)−1

βδQ̄00Q̄

−101 Q

−11 g + Q̄00Q̄

−101 (m(1) + u2) =

m(0) + u2 +δ(1− β)

βδQ̄10Q

−11 g

(25)

where g = u2 +∆lnp12. The expression 25 contains J equations and J + 2 unknown variablesu2,β and δ. If the flow utility −u2 for at least two states are normalized, then the discount factorβ and δ can be identified given some proper rank conditions. We summarize the result in thefollowing theorem:

Proposition 2 Under Assumption 1 and 3, and u2 are normalized on at least two states x1,x2,the discount factors β and δ are identified.

One more thing to be noticed here is the data requirement. From the moment condition 24and the value function expression 17, two period data are enough to identify the discount factorsand utility functions. On the one hand, two period data are used to identify the transition matrixQk. On the other hand, since the previous action is one of the state variable, two period data areneeded.

3.1 identification of finite horizon

Only analyze the sophisticated agent case. For period t ̸= T , we exploit the special featureof multiple terminating actions to rewrite the value of action 0 and 1 relative to the terminatingaction 2 as

wk,t(xt, kt−1)− w2,t(xt, kt−1)

=

u0,t(xt, 0)− u2,t(xt, 0) + βδ

∑xt+1

v(xt+1, 0)Q0(xt+1|xt), kt−1 = 0, kt = 0

u0,t(xt, 1)− u2,t(xt, 1) + βδ∑

xt+1v(xt+1, 0)Q0(xt+1|xt), kt−1 = 1, kt = 0

u1,t(xt, 0)− u2,t(xt, 0) + βδ∑

xt+1v(xt+1, 1)Q1(xt+1|xt), kt−1 = 0, kt = 1

u1,t(xt, 1)− u2,t(xt, 1), kt−1 = 1, kt = 1

(26)

For t = T , the value contrast is the flow utility contrast between the respective actions. Forconsistent agent, the present-biased discount factor β = 1, while for inconsistent agent (no mattersophiticated and naive agent), the present-biased discount factor β ̸= 1.

Although more and more panel data are available, there are still plenty of cases where it ishard to get the final period data, especially for the long term behavior like mortgage data. Onthe one hand, the time period is long (usually fifteen to thirteen years in mortgage market) so itwould be hard to find the data for the last periods, on the other hand, there would be selectiveproblems because agents could leave the project due to death or terminating. In this section, we

12

would analyze the identification of discounting factors in finite horizon when the final period dataare not available.

To make the question simplier, we make the following assumption.

Assumption 4 The reference utility function is invariant to time period, i.e. uK,t(xt, kt−1) =

uK(xt, kt−1).

Here the reference is action 2, so the reference utility function can be expressed as u2(xt, kt−1).Similar to the infinite horizon case equation 18, we have the condition:

∆lnpt12(xt) ≡ ln(p1,t(xt, 0)p2,t(xt, 0)

)− ln

(p1,t(xt, 1)p2,t(xt, 1)

)= [w1,t(xt, 0)− w2,t(xt, 0)]− [w1,t(xt, 1)− w2,t(xt, 1)]

= [u1,t(x, 0)− u2,t(xt, 0)]− [u1,t(xt, 1)− u2,t(xt, 1)]

+ βδ∑xt+1

vt+1(xt+1, 1)Q1(xt+1|xt), (27)

Here I make the similar assumption as infinite horizon case

Assumption 5 The flow utility satisfies the following restriction.

u0,t(xt, 0)− u0,t(xt, 1) = ht(xt)

u2(xt, 0)− u2(xt, 1) = 0

u1,t(xt, 1)− u1,t(xt, 0) = u2(xt)

Comparing assumption 5 and 1, flow utility functions in finite horizon case, except for thereference action, are not necessarily to be stationary. That is, more flexibilitity is allowed in finitehorizon cases. Although the reference utility functions are assumed to be stationary, there is stillroom to learn the behavior of the reference action.

Under assumption 5, equation 27 can be rewritten (stacking over xt) as

∆lnpt12 = −u2 + βδQ1vt+1(1) (28)

Under assumption 3, vt+1 can still be expressed as function of discounting factor and thereference utility:

vt+1(1) = β−1δ−1Q−11 (u2 +∆lnpt12) (29)

The other key identification condition is similar to the expression of perceived long-run value

13

function 21, but in finite horizon case:

vt(xt, 0) = mt(xt, 0) + u2(xt, 0) + δ(1− β)p0,t(x, 0)∑

xt+1∈Xvt+1(xt+1, 0)Q0(xt+1|xt)

+ δ(1− β)p1,t(xt, 0)∑

xt+1∈Xvt+1(xt+1, 1)Q1(xt+1|xt)

vt(xt, 1) = mt(xt, 1) + u2(xt, 1) + δ(1− β)p0,t(xt, 1)∑

xt+1∈Xvt+1(xt+1, 0)Q0(xt+1|xt). (30)

Stacking over xt, the perceived long-run value function can be written as

vt(0) = mt(0) + u2 + δ(1− β)Q̄t00vt+1(0) + δ(1− β)Q̄t10vt+1(1)

vt(1) = mt(1) + u2 + δ(1− β)Q̄t01vt+1(0), (31)

where Q̄t,k,k−1 takes the similar form as 23, except for adopting the finite horizon CCP rather thanthe infinite horizon CCP:

Q̄t,k,k−1 ≡

pk,t(x1, k−1)Qk(x1)

pk,t(x2, k−1)Qk(x2)

· · ·pk,t(xJ , k−1)Qk(xJ)

=

pk,t(x1, k−1)[Qk(x1|x1), Qk(x2|x1), · · · , Qk(xJ |x1)]pk,t(x2, k−1)[Qk(x1|x2), Qk(x2|x2), · · · , Qk(xJ |x2)]

· · ·pk,t(xJ , k−1)[Qk(x1|xJ), Qk(x2|xJ), · · · , Qk(xJ |xJ)]

,

(32)

. Since finite horizon CCP and the transition matrix can be identified from data directly, Q̄t,k,k−1

is known.The moment condition 28 and 31 look similar to the counterpart (equation 20 and 22) in infinite

hoziron. However, since the finite horizon case may not be the stationary procedure (i.e. vt(0) andvt(1) may not be stationary), the moment condition can’t be expressed in matrix form as in 24.Noticing that the key identification result focuses on u2(x), value function vt(0) and vt(1) can stillbe expressed as function of β,δ, and reference utility u2(x) if the following assumption holds.

Assumption 6 Q̄1,Q̄t01 are of full rank

If assumption 6 holds, vt+1(0) can be cancelled out. Then vt(0) can be expressed as

vt(0) = mt(0) + u2 + δ(1− β)Q̄t10vt+1(1) + Q̄t00Q̄−1t01(vt(1)−mt(1) + u2) (33)

Since equation 29 holds for any time t, so vt(0) can be expressed by β,δ and the reference utility,

14

too. Then plug the expression of vt(0), vt(1) into the second equation of 31:

1

βδQ−1

1 (u2 +∆lnpt−1,12) = mt(1) + u2 + δ(1− β)Q̄t01[mt+1(0) + u2

+δ(1− β)

βδQ̄t+1,10Q

−11 (u2 +∆lnpt+1,12)

+ Q̄t+1,00Q̄−1t+1,01(

1

βδQ−1

1 (u2 +∆lnpt,12)−mt+1(1) + u2)] (34)

The moment condition also contains J equations and J + 2 unknowns (β,δ and u2). Given β andδ, the flow utility can be identified.

Proposition 3 Under Assumption 4,5 and 6, flow utility function uk,t(x, k−1) can be identifiedgiven the discount factors β and δ.

Proof. Given β and δ, equation 34 is linear in u2, so u2 can be identified (also assuming the properrank condition holds). Then equation 29 and 33 identify the perceived long-run value functionsvt(0) and vt(0). Finally, equation 26 identify the rest flow utility function uk,t(x, k−1). □

To identify the discount factor β and δ separately, we need normalization or exclusion restrictionsfurther. Similar to the infinite horizon case, if the flow utility u2 for at least two states arenormalized, then the discount factor β and δ can be identified given some proper rank conditions.We summarize the result in the following theorem:

Proposition 4 Under Assumption 4,5 and 6, and u2 are normalized on at least two states x1,x2,the discount factors β and δ are identified.

Finally, since the equation 34 contains CCP at t−1, t, and t+1, the identification result requiresat least four period data (since action in the previous period is also necessary, i.e. data at t − 2).Thus, the finite horizon identification is more demanding in data period than infinite horizon case.However, there is also advantage in finite horizon case: since we don’t impose the stationarity ondata, more flexibilities on flow utility, i.e. u0,t(xt, kt−1) and u1,t(xt, kt−1) are allowed (comparingassumption 5 and 1).

If we only care about the discount factors, and we would like to get rid of the assumption 4. Thediscount factors can still be identified if we have at least three period data from the final period,but we need stronger restriction:

Assumption 7 The flow utility satisfies the following restriction.

u0,t(xt, 0)− u0,t(xt, 1) = ht(xt)

u1,t(xt, 1)− u1,t(xt, 0) = 0

u2,t(xt, 0)− u2,t(xt, 1) = 0

Comparing assumption 7 and 5, assumption 7 makes weak assumption in the reference utility (notnecessarily stationary), and it makes a different restriction in the action 1 (here the difference

15

between utilities of action 1 with different previous actions is zero, but it can be relaxed to someknown function). The second and third equality in assumption 7 is also taken in the similarform in Blevins et al (2020) [11] Appendix. The identification result based on this assumption issummarized in the following proposition:

Proposition 5 Under Assumption 7,6, and u2,T−1 are normalized on at least two states x1,x2, ifthe discount factors β and δ can be identified if the data are observed from the final period T .

Proof. From equation 13, the perceived long-run value function at period T can be described as

vT (xT , kT−1) = γ − logp2,T (xT , kT−1) + u2,T (xT ). (35)

This equation shows that final period value function is determined only by CCP and referenceutility (because there is no future period). From equation 26, now we have the moment condition(stacking over xT−1)

∆lnpt−1,12 = βδQ1vt(1), t = 2, · · · , T (36)

Then the value function vT (1) can be expressed by discount factors and CCP at period T − 1.Combining the above equations, the value function vT (0) can also be expressed by discount factorsand CCP:

∆lnpT−1,12 = βδQ1(vT (0) + lnp2,T (0)

p2,T (1)). (37)

Then from the perceived long-run value function at period T − 1, the similar moment condition

vT−1(1) = mT−1(1) + u2,T−1 + δ(1− β)Q̄T−1,01vT (0), (38)

still holds. Plug in the expression of value function, we have the condition

1

βδQ−1

1 ∆lnpT−2,12 = mT−1(1)+u2,T−1+δ(1−β)Q̄T−1,01

[1

βδQ−1

1 ∆lnpT−1,12 − lnp2,T (0)

p2,T (1))

](39)

This moment condition contains J equations and J + 2 unknowns (β,δ, and u2,T−1). If u2,T−1 arenormalized in at least two states x1,x2, then the discount factors are identified. □

Notice that the proposition 5 only gives identification of the discount factors. To identify therest of the non-stationary flow utility functions, more normalization in the reference utility areneeded.

4 Monte Carlo Simulations

This is a simulation of borrower behavior in a reverse mortgage. In each period, the borrower canchoose to continue, default, or terminate the loan. Especially, default can last for two consecutive

16

periods, i.e. only when the borrower defaults for two periods, then the loan goes foreclosure. Theborrower can also terminate the loan when she moves or repays the loan.

1. action set: D = {0, 1, 2}. Action 0 means continuation, and action 1 means default, action2 means termination. Specially, the loan will terminate if the agent chooses default for twoconsecutive periods. In this case, the agent’s flow utility will be the utility from the loan plusthe termination utility since the loan comes to an end.

2. state variable: Supp(X ) = {1, 2, 3, 4, 5, 6} , stands for credit left.

3. time horizon: in this simulation, I use T = 3, and draw N = 3000 agents.

4. discounting factor: the exponential discount factor δ = 0.8 and the sophisticated present-biased discounting factor β = 0.5.

5. flow utility: following the exclusion restriction assumption 1:

uk(x, k−1) =

θ1 + θ2x k = 0, k−1 = 0

θ3 + θ4x k = 0, k−1 = 1

θ5 + θ6x k = 1, k−1 = 0

(θ5 + θ7) + (θ6 + θ8)x k = 1, k−1 = 1

θ7 + θ8x k = 2

(40)

In the simulation, I set θ1 = 2, θ2 = 1, θ3 = 1.5, θ4 = 0.8, θ5 = 1.3, θ6 = −1, θ7 = 2.5, θ8 = 0.6.Following the normalization mentioned before, u2(x = 1) = u2(x = 2) = 0.

6. transition matrix: just for simplicity, set transition matrix as

Q0(x′|x) = Q1(x

′|x) =

0.41 0.20 0.14 0.10 0.08 0.07

0.18 0.36 0.18 0.12 0.09 0.07

0.11 0.17 0.34 0.17 0.11 0.09

0.09 0.11 0.17 0.34 0.17 0.11

0.07 0.09 0.12 0.18 0.36 0.18

0.07 0.08 0.10 0.14 0.20 0.41

(41)

4.1 Estimation

I generate the actions of individual agents in the hyperbolic discounting framework. I thenestimate the model primitives for both hyperbolic discounting and exponential discounting models.For comparison purpose, I present both estimates in Table 1 (the means and standard errors arebased on 100 times resampling).

From Table 1, it can be seen that misspecification of exponential discounting model couldunderestimate δ since there is only one discounting factor, while the optimal model has two. Theexponential discounting model also estimates some parameters in flow utility wrongly. For example,

17

Table 1: estimation result of parameters

parameters true value sophisticated agent s.e. consistent agent s.e.θ1 2.00 2.04 0.39 2.18 0.40θ2 1.00 1.00 0.27 0.88 0.26θ3 1.50 1.65 0.75 2.06 0.81θ4 0.80 0.77 0.37 0.64 0.38θ5 1.30 1.36 0.43 1.58 0.44θ6 -1.00 -1.03 0.27 -1.10 0.28θ7 2.50 2.55 0.46 2.73 0.47θ8 0.60 0.59 0.27 0.46 0.27δ 0.80 0.80 0.09 0.45 0.04β 0.50 0.51 0.08

it overestimates θ3. From the table, it tempts to think that there is relation between present-biasedmodel and exponential model in the sense that βp ∗ δp = δe (here the subscript ‘p’ stands forpresent-biased discounting model and ‘e’ stands for the exponential discounting model). This isbecause the model simulated only consists of three periods. In this case the present-bias may notbe severe becasue the agents only count two period ahead. If the model time horizon is longer, thispattern will disappear.

Furthermore, I plot the estimation result of β and δ in Figure 1, and I also draw a curve of βδ =

0.4. The estimation results are consistent with the inversely proportional function. This patternis also consistent with Abbring, Daljord and Iskhakov (2019) [6]. This is because the unknowndiscount factors in the moment conditions like 34 can be summarized in δ and βδ. However, inthat paper, the estimation is not good for separate estimation of discount factors. In this article,the separate estimation of discount factors are fairly good (the stanard error is fairly small).

I also show generate the actions of individual agents with exponential discounting models andestimate with two models (hyperbolic and exponential discounting model). This is to show thegeneral utilization of hyperbolic discounting model. The true parameters in flow utilities are thesame as Table 1. The estimation result is summarized in Table 2. Under sophisticated mdoel, β isestimated to be 0.97 and δ is estimated to be 0.86, which are close to the true value (β = 1, δ = 0.8).Also, other parameters in the flow utilities are close to the true value. It seems that when the truemodel is exponential, using sophisticated agent model is still fair for estimation.

4.2 Counterfactual Analysis

In this section, I explore the cases of intermediate punishment (or subsidy), and deferred pun-ishment (or subsidy). To be more precise, I compare the cases when I give a subsidy p = 0.65

immediately when the agent chooses default, and the cases when I gives the subsidy p(1 + r)

the period after the agent chooses default. To avoid the income effect, I set the interest rater = 1/(β ∗ δ) − 1 = 150%. The results are shown in figure 2 and figure 3. Here ”sop” is forsophisticated agent model and ”exp” is for consistent agent model (with only exponential discount

18

Figure 1: β-δ graph

Table 2: estimation result (data from exponential agents)

parameters true value sophisticated agent s.e. consistent agent s.e.θ1 2.00 2.02 0.44 2.00 0.44θ2 1.00 1.00 0.30 1.01 0.30θ3 1.50 1.40 1.01 1.48 0.98θ4 0.80 0.82 0.46 0.82 0.46θ5 1.30 1.35 0.44 1.33 0.43θ6 -1.00 -1.05 0.31 -1.03 0.30θ7 2.50 2.53 0.46 2.51 0.46θ8 0.60 0.59 0.30 0.61 0.30δ 0.80 0.86 0.11 0.82 0.10β 1.00 0.97 0.05

factor). The solid lines are the original models and the dashed lines are the counterfactual analysiswith subsidy. I only show the case when x = 1, the other cases under different value of x arefollowing the similar pattern.

First see the original models. The sophisticated agents tend to default more earlier than themisspecified consistent agents if they didn’t miss payment in the previous period (k−1 = 0). Thisis because here the flow utility of default is high. However when missing payment in the previousperiod (k−1 = 1), the sophisticated agents will default much less than the misspecified consistentagents. This is related with the higher discount factors of sophisticated agents (0.51 ∗ 0.8t) com-

19

pared with misspecified consistent case (0.45t). After the subsidy implies on default, the defaultrate does increase under both models. However, the patterns are different. For the sophisticatedagents, the default rate increases more evenly among periods when they didn’t miss payment inthe previous period, but their default rate increases only for the last few periods if they missedpayment in the previous period. For the misspecified consistent agents, the default rates increaseevenly among periods no matter missing payment or not in the previous period. Thus, misspecica-tion of sophisticated agents do have substantial effect on the prediction of counterfactual defaultrate analysis.

Comparing different subsidy policies (immediate subsidy and deferred subsidy) in Figure 2 andFigure 3, the default rates are slightly different: under misspecified consistent agent model, thedefault rate is slightly higher when the subsidy is deferred than immediate. This is also due to themisspecified exponential discount factor.

Figure 2: default rate with intermediate subsidy

5 Conclusion

In this paper, I discuss the identificaiton of hyperbolic discounting factor in widely used dynamicdiscrete choice model with both finite horizon and infinite horizon. Under economically meaning-ful exclusion restrictions, the identification of disocunting factors is characterized by polynomialmoment conditions. The presence of multiple terminating actions greatly reduces the complica-tion of the identification and also helps relax the restrictions imposed on the flow utility function.

20

Figure 3: default rate with deferred subsidy

In infinite horizon case, the joint identification of hyperbolic discounting factor and exponentialdiscounting factor depends on the normalization of at least two states in the reference utility.

I also examine the effect of model misspecification by Monte Carlo simulations. When theunderlying model is hyperbolic, estimation with exponential discounting model leads to wrongestimation in discounting factor and parameters in flow utility. I also find that mis-specificationcould result in misleading counterfactual outcomes.

21

Appendix

In this appendix I brief show the naive agent model and how the discount factors can bepotentially identified. I follow the existing literature and introduce a factor β̃ to measure the levelof naiveness so that we can model the model in a unified framework. For the (paritally) naiveagent, the problem is more complicated. First define the future self choice-specific value function:

zkt(xt, kt−1 = 0) =

u0(xt, 0) + β̃δ

∑xt+1∈X

vt+1(xt+1, 0)Q0(xt+1|xt), kt = 0

u1(xt, 0) + β̃δ∑

xt+1∈Xvt+1(xt+1, 1)Q1(xt+1|xt), kt = 1

u2(xt, 0), kt = 2

(42)

zkt(xt, kt−1 = 1) =

u0(xt, 1) + β̃δ

∑xt+1∈X

vt+1(xt+1, 0)Q0(xt+1|xt), kt = 0

u1(xt, 1), kt = 1

u2(xt, 1), kt = 2

(43)

This is different from the current self choice-specific value function in the sense that it adoptsthe present-biased discount factor as β̃ rather than β. This is because at the current period, thepartially naive agent thinks that she will use β̃ as present-biased discount factor rather than β

in the future (while she will not behave like that when the future really comes). Also define theperceived CCP at T − 1 period as

P̃kt(xt, kt−1) =exp(zkt(xt, kt−1))∑

j∈Dexp(zj(xt, kt−1))

(44)

Then the perceived long-run value function for period T − 1 is

vT−1(xT−1, kT−2 = 0) = EϵT−1 [uσ̃T−1(xT−1,0,ϵT−1)(xT−1, 0) + ϵT−1(σ̃T−1)

+ δ∑xT∈X

VT (xT , σ̃T−1)Qσ̃T−1(xT |xT−1)]

= EϵT [zσ̃T−1(xT−1,0,ϵT−1)(xT−1, 0) + ϵT−1(σ̃T−1)

+ δ(1− β̃)∑xT∈X

vT (xT , σ̃T−1)Qσ̃T−1(xT |xT−1)]

= γ − ln P̃2(xT−1, 0) + z2(xT−1, 0)

+ δ(1− β̃)P̃0(xT−1, 0)∑xT∈X

vT (xT , 0)Q0(xT |xT−1)

+ δ(1− β̃)P̃1(xT−1, 0)∑xT∈X

vT (xT , 1)Q1(xT |xT−1)

(45)

22

vT−1(xT−1, kT−2 = 1) = EϵT−1 [uσ̃T−1(xT−1,1,ϵT−1)(xT−1, 1) + ϵT−1(σ̃T−1)

+ δ∑xT∈X

VT (xT , σ̃T−1)Qσ̃T−1(xT |xT−1)]

= EϵT [zσ̃T−1(xT−1,1,ϵT−1)(xT−1, 1) + ϵT−1(σ̃T−1)

+ δ(1− β̃)∑xT∈X

vT (xT , σ̃T−1)Qσ̃T−1(xT |xT−1)]

= γ − ln P̃2(xT−1, 1) + z2(xT−1, 1)

+ δ(1− β̃)P̃0(xT−1, 1)∑xT∈X

vT (xT , 0)Q0(xT |xT−1)

(46)

Then they can be plugged into the choice-specific value function and CCP for period T − 2 canbe derived. For higher period case, it is easily extended from the three-period case.

Directly identfying the discount factors and parameters in flow utilities is quite complex becausethe CCPs (P ) observed from the data are not directly used in the continuation value for naive agents,but the perceived CCPs (P̃ ) are relevant. One way to simplifying the problem is to combining theresult of sophisticated agents and naive agents. That is, assuming the flow utilities (and exponentialdiscount factor, present-biased discount factor) are exactly the same for sophisticated agents andnaive agents. Then the only unknown is the naiveness β̃, and it can be identified by the momentconditions 28 45 and 46.

23

References

[1] T. O’Donoghue and M. Rabin, “Doing it now or later,” American Economic Review, no. March,pp. 103–124, 1999.

[2] J. Rust, “Chapter 51 Structural estimation of markov decision processes,” pp. 3081–3143, jan1994.

[3] T. Magnac and D. Thesmar, “IDENTIFYING DYNAMIC DISCRETE DECISION PRO-CESSES,” Tech. Rep. 2, 2002.

[4] H. Fang and Y. Wang, “Estimating dynamic discrete choice models with hyperbolicdiscounting, with an application to mammography decisions,” International EconomicReview, vol. 56, no. 2, pp. 565–596, may 2015. [Online]. Available: http://doi.wiley.com/10.1111/iere.12115

[5] J. H. Abbring and Ø. Daljord, “a Comment on “Estimating Dynamic DiscreteChoice Models With Hyperbolic Discounting” By Hanming Fang and Yang Wang,”International Economic Review, vol. 61, no. 2, pp. 565–571, may 2020. [Online]. Available:https://onlinelibrary.wiley.com/doi/abs/10.1111/iere.12434

[6] J. H. Abbring, O. Daljord, and F. Iskhakov, “Identifying Present-Biased Discount Functionsin Dynamic Discrete Choice Models,” Working Paper, 2019.

[7] A. Mahajan, C. Michel, and A. Tarozzi, “Identification of Time-Inconsistent Models:The Case of Insecticide Treated Nets,” Tech. Rep., 2020. [Online]. Available: http://www.nber.org/papers/w27198

[8] M. K. Chan, “Welfare dependence and self-control: An empirical analysis,” Review of EconomicStudies, vol. 84, no. 4, pp. 1379–1423, oct 2017. [Online]. Available: http://academic.oup.com/restud/article/84/4/1379/3091894/Welfare-Dependence-and-SelfControl-An-Empirical

[9] J. Rust, “Optimal Replacement of GMC Bus Engines: An Empirical Model of Harold Zurcher,”Econometrica, vol. 55, no. 5, p. 999, sep 1987.

[10] V. J. Hotz and R. A. Miller, “Conditional Choice Probabilities and the Estimation ofDynamic Models,” The Review of Economic Studies, vol. 60, no. 3, p. 497, jul 1993. [Online].Available: https://academic.oup.com/restud/article-lookup/doi/10.2307/2298122

[11] J. R. Blevins, W. Shi, D. R. Haurin, and S. Moulton, “a Dynamic Discrete Choice Model ofReverse Mortgage Borrower Behavior,” International Economic Review, vol. 00, no. 00, 2020.

[12] A. Norets and X. Tang, “Semiparametric inference in dynamic binary choice models,”Review of Economic Studies, vol. 81, no. 3, pp. 1229–1262, 2014. [Online]. Available:https://academic.oup.com/restud/article-abstract/81/3/1229/1599945

24

[13] V. Aguirregabiria, J. Suzuki, V. Aguirregabiria, and J. Suzuki, “Identification and counter-factuals in dynamic models of market entry and exit and Wisconsin-Madison, for helpfulcomments and suggestions,” vol. 12, pp. 267–304, 2014.

[14] M. Kalouptsidi, P. Scott, and E. Souza-Rodrigues, “Identification of Counterfactuals inDynamic Discrete Choice Models,” National Bureau of Economic Research, Cambridge, MA,Tech. Rep., sep 2015. [Online]. Available: http://www.nber.org/papers/w21527.pdf

[15] R. Peeters, “Stochastic Games with Hyperbolic Discounting,” 2004. [Online]. Available:https://core.ac.uk/download/pdf/231371786.pdf

25