optimal contracts with strategic shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. now equipped with...

79
Optimal Contracts with Strategic Shirking: Promotion, Suspension, Renegotiation, and Fixed Income in a Continuous-Time Principal-Agent Setting John Y. Zhu 1 May 30, 2009 Abstract We derive explicitly the family of optimal dynamic contracts and the corresponding optimal value function in a continuous time principal-agent model. We show that in many situations, shirking is an integral part of the optimal contract, playing a diverse role as threat, reward, or the promise thereof. The rich interplay between high effort and shirking induces a panoply of optimal contractual forms, allowing us to exhibit, explicitly, contracts with endogenously determined periods of promotion and demotion; signing bonuses in the form of initial cash/private benefit packages; renegotiation clauses; and fixed salaries. In addition, we develop new constructive techniques to tackle the phase change problems inherent in contracting with shirking, providing tools to analyze models with more complex utility structures; and we perform comparative statics on the structure theorem which then help illuminate certain aspects of the change in incentives phenomena and put into context some of the optimality results of previous papers. Keywords: Principal-agent problem, optimal contracts, shirking, continuous-time, promotion, suspension, renegotiation JEL Numbers: C61, C63, D82, D86, M51, M52, M55 1 U.C. Berkeley. The author would like to express his deepest gratitude to his advisor Professor Robert Anderson, for his guidance and the many hours he spent assisting in the revision of this paper. The author is indebted to Professors Christina Shannon and Adam Szeidl for their many helpful suggestions and criticisms. The author would also like to thank Derek Horstmeyer and Professor Kyna Fong for reading through and commenting on the rough drafts of this paper and also all the participants in the Berkeley Theory Seminar. Lastly, the author appreciates the discussions he has had with Professor Sourav Chatterjee on some of the technical aspects of this work.

Upload: others

Post on 20-Aug-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

Optimal Contracts with Strategic Shirking:Promotion, Suspension, Renegotiation, and Fixed Income

in a Continuous-Time Principal-Agent Setting

John Y. Zhu1

May 30, 2009

Abstract

We derive explicitly the family of optimal dynamic contracts and the corresponding

optimal value function in a continuous time principal-agent model. We show that

in many situations, shirking is an integral part of the optimal contract, playing a

diverse role as threat, reward, or the promise thereof. The rich interplay between

high effort and shirking induces a panoply of optimal contractual forms, allowing us

to exhibit, explicitly, contracts with endogenously determined periods of promotion

and demotion; signing bonuses in the form of initial cash/private benefit packages;

renegotiation clauses; and fixed salaries. In addition, we develop new constructive

techniques to tackle the phase change problems inherent in contracting with shirking,

providing tools to analyze models with more complex utility structures; and we perform

comparative statics on the structure theorem which then help illuminate certain aspects

of the change in incentives phenomena and put into context some of the optimality

results of previous papers.

Keywords: Principal-agent problem, optimal contracts, shirking, continuous-time,

promotion, suspension, renegotiation

JEL Numbers: C61, C63, D82, D86, M51, M52, M55

1U.C. Berkeley. The author would like to express his deepest gratitude to his advisor Professor RobertAnderson, for his guidance and the many hours he spent assisting in the revision of this paper. The author isindebted to Professors Christina Shannon and Adam Szeidl for their many helpful suggestions and criticisms.The author would also like to thank Derek Horstmeyer and Professor Kyna Fong for reading through andcommenting on the rough drafts of this paper and also all the participants in the Berkeley Theory Seminar.Lastly, the author appreciates the discussions he has had with Professor Sourav Chatterjee on some of thetechnical aspects of this work.

Page 2: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

1 Introduction

1.1 Motivation

In this paper we study the principal-agent problem in a dynamic setting where a project

owned by a risk-neutral principal is contracted out to a risk-neutral agent to manage. The

project’s revenue stream is stochastic, assumed to be Brownian motion with a variable drift.

The drift is controlled by the agent, and is a function of the effort the agent puts into

managing the project. The agent can change his effort level over time depending on the

performance of the project, and the set of effort levels the agent can choose from is a closed

and bounded interval. Effort is unobserved by the principal and we assume the disutility

it provides the agent is linear in effort level. The principal’s utility function is an expected

time discounted integral over stochastic cash flows. The agent’s utility function is similar

except it also takes into account the cost of effort. We assume that the principal is at least

as patient as the agent. The principal receives the entire revenue stream generated by the

project. To properly motivate the agent, he writes a contract which stipulates a revenue

stream-contingent compensation plan for the agent and a termination clause. The agent,

when presented with such a proposal will then privately select an effort process that maxi-

mizes his own utility. The agent can quit at any time but the principal must be able to fully

commit to the contract. When the project is terminated or the agent prematurely quits,

both parties exercise their outside options providing them with their reservation utilities.

Our goal is to find the optimal dynamic contracts within this framework.

This paper is part of a small but growing literature on continuous-time optimal contract-

ing. The model is related to those considered in Holmstrom and Milgrom (1987), Sannikov

(2008), and DeMarzo and Sannikov (2006). In each of these papers, the optimal contract fol-

lowed the resolution of a fundamental issue in incentive provision. Holmstrom and Milgrom

(1987) tackled the basic problem of how to sustain the incentives for a prescribed level of

effort over time. The authors’ pioneering work towards answering this theoretical issue led

them to introduce many of the core stochastic techniques which now form the foundation of

all work done in continuous-time optimal contracting. Focusing on a model with exponen-

tial utilities, they then derived one of the earliest examples of an optimal continuous time

contract. The contract aggregates intertemporal incentives through an elegant linear equity

holding and is able to induce the agent to apply a fixed effort level throughout the duration of

1

Page 3: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

the project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008)

set out to find a “flexible method [for] analyzing optimal dynamic contracts” in a wide range

of settings. The paper refined and expanded the theory initiated by Holmstrom and Milgrom

(1987), introduced new ideas from ordinary differential equations (ODE), and demonstrated

how the existence of optimal contracts in a class of general continuous-time principal-agent

settings can be deduced from a certain Hamilton-Jacobi-Bellman (HJB) equation. Those

methods have now been widely used both by Sannikov in later work and others. Building

on these papers and the work of DeMarzo and Fishman (2007) and Biais, Mariotti, Plantin,

and Rochet (2007), DeMarzo and Sannikov (2006), in a twist on the usual principal-agent

story, considered a cash diversion model with savings. They studied the problem of inducing

the agent to not steal any cash from a project where only the agent can observe the cash

flow. They found that the incentive structure required for such a truthful revelation of the

cash flow can also be used to induce highest effort in related effort-based principal-agent

problems. The resultant contract can be compactly characterized by an HJB-ODE in the

spirit of Sannikov (2008) and is called the credit-limit contract. It can be implemented with

the basic securities of the financial markets, it can be reinterpreted as a baseline highest

effort contract, and it will play an important role in our work as well.

There is a logical progression in the results of the three papers we have just discussed.

Holmstrom and Milgrom (1987) showed how to provide incentives in continuous time. Then

Sannikov (2008) showed that the instructions for the optimal provision of incentives is en-

coded in an HJB-equation. Then DeMarzo and Sannikov (2006) solved the HJB-equation

when only truthful revelation / high effort was involved, arriving at an explicit optimal con-

tract that could be analyzed in great detail. In the present paper we turn our attention to

the next issue: explicitly characterizing and analyzing families of optimal contracts involving

lower effort levels. The observation is that in many situations, the optimal contract, what-

ever it may be, often requires having the agent switch between high and low effort at least

some of the time.

As we have mentioned, the credit limit contract of DeMarzo and Sannikov (2006) can

be implemented in other effort-based models (including ours) as a baseline highest-effort

contract. While the revelation principle, which holds in the cash diversion model, makes the

credit limit contract optimal in DeMarzo and Sannikov (2006), this may not necessarily be

the case in other models where the credit limit / highest effort contract, though not optimal,

2

Page 4: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

is still a legitimate incentive-compatible contract. Thus we are led to ask: What is the value

of inducing changes in the effort level of a contract over time? In particular, what role can

strategically induced shirking play in optimal contracting? It is a natural question to ask

in light of the work of DeMarzo and Sannikov (2006) and indeed they broached the issue

towards the end of their paper. In our efforts to understand why shirking occurs, we begin

with the credit limit / highest effort contract in our model and then see if this contract can

be improved by relaxing the incentives so that the agent is induced to apply lower effort

sometimes. We then set out to find the optimal contracts. At this point, our theoretical

work begins to diverge from what has come before. Recall, that the optimal contracts are

governed by the solution of an HJB-equation. Outside of a few exceptional cases, it is diffi-

cult to tackle the HJB-equation head on and in our paper, we proceed along a different route.

Specifically, we introduce a versatile algorithm that can systematically improve a contract

by repeatedly altering the incentive structure of the contract. By applying the algorithm

repeatedly, we always arrive at the optimal contract provided one exists. Moreover, when

there is no optimal contract, the algorithm will always lead us to arbitrarily close-to-optimal

contracts. The HJB-equation, on the other hand, may provide little information on such

non-optimal contracts. Also of critical importance is knowing when an optimal contract

needs to switch between inducing high and low effort. These phase changes correspond to

phase change points located on the solution of the HJB-equation, but again, they may be

hard to compute.2 The algorithm resolves the problem of computing such points, allowing

us to find not only the phase change points but also the solution of the HJB-equation.

Our results show that there is a striking diversity of optimal contracts depending on the

range of effort levels available to the agent. The nature of the optimal contracts can change

drastically as the relative strength of the disutility to the principal and the utility to the

agent of shirking is changed. Let us introduce a few of these optimal contracts. When the

effects of shirking are terrible for the principal, then naturally a high effort contract is used.

When shirking does not provide the agent with much gain nor does it hurt the principal too

much then the optimal contract is a suspension type contract. Here shirking is induced dur-

ing the suspension phase as a way for the principal to punish the agent for poor performance.

It is viable because the principal can handle the relatively modest loss in revenue from such

2In fact, some phase change points are singularities, where the second derivative and sometimes even thefirst derivative do not exist. The HJB-equation involves first and second derivatives and is therefore notwell-defined at these exceptional points.

3

Page 5: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

a phase. When the project’s performance is quite good even when the agent is shirking,

then an optimal contract that may appear is the tenure-type contract whereby the principal

induces high effort from the agent initially through the promise of a permanent low-effort

contract if the project’s revenue stream hits a certain predetermined high threshold. These

optimal contracts and the ones that we will see later demonstrate the fundamental reason

for potentially inducing lower effort levels. Strategically induced shirking can serve both

as a form of reward complementing direct cash compensation, or as a form of punishment

complementing contract termination. This added versatility allows the principal to write a

contract with more nuanced incentives. The point is that pure cash compensation may be

too rich a reward and that termination may be too harsh a punishment, but with a poten-

tially more moderate alternative in shirking, the principal can achieve more by sometimes

having the agent work less.

1.2 Related Literature and Further Discussion

The continuous-time optimal contracting literature goes as far back as the seminal paper

by Holmstrom and Milgrom (1987). This work then paved the way for Sannikov (2008),

DeMarzo and Sannikov (2006), and Sannikov (2007a) - a far reaching series of papers that

have greatly clarified the intricate nature of incentive provision. The flexibility of the results

along with their interplay with discrete theory and, in the case of the latter two papers,

their simplicity of form and intuitive appeal serve to affirm the central importance of the

continuous-time perspective in the study of incentives. The methodology has been pow-

erfully applied to a beautiful paper in continuous-time games (Sannikov 2007b) and has

found applications to issues ranging from debt covenants (Piskorski and Westerfield 2009)

to healthcare (Fong 2008). The intertemporal moral hazard that is at the heart of our pa-

per and much of the theory has been studied extensively in discrete time, including Radner

(1981, 1985) and Rogerson (1985a), and more recently in work by DeMarzo and Fishman

(2007) and Biais et al (2007). The recursive flavor that is prevalent throughout much of the

analysis in our paper began with important insights by Abreu, Pearce and Stacchetti (1986),

Spear and Srivastava (1987), and Green (1987).

There is a close connection between our model and that which is shared by DeMarzo

and Fishman (2007), Biais et al (2007) and, in particular, DeMarzo and Sannikov (2006).

In those papers, an agent or entrepreneur owns a project for which he seeks a principal to

finance. Just as in our model, the principal and agent in that setting are both risk-neutral,

4

Page 6: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

and the principal also bears the operating costs of the project. The contractual space is also

the same - the set of revenue stream dependent payment plans plus a termination clause.

The differences are that in their model, the effort level is assumed to be exogenous and fixed,

but the agent can divert the cash flow through hidden actions, and can also supplement the

revenue stream with his personal savings. In our model, the hidden, variable effort level is

what drives the moral hazard problem, and cash diversion is not an option for the agent.

Such economically significant distinctions seem to render these two models incomparable.

However, the distortions in the cash flow reports that the agent can perform by stealing in

their model can be mimicked to some extent by the distortions in the revenue stream that

the agent can perform by changing effort levels in our model.3 Thus one expects that at

the very least, the optimal contracts in our model should exhibit some structural similarities

to the optimal contract of their paper. Their optimal contract is the credit limit contract,

where the agent is compensated by holding a fraction of the firm’s equity. A fixed credit

line supports the firm’s operating expenses. Dividends are paid after the credit line is paid

off, and termination occurs when the credit is exhausted. It turns out, this credit limit

contract is the best high effort contract of our paper. In their papers, DeMarzo-Fishman

and DeMarzo-Sannikov came to similar results when they studied discrete and continuous

time versions of a binary effort model similar to our model. They also discovered that if the

principal wants to always induce high effort from the agent, then the best way to do this is

to write a contract which takes the form of the credit limit contract. More broadly, Biais et

al remarks on the duality of models of cash diversion and effort:

In this model, an incentive problem arises because the entrepreneur can divert operating cash

flows. An alternative approach would be to assume that the manager must exert unobservable

effort to enhance cash flows, as in Innes (1990), Holmstrom and Tirole (1997), and in a previous

draft of the present paper. While the two interpretations of the agency problem are slightly

different and complementary, it turns out that when the effort choice is binary, the formal

analysis of the two models is identical, except in one respect. In the cash-flow diversion model,

the revelation principle implies that there is no loss of generality in requiring truthful revelation

of cash-flow realizations. By contrast, in the unobservable effort model, one must impose

additional restrictions to ensure that it is optimal to request [high] effort in all contingencies,

which raises additional mathematical difficulties. To simplify the exposition, we have therefore

chosen to focus on the cash-flow diversion scenario.

3DeMarzo and Fishman (2007), Biais et al (2007), and DeMarzo and Sannikov (2006) all note the possi-bility of reinterpreting cash diversion as negative effort.

5

Page 7: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

A consequence of our results is a complete resolution of the alternative unobservable effort

problem posed by these authors. Moreover, as a final demonstration of how cash diversion

and shirking are two sides of the same issue, we will essentially show at the end of this paper

that the credit limit contract is the limiting contract of the optimal contracts of our model

as we let the negative effects of low effort become arbitrarily large.

We can naturally import the DeMarzo-Sannikov credit limit contract into our setting. In

our model, the corresponding contract is one where the payment plan is sufficiently sensitive

to the project’s performance that any advantage the agent may derive from not applying

highest effort is offset by the expected loss of payment which is tied to the expected loss

in the project’s profitability. Thus the contract induces the agent to apply highest effort

throughout the duration of the project. When the cumulative revenue stream has reached a

sufficiently low point then the project is terminated. This contract, though incentive com-

patible, is not usually optimal in our model. In fact, certain cases have no optimal contracts;

although whenever this occurs, we can always provide a family of arbitrarily close to optimal

contracts. As previous authors and we have already suggested, when the negative effects of

shirking are great, this contract may be the principal’s best option because then shirking is

not an option, but when lower effort levels produce acceptable, if not necessarily desirable,

results, then a variety of other contractual forms may outperform the highest effort contract.

Indeed, depending on the relative size of the exogenous constants, there are over a dozen

distinct types of incentive compatible contracts, each of which is specifically designed to take

advantage of the unique set of circumstances under which it is optimal (or close to optimal).

To understand the surprising optimal contractual diversity in our model, let us discuss

some of the optimal and close-to-optimal contracts which arise besides the credit limit con-

tract. The salient property of most of the optimal contracts is that they arise as composites

of smaller subcontracts. Such reductionist contracts have appeared before, notably in Fuden-

berg, Holmstrom, and Milgrom (1990). One manifestation of this is when the subcontracts

which form the building blocks of the complete contract are similar but not identical in

structure. The two major examples of this are the promotion and demotion type contracts.

In the promotion contract, upon agreeing to manage the project, the agent receives an initial

signing bonus package. Afterwards, the agent enters into the first of a sequence of increas-

ingly desirable subcontracts which collectively comprise the promotion contract. After the

project has performed sufficiently well for a sustained period of time, the agent is promoted

6

Page 8: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

to the next contract. During the promotion phase there is a deterministic time during which

the agent is not held accountable for the project’s performance. Essentially, the agent is

allowed to take a break from management, during which time he applies lowest effort. Each

of the subcontracts has a termination clause so that even after a promotion, poor perfor-

mance can still lead to the end of the project. The final subcontract is a credit limit contract

and provides explicit incentives through equity compensation while the prior subcontracts

provide more implicit incentives through the promise of promotion. This contract takes into

account “career concerns,” and is related to the findings of Gibbons and Murphy (1992). The

demotion/suspension contract is made up of a collection of highest effort subcontracts, each

of which resembles the DeMarzo-Sannikov credit limit contract. During each subcontract,

the agent is compensated for good performance. However, except for the last subcontract,

poor performance does not lead to termination, but rather to a demotion to a less desirable

subcontract. Succeeding subcontracts require greater sustained periods of good performance

before paying the agent. During each demotion, there is also a fixed period of time in which

the agent is suspended. Termination occurs when the project continues to perform poorly

after the agent has already been demoted to the worst of the subcontracts.

Another variant of the reductionism that arises in optimality is when the subcontracts

are all identical. This is exemplified by the renegotiable highest effort contract. In the credit

limit contract of DeMarzo and Sannikov, when the credit line is almost exhausted, the prin-

cipal and the agent would both be better off renegotiating a new, larger credit limit. Thus

with respect to the event of imminent termination, the principal and agent have ex-post

aligned interests in postponing termination. This temporary alignment of incentives is not

uncommon (cf. Grossman and Hart (1983)). However, if the principal is not able to fully

commit to the termination threat then the incentive structure breaks down and the agent

will not put in high effort even when the project is under duress. Thus the credit limit con-

tract is not usually renegotiation proof. Indeed, the requirement of renegotiation-proofness

often leads to a strictly worse payoff (cf. Fudenberg and Tirole (1990)). We also assume full-

commitment from the principal, so our optimal contracts need not be renegotiation-proof.

However, in certain cases of our model we get renegotiation-proofness for free in our optimal

contracts. The incentive structure of a certain high effort contract is stable under renegoti-

ation. In fact it is a contract with a built-in renegotiation clause occurring during a short

shirking phase, so that the same high effort contract can be repeated over and over, termina-

7

Page 9: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

tion is never invoked, and there is never a need to commit to a true long-term contract (cf.

Malcomson and Spinnewyn (1988) for similar results). The last example of a reductionist

optimal contract is the tenure type contract that comprises two completely different types

of subcontracts. This type of contract begins as a high effort contract. The principal selects

a lower and upper bound on the project’s performance. Depending on which one is reached

first, the project may either be terminated, or enter into a permanent tenure phase during

which the agent applies low effort and manages the project forever.

That so many of these contracts may be optimal is because of the importance of strate-

gically induced shirking. Strategic shirking is a versatile tool and without it, none of this

richness is possible. For example, recall, in the binary effort model of DeMarzo-Fishman and

DeMarzo-Sannikov, that if the principal always wants to induce high effort, then no matter

what the particular exogenous circumstances may be, the credit limit contract is always the

best contract. Shirking has had a precarious place in the literature on incentive provision.

In some cases, such as the highest effort contract considered above, shirking simply doesn’t

appear. But this is largely a function of shirking being potentially so bad for the project

in those models that the principal would like to avoid such a scenario at all costs. Most

general models do have optimal contracts which involve lower effort levels, but the precise

function of these phases and the corresponding phase changes are obscured by the overall

abstract and non-constructive structure of the contracts themselves. Often times optimality

results are essentially existence results where the optimal contract falls out of a Bellman

equation and the heavy machinery of stochastic differential equation (SDE) theory. Even

in discrete models, computing and explicitly characterizing optimal contracts can be quite

intensive and difficult (Phelan and Townsend 1991). Thus the current state of the literature

endorses shirking in optimality, but there is no real theory of shirking.

When the principal considers an effort level to induce, there are two questions that must

be asked: what is the purpose of such an effort level? And how does one induce said effort

level? With respect to high effort, the first question has an obvious answer. After all, if the

principal could costlessly control the effort level, he would always have the agent apply high

effort. It is the second question that is hard to answer. Indeed, much of the theory of incen-

tive provision, and in particular the technical breakthroughs in the continuous-time setting

are intimately related to this question. The situation with respect to low effort is reversed.

From an incentives standpoint it is easy to induce the agent to shirk, just don’t provide

8

Page 10: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

him with any incentives. But such a perspective on shirking does not help to explain its

importance, and may even lead one to assume that shirking is simply a vestigial, inefficient

component of any contract that employs it. To properly understand the role of shirking, one

needs to have some explicit examples of how shirking is utilized. As we have already alluded

to, the constructive optimal contracts in this model posit a rich set of ideas on shirking’s

import. Their explicit nature allows for a detailed analysis of when shirking is induced, and

its contextual function within the larger framework of a whole contract. Ultimately, this

investigation provides us with a more nuanced understanding of optimality.

There are a number of technical innovations and refinements in this paper. One of the first

things we do is gather together the various general results concerning incentive compatible

contracts that have appeared throughout the literature and carefully explain the standard

method of writing such contracts in continuous time. More generally, part of the purpose of

the first half of the paper is an attempt to further formalize and make more accessible the

developing set of principles and techniques common to the growing continuous-time litera-

ture. Along with the appendix, the second half of the paper is largely concerned with the

construction of the optimal contracts. Here we build on the existing theory and introduce

new algorithmic techniques to combine and improve contracts. An important insight is that

the optimal contract question can be recast as a question of expanding the set of attainable

contract payoff points. Building on this perspective, we introduce the concept of intrinsic

payoff sets and then develop methods to determine where and how to optimally paste two

intrinsic payoff sets together. Specifically, we introduce a heuristic algorithm that pieces to-

gether intrinsic payoff sets in a way that will build up to a unifying object called the optimal

value function. The optimal value function is similar to the concept of the value function

that appears throughout macroeconomics, specifically growth theory (cf. Stokey, Lucas, and

Prescott 1989), and is central to understanding optimal contracting. Indeed the main re-

sult of the paper, Theorem 14, characterizes the general form of the optimal contracts by

describing the structure of the optimal value function b. For each value x above the agent’s

reservation value, b(x) is the supremum of the set of payoffs to the principal of incentive

compatible contracts which deliver value x to the agent. The family of optimal contracts is

then the set of contracts parameterized by x which deliver value b(x) to the principal while

delivering x to the agent. We then show that the continuation value process for the principal

and the agent of any member contract of the family of optimal contracts must move along

9

Page 11: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

the graph of the optimal value function. This phenomena, some form of which has been

observed as early as in the work of Abreu, Pearce, and Stacchetti (1990), allows us to study

optimal contracting in a compact and manageable way by focusing on the structure of the

optimal value function rather than dealing directly with contracts. This method ultimately

allows us to explicitly solve the principal-agent problem in our model. Theorem 14 states

that, in general, the optimal value function is composed of four distinct pieces, the first three

of which imply certain high/low effort levels, and the last piece implies payment. The way

these pieces are put together to form the optimal value function then informs how a contract

should incorporate the effort levels and payment in an optimal, incentive-compatible way.

The formal manipulations of intrinsic payoff sets represent a significant departure from

the standard Bellman approach toward finding the optimal value function. Using the Bellman

method, the optimal value function is revealed through the solution of the Hamilton-Jacobi-

Bellman (HJB) equation. In principle, this HJB-equation contains as much information

about the optimal value function as is provided by the algorithm. However, it is often quite

difficult to extract this information from the HJB-equation. In particular, there is a Markov

law that governs the key state variables of the family of optimal contracts. This Markov law

is closely linked to the first and second derivatives of the optimal value function. Unfortu-

nately, it is often extremely difficult to get a reasonably precise idea of the values of these

derivatives from the HJB-equation, which then makes it difficult to pin down the important

Markov law. Another problem of the HJB-equation is that it may only be a suitable tool

to find the optimal value function on certain intervals. For example, the HJB-equation in

Sannikov (2008) produces a solution that is the optimal value function in a certain domain,

but is only an upper bound for it elsewhere. This problem may be symptomatic of the

common situation where there does not exist a complete family of optimal contracts. In our

model, it is not uncommon for the family of optimal contracts to be empty. Far from be-

ing pathological, very reasonable situations can produce an open set of incentive-compatible

contracts where there are no true optima. In such cases, it would still be valuable to get a

sense of what the close-to-optimal contracts look like. However, the HJB-equation provides

little information on close-to-optimal contracts. This is because the optimal value function

taken by itself provides little information on close-to-optimal contracts. The algorithmic

approach developed in this paper does not suffer from these deficiencies. The intrinsic payoff

sets used in the algorithm can be thought of as small, but well understood suboptimal value

10

Page 12: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

functions. They are pieced together in simple, constructive ways, so that after any step

of the algorithm, we have a clear idea of the object we have constructed so far, including

the values of its differentials. Furthermore, the algorithm builds up to the optimal value

function everywhere, not just on a specific interval. Lastly, the algorithm captures more in-

formation than just what is provided by the optimal value function, so that when there is no

true family of optimal contracts, and the optimal value function and the HJB-equation are

insufficient indicators of what the close-to-optimal contracts are, the algorithm can still give

us a very clear idea of what contracts are close-to-optimal. In general, the geometric theory

of intrinsic payoff sets, when translated back to the language of contracts, provides insights

to the optimal timing of a phase change between low and high effort levels and is central to

understanding the various functions shirking plays in optimal contract design. The methods

we have articulated are sufficiently broad in scope that they can be easily imported for the

search of optimal and close-to-optimal contracts in many other potential models, such as

those with non-affine cost of effort, or those which combine hidden effort and cash diversion.

2 The Model

There is a project belonging to a principal, for which he seeks an agent to manage. The

project produces a stochastic revenue stream. Over time, we assume that the cumulative

revenue stream behaves as Brownian Motion with a drift process determined by the effort

exerted by the contracted agent.

Formally, there is the probability space Ω ≡ C[0,∞) of all continuous functions on the

interval [0,∞). Each point f in this space is a continuous function that represents a possible

cumulative revenue stream of the project over time. On this probability space are defined

a collection of random variables Zt where Zt(f) = f(t). For a fixed t, Zt(f) represents the

cumulative revenue stream of the project up to time t under the realization f . There is a

base probability law P on Ω that governs the likelihood of Zt taking on any specific values.

This law P is assumed to be Brownian law with drift µ and Z0 = 0 a.s. Under this setup,

Zt, t ≥ 0, which evaluates the actual cumulative revenue stream of the project, behaves as

a Brownian motion starting at 0. The µdt drift corresponds to the default expected marginal

revenue which can be interpreted as the intrinsic expected profitability of the project, or the

expected profitability when the agent applies no effort, or the expected profitability when

the agent applies some standard base level of effort.

11

Page 13: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

The agent’s effort strategy is defined to be a progressively measurable 4 bounded process

at ∈ [η−A, η], where η and A > 0 are constants. The values of this interval are all the effort

levels that the agent may choose from. There are no restrictions as to what values η and A

may take (except for A > 0). Indeed, η−A < 0 is even allowed, and one can interpret this as

the agent actively harming the project for personal gain, or applying an effort lower than the

base level. For example, DeMarzo and Sannikov model cash diversion with what amounts

to an “effort” interval of (−∞, 0], where the negative values corresponded to different rates

of stealing from the project by the agent and zero represented honesty. In our model, effort

affects the revenue via the probability measure on Ω. Specifically, given an effort process at

which the agent applies on the project, the probability measure which governs Zt changes

from P to P a defined to be the measure on Ω which makes Zt a Brownian motion with drift

(at+µ)dt. Thus, as the effort level is increased, the cumulative revenue stream Zt, t ≥ 0 is

more likely to take on higher values. We represent the instantaneous cost of exerting x units

of effort by the function h(x), so that given an effort process at, the total expected cost to

the agent of such a commitment is EPa[ ∫

h(at)dt]. Both agent and the principal observe the

revenue stream Zt, t ≥ 0 which belongs to the principal. However, only the agent knows

his effort level. Formally, both principal and agent know Ft at time t but the principal

does not know the probability measure P a. Without the ability to directly observe at, the

principal’s problem is to figure out a way to design an Ft-contingent contract that provides

the agent with incentives to apply costly effort. The primary way to provide incentives is of

course to pay the agent. We represent an Ft contingent payment plan as a nondecreasing

progressively measurable process It that keeps track of the total amount paid to the agent

by time t. The second way to induce the agent to apply effort is through the threat of

termination. Formally, the termination clause is represented by an Ft stopping time τ .

How well does Ft capture information on the agent’s behavior? We know that the sample

mean is a sufficient statistic for the drift, so that given a cumulative revenue stream up to time

t the principal can make an educated estimate of the effort process of the agent. However,

the confidence interval can be quite high and the estimate may be of little use. Moreover,

the family of probability measure P a are absolutely continuous, and Ft is invariant with

4Let (Ω,Ft) be a filtered space and let B be the Borel σ-algebra. A process X : [0,∞) × Ω → R isprogressively measurable if for each t, X restricted to [0, t] × Ω is B[0, t] ⊗ Ft measurable. This technicalrequirement is a standard one for stochastic processes, and is necessary for integration. Throughout thepaper we employ a fair amount of probability theory. Durrett (2004), Karatzas and Shreve (2008), andRogers and Williams (2000) are sufficient references.

12

Page 14: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

respect to the choice of effort. Thus any event which occurs under one effort process can

also occur under any other effort process. In light of these remarks, one may be tempted to

think that Ft-contingent contracts provide insufficient incentives and are therefore unable to

combat the moral hazard inherent in the principal-agent relationship. But this is incorrect

once one realizes that the principal does not need to know, in a strong statistical sense, the

agent’s effort level. That the agent is utility maximizing coupled with the fact that even

though Ft is invariant under a, the measure on Ft is not and so the likelihood of certain

cumulative revenue streams are greater than others under different effort processes, means

that the principal may be able to predict what a rational, utility maximizing agent will do

when faced with an Ft-contingent contract. It is this probabilistic variation that both It and

the termination threat exploit to provide incentives for the rational agent.

Definition 1. A proposal is a pair (It, τ).

Suppose the agent accepts a proposal (It, τ), runs the project, and employs an effort strategy

at until termination. We assume that once the project is terminated, the agent may pursue

some other venture with payoff K, and similarly the principal receives payoff L. The agent

can also refuse a contract and immediately pursue his outside option with payoff K. Let us

also assume that the agent’s inter-temporal discount factor is γ and that of the principal is

r. Then the value of the project for the agent and the principal are respectively,

EPa

[ ∫ τ

0

e−γs(dIs − h(as)ds) + e−γτK

](Agent)

and

EPa

[ ∫ τ

0

e−rs(dZs − dIs) + e−rτL

](Principal)

Definition 2. A contract is defined to be a proposal and a strategy that the principal recom-

mends the agent employs. A contract (It, τ, at) is called incentive compatible if the recom-

mended strategy at maximizes the agent’s payoff given the proposal (It, τ).

Definition 3. The absolute optimal contract(s) is the incentive compatible contract that

maximizes the principal’s payoff over the set of all incentive compatible contracts.

Definition 4. The family of optimal contracts is a set of incentive compatible contracts

parameterized by a value W0 ≥ K. For each W0, the elements of this family with parameter

13

Page 15: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

W0 are those incentive compatible contracts which maximize the principal’s payoff subject to

the condition that the agent receives payoff W0.

The definition of the family of optimal contracts is a more general notion of optimality and

provides us with a better understanding of optimal contract design. Let C be an absolute

optimal contract, and let W o be the payoff of C to the agent. Then C is simply an element of

the family of optimal contracts with parameter W o. Knowing the family of optimal contracts

instead of just a single absolute optimal contract also provides the principal with greater

flexibility. Suppose there was a initial bargaining stage where the agent had some power

to negotiate a high payoff W h > W o for himself. Then the principal could simply refer to

the family of optimal contracts to find the best contract which delivers W h to the agent.

Moreover, any particular optimal contract, including the absolute one, shares a great deal

of structural similarity with any other, so that it is not much more work to find the entire

family of optimal contracts. We shall express this idea rigorously in Section 4.

Remark 1. From now on, an optimal contract will refer to an element of the family of

optimal contracts, and when we speak of optimality, it is with the notion of the family of

optimal contracts in mind.

The principal’s problem is to find this family of optimal contracts. If for some parameters,

there are no optimal contracts then the principal’s problem is to find a class of contracts

that can be made arbitrarily close to optimal.

We make the following assumptions: i) The cost of effort is affine: h(x) = C + cx where

c > 0, ii) The principal is at least as patient as the agent: r ≤ γ, iii) The project is

potentially better than the outside option for the principal (i.e. first-best is better than the

outside option for the principal): L < µ+ηr

, iv) Working for free at high effort is worse than

the outside option for the agent: K > −h(η)γ

Remark 2. Though the previous four conditions are the only ones that we shall impose

on the exogenous constants of the model, there is another condition on K that we could

consider. It seems reasonable to impose the following stronger version of the last condition:

K ≥ −h(η−A)γ

. That is, if the agent applies the minimal effort on the project forever and the

principal never pays him anything then the agent is still worse off than pursuing his outside

option. We do not impose this condition for a number a reasons. First, the extra generality is

not trivial; the optimal contract and optimal value function still exhibit interesting structure

14

Page 16: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

when −h(η)γ≤ K < −h(η−A)

γ. Second, this aforementioned case gives a clearer picture of the

optimality structure theorems for this model and indeed informs the optimality theorems for

the K ≥ −h(η−A)γ

case; we will make this precise in the following sections and the appendix.

And lastly, there are legitimate economic interpretations of the −h(η)γ≤ K < −h(η−A)

γcase.

We can rearrange and reinterpret some of the values of the model to embed a “minimum

wage” component. Specifically, we introduce a new stream mdt which we may assume that

the principal is compelled to provide to the agent for the duration of the contract. We

can embed this new stream in the mechanics of the model by assuming that the intrinsic

drift of the project is actually (µ + m)dt, and h(x) + m represents the cost of effort. Now

K < −h(η−A)γ

= −m−(h(η−A)+m)γ

doesn’t mean working for free is better than K, but rather,

the fixed wage minus the cost of low effort is better than the outside option. Such a scenario

can represent paid vacation, retirement, a form of tenure, or simply a lucrative and steady

paying job. Another interpretation is that low effort produces private benefit. In this case

low effort is not just putting in little effort, but rather a “negative” effort that amounts to the

agent actively harming the project for personal gain, in which case, remaining in the project

even without payments from the principal may still be more rewarding than the outside option

K. Stealing project funds is an example of this type of private benefit.

3 Provision of Incentives: The Standard Method

In this section we prove a fundamental lemma on incentive-compatibility and use the result

to explain a common way of writing incentive-compatible contracts in continuous time. A

key tool is a continuous-time analogue of the first order approach justified by Rogerson

(1985b) and Jewitt (1988) and explored in detail in continuous-time by Williams (2008).

3.1 Incentive-Compatibility

In the search for an optimal contract, the first task for the principal is to find a convenient

characterization of incentive compatibility. In order to recommend an effort strategy that he

is confident the agent will follow, the principal needs to determine, given a proposal, what

the optimal response from the agent will be. In this subsection, we develop the basic tools

that allow the principal to answer this question, culminating in the incentive-compatilibity

criterion.

Let us fix a contract (Is, τ, as) throughout this subsection. Recall P a is the law of the

revenue stream Zs under the recommended effort as. The key state variable at time t is the

15

Page 17: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

agent’s expected future value if he were to act in accordance to the contract after time t

given a revenue history up to time t:

Wt(Is, τ, as) = EPa

[ ∫ τ

t

e−γ(s−t)(dIs − h(as)ds

)+ e−γ(τ−t)K

∣∣∣∣Ft] t ≤ τ (1)

Wt = K t ≥ τ

Thus for a fixed time t, Wt is a random variable dependent on the history of the revenue

stream up to time t. We also define a companion variable to Wt, which we call Gt - the

principal’s future expected value

Gt(Is, τ, as) = EPa

[ ∫ τ

t

e−r(s−t)(dZs − dIs

)+ e−r(τ−t)L

∣∣∣∣Ft] t ≤ τ

Gt = L t ≥ τ

Definition 5. With respect to a contract (Is, τ, as), Wt is called the agent’s payoff process,

and Gt is called the principal’s payoff process. Together, the stochastic process pair (Wt, Gt)

is called the contract’s payoff process. W0 is the total payoff to the agent, and we refer to

this contract as one with parameter W0 or payoff point (W0, G0).

Now that Wt is well-defined (see Remark in Appendix B), we can apply the Martingale

Representation Theorem to get a differential equation that governs the motion of Wt.

Lemma 1. Let Zat = Zt −

∫ t0(µ + as)ds denote the standard Brownian motion under P a.

Then there exists a progressively measurable process βt such that for all t < τ

dWt = γWtdt− dIt + h(at)dt+ βtdZat (2)

We call βt the (Is, τ, as) contract’s β process. A little work then gives us the incentive-

compatibility criterion:

Lemma 2. (Is, τ, as) is incentive compatible if and only if

i) at = η ⇒ βt ≥ c ii) at = η − A⇒ βt ≤ c iii) at ∈ (η − A, η)⇒ βt = c

βt represents how sensitive the principal’s payment plan is to the performance of the project.

The criterion says that in order to induce high effort, the principal must set the sensitivity

16

Page 18: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

parameter high. When βt = c, the benefits of applying lower effort and the benefits of extra

expected payment from higher effort balance perfectly, and the agent is indifferent between

all effort levels. For any lower βt, incentives are weak and can only induce lowest effort.

Definition 6. We call a pair of stochastic processes (xt, yt) an incentive-comaptible pair if

they satisfy the relationship in the previous lemma where at is replaced with xt and βt is

replaced with yt.

3.2 How to Write an Incentive-Compatible Contract

The incentive-compatibility criterion allows us, in theory, to determine if a contract is in-

centive compatible. However, given an arbitrary contract (It, τ, at), it is usually difficult

to determine the β-process needed to apply the criterion because the Martingale Represen-

tation Theorem, which is needed to affirm the existence of β, is not a constructive result.

Fortunately, we do not need to know if any random contract is incentive-compatible. Rather,

we need to be able to write incentive-compatible contracts ourselves. And it is to this end,

that the incentive-compatibility criterion will serve a constructive purpose.

Instead of starting out with an Is, τ , and as, then computing the β processes, we begin

with an incentive compatible pair of stochastic processes (as, βs). At this point, the pair has

no contractual meaning as we have not defined a contract yet; it is simply a pair of abstract

processes. We call βs the sensitivity parameter. The as process determines the measure P a

on Ω and in particular, a standard Brownian Motion Zat ≡ Zt − at − µ. Next we pick an

increasing process Is which presently has no contractual meaning either; again, it is simply

an abstract increasing process. Our choice of Is is subject to a local continuity condition

that we will make precise later. We now define a shadow value Wt as follows:

dWt = γWtdt− dIt + h(at)dt+ βtdZat (3)

Lastly, we define a stopping time τ : τ = inft≥0Wt < K.

The local continuity condition on It is that the process must be continuous when Wt = K,

so that Wτ = K. We set Wt = K for t > τ and we have now produced a contract (Is, τ, as).

Lemma 3. If the shadow value Wt and the sensitivity parameter βt are bounded then Wt =

Wt(Is, τ, as) and βt = βt(Is, τ, as). Thus the contract (Is, τ, as) is incentive-compatible.

17

Page 19: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

One should realize that this lemma does not make a trivial statement. To be clear, Wt is

a process that is defined by the processes as, βs, and Is through SDE (3). Wt(Is, τ, as) is

a process that is computed from as, τ, and Is via equation (1). The sensitivity parameter

βt is a process that we simply defined, whereas βt(Is, τ, as) is the β-process of the contract

(Is, τ, as) and is computed via the Martingale Representation Theorem. Thus the shadow

value and the sensitivity parameter utilized in the construction of an incentive compatible

contract are not a priori the same as the agent’s payoff process and the β-process of said

contract. Indeed, the purpose of the previous lemma is to prove that they coincide.

The Family of Optimal High-Effort Contract

As an application, let us write the best contracts that always provides the incentive for

high effort. This family of contracts is equivalent to the credit-limit contract of DeMarzo-

Sannikov. The proof of its optimality may be found in their paper. Since this is a high effort

contract, as ≡ η. Generally, high effort contracts have βs ≥ c, but in this case βs ≡ c. Thus

the incentive compatible pair with which we begin the construction of the optimal high effort

contracts is (η, c). The next step is to introduce the shadow value. We are free to set W0

to be any value ≥ K, and the members of the family of optimal high effort contracts are

parameterized by this W0. The termination time τ is the first time Wt hits K. There is a

value W S ≥ K, such that if Wt > W S then I+t − It = Wt −W S; if Wt < W S then It = 0;

and when Wt = W S then there is an infinitesimal increase of It akin to the increases of the

running maximum of Brownian motion. The implied motion of Wt is the following: when

W0 ∈ [K,W S) the process Wt stays in this interval. It moves around in a Brownian-type

motion; if and when it hits W S, It increases, and Wt gets reflected back into [K,W S); when

it hits K, the contract is terminated. If W0 ∈ (W S,∞), then I+0 = W0 − W S and Wt

is instantaneously pushed down into the [K,W S) region, where Wt behaves as previously

described. We have now defined a contract (Is, τ, η).

The intuition for this contract is that the principal sets a low and high threshold for

performance. When Wt, which is a measure of how well the agent has performed, reaches

the high threshold, the agent is paid, and when Wt reaches the low threshold, the project

is terminated. By applying some functionals on the processes of the contract, DeMarzo and

Sannikov were able to interpret this contract as a credit limit contract. Basically, the agent

begins with some debt represented by a function ofW0. Over time, asWt increases the debt

decreases and as Wt decreases the debt increases. When Wt hits W S, the company is out of

18

Page 20: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

debt and all the incoming revenue is paid out in dividends until the company is back in debt

again. When Wt hits K, the credit line for debt is exhausted and the project is liquidated.

When W S = K, I+0 instantaneously pushes W0 down to K and the contract is immedi-

ately terminated. This happens when the principal finds the project to be unprofitable (at

least when compelled to induce high effort) and so the best way to deliver value W0 to the

agent is to simply let him exercise his outside option and pay him the difference.

When r = γ, W S =∞. Specifically, the higher the payment threshold W S is, the better

the contract is to the principal. Unfortunately, if the principal actually sets W S = ∞, the

corresponding contract is not incentive compatible because the agent would never get paid!

In fact, there is no optimal high effort contract, and the best one can do is to simply set

W S to be a large number, thereby obtaining an almost optimal high effort contract. From a

credit limit perspective, a close to optimal contract is where the agent is given a very large

credit line, but also begins with a large debt.

4 The Optimal Value Function

In this section we introduce the key concept of an intrinsic payoff set. An intrinsic payoff set

is an object which captures important information pertaining to the family of contracts to

which it is attached. It implies a uniformity of structure possessed by the member contracts,

and to a varying extent, the very structure of the contracts themselves. The family of optimal

contracts is shown to posses an intrinsic payoff set called the optimal value function. We

explain how the optimal value function implies the structure of all the optimal contracts

and then demonstrate how to construct the optimal value function by piecing together lesser

intrinsic payoff sets. We end the section with the main theorem of the paper, the structure

theorem for the optimal value function. This result then leads us to the classification of the

family of optimal contracts in the next section.

From now on, all contracts will be assumed to be incentive-compatible.

4.1 Intrinsic Payoff Sets

Definition 7. Fix a contract C = (Is, τ, as) with payoff process (Wt, Gt). Let σ be a stopping

time. Then we say G is a value set of (C, σ) if G is any measurable set of points in the plane

such that for each t, (Wt∧σ(ω), Gt∧σ(ω)) ∈ G for almost surely all ω ∈ Ω. If σ ≥ τ then we

simply say G is a value set of C.

19

Page 21: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

Definition 8. Let V be the set of payoff points of a set of contracts H. Then V is called the

payoff set of H.

Definition 9. Let V be the payoff set of a set of contracts H. Let f be a function from

H to the set of stopping times. Suppose for each contract C ∈ H, V is a value set for

(C, f(C)). Then we say V is intrinsic to H up to the family of stopping times f(H) or Hhas an intrinsic payoff set up to f(H). If for each C, f(C) is greater than or equal to the

termination time of C, then we say V is intrinsic to H or H has an intrinsic payoff set. A

set of points P is called intrinsic, if there is a family of contracts and a corresponding family

of stopping times for which it is intrinsic.

Let V be the set of payoff points of a set of contracts H and let 0 denote the function that

maps all contracts to the 0 stopping time. It is always the case that V is intrinsic to H up

to 0(H). Or if V is the payoff set of the set I of all incentive-compatible contracts, then V

is intrinsic to I. But these cases are not very interesting because the stopping times in the

former case are too restrictive and the size of V in the latter case is too large. However, as

we let the stopping times increase, so that the payoff processes begin to move stochastically

in space, while keeping the size of the family H of contracts small so that V is not too large,

then for all of these payoff processes to continue to stay within the confines of V (i.e. for V

to remain intrinsic to H as we increase the stopping times), the set of contracts must possess

some sort of intrinsic uniformity of structure.

Specifically, suppose the payoff set V of some family of contracts H is a C2 curve and is

intrinsic. By definition, the payoff processes of the member contracts must remain on the

curve (until the family of stopping times). These payoff processes driven by the underlying

Brownian revenue stream usually follow Brownian paths in the plane. Only under very rigid

circumstances would such a motion respect the strict boundaries of the curve V and not stray

outside. Indeed, the intrinsic property of such a payoff set implies a Markov law governing all

the shadow values of all the member contracts of H up to the family of stopping times. Thus

modulo the starting valueW0, all contracts of H share the same shadow value law up to the

family of stopping times. And since the law of the shadow value is the defining characteristic

of any contract (see the previous section), that means these contracts exhibit a great deal

of structural uniformity. When V is intrinsic to H without regard to stopping times, all

payoff processes of all contracts stay within V at all times, and the family of contracts have

the same shadow value law (again modulo the initial value) throughout the durations of the

20

Page 22: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

contracts. Thus the portion of any contract of H that starts when the shadow value is W is

identical to the contract of H with initial shadow value W .

To see an example of this structural uniformity, let us consider the family of credit limit

contracts. Let C be the credit limit contract where the agent starts with debt D0. At any time

t, the agent’s debt is Dt, and the contract C after this point is identical to the credit limit

contract where the agent starts with debt Dt. Intrinsic payoff sets, whose corresponding

family of contracts exhibit the uniformity and Bellman-type recursive nature like in this

example, abound. They are an effective way of packaging many contracts together into a

single manageable object and their importance lies in their applicability to the study of the

family of optimal contracts, whose payoff set is also intrinsic.

Definition 10. For W ≥ K, define CW to be the set of contracts with parameter W . Then

we may define the function

b(W ) = sup(Is,τ,as)∈CW

G0(Is, τ, as)

We call this function b the optimal value function.

Proposition 4. The optimal value function b is intrinsic to the family of optimal contracts.

This crucial connection allows us to shift our focus to the search for the optimal value

function. There are numerous reasons why it is much better to find the optimal value

function first rather than directly characterize the optimal contracts. The family of optimal

contracts is an infinite set of contracts, whereas the optimal value function is a single object.

As we shall see, once we find the optimal value function, which will be piecewise C2, the

structures of all members of the family of optimal contracts will be implied by the structure

of the single optimal value function via the Markov law governing all the shadow values

as argued previously. Furthermore, for some values W > K the optimal contracts with

parameter W may be empty. In fact there are cases when the entire family of optimal

contracts is empty. The optimal value function on the other hand is always defined because

it is a supremum. And for those situations when there may not exist an optimal contract,

the optimal value functions helps us determine what types of contracts are close to optimal.

4.2 Some Properties of the Optimal Value Function

In this subsection we discuss some of the key properties of the optimal value function b.

In particular, we introduce the two fundamental ordinary differential equations which will

21

Page 23: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

govern the shape of the optimal value function and by extension the family of optimal

contracts. We know, a posteriori, that the optimal value function is both piecewise C2 and

concave, and we will assume these two properties for now.

Lemma 5. For all W ≥ K, b′(W ) ≥ −1. Furthermore, if there is a value W ∗ such that

b′(W ∗) = −1, then b′(W ) = −1 for all W ≥ W ∗. We denote by W l the infimum of all such

W ∗ and set W l =∞ if none exist.

Definition 11. W l is called the payment threshold of the family of optimal contracts.

Lemma 6. On the open set where the optimal value function b is C2, it satisfies the following

three families of ordinary differential inequalities, each indexed by a value β

ry ≥ µ+ η + (γx+ h(η))y′ +β2

2y′′ for β ≥ c

ry ≥ µ+ η − A+ (γx+ h(η − A))y′ +β2

2y′′ for β ≤ c

ry ≥ µ+ a+ (γx+ h(a))y′ +c2

2y′′ for a ∈ (η − A, η)

Definition 12. In the lemma, the first family of inequalities is called the high effort in-

equalities, the second family is called the low effort inequalities, and the third family is called

the indifference inequalities. For an incentive compatible pair (a, β), we define the (a, β)-

inequality to be the a-effort inequality with sensitivity β, which is an element in one of the

three families of inequalities. We call the (η, c)-inequality and the (η−A, 0)-inequality the (I)-

and (II)-ineuqality respectively. Replacing the inequalities with equality, we get three fami-

lies of ODEs, comprising the (a, β)-ODEs, and we may define the (I)-ODE and (II)-ODE

similarly. Segments of solutions to the (I)- and (II)-ODEs are called (I)- and (II)-curves.

The intuition for these inequalities is as follows: Since b is the optimal value function, its

instructions for when to induce which effort level at what sensitivity are the best. Thus for

any effort level a and an incentive-compatible sensitivity parameter β, one cannot expect to

improve by locally switching the incentive structure to one that induces some effort a with

sensitivity parameter β. The (a, β)-inequality is the mathematical condition that captures

this idea. Now suppose b satisfies an (a, β)-ODE at some value W . This means the principal

can do as well as b by employing incentives that induce effort a with a sensitivity parameter

of β when the agent’s payment process equals W . But since b is the optimal value function

and the principal can do no better than b, that means an optimal contract ought to employ

22

Page 24: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

(a, β) when the agent’s value process is W and b is a solution to the (a, β)-ODE at W . It

is not unreasonable to conjecture that for almost every value W in the domain, if b′ 6= −1

then b satisfies a particular (a, β)-ODE at W . If this is indeed so, then we have a way to

extract from the optimal value function the family of optimal contracts: for each W , let

the (a(W ), β(W ))-ODE be the one that b satisfies at W . Then a contract is optimal if the

induced effort and β factor are (a(W ), β(W )) whenever the agent’s value process Wt = W .

Thus it is imperative we find which ODEs b satisfies.

Lemma 7. The only (a, β)-ODEs that b can satisfy are the (I)- and (II)-ODEs.

Proof. Without loss of generality, let us show the only high effort ODE that b can satisfy is

the (I)-ODE. Suppose b is locally a solution to some (η, β) ODE where β > c. Then since

b′′ < 0

rb = µ+ η(γW + h(η))b′ +β2

2b′′ < µ+ η(γW + h(η))b′ +

c2

cb′′

which means b does not satisfy the (I)-inequality, contradiction.

Remark 3 (A First Guess at Optimality). Lemma 5 and Lemma 7 tell us that the optimal

value function b ought to be a composite of (I)- and (II)-curves together with a line with slope

-1 starting at some W l. Thus we expect that the optimal contracts only induce high effort

with sensitivity c and low effort with sensitivity 0.

Given any point-slope in R2, there is a unique global solution to the (I)-ODE with that ini-

tial condition. Given any point not on the vertical line going through κ = (−h(η−A)γ

, µ+η−Ar

),

there is a unique solution to the (II)-ODE going through that point. In fact, the family of

(II)-curves can be described explicitly: The solutions to ry = µ+η−A+(γx+h(η−A))y′ are

y =µ+ η − A

r+D

∣∣∣∣x+h(η − A)

γ

∣∣∣∣ rγ D ∈ R (4)

All solutions begin at the point κ = (−h(η−A)γ

, µ+η−Ar

) and emanate outwards. When r = γ,

these solutions are simply the non-vertical rays starting at κ. See Appendix A for a picture.

Notation.

(I)p,q will refer to a (I)-curve beginning at p and ending at q. We may define (II)p,q similarly

if such a curve exists. We abbreviate (II)κ,p to (II)p.

23

Page 25: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

4.3 Building Up To The Optimal Value Function

Definition 13. A point is attainable if it is the payoff point of some contract.

Definition 14. Given a set of points S ⊂ R2 bounded above, let I be the domain of S. We

then define a function g over this domain where for each x ∈ I, g(x) = sup y | (x, y) ∈ S .Then we define the upper shell of S to be the graph of g.

The optimal value function is the upper shell of the set of attainable points. So in this

subsection we introduce the methods by which we can build up to the optimal value function

by constructing more and more attainable points from previously constructed attainable

points. The general idea is given a set of attainable points P corresponding to the payoff

points of some set of contracts CP , we will define a new set of points S(P) dependent on P ,

which we will argue is also attainable. In fact, we will show that S(P) is intrinsic. We begin

by introducing a family of contractual alterations/combinations A indexed by the points in

S(P). For ever point q ∈ S(P), there will be an Aq ∈ A such that upon altering/combining

the contracts of CP via Aq, the resultant contract Aq(CP) will have parameter q. Then,

it will be the case that for each contract Aq(CP), there will be a stopping time σq such

that the payoff process stopped at σq will be a point in P . Define the set of contracts

A(P) = Aq(CP) | q ∈ S(P) and the map f(Aq(CP)) = σq. Then S(P) will be intrinsic to

A up to f(A).

Definition 15. Given a point p in the plane, let px and py denote the xy-coordinates of p.

In the following series of definitions and lemmas we will introduce the intrinsic payoff sets

that will be used in the construction of the optimal value function. For each such set we

also describe the corresponding family of contracts. In the appendix we will prove that the

last payoff set is indeed the intrinsic payoff set of the last family of contracts up to a certain

family of stopping times. This proof can be modified to verify the other cases as well.

Definition 16. Let p be a point, then define Lp to be the straight line of slope -1 emanating

downwards from p.

Lemma 8. Suppose p is an attainable point, then Lp is intrinsic.

Proof. Let m ∈ Lp, and let C be a contract that has payoff p. Then a contract that has payoff

m is one where the principal pays the agent mx − px initially, then immediately employs C.Call this family H. Then define S to be the family of stopping times determined by when

the contracts of H reach p. Then Lp is the intrinsic payoff set of (H,S).

24

Page 26: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

Definition 17. (The DeMarzo-Sannikov Sp-Curve)

Fix a point p where px ≥ −h(η)γ

and py ≤ µr. Let l denote the line of slope −γ

remanating

downwards from the point ν = (−h(η)γ, µ+η

r). If rpy + γpx ≥ µ+ η − h(η), then Sp is defined

to be Lp. When rpy +γpx < µ+η−h(η), we divide the definition of Sp into these two cases:

r < γ: There is a unique (I)-curve going from p to the line l, such that at the point q

where the (I)-curve intersects l, the derivative is −1 and the concavity is 0. We define Sp to

be the union of (I)p,q and Lq.r = γ: Sp is defined to be the unique (I)-curve starting from p which stays below l and

remains concave on the entire interval [px,∞).

Lemma 9. For an attainable point p, Sp is intrinsic except when r = γ and Sp 6= Lp, in

which case Sp is still the upper shell of a set of attainable points.

Proof. Let m ∈ Sp and C be a contract with payoff p. We already know Sp is intrinsic if

it equals Lp. Otherwise, first suppose r < γ and let us describe the contract that produces

payoff m. Import the member of the optimal high effort contract with parameter W0 = mx.

Replace the K-termination threshold with px, and reset the payment threshold from W S to

qx. Now when the shadow value Wt reaches px, instead of termination, have the contract

switch to C. The resultant contract has payoff m. Sp is the intrinsic payoff set of this family

of contracts up to the family of stopping times when the contracts switch to C.Suppose now r = γ. Recall, in this case there is no family of optimal high effort contracts.

Instead, pick a payment threshold W S > mx. Then import the corresponding close-to-

optimal high effort contract with parameter mx from the r = γ case and make the same

alterations as we did in the r < γ case. This contract has a payoff point m(W S) lying

directly beneath m. As we let W S →∞, m(W S) rises up to m.

Definition 18. Let m ∈ Sp. When r < γ we call the contract corresponding to m described

above the optimal high effort contract with inside option p and parameter mx, reflecting its

similarity to the optimal high effort contract with outside option (K,L) and parameter mx.

When r = γ, for sufficiently high W S, we call the corresponding contract a close-to-optimal

high effort contract with inside option p and parameter mx.

Lemma 10. Let p and q be two attainable points. The set of points u ∈(II)p with ux ≥ K is

intrinsic. If there is a (II)-curve going from p to q then (II)p,q is intrinsic. (I)p,q is intrinsic.

25

Page 27: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

Proof. Let C be a contract with payoff point p. Let u satisfy the hypothesis of the lemma.

Begin the contract with incentive compatible pair (η − A, 0), and set It = 0 and W0 = ux.

This determines a motion for the shadow value: Wt moves deterministically toward px.

When Wt = px, simply enact C. This contract has payoff u.

Introduce another contract E which has payoff q, and let m ∈ (II)p,q. Without the loss

of generality, assume the segment (II)p,q ∈ (II)q. Since both p and q are attainable, mx is

automatically ≥ K. Then the contract that has payoff m is similar to the one described

above, except that when Wt inevitably hits qx (instead of px), enact the contract E .

These two families of contracts basically attach a deterministic waiting period before an

already existing contract. They are integral to the suspension and promotion type contracts

we will construct later.

Lastly, suppose m ∈ (I)p,q. First assume r < γ. Import the optimal high effort contract

with inside option p and parameter mx. Then replace the payment threshold with qx, and

when Wt reaches qx, instead of payment, enact E . This contract has payoff m.

If r = γ, then import any close-to-optimal high effort contract F with inside option p and

parameter mx subject to the condition that W S > qx. Then change the payment threshold

to a E threshold just like in the r < γ case. The resultant contract is invariant under the

choice of F , and the payoff is m.

All the relevant families of stopping times are clear.

Lemma 11. Let a, p1, p2, b be four points satisfying: i) ax < p1x < p2x < bx ≤ κx and

b = κ if bx = κx, ii) a, p1, p2 lie on the same concave (I)-curve, iii) p1, p2, b lie on the same

(II)-curve, iv) a and b are attainable. Then (I)a,p2 ∪ (II)p1,b is intrinsic.

Or, if a, p1, p2 are three points satisfying: i) κx ≤ ax < p1x < p2x and a = κ if ax = κx,

ii) a, p1, p2 lie on the same (II)-curve, iii) p1, p2 lie on the same S-curve, iv) a is attainable.

Then (II)a,p2 ∪ Sp1 is intrinsic.

Proof. Both cases are similar and we prove the first case. By assumption, there are incentive

compatible contracts which produce payoff points a and b. Now pick a m ∈ (I)a,p2 ∪ (II)p1,b

and we will write the contract that will have payoff point m.

If m ∈ a, b then we are done. For any m ∈ (II)p1,b, initially the contract is identical to

the one we associated to (II)p,q: it begins with a deterministic waiting period untilWt = p1x

at which point it enacts the p1 contract. The p1 ∈ (I)a,p2 contract is identical to the ones

we associated to (I)p,q where p is now a and q is now p2. If Wt reaches ax, then the a-

26

Page 28: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

contract is enacted. IfWt reaches p2x then the p2 ∈ (II)p1,b contract is enacted which we just

characterized. The contract with payoff point m ∈ (I)a,p2 is similar.

Definition 19. In the above list of curves which we proved are intrinsic, we call the Sp-type

curves Type 1, the (II)-type curves Type 2, the (I)-type curves Type 3, and the last group of

curves mixing all types Type 4. See Appendix A for pictures.

Given a set of attainable points, we can select one of the above methods to find more attain-

able points. Moreover, all of the attainable curves of this subsection have been constructed

with (I)- and (II)-curves along with lines of slope -1, so they all satisfy the basic differential

properties that we argued b must satisfy. These attainable curves can be combined in such

a way that builds up to the optimal value function. We present a set of guidelines on how

to effectively do this, utilizing the tools at our disposal to find increasingly higher attainable

points as opposed to just aimlessly increasing the size of our attainable set. The product

is an algorithm that generates larger and larger attainable sets. Except for the single case

when the optimal value function corresponds to a family of optimal low effort contracts with

fixed salaries, if the optimal value function is attainable, the heuristic algorithm produces

it after a finite number of iterations. If the optimal value function is not attainable (or is

the previous exception), the the heuristic algorithm runs indefinitely and the optimal value

function is the upper shell of the infinite union of the attainable sets generated after each

iteration. The algorithm utilizes Type 1 - Type 3 curves. The Type 4 curves are of a wholly

different nature, and they represent a second way of building up to the optimal value func-

tion, which we will explain later.

A Heuristic Algorithm for Obtaining b.

We begin with the attainable set C = (K,L). If K < −h(η−A)γ

then we also include the

point κ in C. Let C be the attainable set you currently possess. In no particular order,

choose one of the following ways to increase the size of C:1. Find a point u ∈ C and add Su.

2a. Find a point u ∈ C and add the portion of (II)u to the right of the vertical line x = K.

2b. Find two points u, v ∈ C lying on the same (II)-curve and add (II)u,v.

3. Find two points u, v ∈ C and add (I)u,v.

subject to two conditions:

i). All new points added to C have to lie on the upper shell of the new attainable set.

27

Page 29: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

ii). As a rule of thumb, if one can make an expansion, then there will be at least one

expansion type such that once one picks this type of expansion to make, there will be one

specific “maximal” expansion of that type, which if made, then upon repeating the algo-

rithm, only one of the other three expansion types can be made or none at all. You must

select one of these expansions.

If you cannot make any of these expansions, then you’re done, otherwise, repeat algorithm.

Using the algorithm, we do a case by case analysis of the model in the appendix. In each

case, the algorithm delivers the optimal value function as promised, and moreover, the rule of

thumb mentioned above is applicable. Each iteration of the algorithm expands the attainable

set and increases the upper shell. Each such expansion of the attainable set also corresponds

to an incentive-compatible contractual amendment of the family of contracts that corre-

sponded to the previous attainable set, so that the new attainable set also corresponds to

a family of contracts. If the algorithm stops after finitely many steps, then the family of

contracts corresponding to the upper shell of the final C is the family of optimal contracts. If

the algorithm runs indefinitely, then for all sufficiently high n, the contracts corresponding

to the upper shell of C after then n-the iteration form a family of close-to-optimal contracts.

4.4 The Structure Theorem for The Optimal Value Function

Theorem 12. The optimal value function b is continuous and piecewise C2 and its domain

is [K,∞). Note that b(K) ≥ L, since it is always feasible and incentive-compatible for the

principal and agent to exercise their outside options. There is a value W l such that b is

linear with slope -1 on [W l,∞) and is strictly concave with slope > -1 on [K,W l). The value

W l may be infinite, in which case b lacks the linear branch. The graph of b is a composite

of (I)-curves, (II)-curves, and a line of slope -1. The exact composition varies depending on

the exogenous constants of the model. There are at most two distinct phase change points

on the graph of b, both before the point (W l, b(W l)). Each such point demarcates a shape

change of b between a (I)-curve and a (II)-curve and implies a shift in effort level. Although

the point (W l, b(W l)) also demarcates a shape change for b (between a (I)- or (II)-curve and

the line with slope -1), we do not consider it to be a phase change point as it does not imply

a change in effort. When the optimal value function has exactly two distinct phase change

points it is not attainable, and thus there are only close-to-optimal contracts. There are

28

Page 30: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

also other sporadic cases where the optimal value function is not attainable. If the optimal

value function has two distinct phase change points neither of which is equal to (K, b(K))

or (W l, b(W l)) then the precise composition of b is as follows: from (K, b(K)) to the first

phase change point, b is a (I)-curve; between the two phase change points, b is a (II)-curve;

from the second phase change point to (W l, b(W l)), b is a (I)-curve again; and for W ≥ W l,

b′(W ) = −1. For all other types of optimal value functions, the phase change points exhibit

some type of degeneracy: the two points may coincide, one or both points may coincide

with (K, b(K)) or (W l, b(W l)), one or both points may disappear. These situations then

cause the optimal value function to lack one or both (I)-curve branches, or to lack the (II)-

curve branch, or to have some of the branches become compressed into a single point. When

the optimal value function is attainable, it determines the precise structure of the optimal

contracts. In particular, it determines the stochastic differential equation (SDE) that governs

the motion of Wt. When Wt is in the (I)-curve domains, the principal induces high effort

with sensitivity factor c. When Wt is in the (II)-curve domain, the principal induces low

effort with sensitivity factor 0. And when Wt ≥ W l, the principal pays the agent a lump sum

of Wt −W l and the payoff process immediately jumps back to (W l, b(W l)). The only time

when Wt > W l is when t = 0, though W0 does not have to be strictly greater than W l.5

This is the fundamental theorem of the paper. We present the complete case-by-case

description of b in the appendix and describe all the forms the optimal contracts may take

in the next section.

We now explain why an optimal value function b with two distinct phase change points

not equal to (K, b(K)) or (W l, b(W l)) is unattainable. Recall that (II)-curves emanate from

κ, and are either to the left or right of κ. Let D be the domain of the (II)-curve portion of

b. Then D is an interval either to the left or to the right of −h(η−A)γ

. The location implies

an orientation of the shadow value Wt of an optimal contract. In particular, if D is to the

left, then whenever Wt ∈ D it moves deterministically6 to the left until it exits D through

its left endpoint. Similarly, if D is to the right, then Wt moves deterministically rightward

5Since ifW0 > W l it immediately jumps down to W l and since the motion ofWt is continuous in [K,W l],it is always the case that lim

t↓0Wt ≤W l. Once Wt is in the interval [K,W l], it never leaves since the moment

it enters the interval [W l,∞), the payment clause of the contract kicks in and pushes Wt down into [K,W l].Formally, Wt is reflected by W l and Wt ≤ W l for all t > 0. This however does not mean the agent nevergets paid after time 0, but rather the cumulative payment stream after time 0 is like the running maximumof Brownian motion, infinitesimally increasing every time Wt hits W l.

6The sensitivity factor in this domain is 0, so the law of Wt described by Equation (3) is deterministic)

29

Page 31: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

in D until it exits the right endpoint. Whichever this endpoint may be, call it x. By the

structure theorem, we know (x, b(x)) is a phase change point and borders the neighboring

(I)-curve. Thus x borders the neighboring (I)-curve domain whereWt is required to move in

a Brownian way with sensitivity factor c. Thus as Wt leaves D it immediately transitions to

Brownian motion which then causes it to immediately get pushed back into D. Intuitively,

this type of motion is impossible. Indeed the sensitivity factor βt of Wt, is discontinuous

at x, with one side being c and the other side being 0 and singular. The SDE has no local

solution and is the reason for the unattainability of this type of optimal value function. Thus,

if an optimal value function were attainable, its structure would have have to exhibit some

degeneracy. Here are the most common examples of such degeneracies:

• The two phase change points both equal κ so that D is κ, and it makes no sense to

assign an orientation to D.

• The neighboring (I)-curve may disappear, in which caseWt does not have to deal with

an impending Brownian requirement when it leaves D.

• The two phase change points may disappear, so that there is no D to begin with, and

the optimal value function is a single S-curve.

The nonattainability phenomenon is not a serious hindrance, since we can always find close-

to-optimal contracts in such cases. Indeed, nonattainability enriches the notion of optimality,

and provides us with numerous interesting contracts that may otherwise be overlooked if a

true optimal contract existed in that particular case of the model. We now present the

optimal and close to optimal contracts.

5 The Contracts

To be able to precisely describe an optimal contracts we need to know the precise shape of

the optimal value function b. In Appendix A we give a classification of all forms that can

be taken by b. We depict when b takes on each form depending on the approximate relative

locations of κ and (K,L). The optimal value function is in bold and the long-dashed lines

are the (I)- and (II)-curves which have have sections that contribute to b.

5.1 Contracts with Few Phase Changes

The High-Effort Contract with Outside Option

The contracts corresponding to Figures 8a, 8b, 9a, and 9b are the usual optimal high effort

30

Page 32: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

contracts of DeMarzo-Sannikov. These cases correspond to when after making a Step 1

expansion one cannot make any more improvements via the heuristic algorithm. They are

optimal when the harm done by shirking is too great.

The Null Contracts

The contracts corresponding to Figures 8a, 9a, 8c, and 8d are called the null contracts. Each

such optimal value function attains its maximum at K, so that the principal would always

prefer to pursue his outside option (K,L) immediately bypassing all other optimal contracts.

Thus all points on the optimal value function except for (K,L) will never be hit.

The High-Effort Contract with Inside Option

Recall κ is the payoff point of the contract where the agent applies low effort and the princi-

pal pays the agent nothing via dIt forever. It is possible that the agent’s payoff under such a

contract, κx, is greater than his outside option K. Reasons given for this were the possible

existence of some form of private benefit, or the agent receives some fixed wage (see pg 14-

15) not captured by dIt. Now under this scenario, if the principal’s payoff, κy is sufficiently

higher than L, then the optimal contract is the optimal high effort contract with inside op-

tion κ. These contracts are structurally identical to the optimal high effort contracts with

outside option, except the outside option of (K,L) is replaced by the incentive-compatible

and preferred inside option of κ. Such contracts correspond to the right halves of Figures

8h, 8i, and 8j; that is, the graphs (κ, Sκ). We have left out a description of contracts implied

by the left halves: the (I)(K,L),κ segments. When r < γ, then only Figures 8h, and 8i are

relevant, and the left half is unimportant because this entire segment has positive slope.

Any optimal contract with a parameter W belonging in the domain of this segment can be

strictly dominated, from both the principal’s and agent’s perspective, by an optimal contract

with a higher parameter W ∗ lying in the domain of (κ, Sκ). Thus both the principal and the

agent have aligned interests in moving away from this region and picking an optimal contract

from the right half. Furthermore, the value process (Wt, Gt) of any contract starting in the

right half of the optimal value function never strays into the left half and vice versa. This

is because κ is an absorbing point - whenever the shadow value Wt = κx, it stays there

because the differential equation at this point is dWt = γWt + h(η − A)dt = 0. Thus we

don’t even need to know the behavior of left half “optimal” contracts to describe the right

half optimal contracts. This is in direct contrast to the optimal high effort contracts, whose

31

Page 33: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

payoff processes travels all around the (I)-curve portion of S(K,L), including the parts where

the slope is positive. When r = γ, sometimes Figure 8j applies, in which case the (I)(K,L),κ

portion of the above optimal value function contains the maximum, meaning the ((I)ι,κ, κ)

contracts are important. We describe these optimal contract in the next part. It’s not that

one never sees such payoff sets when r < γ; rather, such payoff sets are never optimal when

r < γ. Indeed, whenever these payoff sets appear in the r < γ case, they can always be

improved by a Step 2b amendment, and be eventually subsumed by the payoff set a family

of close to optimal promotion-type contract which we describe later.

A Contract with Tenure

These contracts correspond to the left half of the optimal value function of Figure 8j. There

is an inside option κ with κx lying at the right end of the domain of the contract’s shadow

value process. Contracts of this type are high effort contracts but with a “high” inside op-

tion κ. These contracts are optimal when both κx and κy are relatively high, suggesting a

situation where there is an embedded minimum wage that is quite high and even low effort

produces a desirable outcome for the principal. Such a κ can represent tenure, being made

partner, or any number of situations where after “putting in the time and effort”, an agent

may attain job stability and a high payoff. The contract begins as a high effort contract,

with incentives provided by the promise of tenure if the agent’s performance has reached

a satisfactory level. Performance is measure by Wt much like in the high effort contract

with outside option. When Wt = K, the contract is terminated due to poor performance

and when Wt = κx, the contract enters the tenure-type phase, κ. This is the only optimal

contract with the property that the termination time is both finite and infinite with positive

probabilities, corresponding to failure and success in the agent’s efforts to achieve tenure.

Another way to interpret these contracts is as retirement contracts. From this perspec-

tive, the principal would like to induce the agent to apply some amount of high effort, and

he uses the prospect of a lucrative retirement phase κ to achieve this. When the shadow

value Wt reaches κx, the agent is retired. This is a manifestation of the principle that often

times when the promised value is sufficiently high, the best thing to do is simply to retire

the agent (cf Sannikov 2008).

A Low Effort Contract with Fixed Income

Let c ≥ 1 and K > κx. This last inequality means that, for the agent, working at low effort

32

Page 34: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

with no dIt compensation from the principal is worse than exercising his outside option. We

assume that (K,L) lies below the line going through κ with slope −γr. This means that κy

is quite high - the productivity of the agent even under low effort is quite good. Thus the

principal would be pleased with a low effort contract, but the agent needs to be compensated

for him to be willing to accept. The low effort contract with fixed income addresses this

issue in the simplest way by providing the agent with a steady payment stream so that he

is willing to work at low effort indefinitely. All of the previous optimal contracts presented

have optimal value functions that require only a finite number of applications of the heuristic

algorithm. Despite its simplicity, this contract has an optimal value function which requires

an infinite number of steps.

First apply Step 1 so that now C is the S-curve corresponding to the usual high effort

contract. Next, we choose to make a Step 2a amendment. The (II)-curve selected for this

amendment will be the unique one that is tangent to C. The new C is the solid graph in

Figure 1a.

Fix a point u on the current C. If u is on the old part of C, then the contract with payoff

u is simply the usual high effort contract, and the value process moves stochastically on the

old part C until it hits (K,L) and the contract ends. If u is on the new part of C then

the contract begins by having the agent work without dIt compensation for a deterministic

amount of time. During this phase, the value process moves deterministically down the

(II)-curve until it reaches the old part of C at point p and the contract enters the high effort

phase, becoming the usual high effort contract with payoff point p.

We then repeat the algorithm. Notice that we cannot make a Step 2 expansion anymore.

Thus, the amendment we had chosen was consistent with the rules of selection. Instead, we

choose to make a Step 1 amendment again. In accordance with the rule, we choose to attach

the S-curve at the very top of the just added (II)-curve, as in Figure 1b. We already know

what the contracts look like for all points on the previous version of C. For a parameter point

on the new S-curve, the contract initially behaves as an usual high effort contract, with the

value process moving stochastically on the new S-curve. Then when Wt hits K, instead of

termination, the principal and agent resort to the “inside option” - that being the contract

which moves along the rest of C. In other words, once Wt = K, the agent works at low

effort bringing the value process down the (II)-curve until it hits the original S-curve and

the contract takes the usual high effort form and eventually ends when (K,L) is reached.

33

Page 35: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

(a) (b) (c)

Figure 1: Toward a Low Effort Contract with Salary - Heuristic Algorithm

We can continue layering (II)-curves and S-curves in the manner performed previously,

and the contracts that correspond to each additional layer are clear. Eventually all the

additional S-curves will simply be straight lines of slope -1, so that once such an S-curve

is reached in a contract, the agent receives a lump sum payment, instantaneously pushing

the value process up the S-curve until it reaches the (II)-curve immediately below and a low

effort phase begins again (see Figure 1b). We can iterate this layering process indefinitely

and as the number of iterations increase, so does the proportion of time the contract spends

in the low effort phases. The contract begins to look like one where the agent is always

working at low effort, with periodic payments from the principal. With the exception of the

initial payment, these payments are small and frequent to begin with, and become larger

and less frequent as time progresses. For sufficiently large numbers of iterations, most of the

payments will be small and frequent, with the larger, less frequent payments delayed until the

distant end of the contract. In the limit, the atomistic payments become a constant stream,

and the optimal contract is one where the principal makes an initial payment to the agent,

then pays him a stream K−h(η−A)γ

inducing the agent to work at low effort permanently. The

optimal value function is the limiting upper shell of the layered C we just produced, as in

Figure 1c, and it coincides with the optimal value function in Figure 9c.

It is not uncommon for the algorithm to run indefinitely, and we must take the limit of

the successive upper shells to get b. In the present example, taking the supremum of the ever

expanding payoff sets was paired with a nice physical interpretation of “taking the limit” of

a sequence of contracts. In general, it does not make sense to take the limit of a sequence of

contracts because:

34

Page 36: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

• The limiting contract may not be incentive compatible - this problem appeared in the

r = γ case when we argued as we let the payment threshold of close to optimal high

effort contracts go to infinity, the limiting contract is not incentive compatible because

the agent would never receive dIt compensation.

• The limiting behavior may not be well-defined - for example, the limiting behavior

of certain sequences of contracts may require the shadow value to immediately switch

between deterministic and Brownian motion at a single point, which we argued is

impossible in the discussion following the structure theorem.

We explore these issues and the implications for optimality in the next two subsections.

5.2 Contracts with Many Phase Changes: Promotion and Suspension

Since the situation that led to the low-effort contract with fixed income was quite special, we

cannot in general let the algorithm run indefinitely if our goal is to extract a good contract

from it. But since we know that the algorithm if properly applied will build up an upper

shell that gets closer and closer to the optimal value function, an natural way for us pick out

a family of close-to-optimal contracts is to simply stop the heuristic algorithm after a large

number of steps and take as our family of close-to-optimal contracts those whose payoff

points lie on the current upper shell. On our way toward getting the low effort contract

with fixed income, we explained how, after every application of the heuristic algorithm, the

structure of the upper shell contracts could be deciphered through examining the series of

amendments made up to that point. Let us recap the main points:

• Each amendment is a (I)-curve, (II)-curve, or an S-curve and corresponds to a high

effort or low effort phase

• Each amendment is anchored to the old intrinsic payoff set at some point and this

point is a phase change point

• After an application of the heuristic algorithm, the current best contracts you have are

the ones whose payoff sets lie on the current upper shell

• The payoff processes of these top contracts start on the upper shell and progressively

move down the chain of amendments in the reverse order in which the amendments

were made, undergoing phase changes each time it moves from an amendment to one

that was made prior to it.

35

Page 37: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

Figure 2: Close to Optimal Contract with Suspensions and the Optimal Value Function

Contracts with Suspensions

The optimal value function of Figures 8e, 9e, 9f and 9d are not attainable but can be ap-

proximated by the payoff points of families of close-to-optimal suspension-type contracts.

Let us focus on case 9d. The setting is the strictly non private benefit case with c < 1. This

is similar to the strictly non private benefit case with c ≥ 1 which led to the optimal low

effort contracts with fixed income. As in that situation, the agent’s payoff under a low effort

contract with no dIt compensation, κx, is not incentive compatible, and so the principal will

have to pay the agent somehow. The difference is that the principal was happy with a simple

low effort contract in that previous case because the productivity of low effort, κy, was high,

by virtue of c ≥ 1. In this situation, c < 1 and so κy is relatively low. Thus the principal will

want to induce high effort at least some of the time. The question is how much high effort

does the principal want to induce and when does he want to induce it. We use the heuristic

algorithm to help us answer these questions and guide us to the close-to-optimal contracts.

Since the principal knows he will want some high effort from the agent, a natural first

guess at optimality is to simply use the high effort contract with outside option (K,L) de-

scribed in section 5.2. This corresponds to applying Step 1 in the heuristic algorithm. Now

the principal wonders if he paid too much of a price by inducing high effort all the time from

the agent, and so he looks to the heuristic algorithm for any suggestions on improvements.

Sure enough, the heuristic algorithm tells the principal he can improve the current value

function by making a Step 2a amendment followed by another Step 1 amendment in exactly

the same manner as was initially done in the construction of the low effort contract with

fixed income. The upper shell is now the second Step 1 amendment and it lies above the

initial Step 1 curve. The new and improved contracts are the ones whose payoff points lie

on this second Step 1 curve. They begin as high effort contracts much like the original.

36

Page 38: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

However, when the shadow value Wt first hits K, instead of exercising the outside option,

the principal instead decides to punish the agent by suspending him for a period of time,

and then employing the high effort contract with outside option. This results in the payoff

process first moving along the second Step 1 curve (instead of the first Step 1 curve), then

moving down the (II)-curve of Step 2a after Wt first hits K and the suspension phase is

enacted, and then finally dropping down to the original Step 1 curve when the suspension

is over and the high effort contract with outside option is finally utilized. This contract

while clearly better than the first attempt, is still not optimal. Indeed, the principal can

continually layer (II)-curves and S-curves via the Step 2a / Step 1 amendments just like

before (see Figure 2) to get better and better contracts. Let us stop this layering process

after some finite number of iterations and describe the contracts whose payoff points are

on the upper shell. Such a family of contracts comprise a set of high effort contracts with

periods of suspension for poor performance. Fix a point on the upper shell. The contract

with this parameter begins as a standard high effort contract. However, when the project

has performed sufficiently poorly (whenWt first hits K), instead of terminating the contract,

the principal instead punishes the agent via a suspension of deterministic time. During this

time, the value process is sliding down the (II)-curve, and βt = 0, meaning the performance

of the project during the suspension is irrelevant to the remainder of the contract, and the

agent applies low effort, which we could interpret in this particular case as no effort. Once

the suspension is complete, the value process of the contract is deposited onto the next

S-curve and another high effort contract begins. Over time, the contract moves down the

S-curve ladder rung by rung with each suspension longer than the previous. Furthermore,

the value that the agent needs to get Wt to reach before getting paid increases with each

successive suspension, meaning on average, the agent needs to apply high effort for longer

periods of time to get paid as the number of times he gets suspended increases. Finally, after

the predetermined number of suspensions have all been exhausted and the project continues

to show poor performance then the principal terminates the contract.

As the number of Step 2a / Step1 layers increase, the successive upper shells converge

to the optimal value function of Figure 9d as can be seen in Figure 2. Thus the suspension

type contracts are close-to-optimal provided the number of suspension periods is reasonably

high. However, there is no true optimal contract in this case. Unlike the previous case, there

is no limiting contract in this situation. There is a limiting behavior of sorts implied by the

37

Page 39: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

sequence of improving contracts, which we call infinitesimal shirking. What happens is that

as the number of suspensions increases, the newer suspension / low effort periods become

shorter and shorter so that in the limit, the suspension periods are infinitesimal in duration.

This is impossible and is the reason why we cannot have an optimal contract. The truncation

of the iterations of the heuristic algorithm we performed in this subsection is a way to get

around this problem and provide some close-to-optimal contracts. Another way to resolve

the infinitesimal shirking problem is to substitute with carefully chosen periods of short-term

low effort phases. We essentially expand a shirking phase of infinitesimal duration into one

with a very small duration. The techniques by which to accomplish this and the resultant

new class of close-to-optimal contracts are discussed in the next subsection on contracts with

short term shirking.

Contracts with Promotions

We now consider contracts in the private benefit setting when the optimal value function is

Figure 8f or Figure 8g. Private benefit means that the payoff to the agent of working at low

effort without dIt compensation, κx, is better than the outside option K, so that shirking

can be used by the principal as a form of reward. The sequence of expansions that will lead

to these optimal value functions is first apply Step 1, then repeatedly apply Steps 2b, 3, 1

in that order. Once again, there is no realizable limiting contract, so we consider those close

to optimal contracts whose payoff points comprise the upper shell of C after a high number

of iterations of the aforementioned sequence. See Figure 3a, which only show steps 2b and

3 in a c = 1 case that will eventually build up to a Figure 8g optimal value function.

As in the suspension case, the payoff of low effort to the principal, κy, is relatively

low, so that he will want to induce high effort sometimes. Thus the first try is again the

usual high effort contract corresponding to an initial Step 1 amendment. While this may

produce a decent payoff for the principal, this contract will not be optimal because the agent

is getting too rich of a contract. Recall that the high effort contract is a credit limit contract

where the agent holds a fraction of the project’s equity stock. When the project does well

this stock then pays out dividends to the agent. Such stock driven compensation packages

are quite lucrative and are usually only given to managers with some reputation. Thus the

principal may be better off first signing the agent to a less lucrative contract with the option

of promoting him if the project does well. This intuition can be captured by the application

38

Page 40: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

(a) Heuristic Algorithm (b) The Limit (c) Signing Bonus when c < 1

Figure 3: Contracts with Promotions

of the heuristic algorithm. After the Step 1 amendment, one will be able to make a Step

2b amendment followed by a Step 3 amendment as seen in Figures 3a and 3b. Consider a

contract whose payoff point lies on this Step 3 curve. Initially, the payoff process moves along

the Step 3 (I)-curve and high effort is induced. However, the compensation package does not

include equity stock, and is in fact purely based on the embedded fixed wage. The incentive

for high effort is created with the promise of a fixed period of shirking akin to a vacation

once a sustained period of high productivity has been achieved, a promotion thereafter, and

the threat of termination. Specifically, if Wt reaches the sufficiently high threshold where

the payoff process hits the (II)-curve of Step 2b, the initial high effort credit limit contract

with fixed wage is ended, and the principal rewards the agent with a deterministic period

of low effort, with the payoff process moving along the (II)-curve until it finally reaches the

initial Step 1 S-curve at which point the agent is promoted to the usual high effort credit

limit contract with equity based compensation.

We can perform the Step 2b, Step 3 layering indefinitely creating contracts with more and

more promotion stages. Such contract are similar to the single promotion contract described

before. They begin as high effort credit limit contracts with the embedded fixed wage. The

credit limit is small so that the principal can terminate the contract quickly if things don’t

work out. If the agent does well, then he receives the fixed period of low effort and then

is promoted to the next high effort credit limit contract with embedded fixed wage. While

there is still no equity based compensation, the new contract is a promotion - the credit

limit has been increased providing greater flexibility and the agent is now one promotion

39

Page 41: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

closer to getting the final credit limit contract with equity compensation. With each suc-

cessive promotion, both the duration of the low effort phase and the size of the credit limit

are increased. The principal also expects more from the agent with each promotion - after

each promotion the performance threshold needed to reach the next promotion is increased.

Eventually, the last (I)-curve, an S-curve, is reached and the principal decides to let the

agent hold a fraction of the project’s equity, and the contract becomes the lucrative high

effort contract with equity based compensation.

The Step 2b / Step 3 layering above only builds the intrinsic payoff set up to the optimal

value function for small to moderate domain values. Though this domain contains the most

important portion of the optimal value function - the portion which includes the absolute

highest point of the optimal value function, it would still be interesting to get those close-to-

optimal contracts that have a high W0 parameter / payoff for the agent. So after stopping

the Step 2b / Step 3 iteration after some finite amount of iterations, make another Step 2b

amendment followed by a Step 1 amendment. This pair of amendments increases the upper

shell for the high W0 parameters domain values and they correspond to a signing bonus.

Specifically, a contract with a high parameter point lying on this portion of the upper shell

has an initial deterministic signing bonus in the form of a lump sum / low effort package,

before the promotion type contract just described is enacted. See Figure 3c.

5.3 Contracts with Short Term Shirking: Renegotiation and Super-Efficiency

We now introduce a new perspective on close-to-optimal contracting, addressing the infinites-

imal shirking problem posed at the end of the discussion on suspension contracts. When the

optimal value function is not attainable, then the principal has some freedom in selecting

a family of close-to-optimal contracts as a substitute. We demonstrated how the heuristic

algorithm, the basic tool used in the search for the optimal value function, can also be used

to find natural families of close-to-optimal promotion and demotion type contracts. We now

present another distinct class of close-to-optimal contracts - those with Type 4 payoff sets

(see section 4). Recall the fundamental reason for nonattainability was a discontinuity of the

volatility of the SDE determined by the optimal value function at a particular phase change

point which we call p 7. The SDE of the optimal value function dictates for what values

7Remember, not all discontinuities are bad, just the one at px which is discussed in detail in Section 4.4.Indeed, there is a discontinuity every time there is phase change since high effort has volatility β = c andlow effort has volatility β = 0.

40

Page 42: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

of W the principal should induce high and low efforts and if we were able to follow this

set of instructions faithfully, then we would get the family of optimal contracts. However,

the problem at px is that when switching from low effort to high effort there, the shadow

value W will immediately move back into the low effort domain infinitely many times in

an arbitrarily small time interval after the initial switch to high effort due to the nature

of Brownian motion, leading to the problematic infinitesimal shirking phenomenon. It is a

technical problem, and we can get around it by having the phase change performed on a

small interval instead of at the single value px. Contractually speaking, we replace infinites-

imal shirking with short term shirking. Specifically, suppose we are in a low effort phase as

W is moving up to px. Instead of switching to high effort when W = px, we pick a small

δ > 0, and have the switch occur at px + δ. Then once we are in the high effort domain,

we only switch back to low effort if W drops back down to px. In other words, whereas

before the switch between high and low effort all occurred at px, we now have the switch

from low to high be at px + δ and the switch from high to low at px. This motion is feasible

because there is now no pathological, infinitely frequent switching back and forth between

effort levels occurring in an arbitrarily small time frame. Thus a contract written in this

manner has a shadow valueW that obeys the Markovian law of the optimal value function’s

SDE everywhere except around px where we tweak the motion in the manner just described.

The payoff sets of such contracts are Type 4 curves and they are close-to-optimal:

Lemma 13. For every ε > 0 there exists a δ > 0 such that if the phase change interval is

less than δ, then for all time t < τ , we have |b(Wt)−Gt| < ε. In particular, the payoff point

of the contract (W0, b(W0)) is close to the optimal value function.

This lemma states more than just the contracts are close-to-optimal. It makes the stronger

point that the payoff process stays uniformly close to the optimal value function throughout

the duration of the contract (see Figure 4a). The close-to-optimal contracts derived from

the heuristic algorithm do not satisfy this property since the payoff process of such contracts

moves down to older and less close-to-optimal intrinsic payoff sets over time. The contracts

with short term shirking possess this uniformity property because the shadow value obeys the

SDE law dictated by the optimal value function at all values except for a small neighborhood

around the phase change point p. Thus these contracts are almost true Markovian contracts

except for around p, and it is this Markovian property that keeps the payoff processes

uniformly close-to-optimal.

41

Page 43: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

(a) Uniformly Close to Optimal (b) Short-Term Punishment (c) Short-Term Reward

Figure 4: Type 4 Contracts

Consider the optimal value function which produced the suspension-type contracts under

the heuristic algorithm. It possesses the problematic phase change point, and so we may

apply the short-term shirking approach of this section to the optimal value function and

derive a class of close-to-optimal contracts. In this case, the problem point p is at the left

end of the optimal value function. After performing the procedure around p, the resultant

family of close-to-optimal contracts has an intrinsic payoff set that looks like Figure 4b. It

is disconnected from the usual high effort value function, and indeed the family of contracts

corresponding to the new payoff set cannot be “built up to” from the original high effort

contracts. The trajectory of the payoff process on the new payoff set moves stochastically

on the S-curve in the standard way, until it hits the left corner at which point it moves

deterministically to the right along the (II)-curve until it reaches the (I)-curve again and

the motion is repeated. The corresponding contract is one that begins as the high effort

contract (when the payoff process is on the S-curve) except termination is replaced by a

short punishment time (when the payoff process is on the (II)-curve), at the end of which,

the same high effort contract resumes. Thus the future of the contract after every punishment

phase looks exactly the same, and the horizon is infinite. However, after every punishment

phase, the shadow value is deposited very near the punishment threshold, so that the agent

is on a “short leash” and now a even a small amount of poor performance may trigger

punishment again.

As with the suspension type contracts, the shirking phase here is used as a form of

punishment. However, there are significant differences in their function within the contract.

In the suspension type contracts, the shirking phase signaled an impending demotion to a

42

Page 44: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

lesser high effort contract. Here the shirking phase serves as a way of preserving the incentive

structure of the high effort contract while allowing for renegotiation. Let us explain: In the

standard high-effort contract, the slope of the corresponding value function near (K,L) is

usually positive. This means that when the credit is almost exhausted both parties would be

better off renegotiating a larger credit limit. The problem is, if the agent sees this coming and

expects the principal will want to renegotiate, then the incentives to put in high effort are

destroyed and the principal will actually be worse off, hence the importance of termination.

But the potential loss in revenue due to termination may be great and so it would behoove

the principal to find a way to write a contract that provides incentives for high effort but also

allows for renegotiation. This is where the low effort punishment phases come in. Instead of

using the harsh termination clause, when Wt hits the low point K the principal implements

a low effort punishment period with no dIt compensation before renegotiating. This low

effort phase is sufficiently distasteful to the agent (recall we are in the strictly non-private

benefit case) that he will want to apply high effort throughout the high effort portion of

the contract even though he realizes that if the project performs poorly, the principal will

renegotiate the terms of the contract to allow the agent to continue managing the project.

The short term shirking approach can also be used to get a new type of close-to-optimal

contract for the optimal value functions that produced the promotion type contracts. Recall

the setting is where low effort provides private benefit to the agent. In these cases there

is a reason why the principal may want to replace dIt compensation with brief low effort

phases that reward the agent in the form of private benefit. dIt is an efficient payment

process. There is no loss in total utility (principal plus agent) when the principal makes an

instantaneous transfer via dIt. However, in certain cases the principal can do even better

than efficient by replacing the dIt cash transfer with private benefit. Whenever the low-

effort phase occurs, the effect on the principal’s and agent’s utility is similar to that of cash

transfer. The principal loses some utility due to lack of high-effort, and the agent gains

utility due to private benefit. However, the slope of the (II)-curve when the low-effort phase

kicks may be > −1 so that for every unit of utility lost by the principal, the agent gains

strictly more than one unit. In this situation the low effort phase is akin to a super-efficient

cash transfer. In the gallery of optimal value functions, we have labeled the problem point p

in the relevant optimal value functions (Figures 8f, and 8g). After performing the procedure

around p described in this subsection, the intrinsic payoff set of the family of close-to-optimal

43

Page 45: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

contracts with parameter W0 ≤ px is shown in Figure 4c where the phase change point p

has now been relaxed into the small bump at the right end. The corresponding contract is

structurally similar to the usual high effort contract, except instead of a payment threshold,

there is now a different threshold which if reached rewards the agent with a short period of

shirking. The effect on Wt when it hits this threshold is that it gets pushed back, much like

the bouncing motion ofWt when it hits the W l threshold in the optimal high effort contract.

The slope at the threshold is greater than -1 so that we indeed have a super-efficient utility

transfer during the reward phase. For those contracts with a parameter W0 > px, we simply

add a signing bonus to the beginning of the contracts like the ones used in the promotion-type

contracts, and then use the aforementioned altered high effort contract.

6 Application: DeMarzo-Sannikov as a Limiting Case

In this section we demonstrate the flexibility of our model and techniques by showing how

the results of DeMarzo/Sannikov can be deduced as a limiting case of our model. A major

difference between the DeMarzo-Sannikov model and ours is that in their model the agent

can manipulate the cash flows in any bounded variational way. This is quite a powerful

ability, and in our model we restrict that ability by only allowing the agent to change the

drift of the cash flow by an effort factor of a ∈ [η − A, η]. However, as we let A tend to

infinity, the effort interval becomes arbitrarily large and so do the manipulative powers of

the agent. In the limit, the agent in our model can manipulate just as well as the agent in

theirs. So for large A, the principal in our model faces an agent that is quite similar to the

agent in their model, and one would expect that the optimal contracts would be similar as

well. Indeed, we will show that the credit limit contract of DeMarzo-Sannikov is essentially

the limiting contract of our family of optimal contracts as we let A→∞.

In their model c ≤ 1 and r < γ, and so we restrict our attention to these cases. We fix

all the exogenous constants of the model except A and parameterize all the functionals and

objects in our model by A. S(K,L) is the intrinsic payoff set of the family of optimal high

effort contracts, and since these contracts don’t involve low effort, they along with S(K,L) are

invariant under A. These high effort contracts are our model’s equivalent of the credit limit

contracts of DeMarzo-Sannikov. Specifically we will show

limA→∞

bA(W ) = S(K,L)(W )

44

Page 46: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

and explain the implied contractual convergence between the optimal contracts correspond-

ing to b and the high effort credit-limit contracts corresponding to S(K,L).

Proposition 14. For c < 1 and all sufficiently large values of A, the model is one of private

benefit and bA(W ) = S(K,L)(W ) for all W ≥ K. The contracts are the same as well.

In this case, the convergence happens in finite time. Intuitively, once A reaches a certain

threshold, the harm done by low effort is so great, that no matter what the situation, it is not

beneficial for the principal to let the agent shirk. Thus the principal only wants to induce

high effort, and so the family of optimal contracts and the family of optimal high-effort

contracts conincide.

Proposition 15. For c = 1, bA converges uniformly to the S(K,L), with the phase change

points pA, qA = (W lA, bA(W l

A)) converging to the linearity point (W l, S(K,L)(W l)) of S(K,L).

The convergence this time does not happen in finite time. Moreover, for all A < ∞, no

matter how large, the family of optimal contracts is empty, so we have to be careful about

what we mean when we talk about contractual convergence.

For all A sufficiently large, one family of close-to-optimal contracts is the promotion type

contracts. Recall that a promotion type contract consists of promoting the agent through a

series of improving high effort subcontracts. As A becomes increasingly large, the addition of

the subcontracts that come before the final subcontract only marginally improve the payoff

of the principal. The result is that principal can pretty much do just as well by immediately

promoting the agent to that last subcontract. This last subcontract is of course the high

effort / credit limit contract.

Another family of close-to-optimal contracts are those where short-term shirking replaces

the payment process dIt. Recall, from a utilities standpoint, the short-term shirking phase

amounted to a super-efficient cash transfer. As we let A tend to infinity, the slope of the

(II)-curve representing the shirking phase approaches -1, so that super-efficiency converges to

efficiency and furthermore, the duration of each shirking phase decreases to 0. Thus for large

A, the short term shirking phases of the close-to-optimal contracts have pretty much the same

effect on utilities as does the payment phase of the credit limit contracts, so that in the limit,

the close-to-optimal contracts are structurally identical to the credit limit contract except

that the payment threshold is replaced by a utility-equivalent infinitesimal-term shirking

threshold.

45

Page 47: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

Appendix A: Pictures

The (II)-Curve

Figure 5: (II)-Curves

Types 1 - 4 Curves

(a) Sp-Curves (b) (II)p Curve (c) (I)p,q-Curve

Figure 6: Curves of Type 1, 2, 3

Figure 7: Curves of Type 4

46

Page 48: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

The Gallery of Optimal Value Functions

(a) (b) (c) (d)

(e) (f) (g)

(h) (i) (j)

Figure 8: Private Benefit Case: K < κx

47

Page 49: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

(a) (b) (c)

(d) (e) (f)

Figure 9: Non-Private Benefit Case: K ≥ κx

Appendix B: Incentive Compatibility Results

Remark 4. The careful reader may have noticed what looks to be a discrepancy between the formula

for Wt and the qualitative definition that preceded it. The value of Wt as we intended should only

depend on the contract after time t and the revenue stream up to time t. In particular, we make no

assumption as to how the revenue history up to time t is arrived at. But the formulaic definition

of Wt is an expectation with respect to P a which is a measure defined by the entire effort stream

ass≥0. What if the agent employed some other effort strategy a∗s which is the same as as for s ≥ t

but potentially different before t? The following lemma resolves this issue.

Lemma 16. Fix a t ≥ 0 and let a∗s be an effort strategy such that as = a∗s for s ≥ t. Then

EPa [ · |Ft] = EPa∗ [ · |Ft]

Proof.

The two measure P a and P a∗

induce the same measure on Ft.

48

Page 50: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

Corollary 17. Let (Js, σ, bs) be a contract such that for s ≥ t, Is = Js and as = bs. Suppose

further that τ ∨ t = σ ∨ t, then Wt(Is, τ, as) = Wt(Js, σ, bs).

Proof of Lemma 1 - The Law of Wt.

Define the total value function,

Vt(I, τ, a) = EPa

[ ∫ τ

0e−γs

(dIs − h(as)dt

)+ e−γτK

∣∣∣∣Ft]

=∫ t

0e−γs

(dIs − h(as)ds

)+ e−γtWt(I, τ, a)

Vt is automatically a martingale and applying the Martingale Representation Theorem, there exists

a process βt(I, τ, a) such that

dVt = βte−γtdZat t ≤ τ (5)

But we also have that

dVt = e−γt(dIt − h(at)dt)− γe−γtWt + e−γtdWt (6)

Combining (5) and (6) and multiplying by eγt we get the law of Wt.

Proof of Lemma 2 - The Incentive Compatibility Criterion.

Let a and a∗ be two effort strategies and define a∗|ta to be the strategy that is a∗ until time t and

then a afterwards. Consider the process Vt(I, τ, a∗|ta), the total value for the agent at time t when

applying the strategy a∗|ta. Note that as time changes, so does the effort strategy with respect to

which we calculate the project’s value to the agent. The drift of this process with respect to P a∗

is

e−γt[(h(at)− h(a∗) + βt(I, τ, a)(a∗t − at)

]If we let a∗t be governed by βt via the incentive compatibility criterion then Vt(I, τ, a∗|ta) always

has nonnegative drift and so is a submartingale (with last element). Now by the optional sampling

theorem

W0(It, τ, at) = V0(It, τ, a∗|0a) ≤ EPa∗V∞(It, τ, a∗|∞a) = W0(It, τ, a∗t )

Thus a∗ is a better response to the proposal (It, τ).

Proof of Lemma 3 - The Shadow Variable Wt.

49

Page 51: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

Fix a T and integrating the law of Ws from T to t ∧ τ , we get

e−γTWT − e−γt∧τWt∧τ =∫ t∧τ

Te−γs(dIs − h(as)ds) +

∫ t∧τ

Te−γsβsdZ

as

The martingale term on the right side has a quadratic variation process that is uniformly bounded

and so taking the conditional expectation E[ · |FT ] and the limit as t→∞

e−γTWT = limt→∞

E[ ∫ t∧τ

Te−γs(dIs − h(as)ds) +

∫ t∧τ

Te−γsβsdZ

as + e−γt∧τWt∧τ |FT

]=

E[ ∫ τ

Te−γs(dIs − h(as)ds) + e−γτK|FT ] = e−γTWT

Proof of Proposition 4 - The Intrinsic Value Set of Optimal Contracts.

If b is not intrinsic to the family of optimal contracts, then there is a particular optimal contract

(Is, τ, as) for which it is not intrinsic. Let (Wt, Gt) be its payoff process. Then there is some time

t, such that the set of points ω where Gt∧τ (ω) < b(Wt∧τ (ω)) is positive. For each such ω select

a contract with parameter Wt∧τ (ω) and a higher principal payoff. Then at time t ∧ τ , for each

relevant ω, switch to the corresponding new contract. Then this altered contract has a higher

principal payoff than G0 = b(W0). Contradiction.

Proof of Lemma 5. Let W ∗ be a point described in lemma. That b′(W ∗) = −1 and b is concave

implies that for all W ≥W ∗, b(W ) ≤ b(W ∗)− (W −W ∗). If suffices to show that for any W ≥W ∗

the principal can deliver value W to the agent in an incentive compatible way that guarantees

himself a payoff arbitrarily close to b(W ∗)− (W −W ∗). To achieve this, the principal can simply

give then agent W −W ∗ immediately and then employ a contract with parameter W ∗ which has

a principal payoff arbitrarily close to b(W ∗).

Proof of Lemma 6. Suppose that b does not satisfy some (a, β)-inequality for some domain value

W0. Then there is an open interval (W1,W2) containing W0 on which b does not satisfy that same

inequality. We now describe a contract with parameter W0 which has a principal payoff greater

than b(W0). Begin with an a-effort incentive compatible pair (a, β), and an identically zero Is.

Define the shadow value Wt with initial value W0. Since the Is factor is 0, the shadow value moves

continuously. Let σ be the first time Wt ∈ W1,W2. Define the following process

Ft(a, β) = Ft =∫ t

0e−rsdZs + e−rtb(Wt)

50

Page 52: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

With respect to the measure P a the drift is

e−rt(µ+ a+ (γWt + h(a))b′(Wt) +

12β2t b′′(Wt)− rb(Wt)

)dt

This drift is positive when t < σ and EFσ − EF0 > ε > 0.8 Now for each Wi, i = 1, 2, pick a

contract Ci which has a principal payoff Gi which is ε-close to b(Wi). Now consider the contract C

with parameter W0, which begins as an a-effort contract with sensitivity β, and no payments until

the shadow value of the contract hits Wi, at which point the contract turns into Ci. Then

G0(C) = E∫ σ

0e−rsdZs + e−rσGi(Wσ) ≥ E

∫ σ

0e−rsdZs + e−rσ(b(Wσ)− ε) ≥ Fσ − ε ≥ F0 = b(W0)

Proof of Lemma 11 - The Intrinsic Value Set of Type 4 Contracts.

To show that the described contracts produce the claimed payoff points let us call B the “function”

corresponding to the candidate intrinsic value set (I)a,p2 ∪ (II)p1,b. |B| is the union of two continuous

functions on compact intervals, so it attains a maximum M . Let σ be the random time when

Wt = ax and define the process

Ft =∫ t

0e−rs(dZs − dIs) + e−rtB(Wt) t ≤ σ

Then

EFσ = E[ ∫ σ

0e−rs(dZs − dIs) + e−rσB(Wσ)

]= Principal’s Payoff G0

Since B(W0) = F0, it suffices to show F0 = EFσ. The value, slope, and concavity of the function

B(Wt), if ambiguous, will correspond to branch of the curve (I)a,p2 ∪ (II)p1,b thatWt corresponds to.

By synching the shadow value with the appropriate branch, the drift of Ft is 0. Thus F0 = EFt∧σ.

Furthermore

E[Fσ] = E[Ft∧σ + 1t<σ

( ∫ σ

te−rs(dZs − dIs) + e−rσB(Wσ)− e−rtB(Wt)

) ]

≤ G0 + e−rt(µ+ η

r+ 2M)→ G0

8Given a submartingale Xt and a stopping time τ , one cannot in general conclude that EXτ ≥ EX0.Doob’s optional sampling theorem affirms this relation when τ is bounded which our stopping time is not.It turns out this inequality is true for our case anyways, but for more subtle reasons. Since we are only usingthis result to motivate some differential equations, and will not be used in the actual proof of the structuretheorem later in the appendix, we skip the technical details.

51

Page 53: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

That the drift of Ft is everywhere 0 also shows that the payoff process up to σ always stays on

(I)a,p2 ∪ (II)p1,b, proving the intrinsic property.

Appendix C: Some Technical Caveats

The Principal’s Budget

We have implicitly assumed that the principal’s budget is infinite. This is a standard assumption

which for the most part is harmless and helps streamline results on optimality, much like the

infinite horizon assumption. It is a stylized representation of the principal having a large budget

in comparison to the agent. What we want to make sure is that this assumption does not produce

results that do not properly reflect the dynamics of a realistic setting with budget limit. When

martingales are allowed to be unbounded, the option sampling theorem does not hold in general.

Thus in economic models where the budget is infinite, one must take precautions to prevent the

double-or-nothing phenomenon from appearing and distorting results (cf Harrison and Kreps). The

following two observations confirm that the results of the paper are right:

• With a budget limit, the optimal value function will of course be lower. However, as we let

the budget increase to infinity, the optimal value function with a budget limit will converge

to the optimal value function of this paper.

• Fix a stopping time τ , representing a termination time for some contract in an infinite budget

setting. With a budget limit, this termination time will be cut short whenever the underlying

Brownian motion representing the cumulative revenue stream hits the budget limit before τ .

Let A be this event. As we let the budget increase to infinity, the probability of A occurring

converges to 0.

Since the infinite budget is a stylization, what we really want is to find the limit of the optimal value

function with a budget as the budget tends to infinity. The first observation then implies that the

optimal value function we derived is the “correct” one. We now argue that our contracts are also

the correct ones as well. In our paper, we provided optimal contracts and close-to-optimal contracts

corresponding to this correct optimal value function. However, the fear is that while their payoff

points are the right ones, the contracts themselves may not be because they may not be feasible in

a budget setting. The second point then assures us that these fears are unfounded. Specifically, we

can import any of our contracts up to a slight modification into a budget setting. Take any such

contract C, and prematurely terminate the contract whenever the budget limit is reached. Now

because this termination is premature, Wt is not going to equal K. So as the principal, simply

52

Page 54: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

pay the agent the difference at the premature termination time. This altered contract C′ is still

incentive compatible, and the payoff to the agent remains the same. Of course, the payoff to the

principal will now be less than the original, but the second point says that as the budget tends to

infinity, premature termination gets increasingly rare, and consequently, the payoff to the principal

under C′ converges to the original payoff of C.

A Word on the r = γ Case

Recall, when r = γ, any contract can be improved by deferring payment until later. Thus the

“optimal” way to induce high effort is to set the payment threshold W l at infinity effectively

delaying payment forever. This of course is not incentive compatible and so there are only close-

to-optimal high effort contracts with high (instead of infinite) payment thresholds. Thus in all the

previous discussions about contracts, whenever we mentioned an optimal high effort contract, one

should actually substitute with a close-to-optimal high effort contract with high payment threshold

when dealing with the r = γ case. Similarly, whenever we dealt with the intrinsic payoff set

S of the optimal high effort contract, we should substitute with intrinsic payoff set S(W l) of a

close-to-optimal high effort contract with high payment threshold. This change could potentially

be troublesome, since the heuristic algorithm, which was used not only to find the optimal value

functions, but also to derive optimal and close-to-optimal contracts, uses the S-curve, which in the

r = γ case is the intrinsic payoff set of the non-incentive compatible infinite threshold high effort

contract. Thus technically speaking, in the r = γ case, a true heuristic algorithm would replace

all S-curve expansions with attainable approximations of the form S(W l) where W l is high. The

fear is that such a true heuristic algorithm might produce completely different results from the one

we used since as we have seen, the algorithm can run indefinitely and so the little changes may

add up. However, this is actually not the case. For all sufficiently high W l, the changes arising

from the replacement of S with S(W l) don’t accumulate to much, even when the algorithm runs

indefinitely and therefore from an optimal value function perspective, it makes no difference. Of

course, when speaking of contracts, one should remember that the heuristic algorithmic construction

is only an idealized representation of constructions using only attainable points, so that all the high

effort contracts used previously need to be interpreted as ones with high but not infinite payment

thresholds.

Another distinction is that the (II)-curves are linear. This affects the geometry of b, which in

turn affects the structure of the family of optimal contracts. The result is that many of the optimal

contractual forms from the r < γ case, like the promotion contract, are not optimal when r = γ,

and conversely, the tenure-type contract in the r = γ case is not optimal in the r < γ case.

53

Page 55: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

Appendix D: Some Terminology and Preparatory Results

• ν = (−h(η)γ , µ+η

r ), κ = (−h(η−A)γ , µ+η−A

r ), ι = (K,L), i = (K, b(K))

• Given two points u, v, let uv denote the line segment from u to v

• Given a point u, ux and uy denote its x and y coordinates

• Given a differentiable curve C and p ∈ C, let dC|p denote the derivative of C at p; suppose

further C is twice differentiable, then we may also define d2C|p with the obvious meaning.

Alternatively, given a value x over which C is defined, we may also write dC(x) and d2C(x)

with the obvious meanings

• Q = (x, y)|x ≥ −h(η)γ , y < µ+η

r

• For d > 0, define d to be the unit vector with slope −γr d starting from ν. We then define the

line segment ld = rd|r > 0.

• Rd is the set of points in Q strictly below ld and Rcod = Q−Rd

• vκ is the vertical line through κ and v+κ is the portion of vκ above κ and including κ; lκ is

the line in Q parallel to l1 and going through κ; we can similarly define an l+κ

• Fix a point u. Suppose it lies on ld, then define e(u) = −1d .

• Lu denotes the line with slope -1 starting at u and going down and to the right.

• Let C and K be two sets with the same domain D. We define

|C − K| = supx∈D

infy1∈C(x),y2∈K(x)

|y1 − y2|

When d = 1, we may drop the subscript in ld, l+d , Rd, and Rcod . Whenever we fix a point, we will

assume that this point lies in Q unless otherwise stated.

We need to know some specifics about (I)- and (II)-curves for the forthcoming structure theorem.

Lemma 18. The regularity properties of the (I)-ODE imply that given any initial point and slope,

there is a unique global (I)-ODE solution with those initial conditions. So fix a point u and slope

α and let C be the corresponding (I)-ODE curve. Let α ≥ e(u), then C is concave on the interval

[−h(η)γ , ux]. If α > e(u) then C is concave in a neighborhood of u. Let x∗ be the possibly infinite

supremum of all values x such that C is concave on [ux, x].

54

Page 56: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

Proof. The proof of the existence of global solutions is classical. The rest of the results are implied

by similar results from DeMarzo and Sannikov.

Definition 20. Fix a point u and for α ≥ e(u). Let C be the global solution to the (I)-ODE with

initial condition (u, α), and if α > e(u), define x∗ as in Lemma 18. Now given Lemma 18, we may

define (I)u,α to be the concave segment of C whose domain is [−h(η)γ , ux]; (I)u,α to be the concave

segment of C whose domain is [ux, x∗]; and (I)uα to be the concave segment of C whose domain is

[−h(η)γ , x∗]. Lastly, given points u and v, we write (I)u,v to mean any concave (I)-curve going from

u to v; we will see later that (I)u,v is unique if it exists.

Lemma 19. Let u, v be two points (possibly the same) with ux = vx, uy ≤ vy. Fix β ≥ α > e(u)

(β > α if u = v). Then

d(I)u,α(x) < d(I)v,β(x)

for all x > ux over which both (I)-curves are defined. Symmetrically, for β ≥ α ≥ e(u) (β > α if

u = v)

d(I)u,β(x) > d(I)v,α(x)

for all x < ux over which both (I)-curves are defined. In particular, the curves don’t intersect except

possibly at ux.

Proof. Let us prove the first part. For all x > ux but sufficiently close to ux

d(I)u,α(x) < d(I)v,β(x)

either because d(I)u,α(ux) < d(I)v,β(vx) or if not that then by assumption u 6= v and the concavity

at ux of (I)u,α is greater than that of (I)v,β. Now suppose there exist values x such that d(I)u,α(x) ≥

d(I)v,β(x). Then pick the smallest such value - x∗. Since by assumption d(I)u,α(a) < d(I)v,β(a) for

all a ∈ (ux, x∗) so (I)u,α(x∗) < (I)v,β(x∗). But then using a similar concavity argument as before,

we find that for all values x < x∗ but sufficiently close to x∗

d(I)u,α(x) > d(I)v,β(x)

Now by the mean value theorem, there must be a value x∗∗ ∈ (ux, x∗) satisfying

d(I)u,α(x∗∗) = d(I)v,β(x∗∗)

contradicting the minimality of x∗. The second part follows similarly.

55

Page 57: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

Definition 21. Given Lemma 19, we may define u’s (I)-curves - H(I)u ,≥, to be the partially

ordered set

(I)u,d,≥ |d ∈ (e(u),∞] ∪ (−∞, 0) ∪ (I)u,d,≥ |d ∈ [e(u),∞] ∪ (−∞, 0)

where the ordering of two curves both in the first or the second subset is by which one lies above

the other.

Corollary 20. Fix a point u and a (I)-curve (I)u,d1 ∈ H(I)u . Pick a point p1 on this curve, and a

p2 strictly above this curve such that ux < p2x ≤ p1x. Then p2 lies on another (I)-curve (I)u,d2 and

d1 < d2 and d(I)u,d1 |p1 < d(I)u,d2 |p2

Symmetrically, fix a (I)-curve (I)u,d1 ∈ H(I)u . Pick a point p1 on this curve, and a p2 strictly above

this curve such that p1x ≤ p2x < ux. If p2 lies on another (I)-curve (I)u,d2 then

d1 > d2 and d(I)u,d1 |p1 > d(I)u,d2 |p2

Proof. Let us prove the first case. For sufficiently large d, u lies above (I)p2,d. Let α = d(I)u,d1(p2x),

then by Lemma 19, u lies below (I)p2,α. Thus there is a d∗ > α such that u ∈ (I)p2,d∗

or equivalently,

there is a d2 such that p2 lies on (I)u,d2 . Lemma 19 also implies d1 < d2 and d(I)u,d2 |p2 > α. Since

(I)u,d1 is concave, α ≥ d(I)u,d1 |p1 and the second inequality is proved. The second part follows

similarly.

Definition 22. Fix a d and any point x ∈ ld. We call the curve ((I)x,− 1d,Lx) a full Sd-curve. In

addition, for any point p with px ≥ νx and py = νy, we call Lp a full Sd-curve.

Lemma 21. The full Sd-curves don’t intersect each other and cover all of Q.

Proof. Clearly, the full Sd curves don’t intersect in Rcod . So suppose two full Sd curves, say S1d and

S2d , intersect at u ∈ Rd. Now any full Sd curve that goes through Rd must cross ld. Let S1

d and S2d

intersect ld at p1 and p2. Then Corollary 20 implies dS1d |p1 6= dS2

d |p2 . Contradiction.

The second assertion is obvious.

Definition 23. We define (HSd , ≥) to be the ordered set of full Sd-curves, where ordering between

curves is determined by which curve lies above the other.

56

Page 58: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

Definition 24. Fix a point u, then we write Sud for the portion of the unique Sd-curve through u

lying to the right of u and containing u. We shall refer to this Sud as an Sd-curve. When d = 1,

we may drop the subscript.

Lemma 22. Let S1d, S2

d be two distinct elements of HSd with S1d > S2

d. Let x be a value for which

(x, S2d(x)) lies in Rd. Then dS1

d(x) < dS2d(x). Furthermore, for any x ≥ −h(η)

γ and y ≥ −1 there

is an Sd-curve S∗d defined over x such that dS∗d(x) = y.

Proof. Let u = (x, S1d(x)), v = (x, S2

d(x)), p = Sud ∩ ld and q = Svd ∩ ld. If dSud (x) ≥ dSvd(x) then by

Lemma 19, dSud |p < dSvd |q, contradiction.

Lemma 23. Fix a κ and a line l containing κ with slope e. Let C be any (II)-curve which intersects

l at a point p other than κ. Then dC|p = rγ e.

Proof. Perform the translation/substitution X ≡ x−(−h(η−A)γ ) and Y ≡ y− µ+η−A

r . The (II)-ODE

becomes

rY = γXY ′ or rearranging Y ′ =r

γ

Y

X

Lemma 24. Let (x, y, a) be a tuple of numbers signifying a point x, y and a slope a at that point.

For x 6= κx let y′ = d(II)(x,y)|(x,y). Then the tuple (x, y, a) satisfies the (II)-inequality if

• x < κx ⇒ a ≥ y′

• x = κx ⇒ y ≥ κy

• x > κx ⇒ a ≤ y′

Proof. Again, it is easiest to work in (X,Y ) coordinates, with Y ′ ≡ y′. In the first case X < 0, so

aX ≤ Y ′X if a ≥ Y ′. In the second case, X = 0 so rY ≥ γXY ′ if Y ≥ 0. And in the last case

X > 0, so aX ≤ Y ′X if a ≤ Y ′.

Lemma 25. Fix a point u ∈ v+κ . Any sub-curve of any differentiable downward sloping concave

curve C emanating from u heading either in the left or the right direction satisfies the (II)-inequality.

Proof. It suffices to prove the lemma when the sub-curve is the entire curve, and the curve is heading

right. We work in (X,Y ) coordinates. In the (+,+) quadrant of the X − Y plane, Y ′ > 0 but the

dC < 0. So by Lemma 24, the (II)-inequality is satisfied here. In the (+,−) quadrant, suppose at

some point p, dC|p > d(II)p|p. In this quadrant, the (II)-curve is strictly convex, and C is concave,

which means C(0) < (II)p(0) = 0, contradiction. Thus dC|p > d(II)p|p and the (II)-inequality is

satisfied everywhere.

57

Page 59: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

Lemma/Definition 26. For each point p in the plane such that px 6= κx we may associate to it

three-tuple of numbers (a, b, c)p

• l = d(II)p∣∣p

• m = d2(II)p∣∣p

• n = d2(I)p,a∣∣p

The locus of points p where m = n is along with κ comprise the set of points satisfying the equation

Y =γ

rc

X2

c2A(1− r

γ )−X

where Y ≡ y− µ+η−Ar and X ≡ x−(−h(η−A)

γ ). When r < γ, there is a singularity at X = c2A(1− r

γ ).

To the left of this singularity lies the component of points that includes κ. This component is a

convex, U-shaped curve with a global minimum at κ. We call it the U-curve. If p lies in the U-curve

or under the other downward facing parabola then m < n. Along the graph, m = n, and for all

other p, m > n. Also, l+c lies inside the U-curve.

When r = γ, there is no singularity and the equation’s graph is simply the straight line going

through κ and ν. If p lies above this line, m > n; if p lies on the line, m = n; and if p lies below

the line, m < n.

Proof. The (I)-inequality can be rewritten as

rY +A = (γX + cA)Y ′ +c2

2Y ′′

At the point p = (X,Y )

l =r

γ

Y

Xm =

r

γ

Y ′X − YX2

=r

γ

(r

γ− 1)Y

X2

and

n =2c2

(rY +A−

(γX + cA

)Y ′)

=2Ac2

(1− rc

γ

Y

X

)setting m = n

r

γ

(r

γ− 1)Y

X2=

2Ac2

(1− rc

γ

Y

X

)⇒

r

γ

(r

γ− 1)Y =

2Ac2

(X2 − rc

γXY

)⇒

58

Page 60: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

[r

γ

(r

γ− 1)

+2Arcγ

X

]Y =

2Ac2X2 ⇒

Y =2Ac2

X2

(1− r

γ

)− 2Ar

cγ X=

γ

rc

X2(1− r

γ

)−X

Corollary 27. Let p /∈ v+κ be a point inside the U-curve. Then (II)p satisfies the (I)-inequality.

Proof. rY +A = (γX + cA)l + c2

2 n ≥ (γX + cA)l + c2

2 m

Lemma 28. Let us split Q into five regions

• Lo - the region to the left of vκ and outside the U-curve

• Li - the region to the left of vκ and inside the U-curve

• Ri - the region to the right of vκ and inside the U-curve

• Ru - the region to the right of vκ, above the y = κy line, and outside the U-curve

• Rl - the region to the right of vκ and below the y = κy line.

We assume that the boundaries belong to both neighboring areas. Let p be a point in one of these

regions but not on vκ. Fix a value λ subject to

• e(p) < λ and d(II)p|p ≤ λ if p ∈ Lo ∪ Li

• e(p) < λ ≤ d(II)p|p if p ∈ Ri ∪Ro

• λ ≤ d(II)p|p if p ∈ Rl

Next we fix a curve C

• C = (I)p,λ if p ∈ Lo ∪Ri

• C = (I)p,λ if p ∈ Li ∪Ru

• C is any concave C1 curve with initial point/slope (p, λ)

Then for a point q ∈ C different from p but in the same region as p

• dC|q > d(II)q|q if q ∈ Lo ∪ Li

• dC|q < d(II)q|q if q ∈ Ri ∪Ru ∪Rl

59

Page 61: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

Thus, in the region that p is in, C satisfies the (II)-inequality and doesn’t intersect (II)p.

Proof. Let us consider the Ri case, and assume that dC|p < d(II)p|p. The for all sufficiently small

ε > 0, dC(px− ε) < d(II)pε |pε where pε is the unique point on C such that pεx = px− ε. Now suppose

there is a point q ∈ Ri lying on C such that dCq|q ≥ d(II)q|q. Let q be the rightmost such point.

Then since q lies inside the U-curve, d2C|q > d2(II)q|q. This implies that for all sufficiently small

ε > 0, dC(qx + ε) > d(II)qε |qε where qε is the unique point on C such that qεx = qx + ε. But then by

the mean value theorem, there is another point q′ ∈ C between q and p such that dC|q′ = d(II)q′ |q′

contradicting the rightmost property of q. The dC|p = d(II)p|p now follows as a limiting case.

Similar arguments can be made for the Lo, Li, and Ru cases.

For the Rl case, suppose there is a q for which the result fails. Since C is concave and (II)p is

convex, C < (II)p for all point except p. Since q 6= p, q lies below (II)p. Now by assumption,

dC|q ≥ d(II)q|q. Again, using the concave-convex argument, C lies below (II)q which is impossible

since C intersects (II)p.

Corollary 29. Let Ca,b be a concave curve. Suppose a ∈ U -curve, dCa,b|a ≥ d(II)a|a and Ca,b crosses

v+κ with negative slope. Then Ca,b satisfies the (II)-inequality. Symmetrically, suppose b ∈ U -curve,

dCa,b|b ≤ d(II)b|b and Ca,b crosses v+κ with positive slope. Then Ca,b satisfies the (II)-inequality.

When r = γ, a straight line satisfies the (II)-inequality if and only if it cuts vκ at or above κ.

Proof. Let us consider the a ∈ U-curve case. That Ca,b satisfies the (II)-inequality in Li is affirmed

by the previous lemma. That Ca,b satisfies the (II)-inequality to the right of vκ is given by Lemma

25. The symmetric case is similar.

In the X − Y plane, the (II)-inequality is Y ′ ≤ YX . The line corresponding to such a point slope

triple (X,Y, Y ′) clearly has a nonnegative Y -intercept.

Lemma 30. Consider two versions of this model where the only difference is in the principal’s

outside option, with one having payoff L1 and the other having payoff L2 > L1. Then bL2 ≥ bL1.

Proof. A contract (It, τ, at) is incentive compatible in the L1-model if and only if it is incentive

compatible in the L2-model. Since L1 < L2, thus GL10 (It, τ, at) ≤ GL2

0 (It, τ, at).

Lemma 31. Fix points a, p such that ax < px < κx. Suppose (I)a,p exists and d(I)a,p|p = d(II)p|p.

Let C = ((I)a,p, (II)p). If (I)a,p is not a minimal element of H(I)a and d2(I)a,p|p < 0, then for

sufficiently small ε > 0 there exist curves (I)a,p1 and (II)p2 satisfying the following properties:

60

Page 62: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

• p1 6= p2, p1 ∈ (II)p2, and p2 ∈ (I)a,p1

• (I)a,p1∪ (II)p2 ≤ C and |C− (I)a,p1∪ (II)p2 | < ε

Similarly, fix a point p such that px > κx and suppose d(I)p = dSp|p. Let C = ((II)p, Sp). Then for

sufficiently small ε > 0 there exist curves (II)p2 and Sp1 satisfying the following properties:

• p1 6= p2, p1x = px, and p1 ∈ (II)p2, and p2 ∈ Sp1

• (II)p2 ∪ Sp1 ≤ C and |C− (II)p2 ∪ Sp1 | < ε

Proof. Since the only version of this lemma we will use involves p lying on or outside the U-curve,

we will only prove the result for this case. However, the proof can be easily adapted to p inside

the U-curve. We should also note that even more general versions of the lemma are true (e.g. one

doesn’t have to assume d2(I)a,p|p < 0 in the first case). The purpose of results of this type is to

affirm that attainable payoffs of Type 4 do exist; the two cases above are the only ones that are

needed in the forthcoming proof of the structure theorem.

px < κx

Let q be the unique point with the following two properties: qx = ax and p ∈ (II)q. Let α = qy−ay.

By Lemma 28, α > 0 we assume ε < α. Let p1 be a point satisfying p1x = px and py − p1y = δ > 0.

Let b be the point satisfying bx = qx, and p1 ∈ (II)b. We assume that δ is so small that

• qy − by < α and therefore | (II)q− (II)b| < ε.

• (I)a,p1 exists and |(I)a,p− (I)a,p1 | < ε

Since p1 lies below p, by Lemma/Definition 19 and Lemma 23, d(I)a,p1 |p1 < d(II)b|p1 . Therefore

(II)b lies strictly below (I)a,p1 immediately to the left of p1. But by assumption, we know b lies

above a. So to the left of p1, (I)a,p1 and (II)b must cross. Lemma 28 implies that they cross only

once, and we define p2 to be this intersection.

px > κx

In this part, as an abuse of notation, we write (II)u to mean the entire (II)-curve that (II)u is a

part of. Pick a p1 lying directly below p. By Lemma 22 and Lemma 23, d(II)p1 |p1 < Sp1 . Now by

Lemma 28 and the fact that the S-curve eventually becomes a straight line of slope -1, they must

intersect once more at a point p2.

Lemma 32. The first best function f(W ) is bounded above by a linear function of slope -1.

61

Page 63: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

Proof. Since the principal is more patient than the agent, any contract can be improved (usually

in a non-incentive compatible way) by collapsing the process It to a initial lump sum payment that

has equivalent value to the agent. So let (It, τ, at) be such a contract. Since It is constant for all

t > 0, W+0 ≤ maxκx,K, and clearly, G+

0 ≤µr . Thus

G0 = G+0 − (W0 −W+

0 ) ≤ µ

r−W0 + maxκx,K

So

f(W ) = supW0=W

G0 ≤µ

r+ maxκx,K −W

Lemma 33. Let B be a concave differentiable function defined on [K,∞). Suppose B satisfies the

(I)- and (II)-inequalities, B′ ≥ −1, and B(K) ≥ L, then B is an upper bound on the optimal value

function.

Proof. By Lemma 32, f(W )−B(W ) ≤ µr +maxκx,K−W −B(K)+W −K ≤ µ

r +maxκx,K−

B(K)−K ≡M , a constant.

Given any incentive compatible contract, define the following process

Ft =∫ t

0e−rs(dZs − dIs) + e−rtB(Wt)

The drift is

dFt = e−rt[(µ+ at + (γWt + h(at))B′(Wt) +

12β2tB′′(Wt)− rB(Wt)

)dt

−(1 +B′(Wt))dIt

]≤ 0

Thus for any bounded stopping time σ, EFσ ≤ EF0 = B(W0). Now for any time t

G0 = E[ ∫ τ

0e−rs(dZs − dIs) + e−rτL

]=

E[Ft∧τ + 1τ>te−rt

(∫ τ

te−r(s−t)(dZs − dIs) + e−r(τ−t)L−B(Wt)

)]≤

B(W0) + e−rtE[1τ>t

(f(Wt)−B(Wt)

)]≤ B(W0) + e−rtM

Letting t→∞, we get G0 ≤ B(W0).

62

Page 64: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

Appendix E: Structure Theorem for The Optimal Value Function b

Part I: K < −h(η−A)γ

Lemma 34. ι = i

Proof. It suffices to show b(K) ≤ L. For any t > 0

∫ t

0−e−γsh(η −A)ds+ e−γtK > K

Let (It, τ, at) be any incentive compatible contract. If τ 6= 0 a.s. then

Agent’s Payoff = EPa

[ ∫ τ

0e−γs

(dIt − h(η − at)ds

)+ e−γτK

]

≥ EPA

[ ∫ τ

0e−γs

(dIt − h(η −A)ds

)+ e−γτK

]

≥ EPA

[ ∫ τ

0−e−γsh(η −A)ds+ e−γτK

]> K

Thus only null contracts can possibly deliver value K to the agent, and so b(K) ≤ L.

Case 1: r < γ and c ≥ 1

Proposition 35. If ι lies at or above above lκ then the optimal value function is

(Lι)

Otherwise, if ι is still on or inside the U-curve then there is a unique point q ∈(II)ι such that

d(II)ι|q = −1 and the optimal value function is

((II)ι,q,Lq)

If ι lies strictly between the U-curve and (I)κ,0 then there is a unique point p on the U-curve such

that d(I)ι,p|p = d(II)p|p and there is a unique point q ∈(II)p such that d(II)p|q = −1 and the optimal

value function is

((I)ι,p, (II)p,q,Lq)

Finally, if ι lies on or below (I)κ,0 then the optimal value function is

((I)ι,κ, κ,Lκ)

63

Page 65: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

Proof.

(Lι) - Upper Bound

Since ι lies in Rco, (Lι) = (Sι) satisfies the (I)-inequality. Since ι is at or above lκ, by Lemma

23, d(II)ι|ι ≤ −1. Lemma 28 and Lemma 25 imply (Lι) satisfy the (II)-inequality. By Lemma 33,

(Lι) ≥ b.

(Lι) - Lower Bound

Trivial. Thus b = (Lι).

((II)ι,q,Lq) - Upper Bound

Since ι lies strictly below lκ but on or inside the U-Curve, by Lemma 23, −1 < d(II)ι|ι < 0 and so

the point q exists. By Corollary 27, (II)ι,q satisfies the (I)-inequality. Since q ∈ Rco, Lq satisfies the

(I)-inequality, and by Corollary 29, Lι also satisfies the (II)-inequality. Now ((II)ι,q, Lq) is not C2

so we cannot apply Lemma 33. Instead, we will construct a sequence of C2 functions fi ↓ ((II)ι,q,

Lq), each satisfying the hypothesis of Lemma 33.

Fix an ε1 > 0 and let q1 be the unique point on (II)ι such that q1x = qx − ε. Let l1 = d(II)q|q1 ,

m1 = d2(II)q|q1 , and y1 = (II)q(q1x). Let f1 be a function on the interval [ιx,∞) defined as follows

• f1(x) = (II)ι(x) for x ∈ [ι, qx − ε]

• f1(x) = y1 +∫ x−(qx−ε)0

[l1 + t

∫ x−(qx−ε)0 (1− s

ε )m1ds]dt for x ∈ [qx − ε, qx]

• f1(x) = f(qx) +(l1 + t

∫ ε0 (1− s

ε )m1ds)(x− qx) for x ∈ [qx,∞)

One can show that f1 satisfies the (I) and (II)-inequalities and is C2. Thus b ≤ f1. One can continue

picking a sequence of decreasing εi’s, and as εi → 0, b ≤ lim fi = ((II)ι,q, Lq).

((II)ι,q,Lq) - Lower Bound

This is the intrinsic value set of a family of contracts so b ≥ ((II)ι,q,Lq).

((I)ι,p, (II)p,q,Lq) - Upper Bound

64

Page 66: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

1.) Existence of p when κ lies at or above SιcDefine C to be (I)ι,κ. Since ι lies strictly above (I)κ,0, Lemma 20 implies dC|κ > 0. Thus C intersects

the left half of the U-curve at a point p′. Suppose dC|p′ ≥ d(II)p′ |p′ , then by Lemma 28, C and

(II)p′

would not intersect on or inside the U-curve except at p′, and so κ /∈ C, contradiction. So

dC|p′ < d(II)p′ |p′ . Let q be a point moving further up and left along the U-curve. Then by Lemma

23, d(II)q|q is decreasing. By Lemma 20, d(I)ι,q|q is increasing. So there is a unique point p when

they equal.

2.) Existence of p when κ lies under SιcDefine C to be Sιc. In this case, p′ clearly exists. Again, suppose dC|p′ ≥ d(II)p

′ |p′ . By Lemma 26,

C hits l+c at a point q1, and by Lemma 23, dC|q1 = −c = d(II)q1 |q1 . This contradicts Lemma 28 and

so dC|p′ < d(II)p′ |p′ . The existence of p follows as above.

The existence of q is clear. By Lemma 28 both the (I)- and (II)-curves of our candidate b sat-

isfy both inequalities. The exact same construction of the fi’s in the previous case works here too,

implying b ≤ ((I)ι,p, (II)p,q,Lq).

((I)ι,p, (II)p,q,Lq) - Lower Bound (Heuristic Algorithm Approach)

Let us begin with C. Let p1 be the point on C with the highest (II)-type and let q1 = (II)p1 ∩ lκ.

Let ρ1 be the point on (II)p1 with the highest (I)ι-type. Let p2 be the point on (I)ι,ρ1 with the

highest (II)-type and let q2 = (II)p2 ∩ lκ. Successively, we may define ρ2, p3, q3, ρ3, . . .

1.) ((I)ι,pn , (II)pn,qn ,Lqn) and ((I)ι,ρn , (II)ρn,qn ,Lqn) are attainable.

C is the value set of a family of contracts. The addition of the curve (II)p1 and then Lq1 correspond

to amendments of contracts so that C ∪ (II)p1 ∪ Lq1 is also the value set of a family of contracts,

and the upper shell of this value set, which by definition is attainable, is ((I)ι,p1 , (II)p1,q1 ,Lq1).

After the next addition, the value set becomes C ∪ (II)p1 ∪ Lq1 ∪ (I)ι,ρ1 with an upper shell of

((I)ι,ρ1 , (II)ρ1,q1 ,Lq1). Thus the result holds for n = 1. The proof for larger n follows successively

with the same argument.

2.) limn→∞

pn = limn→∞

ρn = p and limn→∞

qn = q

Each pn lies outside the U-curve and each ρn lies inside the U-curve. Let (I)ι,ρ∗

= limn→∞

(I)ι,ρn and

(II)p∗

= limn→∞

(II)pn . Then ρ∗ ∈ (II)p∗

and d(I)ι,ρ∗ |ρ∗ = d(II)p

∗ |ρ∗ . If ρ∗ is not on the U-curve, then

65

Page 67: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

it lies inside the U-curve and by Lemma 28, (I)ι,ρ∗

lies above (II)p∗

near ρ∗, which is a contradiction,

since every (I)ι,ρn is below (II)pn+1 near ρn. Thus ρ∗ is on the U-curve and therefore must be p.

Similarly, the fact that every (II)pn lies below (I)ι,ρn implies the same relation in the limit shows

that p∗ must lie on the U-curve and therefore must be p as well. That qn → q follows immediately.

((I)ι,p, (II)p,q,Lq) - Lower Bound (Type 4 Approach)

As an application of Lemma 31, there exists a value set ((I)ι,p1∪ (II)p2) lying just below and ε-close

to ((I)ι,p, (II)p). On the (II)-curve branch of this value set there also exists a q2 where we can paste

a Lq2 in a C1 way, lying just below Lq. The entire set ((I)ι,p1∪ (II)p2 ,Lq2) is a value set and as

ε→ 0, it converges to ((I)ι,p, (II)p,q,Lq).

((I)ι,κ, κ,Lκ) - Upper Bound

Let d be the S-type of (I)κ,0 and let u be the intersection of (I)κ,0 and the line x = K. Then u lies

above ι so that uy ≥ L. Then Sud satisfies the conditions of Lemma 33, so b ≤ Sud . Since κ ∈ Sudand is attainable, so κ is on b. By Lemma 25, (I)ι,κ satisfies the (II)-inequality so by Lemma 33,

(I)ι,κ ≥ b. Similarly, Lκ satisfies the (II)-inequality and so by Lemma 33 it is also ≥ b. Thus

b ≤ ((I)ι,κ, κ,Lκ).

((I)ι,κ, κ,Lκ) - Lower Bound

This is the value set of a family of contracts, so b ≥ ((I)ι,κ, κ,Lκ).

Case 2: r < γ and c < 1

Proposition 36. If ι ∈ Rco then the optimal value function is

(Lι)

Otherwise,

I. dSι(κx) < 0

• If Sι satisfies the (II)-inequality, then the optimal value function is

(Sι)

This occurs only when κ lies strictly under Sι.

66

Page 68: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

• If Sι does not satisfy the (II)-inequality; and either κ lies at or below Sι, or if not, then

d(I)ι,κ|κ < 0; and ι lies outside the U-curve, then there exists points p and q to the left of κ

such that p is the unique point on the U-curve such that d2(I)ι,p = d2(II)p and q is the unique

point on (II)p such that d(II)p|q = dSq|q and the optimal value function is

((I)ι,p, (II)p,q, Sq)

• If Sι does not satisfy the (II)-inequality; and either κ lies at or below Sι, or if not, then

d(I)ι,κ|κ < 0; and ι lies inside the U-curve, then there exists a q to the left of κ where q is

the unique point on (II)ι such that d(II)ι|q = dSq|q and the optimal value function is

((II)ι,q, Sq)

• If Sι does not satisfy the (II)-inequality, d(I)ι,κ|κ ≥ 0, then the optimal value function is

((I)ι,κ, κ, Sκ)

II. dSι(κx) = 0

• If κ lies at or below Sι then the optimal value function is

(Sι)

• If κ lies at or above Sι then the optimal value function is

((I)ι,κ, κ, Sκ)

III. dSι(κx) > 0

• If Sι satisfies the (II)-inequality, then the optimal value function is

(Sι)

This occurs only when κ lies strictly under Sι.

• If Sι does not satisfy the (II)-inequality; and either κ lies at or below Sι, or if not, then

dSκ|κ > 0; then there exists points p and q to the right of κ such that p is the unique

67

Page 69: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

point on the U-curve such that d2(II)p = d2Sp and q is the unique point on (II)p such that

d(II)p|q = d(I)ι,q|q and the optimal value function is

((I)ι,q, (II)q,p, Sp)

• If Sι does not satisfy the (II)-inequality; and dSκ|κ ≤ 0; then the optimal value function is

((I)ι,κ, κ, Sκ)

Proof.

(Lι)

This curve satisfies the (I)-inequality since it lies in Rco. By Lemma 29, it also satisfies the (II)-

inequality since ι lies above lκ.

I. (Sι)

Sι satisfies the (I)-inequality, is the value set of a contract, and by assumption satisfies the (II)-

inequality. So by Lemma 33, b = (Sι).

I. ((I)ι,p, (II)p,q, Sq) - Upper Bound

The existence of p follows exactly from the ((I)ι,p, (II)p,q,Lq) proof found in Proposition 35. Since

(I)ι,p lies above Sι, Lemma 20 implies d(II)p|p = d(I)ι,p|p > dSι|p. Of course, d(II)p|κ = −∞ <

dSκ|κ, so by the mean value theorem, there is a point q on (II)p where d(II)p|q = dSq|q. That this

point is unique follows from Lemma 28.

Now pick out q1, l1, m1, and y1 as in the ((I)ι,p, (II)p,q,Lq) case from the previous Proposition.

Also define f1(x) on the interval [qx − ε1, qx] as before. On this interval, f1 satisfies the (II)-

inequality. At the left endpoint, it satisfies the (I)-inequality strictly because the (II)-curve does.

However, at the right end point, f ′′ = 0, f ′ < 0, x < κx, and f(x) < µ+ηr , so that the (II)-inequality

is not satisfied. Thus there must be a leftmost value x∗ ∈ [qx − ε1, qx] such that the (I)-inequality

is satisfied with equality. Let q∗ = (x∗, f1(x∗)). Let l′ = f ′1(x∗) and consider (I)q∗,l′ . Since f1

satisfies the (I)-inequality on [qx − ε1, x∗], so it lies above (I)q∗,l′ on this interval. Thus (I)q∗,l′

intersects the leftward extension of Sq at a point ρ. Let d be the S-type of (I)q∗,l′ . Then by Lemma

68

Page 70: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

20, Sρd and Sρ do not intersect outside of ρ. Thus the C2 curve ((I)ι,q1 , f q1,q∗

1 , Sq∗), which is an

upper bound by Lemma 33, is ≥ ((I)ι,p, (II)p,q, Sq). Picking successively smaller εi, we get that

b ≤ ((I)ι,p, (II)p,q, Sq).

I. ((I)ι,p, (II)p,q, Sq) - Lower Bound

Both the Heuristic Algorithm and the Type 4 Approach are very similar to those used in the pre-

vious Proposition.

I. ((II)ι,q, Sq) - Upper Bound

If dSι|ι ≥ d(II)ι|ι then by Lemma 29, Sι would satisfy the (II)-inequality, which contradicts our

assumption. Thus dSι|ι < d(II)ι|ι and just as in the previous case, the mean value theorem and

Lemma 28 implies the existence of a unique q. Lemma 29 now implies that Sq satisfies the (II)-

inequality, and using the C2-approximation method from the previous case we get b ≤ ((II)ι,q, Sq).

I. ((II)ι,q, Sq) - Lower Bound

This curve is the value set of a family of contracts.

I. ((I)ι,κ, κ, Sκ)

In this case, κ lies above Sι, so Lemma 22 and the assumption of I. imply dSκ|κ < dSι(κx) < 0.

Now by Lemma 25, (I)ι,κ and Sκ satisfy the (II)-inequality. The rest of the proof is identical to the

((I)ι,κ, κ, Lκ) case in Proposition 35.

II. (Sι)

Sι is an increasing concave function to the left of vκ, Sι(κx) ≥ κx, and Sι is a decreasing concave

function to the right of vκ. Given Lemma 25 it is clear that such a curve satisfies the (II)-inequality.

Lemma 33 and the fact that (Sι) is a value set does the rest.

II. ((I)ι,κ, κ, Sκ)

The proof is almost identical to the previous I. ((I)ι,κ, κ, Sκ) case. The inequality now reads

dSκ|κ < dSι(κx) = 0.

III. (Sι)

69

Page 71: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

Same reasoning as I. (Sι).

III. ((I)ι,q, (II)q,p, Sp) - Upper Bound

Existence and Uniqueness of p

If κ is under Sι then define C to be Sι, otherwise define C to be Sκ. Let p1 be the intersection

of C with the right half of the U-curve. If dC|p1 ≤ d(II)p1 |p1 , then Lemma 28 would imply that C

satisfied the (II)-inequality in the first case and that C(κx) > κy in the second case. Both contradict

assumptions. So dC|p1 > d(II)p1 |p1 , and the existence of a p follows the same logic as in Proposition

35.

Existence and Uniqueness of q

Since this p will lie above Sι, d(I)ι,p|p > dSp|p = d(II)p|p (need to reference a lemma). Now in the

first case let q1 be the intersection of Sι and (II)κ,p and in the second case let q1 = κ. In either

case, d(I)q1 |q1 < d(II)p|q1 . Thus by the mean value theorem there is a point q in between q1 and

p that satisfies d(I)κ,q|q = d(II)p|q. If there were to such points, then Lemma 20 would imply a

contradiction.

That (I)ι,q satisfies the (II)-inequality follows from Lemma 25 and Lemma 28. Using the same

type of argument employed in I. ((I)ι,p, (II)p,q, Sq) - Upper Bound, we get b ≤ ((I)ι,q, (II)q,p, Sp).

III. ((I)ι,q, (II)q,p, Sp) - Upper Bound

Similar to I. ((I)ι,p, (II)p,q, Sq) - Lower Bound.

((I)ι,κ, κ, Sκ)

The proof is the same as before.

Case 3: r = γ

Proposition 37. If c < 1 and ι ∈ Rco the optimal value function is

(Lι)

If c < 1 and ι ∈ R then either κ lies at or above Sι in which case the optimal value function is

((I)ι,κ, κ, Sκ)

70

Page 72: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

or κ lies strictly below Sι and the optimal value function is

(Sι)

If c ≥ 1 and ι lies at or above lκ then the optimal value function is

(Lι)

If c ≥ 1 and ι lies at or above lc and at or below lκ then the optimal value function is

((II)ι, κ,Lκ)

If c ≥ 1 and ι lies at or below lc then the optimal value function is

((I)ι,κ, κ,Lκ)

Proof.

c < 1: (Lι)

By Lemma 29, (Lι) satisfies the (II)-inequality.

c < 1: ((I)ι,κ, κ, Sκ) - Upper Bound

Let u be the unique point with ux = K such that Su goes through κ. Since κ lies above Sι, so

u lies above ι. Also, by Lemma 25, Su satisfies the (II)-inequality. Lemma 33 now implies that

(Su) is an upper bound on b. Thus κ ∈ b. Applying Lemma 25 to both (I)ι,κ and Sκ, we see

both parts satisfy the (II)-inequality. In previous cases when b exhibited this form, it was always

the case that d(I)ι,κ|κ ≥ 0. This may not be the case now. However, Lemma 22 still implies

d(I)ι,κ|κ ≥ dSι(κx) > −1. We are now able to apply Lemma 33, and so b ≤ ((I)ι,κ, κ, Sκ).

c < 1: ((I)ι,κ, κ, Sκ) - Lower Bound

c < 1: (Sι)

Lemma 25 implies the upper bound property.

c ≥ 1: (Lι)

71

Page 73: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

In this region, (Lι) satisfies the hypothesis of Lemma 29 and so satisfies the (II)-inequality.

c ≥ 1: ((II)ι, κ, Lκ)

In this region, by the r = γ version of Lemma 28, the (II)-curve satisfies the (I)-inequality. By

Lemma 29, Lq satisfies the (II)-inequality, and the two curves put together is C1. The rest of the

proof is the same as ((II)ι,q, Lq) in Proposition 35.

c ≥ 1: ((I)ι,κ, κ,Lκ)

lc is a solution to both the (I)- and (II)-ODE. It also lies above ι so b ≤ lc. Thus κ is on b. Now

(I)ι,κ is a concave curve with d(I)ι,κ|κ > −1c ≥ −1. By Lemma 25, both (I)ι,κ and Lκ satisfy the

(II)-inequality. Now Lemma 33 does the rest.

Part II: K ≥ −h(η−A)γ

Case 4: c < 1

Lemma 38. Suppose that dSκ|κ > 0 and r < γ. Then there is a unique point p on the right side

of the U-curve such that the curve ((II)p, Sp) is C2. Let v be the unique point on the vκ line such

that Sp ⊂ Sv. Now for r ≤ γ, for each a > κx there is a unique point u(a) with u(a)x = a such

that dSu(a)|u(a) = d(II)u(a)|u(a).

Proof. The proof of the existence of p in this lemma is similar to the proofs for the existence of p

found in the ((I)ι,p, (II)p,q, Sq) cases. For each a > κx let s(a) denote the unique point on lκ with

s(a)x = a, and let t(a) be the unique point on l with t(a)x = a. We have

−1 = d(II)s(a)|s(a) < dSs(a)|s(a) and d(II)t(a)|t(a) > dSt(a)|t(a) = −1

So by the intermediate value theorem, such a u(a) must exist.

Proposition 39. If ι ∈ Rco then the optimal value function is

(Lι)

Otherwise,

I. If r < γ, dSκ|κ > 0 and

72

Page 74: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

• if ι lies at or above (I)v,p or the locus of points u(a) for a ≥ px then the optimal value function

is

(Sι)

• if ι lies strictly in between Sv and (II)κ,p then there exists a unique q 6= p on (II)p such that

d(II)p|q = d(I)ι,q and the optimal value function is

((I)ι,q, (II)q,p, Sp)

• if ι lies at or below (II)κ,p then ι 6= i. Now, i is the unique point on (II)p which lies above ι

and the optimal value function is

((II)i,p, Sp)

• if ι lies strictly below the locus of points u(a) for a ≥ px then i = (K,u(K)) and the optimal

value function is

(i, Si)

II. If r < γ and dSκ|κ ≤ 0; or r = γ; and

• if ι lies at or above the locus of points u(a) for a ≥ κx then the optimal value function is

(Sι)

• if ι lies strictly below the locus of points u(a) for a ≥ κx then i = (K,u(K)) and the optimal

value function is

(i, Si)

Proof.

(Lι)

Lι satisfies the (II)-inequality by Lemma 25 and is attainable. By Lemma 33, b = (Lι).

I. and II. (Sι)

If ι lies on or outside the U-curve then ιx ≥ px and by assumption, ιy ≥ u(ιx). Let u(ι) denote the

point (ιx, u(ιx). By the concavity of S-curves and Lemma 22, dSι|ι ≤ dSu(ι)|u(ι). By Lemma 23,

d(II)u(ι)|u(ι) < d(II)ι|ι. Thus dSι|ι ≤ d(II)ι|ι. Now by Lemma 28, Sι satisfies the (II)-inequality.

73

Page 75: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

Lemma 33 does the rest.

If ι lies inside the U-curve then let p′ = Sι∩ U-curve. p′ lies further up along the convex U-

curve then p so d(II)p′ |p′ > d(II)p|p. Since the U -curve is increasing from p to p′, so by Lemma 22,

dSp|p > dSι|p′ . Thus d(II)p′ |p′ > dSι|p′ . Now Lemma 28 implies Sι satisfies the (II)-inequality and

Lemma 33 finishes the proof.

I. ((I)ι,q, (II)q,p, Sp)

This case is similar to Proposition 36: III. ((I)ι,q, (II)q,p, Sp).

I. ((II)i,p, Sp) - Upper Bound

By Lemma 30 and the previous case, b(K) ≤ i. The (II)-curve and S-curve satisfy the two inequal-

ities by Lemma 28. Lemma 33 implies the result.

I. ((II)i,p, Sp) - Lower Bound

From the proof of ((I)ι,q, (II)q,p, Sp) in Proposition 36 we know that we can construct Type 4

contracts with value graphs approaching ((II)p, Sp). Any such contract with parameter W0 > κx

has a future value process Wt > minW0, px for all t > 0. Thus any Type 4 contract in that

model with parameter value greater than the ιx of this model remains incentive compatible in this

model. In particular, attainable payoffs with W0 ≥ ιx in that model are attainable here as well. So

b ≥ ((II)i,p, Sp).

I. and II. (i, Si) - Upper Bound

By Lemma 30 and the I and II. (Sι) case, b(K) ≤ i. The curve satisfies the (II)-inequality by

Lemma 25.

I. and II. (i, Si) - Lower Bound

This is an application of the second case of Lemma 31.

Case 5: c ≥ 1

74

Page 76: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

Proposition 40. If ι lies at or above lκ then the optimal value function is

(Lι)

If ι lies at or below lκ then i is the point on lκ directly above ι and the optimal value function is

(i,Li)

Proof.

(Lι)

Lemma 29 implies the (II)-inequality holds. It is also a value graph.

(i,Li) - Upper Bound

By Lemma 30 and the previous case, b(K) ≤ i. By Lemma 29 the (II)-inequality holds.

(i,Li) - Lower Bound

This is the special case when the limit of an infinite sequence of improving contracts is a contract.

Indeed i is achieved by the principal paying a constant stream K−h(η−A)γ to the agent. With this

stream and working at low effort, the agent receives value γ(K−h(η−A)γ ) + h(η − A) = K from the

contract. The principal receives µ+η−Ar − r

γ (K − h(η − A)). The payoff pair from this contract

clearly equals i.

Applications

Proof of Proposition 14 - Optimality When c < 1 and A Large.

The linear portion of Sι satisfies the (II)-inequality as long as κ lies below Sι. To see that the

nonlinear portion satisfies the (II)-inequality for all large A, let us rewrite the (II)-inequality as

follows

ry ≥ µ+ η −A+ (γx+ h(η −A))y′ = µ+ η + (γx+ C + cη)y′ + (−cy′ − 1)A

The y and x values are bounded in the nonlinear portion, and y′ > −1 implies −cy′−1 < c−1 < 0.

Thus for all sufficiently large A, the inequality must hold.

75

Page 77: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

Proof of Proposition 15 - Optimality When c = 1 and A→∞.

For all large A the optimal value function is ((I)ι,p(A), (II)p(A),q(A),Lq(A)). Since this curve lies

above Sι, we know that d(I)ι,p(A)|p(A) > −1. We also know that limA→∞

d(II)p(A)|p(A) = −1. Thus

p(A) converges to the intersection of Sι and l, and therefore so does q(A).

References

[1] Abreu, D., Pearce, D., Stacchetti, E. (1986) “Optimal Cartel Equilibria with Imperfect Mon-

itoring,” Journal of Economic Theory Vol. 39, pp. 251 - 269

[2] Abreu, D., Pearce, D., Stacchetti, E. (1990) “Toward a Theory of Discounted Repeated Games

with Imperfect Monitoring,” Econometrica Vol. 58, pp. 1041 - 1063

[3] Biais, B., Mariotti, T., Plantin, G., and Rochet, J.-C. (2007) “Dynamic Security Design:

Convergence to Continuous Time and Asset Pricing Implications,” Review of Economic Studies

Vol. 74, pp. 345 - 390

[4] DeMarzo, P. M. and Fishman, M. J. (2007) “Optimal Long-Term Financial Contracting,” The

Review of Financial Studies Vol. 20, pp. 2079 - 2128

[5] DeMarzo, P. M. and Sannikov, Y. (2006) Optimal Security Design and Dynamic Capital

Structure in a Continuous-Time Agency Model,” Journal of Finance Vol. 61, pp. 2681 - 2724

[6] Durrett, S. Probability: Theory and Examples. Duxbury Press, Belmont, CA, 2008

[7] Fong, K. (2008) “Evaluating Skilled Experts: Optimal Scoring Rules for Surgeons,” working

paper, Stanford University

[8] Fudenberg, D., Holmstrom, B., and Milgrom, P. (1990) “Short-Term Contracts and Long-Term

Agency Relationships,” Journal of Economic Theory Vol. 51, pp. 1-31

[9] Fudenberg, D. and Tirole, J. (1990) “Moral Hazard and Renegotiation in Agency Contracts,”

Econometrica Vol. 58, pp. 1279 - 1319

[10] Gibbons, R. and Murphy, K. J. (1992) “Optimal Incentive Contracts in the Presence of Career

Concerns: Theory and Evidence,” Journal of Political Economy Vol. 100, pp. 468 - 505

[11] Green, E. (1987) “Lending and the Smoothing of Uninsurable Income,” in E. C. Prescott and

N. Wallace (eds.) Contractual Arrangements for Intertemporal Trade (Minneapolis: University

of Minnesota Press) pp. 3 - 25

76

Page 78: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

[12] Grossman, S. J. and Hart, O. D. (1983) “An Analysis of the Principal-Agent Problem,” Econo-

metrica Vol. 51, pp. 7 - 45

[13] Harrison, M. and Kreps, D. (1979) “Martingales and Arbitrage in Multiperiod Securities Mar-

kets,” Journal of Economic Theory Vol. 20, pp. 381 - 408

[14] Holmstrom, B. and Milgrom, p., (1987) “Aggregation and Linearity in the Provision of In-

tertemporal Incentives,” Econometrica Vol. 55, pp. 303 - 328

[15] Homlstrom, B. and Tirole, J. (1997) “Financial Intermediation, Loanable Funds, and the Real

Sector,” Quarterly Journal of Economics Vol. 112, pp. 663 - 691

[16] Innes, R. (1990) “Limited Liability and Incentive Contracting with Ex-Ante Action Choices,

Journal of Economic Theory Vol. 52, pp. 45 - 67

[17] Jewitt, I. (1988) “Justifying the First-Order Approach to Principal-Agent Problems,” Econo-

metrica Vol. 56, pp. 1177 - 1190

[18] Karatzas, I. and Shreve, S. Brownian Motion and Stochastic Calculus. Springer-Verlag, New

York, 1991

[19] Malcomson, J. M. and Spinnewyn, F. (1988) “The Multiperiod Principal-Agent Problem,”

Review of Economic Studies Vol. 55, pp. 391 - 407

[20] Phelan, C. and Townsend, R. (1991) “Computing Multi-Period, Information-Constrained Op-

tima,” Review of Economic Studies Vol. 58, pp. 853 - 881

[21] Piskorski, T. and Westerfield, M. (2009) “Debt Covenants and Distressed Equity Issuance:

Optimal Contracting in the Presence of Monitoring,” working paper, Columbia Business School

and USC Finance

[22] Radner, R. (1981) “Monitoring Cooperative Agreements in a Repeated Principal-Agent Rela-

tionship,” Econometrica Vol. 49, pp. 1127 - 1148

[23] Radner, R. (1985) “Repeated Principal-Agent Games with Discounting,” Econometrica Vol.

53, pp. 1173 -1198

[24] Rogers, L. C. G. and Williams, D. Diffusions, Markov Processes, and Martingales Vol. I and

II Cambridge University Press, New York, 2000

[25] Rogerson, W. P. (1985a) “Repeated Moral Hazard,” Econometrica Vol. 53, pp. 69 - 76

77

Page 79: Optimal Contracts with Strategic Shirkingwebfac/kariv/e208_f09/zhu.pdfthe project. Now equipped with a basic set of tools for incentive provision, Sannikov (2008) set out to nd a \

[26] Rogerson, W. P. (1985b) “The First-Order Approach to Principal-Agent Problems,” Econo-

metrica Vol. 53, pp. 1357 - 1367

[27] Sannikov, Y. (2007a) “Agency Problems, Screening, and Increasing Credit Lines,” working

paper, Princeton University

[28] Sannikov, Y. (2007b) “Games with Imperfectly Observable Actions in Continuous Time,”

Econometrica Vol. 75, pp. 1285 - 1329

[29] Sannikov, Y. (2008) “A Continuous-Time Version of the Principal-Agent Problem,” Review of

Economic Studies Vol. 75, pp. 957 - 984

[30] Spear, S. E. and S. Srivastava (1987) “On Repeated Moral Hazard with Discounting,” Review

of Economic Studies Vol. 54, pp. 599 - 617

[31] Stokey, N., Lucas, R., and Prescott, E. Recursive Methods in Economic Dynamics Harvard

University Press, Cambridge, 1989

[32] Williams, N. (2008) “On Dynamic Principal-Agent Problems in Continuous Time,” working

paper, University of Wisconsin

78