lecture notes for econ 101a - · pdf filelecture notes for econ 101a david card dept. of...

162
Lecture Notes for Econ 101A David Card * Dept. of Economics UC Berkeley * The manuscript was typeset by Daniel Nolan in L A T E X. The figures were created in Asymptote, Inkscape, R, and Excel (the marjority in Inkscape). Please address comments/corrections to daniel [email protected], with “Card Lecture Notes” in the subject line.

Upload: lamngoc

Post on 22-Mar-2018

231 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

Lecture Notes for Econ 101A

David Card∗

Dept. of EconomicsUC Berkeley

∗The manuscript was typeset by Daniel Nolan in LATEX. The figures were created in Asymptote, Inkscape, R,and Excel (the marjority in Inkscape). Please address comments/corrections to daniel [email protected], with “CardLecture Notes” in the subject line.

Page 2: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

Contents

1 Optimization 71.1 Unconstrained Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.2 Constrained Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91.3 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

1.3.1 Convexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121.3.2 SOC in Higher Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2 Consumer Choice 142.1 Budget Constraint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.2 Consumer’s Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.3 Consumer’s Optimum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192.4 Special Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

3 Two Applications of Indifference Curve Analysis 233.1 Analysis of a Subsidy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233.2 The Consumer Price Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

4 Indirect Utility and the Expenditure Function 284.1 Indirect Utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284.2 Expenditure Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

5 Comparative Statics of Consumer Choice 315.1 Change in Demand with Respect to Income, Engel Curves . . . . . . . . . . . . . . . 315.2 Change in Demand with Respect to Price . . . . . . . . . . . . . . . . . . . . . . . . 335.3 Graphical Decomposition of a Change in Demand . . . . . . . . . . . . . . . . . . . . 345.4 Substitution Effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365.5 Income Effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

6 Slutsky’s Equation 386.1 Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 386.2 Slutsky Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

7 Using Market Level Demand Curves 427.1 An Increase in Income . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437.2 Tax Incidence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

8 Labor Supply 48

9 Intertemporal Consumption 52

10 Production and Cost I 5510.1 One-Factor Production and Cost Functions . . . . . . . . . . . . . . . . . . . . . . . 55

10.1.1 Production Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5510.1.2 Cost Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5810.1.3 Connection between MC and MP . . . . . . . . . . . . . . . . . . . . . . . . 5810.1.4 Geometry of c, AC, and MC . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

1

Page 3: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

11 Production and Cost II 6211.1 Derivation of the Cost Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6411.2 Marginal Cost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

12 Cost Functions and IRFs 6812.1 Sheppard’s Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

13 Supply 7013.1 Supply Determination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7013.2 The Law of Supply . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7313.3 Changes in Input Prices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

14 Input Demand for a Competitive Firm 75

15 Industry Supply 80

16 Monopoly I 8216.1 Monopolist’s Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8216.2 Comparative Statics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8316.3 Monopoly in Two or More Markets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

17 Monopoly II 87

18 Consumer’s Surplus 91

19 Duopoly 9419.1 Monopolization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9419.2 Duopoly Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9519.3 Price Setting vs. Quantity Setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

20 Symmetric Cournot Equilibria 9920.1 n-Firm Symmetric Cournot Equilibria . . . . . . . . . . . . . . . . . . . . . . . . . . 9920.2 Alternatives to the Cournot Assumption . . . . . . . . . . . . . . . . . . . . . . . . . 100

21 Game Theory I 102

22 Game Theory II 10622.1 Tree Diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10622.2 Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

23 Uncertainty I: Income Lotteries 11023.1 Review of Basic Statistical Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . 11023.2 Choices Over Uncertain Incomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

24 Uncertainty II: Expected Utility 11424.1 Expected Utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11424.2 The Demand for Insurance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

2

Page 4: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

25 Uncertainty III: Moral Hazard 11825.1 Solution with No Moral Hazard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11925.2 A Partial Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

26 Uncertainty IV: The State-preference Approach and Adverse Selection 12226.1 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12226.2 Adverse Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

27 Auctions I: Types of Auctions 12727.1 Basic Types of Auction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12727.2 Important Results Concerning the Private Values Case . . . . . . . . . . . . . . . . . 12827.3 Bidding in a First-price Auction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

28 Auctions II: Winner’s Curse 13128.1 Appendix: Order Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133

28.1.1 Uniform Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13428.1.2 Normal Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

29 Finance I: Capital Asset Pricing Model 13529.1 Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13529.2 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13729.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

30 Finance II: Efficient Market Hypothesis 13930.1 Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13930.2 Efficient Market Hypothesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

31 Public and Near-public Goods 14331.1 Optimal Provision of Goods with No-rivalry Characteristics . . . . . . . . . . . . . . 143

31.1.1 Case 1: one consumer; x = t1/p. . . . . . . . . . . . . . . . . . . . . . . . . . 14331.1.2 Case 2: two consumers; x = (t1 + t2)/p. . . . . . . . . . . . . . . . . . . . . . 14331.1.3 Case 3: n consumers; x = τ/p, where τ =

∑ni=1 ti. . . . . . . . . . . . . . . . 145

31.2 Appendix: Social Optimum with Ordinary Goods . . . . . . . . . . . . . . . . . . . . 146

32 Externalities 14832.1 Consumption Externalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148

32.1.1 Market Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14832.1.2 Social Optimum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14932.1.3 Market Equilibrium versus Social Optimum . . . . . . . . . . . . . . . . . . . 15032.1.4 Other Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150

32.2 Production Externalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151

33 Empirical Methods in Microeconomics 15433.1 Experiments and Counterfactuals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154

33.1.1 The Self Sufficiency Project (SSP) . . . . . . . . . . . . . . . . . . . . . . . . 15533.2 Research Designs Based on Natural Experiments . . . . . . . . . . . . . . . . . . . . 157

33.2.1 The Mariel Boatlift . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157

3

Page 5: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

33.3 Natural Experiments with Several Control Groups . . . . . . . . . . . . . . . . . . . 15733.3.1 The New Jersey Minimum Wage . . . . . . . . . . . . . . . . . . . . . . . . . 158

33.4 The Discontinuity Research Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159

4

Page 6: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

Course Description

This is a course in intermediate microeconomics, emphasizing the applications of calculus and linearalgebra to the problems of consumer choice, firm behavior, and market interactions. Students arepresumed to be familiar with multivariate calculus (including e.g. limits, derivatives, integrals) andwith basic statistics (random variables, moments, etc.). The course material will be presented in afairly mathematical way and the problem sets and examinations will require you to apply modelsand derive results. Students who are concerned about their mathematical ability should considerEcon 100A.

The basic text is Microeconomic Theory: Basic Principles and Extensions, by Nicholson & Snyder,which should be available at the campus book store. An alternative, slightly more theoreticaltreatment of the same material is Varian’s Intermediate Microeconomics: A Modern Approach.Another, slightly more application-oriented alternative is Perloffs Microeconomics: Theory andApplications with Calculus. Any of the these is a good supplement to the lectures, but the lectureswill be at a somewhat higher level, and will not follow the texts closely.

Problem sets and practice exams will be made available on the course website.

The GSIs will present some additional material in section (for which all students will be responsible)and also will review the solutions to problem sets, practice exams, and problems from the lectures,etc.

Weekly problem sets will be assigned most weeks throughout the course. Completed problem setsare due at the end of the last lecture each week. We will not accept late problem sets. Instead, wedrop your two worst scores. Thus, you can miss up to two problem sets without any penalty. Youare encouraged to work in groups but every student must hand in his or her own version of thesolutions.

Course grades will be determined by a combination of weekly problem sets (20 percent), twomidterm exams (15 percent each), and a final exam (50 percent). The midterm exams will beheld in class.

5

Page 7: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

Lecture Topics

1 Methods of Optimization

2 Consumer Choice

3 Applications of Indifference Curve Analysis, Expenditure Function

4 Comparative Statics, Slutsky’s Equation

5 Market Level Demand and Supply

6 Labor Supply

7 Intertemporal Consumption & Savings

8–9 Production & Cost, Sheppard’s Lemma

10–11 Supply Determination

12 Monopoly and Price Discrimination

13 Consumer/Producer Surplus & Applications

14–15 Duopoly

16–17 Game Theory

18–21 Uncertainty and Insurance Markets

22–23 Auctions

24–25 Finance: CAPM and Efficient Markets

26–27 Public Goods, Externalities

28 Empirical Methods in Microeconomics

6

Page 8: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

1 Optimization

1.1 Unconstrained Optimization

Consider a smooth function y = f(x). How do we go about finding a point x0 such that y0 =f(x0) ≥ f(x) for any x in [a, b]?

Figure 1.1: In this picture f(x0) = maxa≤x≤b f(x). (Read: “f(x0) is the maximum value of f(x) when xis selected from the interval [a, b].”)

What can we say generally? Obviously, if x0 is a potential candidate for a maximizer, then it mustbe the case that we can’t move around x0 and reach a higher value of f . But this means f ′(x0) = 0.Why? Let 0 < h� 1.

If f ′(x) > 0, then f(x+ h) ≈ f(x) + hf ′(x) > f(x).

If f ′(x) < 0, then f(x− h) ≈ f(x)− hf ′(x) > f(x).

This leads us to Rule 1:

If f(x0) = maxa≤x≤b f(x), then f ′(x0) = 0.

This is called the first order necessary condition (FONC) for an interior maximum.

Does f ′(x0) = 0 always mean that x0 is a maximizer? Are there maximizers with f ′(x0) 6= 0?Consider the examples illustrated in Figure 1.3.

How can we be certain that we have located a maximum (not a minimum, nor an inflection point)?We examine the properties of f ′(x), which is itself a function of x. Take a look at Figure 1.4. Asthe function f ′ crosses x0 from left to right, it goes from positive to negative, i.e. it’s decreasing.On the other hand, as f ′ crosses x1 from left to right, it goes from negative to positive, i.e. it’sincreasing. In general, at a local maximum f ′(x) has negative slope, or in other words f ′′(x) < 0,while at a local minimum f ′(x) has positive slope, that is f ′′(x) > 0.

These considerations lead us to Rule 2:

If f ′(x0) = 0 and f ′′(x0) < 0, then f(x0) is a local maximum.If f ′(x0) = 0 and f ′′(x) > 0, then f(x0) is a local minimum.

7

Page 9: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

Figure 1.2: Notice that Rule 1 also holds for a function of several variables.

(a) (b) (c)

Figure 1.3: Exceptions to the converse of Rule 1: (a) f(x) = x. Thus f(b) = maxa≤x≤b f(x) even thoughf ′(b) = 1 6= 0. The maximum occurs on the boundary. (b) f ′(x) = 0 has two solutions, x′

and x′′ but neither one is a maximizer. f(x′) is a local maximum while f(x′′) is a minimum.(c) f(x) = x3. Solving f ′(x) = 0 gives x = 0, which is an inflection point.

8

Page 10: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

Figure 1.4: Properties of f ′(x): at a local max f ′ is decreasing since the tangent lines go from positive tonegative. The reverse is true for a local min.

This generalizes to two or more dimensions.

How do we determine whether a local maximum is a global maximum? If f ′′(x) < 0 for all x andf ′(x0) = 0, then x0 is a global maximum. A function f such that f ′′(x) < 0 for all x is calledconcave.1

Figure 1.5: A concave function always lies below any line tangent to its graph.

1.2 Constrained Optimization

Now we consider maximizing a function f(x1, x2) subject to—“s.t.”—some constraint on x1 and x2

which we denote by g(x1, x2) = g0. The two important examples of this in economics are:

1See Appendix 1.3.

9

Page 11: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

• In the study of consumer behavior, maximize utility u(x1, x2) s.t. the budget constraintp1x1 + p2x2 = I.

• In the study of firm behavior, maximize profit py−wx s.t. the production function y = f(x).

How do we go about a graphical analysis of the problem of maximizing f(x1, x2) s.t g(x1, x2) = g0?

Figure 1.6: Illustration of two-step approach described on p. 10.

A two-step approach:

1. Plot the contours of the function g. E.g. g(x1, x2) = x21 + x2

2; g(x1, x2) = k is the equation ofa circle with radius

√k and center O = (0, 0).

2. Plot the contours of the function f . E.g. f(x1, x2) = x1x2; f(x1, x2) = m is the equation ofa hyperbola.

The constrained maximum of the function f occurs where a contour of f is tangent to the contourof g corresponding to g0. Why? Suppose we add a small amount dx1 to x1 in such a way as tokeep g(x1, x2) constant. If so, then we must have a corresponding reduction in x2 such that thetotal differential of g is zero, i.e.

dg = g1(x1, x2)dx1 + g2(x1, x2)dx2 = 0

(where gi denotes ∂g/∂xi), which implies

dx2

dx1= −g1(x1, x2)

g2(x1, x2)

If we increase x1 by one unit, we must increase x2 by −g1(x1, x2)/g2(x1, x2)—or, equivalently,decrease x2 by g1(x1, x2)/g2(x1, x2)—in order to keep the value of g constant. The net effect of

10

Page 12: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

such a change in x1 on the value of f is

df = f1(x1, x2)dx1 + f2(x1, x2)dx2

= f1(x1, x2)dx1 + f2(x1, x2)× dx2

dx1dx1

=

[f1(x1, x2)− f2(x1, x2)× g1(x1, x2)

g2(x1, x2)

]dx1

Now in order for (x01, x

02) to be a constrained maximum, it must be the case that we cannot increase

f by adding or subtracting a small amount to x1 while keeping the value of g constant. But thismeans the above expression is 0 for all dx1, or in other words

f1(x1, x2)

f2(x1, x2)=g1(x1, x2)

g2(x1, x2)

But this expression says that at (x01, x

02), the contours of f and g are tangent, i.e. have the same

slope. Note that this argument applies only if (x01, x

02) lies in the interior of the domain for if (x0

1, x02)

lies on the boundary then we cannot increase or decrease one of x1 or x2.

How do we convert a constrained maximization problem into an unconstrained one? A Frenchmathematician named Lagrange noted that one gets the right answer by setting up an artificial,unconstrained maximization problem with an additional variable, λ:

L(x1, x2, λ) = f(x1, x2)− λ[g(x1, x2)− g0]

The FONC for L, with respect to x1, x2, and λ are:

L1 = f1(x1, x2)− λg1(x1, x2) = 0

L2 = f2(x1, x2)− λg2(x1, x2) = 0

Lλ = g(x1, x2)− g0 = 0

Dividing the first of these by the second gives

f1(x1, x2)

f2(x1, x2)=g1(x1, x2)

g2(x1, x2)

while the third simply restates the constraint! Thus by writing down the Lagrangian L and settingits first derivatives equal to zero we get the necessary conditions for a constrained maximum.

We also get a new variable, λ, called the Lagrange multiplier. How do we interpret λ? It turns outthat the value of λ tells us how much the maximum value of f changes if we relax the constraint by asmall amount. Specifically, suppose we are to maximize f(x1, x2) s.t. the constraint g(x1, x2) = g0.Call the solution (x0

1, x02). Now suppose we relax the constraint and instead maximize f(x1, x2) s.t.

g(x1, x2) = g0 + dg0. How do we change our optimal choices of x1 and x2? Suppose we decide touse more x1, enough to use up the added constraint. Since the total differential of g is

dg = g1(x1, x2)dx1 + g2(x1, x2)dx2

if we change only x1, (that is, if dx2 = 0), the amount we can change x1 while satisfying the newconstraint is

dx1 =1

g1(x1, x2)dg0

11

Page 13: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

The increase in f that accompanies this increase in x1 is

df = f1(x1, x2)dx1 =f1(x1, x2)

g1(x1, x2)= λ

You are encouraged to check for yourself that if you were to use up the added constraint on x2, dfwould again be λ. This suggests another interpretation of the tangency condition: at a maximum,if we had a bit more constraint, then we would be indifferent as to whether to use it on x1 or x2.

As with unconstrained optimization, there are also second order conditions. These can be expressedalgebraically; however, they amount to the condition that the objective function has contours thatare “more convex” than the constraint.2

(a) (b)

Figure 1.7: (a) Contours of f are more convex than g(x1, x2) = g0: SOC satisfied. (b) Contours of f arelinear, less convex than g(x1, x2) = g0: SOC not satisfied.

1.3 Appendix

1.3.1 Convexity

A set S ⊆ R2 is convex if, for every pair of points u = (u1, u2) and v = (v1, v2) in S,

α ∈ [0, 1] =⇒ αu+ (1− α)v ∈ S

i.e. the line segment joining u and v lies entirely in S. A set that is not convex is called concave.

A function f : [a, b]→ R is called convex if, for every x1 and x2 in [a, b],

α ∈ [0, 1] =⇒ f(αx1 + (1− α)x2) ≤ αf(x1) + (1− α)f(x2)

2See Appendix 1.3.

12

Page 14: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

Or, equivalently, f : [a, b]→ R is convex if the set S = {(x, y) ∈ [a, b]× R : y ≥ f(x)} is convex. Afunction g : [a, b]→ R is called concave if −g is convex. Let f be twice differentiable. Then

f is convex ⇐⇒ f ′′(x) > 0 for all x

f is concave ⇐⇒ f ′′(x) < 0 for all x

Throughout these notes, if f ′′(x) >[<] g′′(x) >[<] 0 on some interval, then we shall think of f asbeing “more[less] convex[concave]” than g.

A function f : R2 → R is quasi-concave if Sk = {(x, y) ∈ R2 : f(x, y) ≥ k} is convex for all k. (Thesets Sk are called upper contour sets.)

1.3.2 SOC in Higher Dimensions

Let f : Rn → R, i.e. let z = f(x1, . . . , xn), and define the Hessian H(f) to be the matrix

H(f) =

∂2f∂x2

1

∂2f∂x1∂x2

· · · ∂2f∂x1∂xn

∂2f∂x2∂x1

∂2f∂x2

2· · · ∂2f

∂x2∂xn...

.... . .

...∂2f

∂xn∂x1

∂2f∂xn∂x2

· · · ∂2f∂x2n

Next, define Hi(f) to be the ith principal minor of H(f), the submatrix comprised of the first irows and the first i columns of H(f). For example

H2(f) =

(∂2f∂x2

1

∂2f∂x1∂x2

∂2f∂x2∂x1

∂2f∂x2

2

)

If, at z0 = f(x01, . . . , x

0n), |Hi(f)| > 0 for all i, then z0 satisfies the SOC for a local minimum. On

the other hand, if sgn(|Hi(f)|) = (−1)i for all i, then z0 satisfies the SOC for a local maximum.

13

Page 15: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

2 Consumer Choice

In this section we apply the methods of optimization of Section 1 to the analysis of consumer choicesubject to a budget constraint. The problem has three elements:

1. Describe the budget constraint.

2. Describe the consumer’s objective, i.e. his or her utility.

3. Set up and solve the constrained optimization.

2.1 Budget Constraint

We assume that a consumer must choose among bundles (x1, . . . , xn) of commodities 1 through nthat fall within his or her budget. In the case of just two goods x1 and x2 let their prices be p1

and p2, respectively. Let the consumer have income I. Then the bundle (x1, x2) is affordable iffp1x1 + p2x2 ≤ I.

Figure 2.1: Graphically, the set of affordable bundles (the budget set) is the triangular region boundedby the coordinate axes and the line x2 = (−p1/p2)x1 + I/p2.

Note the following:

• if all income is spent on x1, the total amount available is I/p1 (and likewise for x2)

• we are implicitly assuming that you cannot buy negative amounts of x1 or x2

• the slope of the “budget line” (the outer boundary of the budget set) is −p1/p2

2.2 Consumer’s Objective

We seek a simple way of summarizing how the consumer evaluates alternative bundles, say (x01, x

02)

and (x∗1, x∗2).

14

Page 16: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

Figure 2.2: If we give up one unit of x1, we save p1, which can be used to purchase p1/p2 units of x2.The market trades x1 for x2 at the rate p1/p2. This ratio represents the relative price of x1and x2.

Graphically, the device we use is the indifference curve: a curve connecting bundles that are equallygood. Consider the indifference curve through (x0

1, x02), i.e. the set of bundles that are “as good”

as (x01, x

02).

Now take a look at Figure 2.4. If both x1 and x2 are desirable, then bundles with more x1 andmore x2 must be preferred to (x0

1, x02). By the same token, (x1, x2) must be preferred to bundles

with less x1 and less x2. This means that indifference curves must have negative slope.

In more advanced treatments of economic theory, indifference curves are derived from a set ofassumptions about how consumers evaluate alternative bundles. Some types of preferences cannotbe represented by indifference curves. The classic example is “lexicographic preferences”: theconsumer evaluates a bundle (x1, x2) first by the amount of x1, then by the amount of x2. Ifx0

1 > x′1, then (x01, x

02) is strictly preferred to (x′1, x

′2) regardless of x0

2 and x′2. However, if x01 = x′1,

then the consumer compares x02 and x′2. (This is the same way alphabetical order works.) As an

exercise, try to graph the “indifference curves” of a consumer with lexicographic preferences.

Analytically, we represent preferences by a utility function u(x1, x2) with domain equal to the setof possible consumption bundles. We construct u such that higher values are preferred.

Examples:

• u(x1, x2) = x1x2

• u(x1, x2) = x1 + x2

• u(x1, x2) = min {x1, x2}

Facts:

• The contours of u are the indifference curves.

• The bundles (x01, x

02) and (x′1, x

′2) lie on the same indifference curve iff u(x0

1, x02) = u(x′1, x

′2).

15

Page 17: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

Figure 2.3: How does a consumer decide between (x01, x02) and (x∗1, x

∗2)?

Figure 2.4: If both x1 and x2 are desirable, then it follows that indifference curves are downward-sloping.

16

Page 18: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

• Let h > 0. If more of x1 is always preferred, then u(x1 + h, x2) > u(x1, x2), which impliesu1(x1, x2) > 0 for every bundle (x1, x2). (Likewise for x2.) You are encouraged to verify thisfor each of the above examples.

• The slope of the indifference curve through (x1, x2), at (x1, x2), is −u1(x1, x2)/u2(x1, x2).We call the absolute value of this ratio the marginal rate of substitution (MRS) because itis the amount of x2 the consumer would need to compensate for the loss of one unit of x1,or in other words the amount of x2 needed, per unit of x1 given up, in order to keep utilityconstant.

Figure 2.5: The slope of the indifference curve through (x01, x02) is MRS = u1(x01, x

02)/u2(x01, x

02).

Examples:

• u(x1, x2) = xα1xβ2 (Cobb-Douglas)

u1(x1, x2) = αxα−11 xβ2

u2(x1, x2) = βxα1xβ−12

MRS =u1(x1, x2)

u2(x1, x2)=α

β× x2

x1

• u(x1, x2) = x1 + x2

MRS =u1

u2= 1, a constant for every bundle (x1, x2)

• u(x1, x2) = 2 log x1 + x2

MRS =u1

u2=

2/x1

1=

2

x1, independent of x2

As an exercise, graph the indifference curves for these three examples.

Note: If your utility function is u(x1, x2) and mine is v(x1, x2) = au(x1, x2) + b, where a > 0, thenwe have the same preferences. Why? It can be shown that we have the same indifference curves,

17

Page 19: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

only with different labels. The result holds for v = f(u), where f is a monatonically increasingfunction.

You may be familiar with the concept of diminishing marginal rate of substitution (DMRS). Unlessstated otherwise, we shall assume DMRS in most of the examples throughout these notes.

(a) (b) (c)

Figure 2.6: (a) DMRS (b) constant MRS (c) increasing MRS

Along an indifference curve, (holding utility constant), the MRS decreases with x1. As one obtainsmore x1, the less one values an additional unit of x1 in terms of x2. DMRS implies that consumersalways prefer averages. Suppose we have two bundles (x0

1, x02) and (x′1, x

′2), on the same indifference

curve. Then a bundle that is a weighted average of (x01, x

02) and (x′1, x

′2), e.g. α(x0

1, x02) + (1 −

α)(x′1, x′2), where 0 < α < 1, is strictly preferred to either of the original bundles.

Figure 2.7: The dashed line represents the set of all weighted averages of x0 and x∗, that is, the setS = {αx0 + (1 − α)x∗ : 0 < α < 1}. Clearly these are strictly preferred to both x0 and x∗.Equivalently, the set S = {x ∈ R2 : u(x) > u(x0)} is convex. (One can see this by noting theshape of the region above the indifference curve.)

It is important to understand that DMRS is not the same as diminishing marginal utility, nor arethe two even related. Given a utility function u, the marginal utility of x1 is u1. We say that uexhibits diminishing marginal utility if u11 = (u1)1 < 0. However, the sign of u11 says nothingabout the MRS, as the following examples show:

• u(x1, x2) = (x21 + x2

2)1/4

u1(x1, x2) = (1/2)(x21 + x2

2)−3/4

18

Page 20: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

u11(x1, x2) = −(3/4)(x21 + x2

2)−7/4 < 0 =⇒ decreasing marginal utility but the indifferencecurves are circles, which exhibit increasing MRS.

• u(x1, x2) = x31x

32

u1(x1, x2) = 3x21x

32

u11(x1, x2) = 6x1x32 > 0 =⇒ increasing marginal utility but the indifference curves are

hyperbolas, which exhibit DMRS.

2.3 Consumer’s Optimum

Analytically, the consumer’s problem is to solve

maxx1,x2

u(x1, x2) s.t. p1x1 + p2x2 = I

Have a look at Figure 2.8. Clearly, a bundle (x01, x

02) is optimal if two things are true:

Figure 2.8: The consumer chooses the bundle that lands her on the highest indifference curve while stilllying on the budget line.

1. p1x01 + p2x

02 = I,

2. MRS(x01, x

02) = p1/p2.

Condition (2), the tangency condition, expresses the simple fact that if (x01, x

02) is optimal, then

there are no gains to be made by trading in the market any further. If MRS > p1/p2, then theconsumer values x1 more than the market does, in terms of x2, so it would benefit the consumerto sell x2 and buy more x1 as you can see in Figure 2.9.

19

Page 21: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

Figure 2.9: MRS > p1/p2. On the margin, the consumer values x1 more than the market does, in termsof x2, and there is room for a profitable trade! What happens if MRS < p1/p2?

To proceed analytically, let’s use the Lagrangian method:

L(x1, x2, λ) = u(x1, x2)− λ(p1x1 + p2x2 − I)

L1 = u1(x1, x2)− λp1 = 0 (2.1)

L2 = u2(x1, x2)− λp2 = 0 (2.2)

Lλ = −p1x1 − p2x2 + I = 0 (2.3)

Dividing (2.1) by (2.2) gives the tangency condition

u1(x1, x2)

u2(x1, x2)=p1

p2

Also,

λ =u1(x1, x2)

p1=u2(x1, x2)

p2

With an extra dollar to spend one could either

(a) buy 1/p1 units of x1 and increase utility by u1(x1, x2)/p1 = λ, or

(b) buy 1/p2 units of x1 and increase utility by u2(x1, x2)/p2 = λ.

For this reason, λ is sometimes called the marginal utility of income.

For example, if u(x1, x2) = x1x2, then L = x1x2 − λ(p1x1 + p2x2 − I), and the FONC are:

L1 = x2 − λp1 = 0

L2 = x1 − λp2 = 0

Lλ = −p1x1 − p2x2 + I = 0

20

Page 22: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

Therefore, x1 = λp2 and x2 = λp1. Plugging these results back into (2.3):

p1(λp2) + p2(λp1) = I

=⇒ 2p1p2λ = I

=⇒ λ =I

2p1p2

=⇒{x1 = x1(p1, p2, I) = I/2p1,x2 = x2(p1, p2, I) = I/2p2

The functions x1(p1, p2, I) and x2(p1, p2, I) are called the demand functions. Notice that p1x1 =p2x2 = I/2, so the consumer spends half his or her income on each good! As an exercise, re-do the

analysis for U(x1, x2) = xα1xβ2 with different values of α and β.

2.4 Special Problems

• Preferences do not satisfy DMRS (Figure 2.10). Often, we restrict preferences by requiring theindifference curves to be convex to the origin. (Functions with this property are called quasi-concave. A function u : R2 → R is quasi-concave if the upper contour sets Sk = {(x1, x2) ∈R2 : u(x1, x2) ≥ k} are convex for all k.)

• Even with quasi-concave preferences, i.e. with convex indifference curves, we still can run intoproblems (Figure 2.11). Most consumers consume zero units of most goods, so the endpointproblem is potentially one that economists must deal with. The problem is much worse themore narrowly goods are defined, (e.g. Coke versus Pepsi), and becomes less serious themore broadly they are defined (e.g. beverages in general). A considerable amount of appliedresearch regarding consumer demand involves the so-called discrete choice approach, focusingon whether consumers buy some or none of a given commodity. Daniel McFadden won theNobel Prize for his research showing how to link the “buy, don’t buy” decision to underlyingutility functions.

21

Page 23: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

(a) (b)

Figure 2.10: (a) Indifference curves exhibit CMRS, and there is no bundle with MRS = p1/p2. (b)MRS = p1/p2 but this point is not a maximum—what’s wrong?

(a) (b)

Figure 2.11: Endpoint optima: (a) MRS < p1/p2, (x1, x2) = (0, I/p2) (b) MRS > p1/p2, (x1, x2) =(I/p1, 0).

22

Page 24: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

3 Two Applications of Indifference Curve Analysis

We have seen that the consumer’s optimum is represented by a tangency between an indifferencecurve and the budget constraint. This condition expresses the simple economic idea that theconsumer, on the margin, cannot adjust her consumption bundle to spend the same amount ofmoney and simultaneously achieve higher utility. Recall that the tangency condition is only truewhen the indifference curves exhibit DMRS, and we don’t have an endpoint optimum.

3.1 Analysis of a Subsidy

In many economies, certain commodities are subsidized by the government. A subsidy is a negativetax that is usually introduced to aid low income consumers. Economists generally argue thatsubsidies are inefficient. Why?

Let there by two commodities: food f and “other stuff” x. The price of other stuff is px, andthe price of food is pf . A typical consumer has income I and normal preferences, (quasi-concaveindifference curves with DMRS). The budget constraint is pxx+ pff = I. See Figure 3.1.

Figure 3.1: Budget constraints with and without food subsidy. (x∗, f∗) denotes the optimal choice underthe subsidy arrangement.

Suppose now that a subsidy of $s per unit is introduced on food. The budget constraint becomespxx + (pf − s)f = I. If the consumer chooses the bundle (x∗, f∗), then the cost of the subsidy tothe government (for this consumer alone) is $sf∗. Most economists would argue that you shouldinstead give the consumer $sf∗ directly and leave the price of food alone. To see this, suppose thelump sum is given to the consumer directly, but she is forced to pay the market, unsubsidized pricefor food. In this case her budget constraint is

pxx+ pff = I + sf∗ (3.1)

Notice that the bundle (x∗, f∗) satisfies the budget constraint, since originally

pxx+ (pf − s)f = I

23

Page 25: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

In other words, if I give the consumer $sf she still can afford (x∗, f∗). But she can do even better,as shown in Figure 3.2.

Figure 3.2: The unsubsidized budget constraint corresponding to I + sf∗ cuts the original indifferencecurve and therefore enables the consumer to achieve higher utility.

The reason is that the budget line (3.1), with the lump sum, is flatter than the budget line withthe subsidy. They both pass through (x∗, f∗), so the budget line (3.1) cuts through an indifferencecurve and therefore enables the consumer to choose a bundle with higher utility.

Figure 3.3 illustrates the same point.

3.2 The Consumer Price Index

The CPI is a measure of how much it costs today (in today’s dollars) to buy a fixed bundle ofcommodities. We currently use 1982-84 as our reference period, which means the CPI is calculatedby finding the cost of the bundle relative to its cost in 1982-84, $100.

Suppose the CPI is 177.5, (which it was in July 2001). That means it now costs 1.775 times asmuch to purchase the “standard bundle” as it did on average in 1982-84. If someone earns 1.78times as much as he did in the early 80s, then he is at least as well off as he was then.

Does your nominal income necessarily have to rise in proportion with the CPI? Suppose that in1983 you purchased (x0, y0) at prices (p0

x, p0y). Your income was I0, and

x0p0x + y0p0

y = I0

Now suppose that in 2001 prices are (p0x(1 +π), p0

y(1 +π)). In this case both prices increased at therate of π. How much would your income have to increase in order to offset the increase in prices?See Figure 3.4.

24

Page 26: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

Figure 3.3: Note that ∆ = sf∗/px, or the subsidy at initial optimum, in terms of x.

On the other hand, suppose px rises by 3π/2 and py rises by π/2, i.e.

px = p0x

(1 +

3

),

py = p0y

(1 +

1

).

The increase in the cost of living is represented by the increase in the cost of the reference bundle(x0, y0):

p0x

(1 +

3

)+ p0

y

(1 +

1

)− p0

xx0 − p0

yy0 =

3

2πp0

xx0 +

1

2πp0

yy0.

If you initially spent half your income on each of x and y, then p0xx

0 = p0yy

0 = I0/2, and theincrease in the cost of living is

2· I

0

2+π

2· I

0

2= πI0,

a proportional increase of π. But, if your income increases by π, you are better off!

The reasoning is as follows: If your income increases by enough to allow you to buy (x0, y0) yourbudget is represented by the dashed line. But with that budget, you will not consume (x0, y0); youwill consume a bundle with more y, less x, and higher utility. You respond to the change in relativeprices by altering your consumption. See Figure 3.5.

The CPI is really a weighted average of prices for a fixed set of purchases. See Table 1 for anexample of some of the major categories and their weights. Note the slow growth of apparel prices(usually attributed to the rapid rise in cheap imports) and the very rapid growth in medical prices.

25

Page 27: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

Figure 3.4: If all prices rise by the same factor, the consumer is in fact worse off.

Figure 3.5: If some prices rise more than others, the new budget line, (assuming income rises in proportionto CPI), cuts the original indifference curve.

26

Page 28: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

Table 1: Major Purchase Categories in CPI and Corresponding Weights

Category Weight Price Index (Dec. 2000)All 100.0 174.1Food & Beverage 16.3 169.5Housing 39.6 171.6Apparel 4.7 131.8Transportation 17.5 155.2Medical 5.8 264.1Recreation 6.0 103.7∗

Education 2.7 115.4∗

Communication 2.7 92.3∗

Other Items 4.7 276.2* Reference period is Dec. 1997, not 1982-84.

The difference between the rate of increase in the average price of the reference bundle and theminimum increase in income necessary in order to maintain the original level of utility is called thesubstitution bias in the CPI. Note that it depends on two things: how disproportionately pricesfor different goods are rising, and how convex one’s indifference curves are. The more convex theindifference curves, and the more dispersion in relative price increases, the bigger the substitutionbias. The Boskin Commission estimates that on average substitution bias was about 0.5% per yearin the U.S. over the past couple decades.

There are lots of other, bigger sources of bias in the CPI. One that is hard to measure is quality bias:consumer goods change over time, which makes it hard to hold the reference bundle constant. Somenew inventions since the early 80s: CD/DVD players, airbags and anti-lock breaks, the internet,laser printers, portable PCs, cell phones, The X-Files. Roughly speaking, quality changes arehandled in the CPI by attempting to subtract the part of any price change that is due to quality,measured at the time the higher quality product is introduced. So, for example, when airbags firstbecame available manufacturers charged about $500 extra for them. Thus, when we compare theprice of a new car in 2001 that is equipped with airbags, to a similar model in 1990 without airbags,we subtract $500 from the 2001 price before computing the price ratio.

27

Page 29: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

4 Indirect Utility and the Expenditure Function

4.1 Indirect Utility

We characterized the solution to the problem

maxx1,x2

u(x1, x2) s.t. p1x1 + p2x2 = I

as an optimal pair (x01, x

02) that satisfies the first order conditions (tangency, budjet constraint).

Note that (x01, x

02) varies with (p1, p2, I). We call the optimal choices at a given level of prices and

income the “demand functions” and write:

x1 = x01(p1, p2, I)

x2 = x02(p1, p2, I)

Note that p1x01(p1, p2, I)+p2x

02(p1, p2, I) = I, so the demand functions satisfy the budget constraint

by definition, even as prices vary. This gives rise to restrictions on the demand functions.

The highest level of utility that can be achieved under (p1, p2, I) is u(x01(p1, p2, I), x0

2(p1, p2, I)),which is the utility of the optimal choices under the budget parameters. We define the indirectutility function to be

v(p1, p2, I) = maxx1,x2

u(x1, x2) s.t. p1x1 + p2x2 = I

= u(x01(p1, p2, I), x0

2(p1, p2, I))

It should be clear to the reader that v is decreasing in p1 and p2, and increasing in I.

Example: u(x1, x2) = xα1xβ2 , where α + β = 1. We saw in Section 2.3 that x0

1(p1, p2, I) = αI/p1

and x02(p1, p2, I) = βI/p2. Note that x0

1 does not depend on p2, and x02 does not depend on p1. The

indirect utility function is given by

v(p1, p2, I) = ααββp−α1 p−β2 I

4.2 Expenditure Function

Instead of maximizing utility subject to a budget constraint, one could minimize spending, subjectto a utility constraint:

minx1,x2

p1x1 + p2x2 s.t. u(x1, x2) = u0

The Lagrangian isL(x1, x2, µ) = p1x1 + p2x2 − µ[u(x1, x2)− u0]

The FONC are:

p1 − µu1(x1, x2) = 0

p2 − µu2(x1, x2) = 0

u(x1, x2) = u0

28

Page 30: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

Note that the first two conditions are equivalent to the tangency condition p1/p2 = u1/u2. Take alook at Figure 4.1. The parallel lines represent “iso-cost lines”: combinations such that p1x1 +p2x2

is constant. These can be thought of as the contours of the objective function. Their slope is−p1/p2. (Why?)

Figure 4.1: How does the consumer reach u0 with as little income as possible?

The utility maximization (u-max) and expenditure minimization (e-min) problems are called “dual”problems, since they reverse the objective and the constraint.

What are the solutions to the e-min problem? The choices (x1, x2) that minimize spending subjectto a utility constraint are like demand functions, with the exception that they take utility, ratherthan income, as given. We call these compensated demand functions, and denote them as follows:

x1 = xc1(p1, p2, u0)

x2 = xc2(p1, p2, u0)

Sometimes these are called Hicksian demand functions, after John Hicks, the English economistwho discovered them (and won the second Nobel prize in economics).

Under (p1, p2, I), and having chosen xc1, xc2, one spends a total of

p1xc1(p1, p2, I) + p2x

c2(p1, p2, I)

We define the expenditure function, (analagous to the indirect utility function for it gives theamount spent assuming one has solved the e-min problem), to be

e(p1, p2, u0) = min

x1,x2

p1x1 + p2x2 s.t. u(x1, x2) = u0

= p1xc1(p1, p2, u

0) + p2xc2(p1, p2, u

0)

Note that e(p1, p2, u0) tells you the minimum amount of money necessary to achieve utility u0 under

prices (p1, p2).

29

Page 31: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

Example: u(xα1 , xβ2 ) = xα1x

β2 , where α+ β = 1. The Lagrangian is

L = p1x1 + p2x2 − µ(xα1xβ2 − u0)

FONC:L1 = p1 − µαxα−1

1 xβ2 = 0

L2 = p1 − µβxα1xβ−12 = 0

}=⇒ p1

p2=α

β× x2

x1=⇒ x2 =

β

α× p1

p2x1

Substituting this into the budget constraint,

xα1

α× p1

p2x1

]β= u0

which implies

x1 = u0

(p2

p1× α

β

)βx2 = u0

(p1

p2× β

α

30

Page 32: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

5 Comparative Statics of Consumer Choice

In this section we characterize the changes in consumer demands that occur as income and pricesvary. Our goal is to describe the consumer’s demand functions. Analytically, the demand functionsfor the goods x and y are a pair of functions

x = x(px, py, I)

y = y(px, py, I)

that describe the consumer’s optimal choices of x and y, given prices and income. As you canimagine, the nature of these functions is important in a wide variety of applications.

5.1 Change in Demand with Respect to Income, Engel Curves

As income changes, the budget constrint shifts in a parallel fashion: inward if I decreases, outwardif I increases.

In commodity space, (xy-space, or in our case the plane), the tangencies of the budget constraintswith higher and higher indifference curves trace out the income expansion path shown in Figure 5.1.For a good x, if the quantity of x demanded increases with income, then x is said to be a normalgood. For some goods, the quantity demanded falls with income—such goods are called inferior.Analytically, ∂x/∂I > 0 =⇒ x normal, while ∂x/∂I < 0 =⇒ x inferior.

Figure 5.1: Fix prices. Then x(px, py, I) = x(I), and y(px, py, I) = y(I). The income expansion path is{(x(I), y(I)) : I ≥ 0}.

A couple interesting implications of the budget constraint for changes in x and y with respect toincome:

31

Page 33: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

(a) (b) (c)

Figure 5.2: (a) x, y normal (b) x normal, y borderline inferior (c) x inferior, y normal

• Using the fact that income is always exhausted,

I = pxx+ pyy

=⇒ dI = pxdx+ pydy

=⇒ 1 = pxdx

dI+ py

dy

dI

so clearly both goods cannot be inferior for in that case the RHS would be negative.

• Starting from the previous equation,

xpxI× I

x

dx

dI+ypyI× I

y

dy

dI= 1

which is equivalent tosxex + syey = 1

where sx and sy are the expenditure shares, (the fraction of income spent on each good),and ex and ey are the income elasticies, (the percent change in demand ∆x/x divided by thepercent change in income ∆I/I, or, in the limit as ∆I → 0, (dx/x)/(dI/I)). This equationcan be summarized as follows: the expenditure-weighted sum of income elasticies is unity.

The relation between x and I, holding prices constant, is called the Engel curve, and is shown inFigure 5.3.

The data in Table 2 confirm Engel’s Law, that as income increases, the expenditure share of fooddecreases. The implication is that income elasticity of food is less than unity. Why? Let x be food.Then sx = xpx/I is the expenditure share of food, and

dsxdI

=px

dxdI

I− 1

I2xpx =

xpxI

IxdxdI

I− 1

I

xpxI

=sxI

(ex − 1)

orI

sx

dsxdI

= ex − 1

32

Page 34: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

Figure 5.3: The Engel curve starts from the origin if x = 0 when I = 0, (which is a reasonable assumption).The Engel curve has positive slope if x is a normal good.

(a) (b) (c)

Figure 5.4: (a) Linear Engel curves: dx/dI = x/I =⇒ ex = 1. (b) Convex Engel curves: dx/dI >x/I =⇒ ex > 1. (c) Concave Engel curves: dx/dI < x/I =⇒ ex < 1.

So, if ex < 1, then food share is declining with income. An alternative proof employs a favoritetrick of economists, taking natural logs:

log sx = log x+ log px − log I

d log sxd log I

=d log x

d log I− 1

orI

sx

dsxdI

= ex − 1

In some contexts, the food share is used as an indicator of welfare. It has been proposed thatfamilies in different countries with the same food share are equally well off.

5.2 Change in Demand with Respect to Price

A change in one of the prices causes the budget line to rotate; as it does so, the tangencies withhigher and higher indifference curves trace out the price consumption path.

You should be familiar with the demand curve, which is the graph of the demand function x(px) =x(px, p

0y, I

0), where p0y and I0 are fixed. See Figure 5.6.

33

Page 35: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

Table 2: Food Share of Std. Budget in Various Years

Year Food Share in Std. Budget∗

1935-39 35.41952 32.21963 25.21992 19.62000 16.3* Budget used in calculation of CPI.

Figure 5.5: A rise in px is accompanied by a reduction in x.

Note that we traditionally plot demand, (the dependent variable), on the horizontal axis and theprice, (the independent variable), on the vertical axis.3 The negative slope of the demand curvereflects the idea that consumption of a commodity falls as its price increases. However, demandcurves are not necessarily downward sloping! We turn now to a decomposition of the change indemand due to a change in price. We show that there are two factors:

1. the curvature of the indifference curves

2. the nature of the income effect on demand

5.3 Graphical Decomposition of a Change in Demand

Suppose px increases from p0x to p1

x; demand changes from (x0, y0) to (x1, y1). We can deocmposethe change from x0 to x1 as follows:

1. First, think of the change in x that arises purely due to the fact that x now costs more.Draw a budget line with slope p1

x/py that still allows the consumer to reach the indifference

3We owe this convention to Alfred Marshall. As a result of this, steep demand curves are “inelastic,” whereas flatdemand curves are “elastic.”

34

Page 36: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

Figure 5.6: The reader is presumed to be famililar with the demand curve.

Figure 5.7: The movement from x0, y0) to (x∗, y∗) takes place along the indifference curve.

curve through (x0, y0) (call this indifference curve u0). Note that, since it’s steeper than theold budget line, it has a tangency with u0 to the left of (x0, y0).4 This “artificial” budgetconstraint is represented by the dashed line in Figure 5.7.

2. Second, move from this intermediate point to the final optimum. Observe that this movementis a movement along an income expansion path, since the intermediate optimum occurs whereu0 has a tangency with a budget line with slope p1

x/py.

Analytically,∆x = x1 − x0 = (x1 − x∗) + (x∗ − x0)

where x∗ denotes the aforementioned intermediate optimum. We refer to the first change (x1−x∗),holding utility constant, as the substitution effect. We refer to the second change (x∗ − x0), as the

4Assuming DMRS.

35

Page 37: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

(a) (b)

Figure 5.8: (a) Step 1: move to new tangency on old indifference curve. (b) Step 2: Move along IEP tonew optimum.

income effect. Thus we write∆x = ∆xS + ∆xI

5.4 Substitution Effect

The substitution effect represents movement along an indifference curve. It tells you how far tomove in order for the indifference curve to be parallel to the new budget line, i.e. in order for theMRS to equal the new price ratio. Obviously, then, if the indifference curves are relatively flat,you have to go a long way before the MRS equals the new price ratio, and the substitution effect issubstantial. If the indifference curves are highly convex, the MRS changes rapidly and you do notneed to go far: the substitution effect is small. See Figure 5.9.

(a) (b)

Figure 5.9: (a) u0 flat =⇒ more substantial substitution effect (b) u0 highly curved =⇒ lesser substi-tution effect

Note that if ∆px > 0, the substitution effect is negative. (Why?) What about the substitution

36

Page 38: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

effect of ∆px on y?

5.5 Income Effect

Intuitively, one might think the income effect is larger the greater x0, i.e. the greater x was inthe first place. If, initially, you consumed very little x, the income effect would be relatively small.Take a look at Figure 5.10:

• Notice that the intermediate budget constraint almost passes through (x0, y0). (It alwayscuts below, if not by much.)

• So, the income effect is approximately proportional to the change in income from the budgetline through (x0, y0) to the final budget line.

Figure 5.10: The income effect is approximately proportional to the perpendicular distance between thebudget lines.

What is the change in income? The final budget constraint limits the consumer to I, just as theinitial constraint does. Therefore I = p0

xx0 + pyy

0. In order to be able to afford (x0, y0) under thenew prices, you would need p1

xx0 + pyy

0, or ∆I = ∆pxx0 more than before. For a small change

in px, the intermediate optimum is close to the initial one, so the difference in income from theintermediate constraint to the final one is approximately ∆pxx

0. (The approximation is exact inthe limit ∆px → 0.)

This confirms our intuition: the movement along the income expansion path from the intermediateoptimum to the final optimum—the income effect—will be larger, the larger was x0, our initial levelof consumption of x.

37

Page 39: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

6 Slutsky’s Equation

6.1 Review

Expenditure function:

e(p1, p2, u0) = min

x1,x2

p1x1 + p2x2 s.t. u(x1, x2) = u0

= p1xc1(p1, p2, u

0) + p2xc2(p1, p2, u

0)

where xc1 and xc2 are the compensated demands, the cheapest choices that enable one to achieveutility level u0 at prices (p1, p2).

The Lagrangian for the e-min problem is

L(x1, x2, µ) = p1x1 + p2x2 − µ[u(x1, x2)− u0]

The FONC are:

p1 − µu1(x1, x2) = 0

p2 − µu2(x1, x2) = 0

u(x1, x2) = u0

As for the derivatives of the expenditure function with respect to prices,

∂e(p1, p2, u0)

∂p1= xc1(p1, p2, u

0) + p1∂xc1(p1, p2, u

0)

∂p1+ p2

∂xc2(p1, p2, u0)

∂p1. (6.1)

The reader is presumed to be familiar with the Envelope Theorem, which says the second and thirdterms on the RHS cancel.

Proof: Recall that u(xc1(p1, p2, u0), xc2(p1, p2, u

0)) = u0. Differentiate both sides with respect to p1:

u1∂xc1∂p1

+ u2∂xc2∂p1

= 0

But u1 = p1/µ and u2 = p2/µ by the FONC. It follows by substitution that

p1

µ· ∂x

c1

∂p1+p2

µ· ∂x

c2

∂p1= 0

which means

p1∂xc1∂p1

+ p2∂xc2∂p1

= 0

Thus we have∂e(p1, p2, u

0)

∂p1= xc1(p1, p2, u

0) �

There is a story we tell to go along with this. If you initially are minimizing expenditure, and theprice of good 1 rises, what do you do? Your first order response is simply to continue buying the

38

Page 40: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

old bundle—this increases your spending by xc1 ×∆p1. That is the first term on the RHS of (6.1).But then you would like to adjust your choices of goods 1 and 2 to reflect the new prices. Theadjustments are the second and third terms on the RHS of (6.1). But because your initial choiceswere optimal—they satisfied the FONC—when you attempt to adjust x1 and x2 you don’t save anymore.

6.2 Slutsky Decomposition

Now we are ready to analyze what happens to the uncompensated, or regular demand functionswhen prices rise/fall. Suppose we start with prices (p0

1, p02) and income I0. Initially the optimal

choices are x01 = x1(p0

1, p02, I

0) and x02 = x2(p0

1, p02, I

0), where x1(·) and x2(·) are the regular demandfunctions.

We decompose the effect of a change in price ∆p1 = p11 − p0

1 as follows:

(a) Starting from (x01, x

02), imagine the adjustment you would make if you could remain on the

old indifference curve. This would lead you to a new bundle (x∗1, x∗2). Since prices have risen

this bundle costs more than you were spending before. This move is called the substitutioneffect of the price increase.

(b) Then, from (x∗1, x∗2), imagine the adjustment you would make to get back to the original

income level. This would be a move inward along an income expansion path (IEP), andwould lead you to (x1

1, x12). This move is called the income effect of a price increase.

Figure 6.1: A decomposition of the change in demand into its constituent parts: movement along theindifference curve followed by movement inward along an IEP.

Note that the total change in x1 is

∆x1 = x11 − x0

1 = (x11 − x∗1) + (x∗1 − x0

1) = ∆xI1 + ∆xS1

39

Page 41: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

What are the relative magnitudes of the constituent parts? To begin, observe that (x01, x

02) and

(x∗1, x∗2) are on u0. Now,

x01 = x1(p0

1, p02, I

0) = xc1(p1, p2, u0) (6.2)

Also,x∗1 = xc1(p1

1, p02, u

0)

so

∆xS1 = x∗1 − x01 = xc1(p1

1, p02, u

0)− xc1(p01, p

02, u

0) ≈ ∂xc1(p01, p

02, u

0)

∂p1×∆p1

The substitution effect depends on the rate at which compensated demands change: this is purely afunction of the curvature of the indifference curves.

How about the income effect?∆xI1 = x1

1 − x∗1First note that x1

1 = x1(p11, p

02, I

0): it is the regular demand given (p1,1 , p0

2, I0). But what is x∗1? It

is the choice one would make with enough income remain on u0 even at the new prices. How muchmoney would it take? The answer is e(p1

1, p02, u

0)! So,

x∗1 = x1(p11, p

02, e(p

11, p

02, u

0))

Thus

∆xI1 = x1(p11, p

02, I

0)− x1(p11, p

02, e(p

11, p

02, u

0))

≈ ∂x1(p01, p

02, I

0)

∂I(I0 − e(p1

1, p02, u

0))

So the income effect depends on the income derivative of demand times the change in income∆I = I0 − e(p1

1, p02, u

0). Note that ∆I < 0 since one would need more than I0 to achieve U = u0

at prices (p11, p

02).

But how big is ∆I? We need one last trick. We know that I0 = e(p01, p

02, u

0), so we can write

∆I = I0 − e(p11, p

02, u

0)

= e(p01, p

02, u

0)− e(p11, p

02, u

0)

≈ ∂e(p01, p

02, u

0)

∂p1(p0

1 − p11)

=∂e(p0

1, p02, u

0)

∂p1× (−∆p1)

= −∂e(p01, p

02, u

0)

∂p1×∆p1

(which is negative for an increase in p1). Finally we have

∂e(p01, p

02, u

0)

∂p1= xc1(p0

1, p02, u

0) by (6.1)

= x01 by (6.2)

40

Page 42: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

and combining the last few results,∆I ≈ −x0

1∆p1

Note that the size of the income effect depends on the original level of consumption of x1.

Putting it all together,

∆xI1 =∂x1(p0

1, p02, I

0)

∂I×∆I = −∂x1(p0

1, p02, I

0)

∂I× x0

1∆p1

Thus

∆x1 = ∆xI1 + ∆xS1

= −∂x1(p01, p

02, I

0)

∂I× x0

1∆p1 +∂xc1(p0

1, p02, u

0)

∂p1×∆p1

or∆x1

∆p1= −x0

1

∂x1(p01, p

02, I

0)

∂I+∂xc1(p0

1, p02, u

0)

∂p1

Now in the limit ∆p1 → 0 the ratio ∆x1/∆p1 equals the derivative of the regular demand functionwith respect to p1. We have established:

∂x1(p01, p

02, u

0)

∂p1= −x0

1

∂x1(p01, p

02, I

0)

∂I+∂xc1(p0

1, p02, u

0)

∂p1

This is called Slutsky’s equation, after the Russian economist who proved it over 100 years ago.Slutsky’s equation says the derivative of the regular demand function with respect to p1 is a com-bination of the income and substitution effects. The income effect depends on the derivative ofdemand with respect to income, times the original level of consumption of x1. The substitutioneffect depends on the derivative of the compensated demand function.

A useful feature of Slutsky’s equation is that it provides a way to recover information about indif-ference curves from the derivatives of the demand functions with respect to prices and incomes. Inprinciple, we can observe ∂x1/∂p1 and ∂x1/∂I, which would enable us to infer

∂xc1(p01, p

02, u

0)

∂p1=∂x1(p0

1, p02, I

0)

∂p1+ x0

1

∂x1(p01, p

02, I

0)

∂I

Suppose we get an estimate of ∂xc1/∂p1 that is nearly zero. The indifference curves must thereforebe almost Leontief (“right angles”).

41

Page 43: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

7 Using Market Level Demand Curves

Since the demand curve graphs x = f(px, py, I), if py or I changes, the demand curve shifts. Forexample, if income were to increase by dI > 0, then at a given price, demand would increase bydx = (∂x/∂I)dI. For a normal good ∂x/∂I > 0, so the demand curve would shift to the right as inFigure 7.1.

Figure 7.1: A shift in the demand curve to to an increase in I, assuming x is a normal good.

If the elasticities of demand are approximately constant, then

d(log x) =dx

x=

(∂x

∂I· Ix

)dI

I= ex

dI

I= exd(log I)

where ex is the income elasticity of demand for x.5 Similarly, if py changes, the demand curve shiftsunless ∂x/∂py = 0 (as in the case of Cobb-Douglas preferences). If ∂x/∂py < 0, and increase in theprice of y causes the demand curve to shift to the right.

For the purposes of evaluating the effect of relatively small changes in prices and income, we oftenassume the demand function has constant elasticities:

∂x

∂px× px

x=

∂ log x

∂ log px= ηxx (constant)

∂x

∂py× py

x=

∂ log x

∂ log py= ηxy (constant)

∂x

∂I× px

x=∂ log x

∂ log I= ex (constant)

This is equivalent to assuming that the demand function is log-linear:

log x = ηxx log px + ηxy log py + ex log I + c

5You should be familiar with the concept of elasticity from Econ 1. In particular, you should be able to verifythat elasticity is a unitless quantity.

42

Page 44: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

where c is a constant. Note that homogeneity implies ηxx + ηxy + ex = 0. Put differently, if pricesand income all rise by one percent, then x remains constant.6

As you recall from introductory economics, the market is constructed by introducing a supply curveof the form x = S(px). (See Figure 7.2.) It is usually assumed that supply is upward sloping. (Wedefer the derivation of market supply curves until later.) For now, we shall assume that elasticityof supply is constant:

dS(px)

dpx· pxS(px)

= σx

where σx denotes elasticity of supply. We now can combine supply and demand curves to analyzethe effects of exogenous shocks to income or other prices. We have

x = S(px) = f(px, py, I)

a system of two equations in two unknowns, px and x (unit price of x and quantity of x, respectively),given income and other prices. This is pictured in Figure 7.3.

Figure 7.2: The reader is presumed to be familiar with the upward sloping supply curve.

7.1 An Increase in Income

Obviously, both x and px increase with I. But by how much? Take a look at Figure 7.4. Startingat equilibrium, with x = x0 and px = p0

x, the changes in demand and supply are:

∆x

x= ηxx

∆pxpx

+ ex∆I

I(demand)

∆x

x= σx

∆pxpx

(supply)

6A proof would involve recognizing that if x remains constant, then so does log x, and therefore setting the totaldifferential of log x equal to zero. The details are left to the reader.

43

Page 45: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

Figure 7.3: The market is in equilibruim when the price is such that supply and demand are balanced.

Figure 7.4: How much does px increase due to an outward shift in the demand curve?

The proportional changes in supply and demand have to be the same in order to restore equilibrium.Therefore

ηxx∆pxpx

+ ex∆I

I= σx

∆pxpx

which implies∆pxpx

=

(ex

σx − ηxx

)∆I

I

Note that σx > 0 and ηxx < 0, so σx − ηxx is strictly positive. Furthermore,

∆x

x= σx

∆pxpx

=

(σxex

σx − ηxx

)∆I

I

44

Page 46: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

For example, suppose the following:

σx = 0.60 (short run)

ηxx = −1.40

ex = 0.40

If ∆I/I = 0.10 (10% increase), then

∆pxpx

= (0.40)(0.10) ≈ 0.02

∆x

x≈ 0.012

As an exercise, calculate the effect of a 10% drop in the price of a substitute good (good y) on themarket for x. Use an estimate for the cros-price elasticity between x and y of 0.67 (ηxy = 0.67).

7.2 Tax Incidence

If a tax of t dollars per unit is imposed on x, it creates a gap between the price that consumers payand the price that producers receive, of t dollars per unit. You are presumed to be familiar withthe diagram shown in Figure 7.5.

Starting from an equilibrium at (p0x, x

0), price received by producers falls to p1x, the price paid

by consumers rises to p1x + t, and the quantity falls to x1. Consider the two marekts shown in

Figure 7.6, each with the same tax. Obviously, the effect of the tax on the prices paid/received bythe two sides depends on the relative elasticities of supply and demand. To see this more formally,we proceed based on the assumption that elasticities are roughly constant. Letting px denote theprice received by producers, the change in supply is

∆x

x= σx

∆pxpx

The change in prices for consumers is ∆px + t. Therefore, the change in quantity demanded is

∆x

x= ηxx

(∆px + t

px

)Market equilibrium requires that change in demand equals change in supply:

ηxx

(∆px + t

px

)= σx

∆pxpx

Solving for the equilibrium change in prices, we have

ηxxt

px=

∆pxpx

(σx − ηxx)

and∆pxpx

=

(ηxx

σx − ηxx

)t

px

45

Page 47: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

where t/px is the proportional tax rate. Since σx > 0 and ηxx < 0, so σx − ηxx is strictly positive,and therefore ∆px < 0. With regard to quantity,

∆x

x= σx

∆pxpx

=

(σxηxxσx − ηxx

)t

px< 0

For producers, the change in price is

∆pxpx

=

(ηxx

σx − ηxx

)t

px

and for consumers it is

∆px + t

px=

(ηxx

σx − ηxx

)t

px+

t

px=

(σx

σx − ηxx

)t

px> 0

Notice that the ratio of the changes in prices for producers versus consumers is ηxx/σx. So, ifdemand is highly inelastic, i.e. |ηxx| is small (e.g. ηxx = −0.1), and supply is moderately elastic(e.g. σx = 1.0), then producer prices don’t fall by much relative to consumer prices. On the otherhand, if demand is highly elastic, i.e. if ηxx is big (e.g. ηxx = −3.0), then producer prices are moreaffected.

Last we consider the effect of a per unit subsidy of s on the price of x. (For example, prior tothe recent rise in electricity rates, electricity prices were subsidized throughout most of California.)The change in price received by producers is ∆px, whereas the change in price paid by consumersis ∆px − s. The proportional changes in quantity are:

∆x

x= ηxx

(∆px − spx

)(demand)

∆x

x= σx

∆pxx

(supply)

Setting the two equal, we have∆pxpx

=

(−ηxx

σx − ηxx

)s

px> 0

which implies that part of the effect of the subsidy is mitigated by a rise in prices. In fact, thechange in price paid by consumers is

∆px − spx

=

(−ηxx

σx − ηxx

)s

px− s

px=

(−σx

σx − ηxx

)s

px< 0

Note that −σx/(σx − ηxx) is less than one in absolute value.

46

Page 48: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

Figure 7.5: The new price p1x is such that when consumers pay p1x + t and suppliers receive p1x, equilibriumis restored.

(a) (b)

Figure 7.6: (a) Demand inelastic, supply elastic. (b) Demand elastic, supply inelastic.

47

Page 49: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

8 Labor Supply

In this section we consider the choice of how many hours to work by an individual who faces anhourly wage w > 0, and also has non-labor income y. The individual is assumed to value leisure `and consumption of goods x, using a utility function u(x, `). We assume there is an upper boundT on leisure, and that the sum of leisure ` and hours of work h is T :

`+ h = T, or h = T − `

The graph looks a little unusual since preferences are only defined up to the point where ` = T asthe reader can see in Figure 8.1.

Figure 8.1: The budget constraint for an agent who works for w/h and consumes a numeraire good x.

The budget constraint is px = wh+ y but we shall assume p = 1. The consumer’s objective is

maxx,`

u(x, `) s.t. x = w(T − `) + y, or x+ w` = y + wT

Note that if you think of the consumption bundle as (x, `), then the budget constraint says thetotal cost of the bundle has to be y+wT for this is all the income you would have if you “bought”no leisure. This “full income” depends on w, and therein lies the key difference between laborsupply and other consumer choice problems: as the price of one good (leisure) rises, the consumeris actually richer. Intuitively this is because a worker is a net seller of leisure: he or she starts atan “endowment point” (x, `) = (y, T ). From there he or she can trade with the market by givingup leisure in return for cash, which is then used to purchase goods.

We proceed by the method of Lagrange:

L(x, `, λ) = u(x, `)− λ(x+ w`− y + wT )

Lx = ux(x, `)− λ = 0

L` = u`(x, `)− λw = 0

Lλ = −x− w`+ y − wT = 0

48

Page 50: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

The first two FONC imply the usual tangency condition: u`(x, `)/ux(x, `) = w. The solutions are:

x = x(w, y)

` = `(w, y)

h(w, y) = T − `(w, y)

Now consider the rise in w (from w0 to w1) shown in Figure 8.2. As you can see, the substitution

Figure 8.2: For this individual the income and substitution effects have opposite signs.

effect causes a drop in `, or equivalently a rise in h. But the income effect works in the oppositedirection: as a net seller of leisure the agent is better off and uses some of her extra income to buymore leisure.

To formally analyze the income and substitution effects we rely on the expenditure function for thelabor supply case: this is the amount of non-labor income needed to achieve utility u0, given w:

e(w, u0) = minx,`

x− w(T − `) s.t. u(x, `) = u0

L(x, `, µ) = x− w(T − `)− µ[u(x, `)− u0]

Lx = 1− µux(x, `) = 0

L` = w − µu`(x, `) = 0

Lµ = −u(x, `) + u0 = 0

The first two FONC imply the tangency condition: u`(x, `)/ux(x, `) = w. The solutions are:

x = xc(w, u0)

` = `c(w, u0)

hc(w, u0) = T − `c(w, u0)

The expenditure function is thus

e(w, u0) = xc(w, u0)− w[T − `c(w, u0)] = xc(w, u0)− whc(w, u0)

49

Page 51: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

and∂e

∂w=∂xc

∂w− w∂h

c

∂w︸ ︷︷ ︸0

−hc = −hc

To see that ∂xc/∂w − w∂hc/∂w = 0, we use the same trick as we did in Section 6 when dealingwith the usual expenditure function. So, recalling that (xc(w, u0), `c(w, u0)) yields utility u0,

u(xc(w, u0), `c(w, u0)) = u0

and therefore differentiating both sides,

ux(xc(w, u0), `c(w, u0))∂xc

∂w+ u`(x

c(w, u0), `c(w, u0))∂`c

∂w= 0

But wux = u` by the tangency condition, and ∂hc/∂w = −∂`c/∂w, hence the desired result.(Again, this is an example of the Envelope Theorem.)

To summarize, we have shown that ∂e/∂w = −hc(w, u0). To understand this, think of your momwhen she finds out you got a raise at your summer job: she reduces your allowance by an amountproportional to how much you were working.

Now let’s see how leisure choice depends on wages. Assume we start with (w0, y0), and that w risesfrom w0 to w1. The rise in w causes a substitution effect and an income effect:

∆` = ∆`S + ∆`I

As usual, we can write

∆`S =∂`c

∂w∆w

representing the compensated adjustment to the higher cost of leisure on the indifference curvecorresponding to level u0. Also,

∆`I = `(w1, y0)− `(w1, y1)

where y0 = original non-labor income, and y1 = e(w1, u0). We use our standard trick of takingfirst order approximations, based on the expenditure function. First, we can approximate

`(w1, y0)− `(w1, y1) ≈ ∂`(w1, y1)

∂y× (y0 − y1)

and recognizing that y0 = e(w0, u0),

y0 − y1 = e(w0, u0)− e(w1, u0)

≈ ∂e(w0, u0)

∂w(−∆w)

= −hc(w0, u0)(−∆w)

= h0∆w

So,

∆`I ≈ ∂`(w1, y1)

∂y× h0∆w

50

Page 52: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

The income effect is proportional to h0∆w: if you had been working more, there would be a biggerpositive income effect. Finally, then, we have

∆` = ∆`S + ∆`I =∂`c(w0, u0)

∂w∆w +

∂`(w1, y1)

∂y× h0∆w

Dividing both sides ∆w, and taking the limit ∆w → 0,

∂`

∂w= lim

∆w→0

∆`

∆w=`c(w0, u0)

w+ h0 ∂`(w

0, y0)

∂y

This is Slutsky’s equation for leisure demand. In terms of hours, recall that h = T − `, so

∂h

∂w= − ∂`

∂wand

∂h

∂y= − ∂`

∂y

and therefore∂h

∂w=∂hc(w0, u0)

∂w+ h0 ∂h(w0, y0)

∂y

When the wage rises there is a positive substitution effect and a negative income effect on laborsupply. Note in particular that when a person gets a raise, he won’t necessarily work more.

51

Page 53: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

9 Intertemporal Consumption

The two-period consumption model concerns a consumer whose lifetime spans two periods. Inperiod one the consumer has income y1 and spends c1; in period two the consumer has income y2

and spends c2. The consumer can borrow or lend at a rate of interest equal to r.

We express the consumer’s budget constraint in terms of period-two dollars. The choice is arbitrary,but this way it ends up simplifying the algebra for then we basically have two goods with prices 1+rand 1, respectively (rather than 1 and 1/(1 + r), which would be the case in period-one dollars).Having 1 + r in the numerator, not the denominator, is a big help. Total consumption is limitedby total income, so the budget constraint is given by

(1 + r)c1 + c2 = (1 + r)y1 + y2

The consumer’s objective is to solve

maxu(c1, c2) s.t. (1 + r)c1 + c2 = (1 + r)y1 + y2

The Lagrangean is

L(c1, c2, λ) = u(c1, c2)− λ[(1 + r)c1 + c2 − (1 + r)y − 1− y2]

and the FONC are

L1 = u1(c1, c2)− λ(1 + r) = 0

L2 = u2(c1, c2)− λ = 0

Lλ = −(1 + r)c1 − c2 + (1 + r)y1 + y2 = 0

These give a rise to the tangency condition u1/u2 = 1 + r and the budget constraint, as usual. Thesolutions are functions of r, y1, and y2:

c1 = c1(r, y1, y2)

c2 = c2(r, y1, y2)

These demand functions are a little unusual because they specify not just total available resources,or “wealth” w = (1 + r)y1 + y2, but also the composition of w. To clarify the effects of a changein r on c1 it is helpful to define two other consumption functions, that depend on the interest rateand total wealth (measured in period-two dollars):

c1 = cw1 (r, w)

c2 = cw2 (r, w)

These optimal choice functions are related by:

c1(r, y1, y2) = cw1 (r, (1 + r)y1 + y2)

c2(r, y1, y2) = cw2 (r, (1 + r)y1 + y2)

You can see that as we change r, the effect on c1(r, y1, y2) depends on both ∂c1/∂r and ∂c1/∂w.

52

Page 54: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

Now let’s define the expenditure function as the minimum cost to reach a given level of utility(again, measured in period-two dollars). Specifically, define e as follows:

e(r, u0) = min(1 + r)c1 + c2 s.t. u(c1, c2) = u0

The Lagrangian isL(c1, c2, µ) = (1 + r)c1 + c2 − µ[u(c1, c2)− u0]

and the FONC are

L1 = 1 + r − µu1(c1, c2) = 0

L2 = 1− µu2(c1, c2) = 0

Lµ = −u(c1, c2) + u0 = 0

The solutions are the compensated demand functions cc1(r, u0) and cc2(r, u0). As usual

e(r, u0) = (1 + r)cc1(r, u0) + cc2(r, u0)

Differentiating,∂e(r, u0)

∂r= cc1(r, u0) + (1 + r)

∂cc1∂r

+∂cc2∂r

and (as usual) it is easy to show that (1 + r)∂cc1/∂r + ∂cc2/∂r = 0, so

∂e(r, u0)

∂r= cc1(r, u0)

Thus we have three optimal consumption functions for first period consumption:

• c1(r, y1, y2), which depends on y1 and y2

• cw1 (r, w), which depends only on w

• cc1(r, u0), which depends on utility

We also have two relations connecting the three:

c1(r, y1, y2) = cw1 (r, (1 + r)y1 + y2) (9.1)

cc1(r, u0) = cw1 (r, e(r, u0)) (9.2)

Now it may seem clear why we defined cw1 : it’s the function that links the compensated demand andthe demand we ultimately are interested in, c1(r, y1, y2). We can differentiate these two equationswith respect to r. Starting with (9.1),

∂c1(r, y1, y2)

∂r=∂cw1 (r, (1 + r)y1 + y2)

∂r+ y1

∂cw1 (r, (1 + r)y1 + y2)

∂w(9.3)

This means that when you change r, the response of the demand for c1 as a function of (r, y1, y2)has an income effect, reflecting the fact that as r rises, so does the value of wealth.

53

Page 55: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

From (9.2) we get an expression like we’ve seen before:

∂cc1(r, u0)

∂r=∂cw1 (r, e(r, u0))

∂r+∂cw1 (r, e(r, u0))

∂w× ∂e(r, u0)

∂r

=∂cw1 (r, e(r, u0))

∂r+∂cw1 (r, e(r, u0))

∂wcc1(r, u0)

Rearranging, we get a Slutsky equation for cw1 :

∂cw1 (r, e(r, u0))

∂r=∂cc1(r, u0)

∂r− ∂cw1 (r, u0)

∂wcc1(r, u0)

=∂cc1(r, u0)

∂r− c1(r, y1, y2) (9.4)

assuming u0 is the level of utility one can achieve with income (y1, y2) and interest rate r.

Finally, plugging (9.4) into (9.3),

∂c1(r, y1, y2)

∂r=∂cw1 (r, (1 + r)y1 + y2)

∂r+ y1

∂cw1 (r, (1 + r)y1 + y2)

∂w

=∂cc1(r, u0)

∂r+∂cw1 (r, e(r, u0))

∂w[y1 − c1(r, y1, y2)]

=∂cc1(r, u0)

∂r+∂cw1 (r, e(r, u0))

∂ws1(r, y1, y2)

where s1(r, y1, y2) = y1 − c1(r, y1, y2) is the optimal level of period-one savings.

The income effect of a rise in r on optimal consumption c1(r, y1, y2) is positive or negative, dependingwhether s1 is positive or negative. For a saver, s1 > 0 and a rise in r has a positive income effect(because the consumer is a net supplier of funds to the market, as in the case of labor supply). Butfor a borrower, s1 < 0 and a rise in r has a negative income effect (because the consumer is a netdemander of funds, as in the case of basic commodity demand).

54

Page 56: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

10 Production and Cost I

The technology available to a given firm is is summarized by its production function. This functiongives the quantities of output produced by various combinations of inputs. For example, an airlineuses labor inputs, fuel, and machinery (airplanes, loading equipment, etc.) to produce the output“passenger seats.” We write y = f(a, b) to signify that with inputs a and b, it is possible to producey units of output.

Examples:

One Input

• y = aγ

• y =

{0 a < a1 a > a

Two Inputs

• y = aαbβ (Cobb-Douglas)

• y = min{a, b} (Leontief, CRS)

• y = a+ b (Additive, CRS)

For two or more inputs, production functions are a lot like utility functions. The important dif-ference is that output is measurable and has natural units (e.g. passenger seats). It’s as if the“indifference curves” have numbers attached to them that matter.

A second, less obvious, way to summarize technology is to compute the cost associated with pro-ducing a given output level y, at fixed prices for the inputs. In principle, if you know the productionfunction, it is easy to find the cost function in two steps:

1. enumerate all possible ways of producing y

2. determine the cheapest one, and evaluate its cost

Most of the economic behavior of firms is studied via the cost function. In the next few sections,we demonstrate how to derive the cost function and illustrate the connection between its propertiesand those of the production function.

10.1 One-Factor Production and Cost Functions

10.1.1 Production Functions

Suppose there is only one input (apart from, perhaps a “set-up cost”). Then we have a picturealong the lines of Figure 10.1. Note that f(0) = 0 by convention.

Definitions and Facts:

55

Page 57: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

Figure 10.1: A representative production function. Note the “S” shape.

• The marginal product of factor a is the increase in y that accompanies a unit increase in a:

MPa =∂f(a)

∂a= f ′(a)

Factor a is said to be useful if f ′(a) > 0.

• The average product of factor a is the ratio of total output to total input of a:

APa =f(a)

a

• If the MP of factor a is increasing, then f ′′(a) > 0 and we say that there are increasingmarginal returns: as the scale of output is expanded, each additional unit of input contributesmore. If the MP is decreasing, then f ′′(a) < 0 and we say there are diminishing marginalreturns. See Figure 10.2.

(a) (b)

Figure 10.2: (a) Increasing marginal returns. (b) Decreasing marginal returns.

• If MPa > APa, then APa is increasing; if MPa < APa, then APa is decreasing.

Think baseball, with AP = career batting average and MP = season batting average. Ahitter who has a better-than-average season raises his career average. See Figure 10.3. In

56

Page 58: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

general,dAPada

=af ′(a)− f(a)

a2=

1

a

[f ′(a)− f(a)

a

]=

1

a(MPa −APa)

Figure 10.3: At a = a1, AP = f(a1)/a < f ′(a) = MP , AP is increasing. At a = a2, the opposite is true.

Examples:

• f(a) = ka, where k > 0 (linear). APa = MPa = k.

• f(a) = aβ , where 0 < β < 1 (concave). See Figure 10.4.

Figure 10.4: The greater β, the less concave the production function, up to β = 1.

• f(a) = 9a2 − a3, a < 6. See Figure 10.5. For this function we have the following:

f ′(a) = 18a− 3a2 =⇒ [f ′(a) ≥ 0 ⇐⇒ a ≤ 6]

f ′′(a) = 18− 6a =⇒{f ′′(a) > 0 ⇐⇒ a < 3f ′′(a) < 0 ⇐⇒ a > 3

57

Page 59: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

Figure 10.5: The production function of the example on page 57.

10.1.2 Cost Functions

What is the cost function for a one-factor production function? Let w dentoe the price per unit offactor a. Then

c(y, w) = minwa s.t. y = f(a)

But y = f(a) implies a = f−1(y).7 Therefore c(y, w) = wf−1(y). See Figure 10.6 for an illustrationof this process. If w is fixed, then we often write the cost function as a function of y only: c(y).Define marginal cost MC(y) = c′(y), and average cost AC(y) = c(y)/y.

Examples:

• y = f(a) = ka (linear) =⇒ a = y/k (linear input requirement function)

c(y, w) = w(y

2

)=

1

2wy (linear in both y and w)

• y = f(a) =√a =⇒ a = y2 (convex input requirement function)

c(y, w) = wy2 (linear in w but convex in y—see Figure 10.7)

10.1.3 Connection between MC and MP

Marginal cost is the amount it would cost, at the current level of output, to produce an additonalunit. By definition of MPa, one unit of input adds MPa = f ′(a) units of output. It follows that

• 1/MPa = 1/f ′(a) units of a are needed to produce one unit of y

• the marginal cost of an additional unit is MC(y) = w/f ′(a), when the production functionis given by y = f(a)

7Assume, for the moment, that f is one-to-one.

58

Page 60: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

(a)

(b)

Figure 10.6: The graph in (b) is obtained by rotating quadrant II in (a) 90 degrees clockwise.

Alternatively, c(y) = wf−1(y), using as input requirement function a = f−1(y). Thus8

C ′(y) = wdf−1(y)

dy=

w

f ′(a)

10.1.4 Geometry of c, AC, and MC

Take a look at Figure 10.8a. Note the following:

• when MC < AC, AC is falling

• when MC > AC, AC is rising

• when AC is at a minimum, AC = MC

8Recall that if f ′(x0) 6= 0, then [df−1(y)

dy

]y=f(x0)

=1

f ′(x0).

59

Page 61: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

(a) (b)

Figure 10.7: The production function y =√a and the corresponding cost function c = wy2, where w is

the per-unit cost of a.

We sometimes add a “set up” cost F , (also called a fixed cost). The total cost is then

c(y) = fixed cost + variable cost = F + V C(y)

The implications of this model are illustrated in Figure 10.8b.

60

Page 62: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

(a) (b)

Figure 10.8: Compare (b) to (a) and note the following: 1. minAC occurs to the right of minAV C.Why? 2. MC intersects both AC and AV C at their respective minimumns. Why?

61

Page 63: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

11 Production and Cost II

The analysis of production and cost is more interesting when it involves combinations of two ormore inputs to produce y. The production function is y = f(a, b). As in consumer theory, we beginby thinking about combinations of inputs that produce the same level of output. In the firm casethese are called isoquants.

We define the marginal rate of technical substitution (MRTS) as the slope of an isoquant. It indicateshow many units of b one would need to add, per unit of a given up, to keep output constant. SeeFigure 11.1.

Figure 11.1: The marginal rate of technical substitution is analogous to the consumer’s MRS. This bearscomparison to Figure 2.5.

Formally, suppose y = f(a0, b0), and consider varying a and b in such a way that output remainsfixed at y0:

dy = fada+ fbdb = 0

which implies [db

da

]y0

= −fa(a0, b0)

fb(a0, b0= −MPa

MPb

The MRTS is analogous to the marginal rate of substitution (MRS) in consumer theory. Whenthere are two or more inputs, the production function is characterized by both the degree of sub-stitutability between inputs (curvature of isoquants) and the extent to which output expands asinputs are expanded proportionately. The latter gives rise to the idea of returns to scale. Recallthat for a production function y = f(a, b), we say f has constant returns to scale (CRS) if

f(γa, γb) = γf(a, b), γ > 0

We say that f has decreasing returns to scale (DRS) if

f(γa, γb) < γf(a, b), γ > 1

62

Page 64: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

With DRS, if you double both inputs, you get less than twice the output. On the other hand, thesame inequality implies that if you reduce inputs by some proportion, your output falls by a smallerproportion. So DRS suggests that smaller firms are necessarily more efficient. Conversely we saythat f has increasing returns to scale (IRS) if

f(γa, γb) > γf(a, b), γ > 1

(a) (b)

Figure 11.2: (a) CRS and (b) DRS. This can be seen by noting the shape of the intersection of the surfacewith the plane a = b for example.

Examples:

• One Input: f(a) = aα

– CRS if α = 1

– DRS if α = 1

– IRS if α > 1

• Cobb-Douglas: f(a, b) = aαbβ

– CRS if α+ β = 1

– DRS if α+ β = 1

– IRS if α+ β > 1

As a check, suppose α+ β = 1. Then

f(γa, γb) = (γa)α(γb)β

= γα+βaαbβ

= γf(a, b)

63

Page 65: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

Geometrically, returns to scale indicates whether f is concave or convex over the top of a rayemanating from the origin. (See Figure 11.2.)

11.1 Derivation of the Cost Function

Given a production function f(a, b) and prices wa, wb, we can write

c(wa, wb, y) = minwaa+ wbb s.t. f(a, b) ≥ y

Define L = waa+ wbb− µ[f(a, b)− y], and proceed by the method of Lagrange:

La = wa − µfa(a, b) = 0

Lb = wb − µfb(a, b) = 0

Lµ = −f(a, b) + y = 0

The ratio of the first two FONC gives

wawb

=fa(a, b)

fb(a, b)= MRTS

Geometrically, we find the point of tangency of the constraint f(a, b) = y with the “iso-cost” lines

waa+ wbb = const.

See Figure 11.3.Notice the problem is reversed relative to that of a consumer. In the cost problem,you are constrained to an isoquant and have to find the lowest budget, or iso-cost line. In theconsumer problem, you are constrained to a budget line and have to find the highest isoquant, orindifference curve.

Figure 11.3: The Firm’s objective is to minimize cost subject to a given level of output. This is done bymoving along an isoquant until the tangency condition is satisfied.

If we consider finding the most inexpensive way to achieve different levels of output given wa andwb, we trace out the scale expansion path (SEP) shown in Figure 11.4. Note the similarity between

64

Page 66: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

a firm’s SEP and a consumer’s IEP. Geometrically, the shape of the cost function (as a function ofy) depends on the shape of the production function “over the top” of the SEP. See Figure 11.5 foran illustration. If the curve over the SEP is S-shaped as in Figure 11.5b we get cost functions ofthe usual shape.

Figure 11.4: The scale expansion path traces out the optimal input demands as production varies.

(a) (b)

Figure 11.5: The shape of the cost function depends on the shape of the production function over thetop of the SEP. In other words, if the SEP is given by g(a, b) = g0, then the cost function isshaped like the intersection of y = f(a, b) with g(a, b) = g0, where the latter is promoted tothree dimensions.

11.2 Marginal Cost

If we were to produce an additional unit of y, we could use input a, or input b, or both. If we useda only, it would take 1/MPa units of a for a single unit of y. The marginal cost is wa/MPa (just as

65

Page 67: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

in the one-factor case). By symmetry, we could also use b only, at marginal cost of wb/MPb. Butfrom the FONC

wawb

=MPaMPb

=⇒ waMPa

=wbMPb

So, on the margin, one should be indifferent to expanding output via increases in a or increases inb. This reflects the fact that a and b were optimally chosen to begin with. Note also that

µ =wa

fa(a, b)=

waMPa

=wbMPb

Thus the Lagrange multiplier in the cost-minimization problem gives marginal cost.

Examples:

• f(a, b) = min{a, b/k}. At a cost minimum we must have a = b/k = y, which implies

c(wa, wb, y) = y(wa + kwb)

Note that this production function exhibits CRS.

• f(a, b) = a + kb. These are linear isoquants, with fa/fb = 1/k. If wa/wb > 1/k, use only b,in which case y = kb =⇒ b = y/k, and c(wa, wb, y) = wby/k. But if wa/wb < 1/k, use onlya, in which case y = a, and c(wa, wb, y) = way. Combining these results, for any wa, wb, wehave c(wa, wb, y) = y ×min{wa, wb/k}.

The previous two examples illustrate what is called the dual relationship between cost and pro-duction functions. Leontief production functions imply linear cost functions; linear cost functionsimply Leontief-like cost functions.

• f(a, b) = aαbβ . (You may have seen this in a problem set!) The Lagrangian is L(a, b, µ) =waa+ wbb− µ(aαbβ − y).

La = wa − µαaα−1bβ = 0

Lb = wb − µβaαbβ−1 = 0

Lµ = −ααββ + y = 0

Using the first FONC, we have

wawb

=αaα−1bβ

βaαbβ−1=αb

βa

or

b =βawaαwb

By substitution,

aαbβ = aα(βawaαwb

)β= aα+βββwβaα

−βw−βb = y

from which we can easily retrieve the input requirement function (IRF) for a:

a = y1

α+β

β

) βα+β

w− βα+β

a wβ

α+β

b

66

Page 68: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

The IRF for b can be found by substitution, or by symmetry:

b = y1

α+β

α

) αα+β

α+βa w

− αα+β

b

Finally c(wa, wb, y) = waa + wbb when a and b are set to their respective cost-minimizingvalues, so

c(wa, wb, y) = y1

α+β

β

) βα+β

α+βa w

βα+β

b + y1

α+β

α

) αα+β

α+βa w

βα+β

b

= y1

α+βwα

α+βa w

βα+β

b

[(α

β

) βα+β

+

α

) αα+β

]

If α+ β = 1 (CRS), this simplifies considerably:

c(wa, wb, y) = ywαawβb

[(α

β

)β+

α

)α]= ywαaw

βb (α−αβ−β)

So with CRS, cost is linear in output. In general the exponent of y in the cost function is(α + β)−1, so if α + β > 1, cost is concave in output (IRS), whereas if α + β < 1, cost isconvex in output (DRS).

67

Page 69: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

12 Cost Functions and IRFs

Suppose we are given a production function f(x1, x2), and the associated cost function c(y, w1, w2).We determine c by solving the cost minimization problem:

minw1x1 + w2x2 s.t. f(x1, x2) = y

We define the Lagrangian L = w1x1 + w2x2 − µ[f(x1, x2)− y]. The FONC are:

L1 = w1 − µf1(x1, x2) = 0

L2 = w2 − µf2(x1, x2) = 0

Lµ = −f(x1, x2) + y = 0

The first two of these imply the tangency condition w1/w2 = f1/f2, while the third is equivalentto the constraint. Solving these two equations in two unknowns we get the IRFs:

x1 = x∗1(y, w1, w2)

x2 = x∗2(y, w1, w2)

The IRF’s are analogous to the consumer’s demand functions: they represent the optimal (cost-minimizing) input choices to produce y when input prices are (w1, w2). With these we obtain thecost function

c(y, w1, w2) = w1x∗1(y, w1, w2) + w2x

∗2(y, w1, w2) (12.1)

which is simply the cost of the cost-minimizing combination of inputs.

12.1 Sheppard’s Lemma

It turns out that given c, one can recover the IRFs by simple differentiation:

x∗1(y, w1, w2) =∂c(y, w1, w2)

∂w1

At a glance, this appears to be inconsistent with (12.1). Indeed, differentiating (12.1) with respectto w1 gives three terms:

∂c(y, w1, w2)

∂w1= x∗1(y, w1, w2) + w1

∂x∗1(y, w1, w2)

∂w1+ w2

∂x∗2(y, w1, w2)

∂w1(12.2)

However, when an input price changes, x∗1(y, w1, w2) and x∗2(y, w1, w2) are constrained to movealong an isoquant as in Figure 12.1. In other words, we have

f(x∗1(y, w1, w2), x∗2(y, w1, w2)) = y

and this holds even as w1 varies, so, differentiating w.r.t. w1:

f1∂x∗1∂w1

+ f2∂x∗2∂w1

= 0

68

Page 70: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

This means∂x∗2∂w1

= −f1

f2× ∂x∗1∂w1

So, since x∗1 falls in response to a rise in w1, x∗2 has to rise, and the rates of change are in the ratiofx1/fx2 . (Note that x∗1 responds to a change in w1 just as a demand function does in consumertheory; the response is like a subsitution effect. Since the isoquant exhibits DMRTS, w1 inc.=⇒ x∗1 dec.) And substituting (12.1) into (12.2),

∂c

∂w1= x∗1 +

∂x∗1∂w1

(w1 − w2

f1

f2

)But w1 − w2(f1/f2) = 0 by the tangency condition, so the second and third terms on the RHS of(12.2) always cancel, leaving us with (12.1).

Equation (12.1) says that if w1 rises, the first order effect on cost is proportional to the amount ofx1 the firm originally was using. Although the optimal choices of x1 and x2 also change, they do soin such a way that y remains constant, and because of the initial tangency condition the movementsin the inputs leave cost unchanged.

Figure 12.1: The price of x1 changes, and the firm adjusts x∗1 and x∗2 without affecting production.

69

Page 71: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

13 Supply

13.1 Supply Determination

So far we have studied cost, taking output as given. In this lecture, we consider the output orsupply decision of individual competitive firms. By competitive, we mean the firm takes the pricesof inputs and outputs as exogenous (i.e. beyond the firm’s control). For any firm, profit is definedas revenue minus cost. For a competitive firm that uses two inputs, 1 and 2, to produce a singleoutput y with unit price p, profit is given by

π(y) = py − c(y, w1, w2)

Note that revenue py is linear in output, whereas the cost function is potentially non-linear. Assumethe firm selects y so as to maximize profit:

max py − c(y, w1, w2)

FONC:dπ

dy= p− cy(y∗, w1, w2) = 0

or, equivalently, price = marginal cost at y = y∗. The SOC for a maximum is

d2π

dy2< 0 =⇒ −cyy(y∗, w1, w2) < 0 =⇒ cyy(y∗, w1, w2) > 0 =⇒ MC is increasing at y = y∗

The diagram is shown in Figure 13.1a. Note that y∗ is a function of p and w = (w1, w2). We definethe supply function to be y = y∗(p, w1, w2). What if π < 0 at y∗(p, w)? See Figure 13.1b.

(a) (b)

Figure 13.1: (a) The firm selects y∗ such that MC = p. (b) p < AV C =⇒ y∗ = 0 and AV C < p <AC =⇒ the firm is not turning a profit but it’s covering its operating costs, so it may beadvised to stay in business and hope for better times.

• If p < AV C then y∗ = 0. The firm is losing on both fixed and variable inputs: the best choiceis to shut down.

70

Page 72: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

• If p > AC, the firm is turning a profit, so y∗ is such that p = MC(y∗).

• If AV C < p < AC , the firm is incurring a loss, but it’s covering its operating costs, failingonly to cover its fixed costs. The firm may well stay in business and hope for better times.

Figure 13.2 is a useful representation of the firm’s optimal choice.

Figure 13.2: The rectangle represents revenue py∗ while the area underneath MC represents costs (notincluding fixed costs). Thus the shaded area represents profits (not including fixed costpayments). Here we are using the fact that c(y) =

∫ y

0MC(s)ds+ F .

Observations

• If MC is constant (e.g. Cobb-Douglas with α + β = 1), then, assuming no fixed costs,p < MC =⇒ loss =⇒ y∗ = 0, and p ≥MC =⇒ π ∼ y =⇒ y∗ =∞ (infinite profit).

• If MC is always decreasing, then supply is undefined, if not zero.

Figure 13.3: At y∗ defined by p = MC(y∗), profit is not maximized. Why? Consider a reduction inoutput. Cost falls by MC and revenue falls by p, so π actually increases. The SOC are notsatisfied since cyy < 0.

Examples:

71

Page 73: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

• y = xa, 0 < a < 1 (one input, DRS)The input requirement function is x∗(y) = y1/a, which does not depend on prices. Thus

c(w, y) = wx∗(y) + F = wy1/a + F

where F = fixed costs, and

MC(y) =w

ay

1−aa

AC(y) =F

y+ wy

1−aa

The optimal output supply choice y∗ solves p = MC(y), which implies

p =w

a(y∗)

1−aa

or

y∗(p, w) =(apw

) a1−a

Note the following:

y∗ is homogenous of degree zero in (p, w)

y∗ increases with p, decreases with w

• y = xα1xβ2 , α+ β < (Cobb-Douglas with DRS)

Recall that

c(y, w1, w2) = k1wα

α+β

1 wβ

α+β

2 y1

α+β

for some k1 > 0. Therefore

MC(y) = k2y1−α−βα+β w

αα+β

1 wβ

α+β

2

for some constant k2. Setting p = MC and solving for y gives

y∗ = k3pα+β

1−α−βw− αα+β

1 w− βα+β

2

for some constant k3. Or, equivalently,

log y∗ = constant +α+ β

1− α− βlog p− α

1− α− βlogw1 −

β

1− α− βlogw2

Again y∗ is homogeneous of degree zero in (p, w), increasing in p, and decreasing in w1 andw2.

As an exercise, prove that for a general cost function, the competitive supply response is homoge-neous of degree zero in all prices, (input and output). Hint: The cost function is homogeneous ofdegree one in all input prices.

72

Page 74: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

13.2 The Law of Supply

The Law of Supply states that competitive supply functions are always upward sloping:

∂y∗

∂p> 0

Why? At the optimal level of supply, p = MC. But MC is increasing by the SOC, so if p increases,the new optimal level of supply increases, too: we simply move along the MC schedule as inFigure 13.4.

Figure 13.4: Assuming the SOC is satisfied, an increase in p is accompanied by an increase in y∗ sincethe intersection moves upward and to the right.

Formally, y∗ is defined as the solution to

p− cy(y∗(p, w1, w2), w1, w2) = 0. (13.1)

This FONC holds even if we move p (or either of w1 or w2 for that matter). Therefore, differentiatingboth sides of (13.1) w.r.t. p,

1− cyy(y∗(p, w1, w2), w1, w2)∂y∗

∂p= 0

hence∂y∗

∂p=

1

cyy(y∗, w1, w2).

But cyy(y∗(p, w1, w2), w1, w2) > 0 by the SOC, so ∂y∗/∂p > 0!

13.3 Changes in Input Prices

What is the effect of an increase in input prices on the firm’s output decisions? An increase ininput prices, (say w1), is associated with a shift in MC. See Figure 13.5.

In the case where MC rises with w1, we have ∂y∗/∂w1 < 0. Is this always the case? We shall seein the next section!

73

Page 75: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

Figure 13.5: An increase in w1 causes theMC curve to shift, usually upward, which causes the intersectionof p and MC to move inward.

74

Page 76: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

14 Input Demand for a Competitive Firm

In this lecture we describe the determination of input demands for a competitive firm that sellsoutput y at price p. Its production function is y = f(x1, x2). Inputs 1 and 2 have prices w1 andw2.

The firm’s optimal choice of (x1, x2) is determined in two steps. First, the firm constructs its costfunction c(y, w1, w2). This implicitly defines the optimal input demands x1 and x2 for each level ofy, given input prices.

c(y, w1, w2) = minx1,x2

w1x1 + w2x2 s.t. y = f(x1, x2)

= w1xc1(y, w1, w2) + w2x

c2(y, w1, w2)

where xc1(y, w1, w2) and xc2(y, w1, w2) are the conditional factor demands. The word conditionalsignifies that these input demands depend on the output choice. Note that xc1 and xc2 are verymuch like the compensated demand functions for the consumer. In particular, setting L = w1x1 +w2x2 − µ[y − f(x1, x2)], we have the following FONC:

L1 = w1 − f1(x1, x2) = 0

L2 = w2 − f2(x1, x2) = 0

Lµ = −y + f(x1, x2) = 0

The ratio of the first two FONC implies that w1/w2 = f1/f2. Recall that f1 is the marginal productof input 1. The ratio f1/f2 is called the marginal rate of technical substitution (MRTS). This isthe firm’s equivalent of the consumer’s MRS; it gives the slope of an isoquant at (w1, w2). So, thefirst order conditions for the cost-min problem are illustrated in Figure 14.1.

Figure 14.1: Illustration of FOC for cost-min problem.

Recall from Section 12.1 that

xci (y, w1, w2) =∂c(y, w1, w2)

∂wi, i = 1, 2

75

Page 77: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

Having determined the cost of producing a given level of output, the next step for the firm is tochoose what level of output to produce. It does so by maximizing profit π = py − c(y, w1, w2):

p− cy = 0 =⇒ p = MC (14.1)

−cyy < 0 =⇒ ∂MC

∂y> 0 (14.2)

Equation (14.2) means that marginal cost must be rising. See Figure 13.1a. The optimal choice ofy, given (p, w1, w2), is the value y∗ such that

p = MC(y∗, w1, w2)

i.e. output is chosen so that price equals marginal cost. Now we are ready to define the firm’sunconditional input choices. The firm’s unconditional input demands are simply:

xi(p, w1, w2) = xci (y∗(p, w1, w2), w1, w2) (i = 1, 2)

In other words, the unconditional input demands are the conditional demands, for the optimalchoice of y. We can think of the problem of finding optimal input demand choices as one of solvingtwo problems simultaneously: cost-min and p = MC.

Figure 14.2: The level of production plays the role of utility in the consumer choice analogy: w1 rises,conditional input demand falls.

What happens when w1 rises? Since

x1(p, w1, w2) = xc1(y∗(p, w1, w2), w1, w2)

we have∂x1

∂w1=∂xc1∂w1

+∂xc1∂y∗× ∂y∗

∂w1

76

Page 78: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

The first term is the response of optimal input demand, holding constant y. This is called thesubstitution effect. It is just like the consumer’s substitution effect, which is defined as the changein demand, holding constant u. Instead of being constrained to move along an indifference curve,the firm is constrained to move along an isoquant as one can see in Figure 14.2.

The second term is called the scale effect. It is somewhat similar to the consumer’s income effect,except the analogy can be misleading. It reflects the fact when w1 rises, the firm’s MC curve shifts,so the optimal choice of y shifts. See Figure 14.3.

Figure 14.3: The optimal choice of y shifts due to a change in w1 Assuming input 1 is non-inferior, theshift is upward.

Recall that if input 1 is non-inferior, then MC shifts upward when w1 rises. Why?

∂MC

∂w1=

∂w1

[∂c

∂y

]=

∂2c

∂y∂w1

=∂2c

∂w1∂y

=∂

∂y

[∂c

∂w1

]=∂xc1∂y

Thus the derivative of MC w.r.t. w1 is the same quantity as the derivative of the conditionalinput demand function w.r.t. y. If input 1 is non-inferior, then ∂xc1/∂y > 0, so MC shifts upwardwhenever w1 rises.

77

Page 79: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

In this case we have the pictures shown in Figure 14.4. When w1 rises, the substitution effect andscale effect both cause a reduction in demand for input 1.

With an inferior input, when w1 rises, MC shifts downward. (E.g. when shovels rise in price themarginal cost of holes goes down.) But the scale effect is also negative because although the risein w1 causes the firm to want to increase output, input 1 is inferior, so the expansion in outputreduces demand! See Figure 14.5.

There is another way to look at the problem of input demands—a so-called “direct approach.”Suppose the firm simply chose x1 and x2 to maximize

π = pf(x1, x2)− w1x1 − w2x2

This is an unconstrained optimization problem, so the FONC are:

pf1(x1, x2)− w1 = 0 (14.3)

pf2(x1, x2)− w2 = 0 (14.4)

Note that dividing (14.3) by (14.4) returns the tangency condition f1/f2 = w1/w2. Also, the firmsets w1/f1 = w2/f2 = p. What do these equations mean? If the firm had to increase output, itcould do by increasing input 1 or input 2. If it used input 1, it would require 1/f1(x1, x2) = p unitsto produce an additional unit of output. The marginal cost would be w1/f1(x1, x2). If instead thefirm used input 2, the marginal cost would again be w2/f2(x1, x2) = p.

Looking back at the Lagrangian for the cost-min problem, notice that the FONC are

w1 = µf1 =⇒ µ = w1/f1

andw2 = µf2 =⇒ µ = w2/f2

Remember that µ is marginal cost. So, when the firm solves the cost-min problem and sets p =MC = µ, it achieves the same result as if it had carried out the direct approach. Sometimes onemethod is more convenient than the other, that’s all.

78

Page 80: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

(a) (b)

Figure 14.4: SE causes x1 to decrease, scale effect does too.

(a) (b)

Figure 14.5: Once again despite x1 being an inferior input, SE causes x1 to decrease, and so does scaleeffect.

79

Page 81: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

15 Industry Supply

The supply curve for an industry consists of the “horizontal sum” of the supply curves of eachindividual firm as shown in Figure 15.1.

Notice that if firms vary with respect to their costs, at any market price some firms are profiting,some are on the margin, and others are out of business. A good example of this is the case of oilwells. Some wells have low variable costs and always are profitable to operate. Others are high-cost, and are activated only when crude prices are high. We usually call the profits earned by theinfra-marginal suppliers rents. Presumably the lower costs of these firms arise from their controlover a scarce resource.

A competitive market is in equilibrium if the following conditions hold:

1. Each existing firm has p = MC and π ≥ 0.

2. No remaining firm can afford to enter the market.

These ideas are applicable to the case of a single firm with multiple facilities, or plants. Forexample if a firm owns two plants, with MC schedules MC1(y1) and MC2(y2), then the firmoperates efficiently by viewing the plants as separate suppliers. See Figure 15.2 for an example ofthis principle, called the principle of decentralization.

80

Page 82: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

Figure 15.1: When prices reach p1, Firm 1 enters the market, when prices then reach p2, Firm 2 entersthe market (causing the discontinuity in the supply curve), and so on.

Figure 15.2: At prices below p1, the firm is completely inactive; at prices between p1 and p2, only plant1 is active, while at prices above p2, both plants are active.

81

Page 83: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

16 Monopoly I

16.1 Monopolist’s Objective

A monopolist is the sole supplier in a given market. The critical feature of monopolistic behavioris the fact that a monopolist sets the price, or quantity. Monopolies arise

(a) through exclusive control over resources, e.g. DeBeers monopoly of diamond marketing

(b) through exclusive legal rights, e.g. public utilities, drug companies with patents, etc.

Suppose the demand for output is represented by the function y = D(p). Then we can invert thisto p = p(y), where p = D−1 is usually referred to as the inverse demand function. A monopolist’sprofit is

π(w1, w2, y) = yp(y)− c(w1, w2, y)

The FONC for profit maximization is

p(y) + yp′(y)− cy(w1, w2, y) = 0 =⇒ p(y) + yp′(y) = cy(w1, w2, y)

The LHS represents marginal revenue MR(y) = p(y) + yp′(y). If demand is downward sloping, asusual, then p′(y) < 0, so MR(y) < p. This is the key point about a monopoly. Since a monopolistcontrols the market, it cannot treat price as exogenous. Rather, it has to take into account the factthat a rise in sales will necessarily come at the expense of a reduction in price. Note that there maybe close substitutes for a product. But as long as a firm is the sole supplier of a given product, ithas monopoly power.

Define the elasticity of demand

η =∂y

∂p· py

=1

p′(y)· py

=⇒ p′(y) =1

η· p(y)

y

We then have

MR(y) = p(y) + yp′(y) = p(y) + y

(1

η

)(p(y)

y

)= p(y)

(1 +

1

η

)So, for a monopolist,

p(y)

(1 +

1

η

)= MC

As the market demand becomes closer and closer to a horizontal line, η → −∞, demand becomesperfectly elastic, and p = MC. In other words, in the limiting case, monopoly becomes perfectcompetition.

The picture associated with monopoly is shown in Figure 16.1.

Observations:

• A monopolist always sets MR = MC. Since MR = p(1 + 1/η) and η < 0, MR < p. If|η| < 1, then 1/η < −1 and MR is negative. It follows that a monopolist never operates in amarket in which demand is inelastic. Intuitively, if demand were inelastic, one could increase

82

Page 84: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

Figure 16.1: The monopolist selects y∗ such that MC = MR.

revenue by raising the price! This is a very powerful result. It says that some markets cannotbe considered monopolies, namely those with measured elasticities of demand less than 1 inabsolute value.

• If the monopolist’s MC schedule were the MC schedule of a price taker—or a set of price takers,i.e. a competitive industry—then equilibrium would occur at p = MC. This would entailhigher output and lower price, but lower profit to the industry as a whole. See Figure 16.2.

Figure 16.2: The area of the region bounded by p = pM , y = D(pM ), and MC, is greater than the areaof the region bounded by p = pC , y = D(pC), and MC.

• A monopolist does not have a supply schedule per se. First, the monopolist examines thedemand function. Then she establishes the price. There is no schedule of price/quantitycombinations.

• The SOC for profit maximization is ∂∂y (MR−MC) < 0, or slope of MR < slope of MC. Even

if MC is downward sloping, there may still exist an equilibrium for the monopolist.

16.2 Comparative Statics

See Figure 16.3. Note the following:

83

Page 85: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

Figure 16.3: If MC increases, output falls, assuming MR is negatively sloped, which is usually the case.

• If MC increases, (say because an input becomes more expensive), output will fall, assumingMR is negatively sloped.

• Factors that shift MR will cause output to increase or decrease along the MC schedule. Aconstant elasticity of demand function gives

y = ApηpγzIe

where z is another good, I is income, p = price of y, and pz = price of z. Inverse demand isgiven by

p = A−1/ηy1/ηp−γ/ηz I−e/η

andMR = p (1 + 1/η) = (1 + 1/η)A−1/ηy1/ηp−γ/ηz I−e/η

Thus increases in I or pz shift MR.

Examples:

• Linear y = a− by, and c(y) = α+ βy. To find inverse demand, note that p = a/b− y/b. Leta′ = a/b and b′ = 1/b. Inverse demand may be written p = a′ − b′y. Revenue is given byyp(y) = a′y − b′y2 =⇒ MR(y) = a′ − 2b′y. See Figure 16.4. Equating MC and MR, weobtain a′ − 2b′y = β, or

y∗ =a′ − β

2

and

p∗ = a′ − by∗ =a′ + β

2

• Exponential y = apη, η < −1. Inverse demand is given by p = a′y1/η, where a′ = a−1/η, and

revenue equals yp(y) = a′y1+1/η, hence we have

MR = (1 + 1/η)a′y1/η

84

Page 86: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

Suppose cost also is exponential, i.e. c(y) = αyβ , β > 0. This implies MC = αβyβ−1. Profitis thus a′y1+1/η − αyβ , and the FONC is

(1 + 1/η)a′y1/η = αβyβ−1

The SOC is1 + 1/η

ηa′y1/η−1 − αβ(β − 1)yβ−2 > 0

which is automatically satisfied whenever β > 1. Solving the FONC,

y =

[(1 +

1

η

)a′

αβ

] ηη(β−1)−1

Note that the optimal choice depends on the parameters of the demand and cost functions.A change in elasticity of demand causes a shift in the optimal choice of output.

Figure 16.4: Marginal revenue in the case of linear demand.

16.3 Monopoly in Two or More Markets

Suppose a monopolist has access to two markets.

Market 1: p1 = p1(y1)

Market 2: p2 = p2(y1)

If trade is restricted between the two markets, then p1 and p2 can differ. The firm’s profits are

π = p1y1 + p2y2 − c(y1 + y2)

The FONC are

p1 + y1∂p1

∂y1− c′(y1 + y2) = 0, or MR1 = MC

and

p2 + y2∂p2

∂y2− c′(y1 + y2) = 0, or MR2 = MC

85

Page 87: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

Since MR1 = p1(1 + 1/η1) and MR2 = p2(1 + 1/η2), and in light of the FONC, our model predictsthat

p2(1 + 1/η1) = p2(1 + 1/η2) =⇒ p1/p2 =1 + 1/η2

1 + 1/η1=η1

η2

(1 + η2

1 + η1

)For example, if η1 = −1.5 and η2 = −2.5, then p1/p2 = 1.82. The monopolist charges more in themore inelastic market. This is known as price discrimination.

86

Page 88: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

17 Monopoly II

We have shown that a monopolist prefers to distinguish between markets and charge more tocustomers with more inelastic demand. This phenomenon is called price discrimination. Sellershave a strong incentives to attempt to separate customers according to their demand elasticities,and charge discriminatory prices. Consumers, on the other hand, have a strong incentive to imitatehigh-elasticity consumers. There are many devices to separate consumers according to elasticity:

• Advanced purchase versus regular coach fares on airlines. Here, the airlines discriminate againstcustomers who book at the last minute (typically business travelers) and charge lower pricesto consumers who are willing to shop around.

• Single tokens versus monthly passes on public transit. Presumably, commuters have moreelastic demand for public transit than out-of-town or occasional passengers.

• Discount coupons. Here, retailers are willing to charge lower prices to consumers who arebetter informed while continuing to charge high prices to consumers who “can’t be bothered”with coupons (and therefore reveal themselves as having inelastic demands).

• After-season sales, special Monday-and-Tuesday only sales. Again, retailers are attemptingto separate high-elasticity consumers from those who want only up-to-date items at the peakof their popularity.

In each case, the key to price discrimination is to impose a cost on the low-price consumers (thosewith more elastic demands), in order to prevent high-price consumers from masquerading as low-price consumers. The cost must be too high for low-elasticity consumers, yet not high enough todiscourage others from buying altogether.

As an example, suppose that, across a population, individual demand elasticities are negativelycorrelated with wages. Those with the highest wages have the most inelastic demand; those withthe lowest wages have the most elastic demand. A firm can use a queue, or “line-up,” as follows:charge a high price with no waiting time, and a low price to those who are willing to line up in aqueue for a while, (e.g. price difference between buying a ticket at a box office versus buying overthe phone.) For a consumer with wage w, the full price is given by

p′ =

{p if she decides not to waitp+ wt− d if she waits, where d = price discount and t = waiting time

For this individual, if wt > d, she bypasses the line and pays p, whereas if wt < d, she waits inline and pays p− d. The firm has successfully charged two prices! Another way to implement pricediscrimination is by charging less to those who buy more. Suppose, for example, that there are twokinds of buyers, (1) low-volume buyers with inelastic demand, and (2) high-volume buyers withelastic demand. See Figure 17.1. The monopolist can choose y0 between y∗1 and y∗2 and offer atwo-tiered price system: p1/unit for those who buy less than y0, and p2/unit for those who buyat least y0. Note that we must have p1y

∗1 < p2y

∗2 , or else the low-volume customers would buy y0

units and discard what they don’t need. The ultimate price discrimination strategy would involvecharging a separate price for each unit sold, as in Figure 17.2 (for the first unit sold, charge p1, forthe 20th unit, p20, etc.). Notice that in this case the MR of the next unit sold is equal to its price,since the seller doesn’t have to lower prices on the infra-marginal—previously sold—units to sell

87

Page 89: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

Figure 17.1: On the left, low-volume buyers with inelastic demand, and on the right, high-volume buyerswith elastic demand. The monopolist can price discriminate in this case.

Figure 17.2: Ideally, the monopolist would charge a separate price for each unit sold.

an additional unit, which means the MR curve is identical to the demand function.9 Thus underperfect price discrimination:

• quantity is equal to its level under perfect competition

• monopolist revenue = area underneath demand curve

Relative to a perfect price discrimination scheme, consumers benefit when all consumers pay thesame price. The savings to all consumers is the shaded area underneath the demand curve, abovethe price line in Figure 17.3. This area is sometimes called consumer’s surplus (CS). By analogy,the area over the MC curve and up to price line is called producer’s surplus (PS). We have notedpreviously that this area equals revenue less total variable cost, so PS = π + F . Also, we saw thatin a competitive industry, the supply schedule is simply the combined MC schedule of the firms thatcomprise the industry. The area between the supply and demand curves, (or MC and demand),represents the total surplus CS + PS pictured in Figure 17.4. This is consistent because CS andPS both are measured in dollars. Applied economists often evaluate the effect of a governmentintervention in a given market by computing ∆CS+ ∆PS+GC, where GC denotes the cost to the

9Actually, as you shall see if you continue reading, this is not quite true. The demand function represents thenumber of units demanded when all units sell for a given price.

88

Page 90: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

Figure 17.3: The shaded region is sometimes called consumer’s surplus.

Figure 17.4: The dark region is referred to as producer’s surplus.

government. The appeal of this exercise is obvious: it assigns a dollar value to the inefficiency thatarises due to monopolization or the imposition of a tax/subsidy. Nonetheless, there is a problem.Recall that if y = D(p) is demand function, D indicates how much is purchased when y costsp/unit. On the other hand, D says nothing regarding demand when the price of the next unit isp but prices for all previous units are higher. In general, having paid more for the inframarginalunits, the consumers who purchased them have less income with which to purchase additional units.Higher prices on the inframarginal units have an income effect that is not captured by the ordinarydemand function. In fact, the only case in which CS and PS analysis is completely legitimate isthe one in which demand does not depend on income (the slope of the indifference curve throughx1 = (x1, x

12), at x1, equals the slope of the indifference curve through x2 = (x1, x

22), at x2), or

each consumer buys at most one unit of the commodity (so that higher prices for the first—andonly—unit purchased don’t lower subsequent demands).

In spite of this problem, CS and PS analysis is a good starting point for evaluating the merits of amarket intervention. For example, suppose that a market is in equilibrium at p = p0, x = x0, whena per-unit tax of t is imposed as in Figure 17.5. Demand falls to x1, and the amount received bysupplies falls to p1. Tax revenue is tx1. The combined loss in CS and PS, however, exceeds the taxrevenue by an amount equal to the area of the shaded triangle. This excess loss is referred to asthe deadweight loss due to the tax. It provides a rough estimate of the inefficiency brought on by

89

Page 91: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

the tax.

Figure 17.5: The tax scenario pictured is inefficient because the shaded triangle can be thought of as lostsince it is neither revenue nor savings to any of the parties involved in this market.

Exercises:

1. Calculate deadweight loss in terms of elasticities of supply and demand.

2. Prove that CS + PS is a maximum when D intersects MC.

3. Calculate ∆CS + ∆PS when a competitive industry is monopolized.

90

Page 92: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

18 Consumer’s Surplus

In Econ 1 you probably were introduced to the concept of consumer’s surplus (CS). Consider aconsumer who is choosing between two goods, x and y. Denote by x(px, py, I) the consumer’sdemand for good x, given prices px and py, and income I. Now suppose the price of good x risesfrom p0

x to p1x. The change in consumer’s surplus is the shaded region in Figure 18.1, which can be

written as

∆CS =

∫ p1x

p0x

x(px, py, I)dpx

As we noted previously, there is a problem with CS: although the vertical height of the inversedemand function appears to be the most you would be willing to pay for each additional unit,if someone actually charged you different prices for each unit, your demand would not be givenby the conventional demand curve, (since that is derived under the assumption that you pay thesame price for every unit you purchase). There is, however, a measure of welfare that does makesense—in fact, there are two. Let u0 represent utility at (p0

x, py, I), and let u1 represent utility at

Figure 18.1: The shaded region represents the change in consumer’s surplus due to an increase in px.

(p1x, py, I). Note that u0 > u1 since a rise in prices makes a consumer worse off.

Also note that I = e(p0x, py, I). In other words I is the minimum amount of money needed to

achieve u0 at prices p0x and py. This follows by the fact that the consumer wasn’t wasting money

initially.

Likewise, I = e(p1x, py, u

1). (Make sure you understand why this must be true.)

Consider the quantity

EV = e(p1x, py, u

1)− e(p0x, py, u

1) = I − e(p0x, py, u

1)

This is the amount of money one would have to take away from our consumer initially, leavingprices alone, so that he would be indifferent regarding a rise in prices. This is called equivalentvariation. It can be thought of as the income equivalent of a rise in prices or, more specifically, anatural means of measuring the effect on welfare of a rise in prices.

Alternatively, consider the quantity

CV = e(p1x, py, u

0)− e(p0x, py, u

0) = e(p1x, py, u

0)− I

91

Page 93: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

This is the amount of money one would have to provide our consumer in order for him to be aswell off under the new prices as he was initially. This is called the compensating variation. It alsoappears to be a plausible measure of the effect on welfare of a rise in prices.

Now we shall use Sheppard’s Lemma to connect these two quantities to the area underneath com-pensated demand curves. Specifically, start from the fact that

∂pxe(px, py, u

0) = xc(px, py, u0)

By the Fundamental Theorem of Calculus,

e(p1x, py, u

0) = e(p0x, py, u

0) +

∫ p1x

p0x

∂pxe(px, py, u

0)dpx

hence we have

CV =

∫ p1x

p0x

xc(px, py, u0)dpx

which is the area “underneath” the compensated demand curve from p0x to p1

x. See Figure 18.2.Note that xc(p0

x, py, u0) = x(p0

x, py, I), so the regular demand curve, with I, and the compensated

Figure 18.2: The area of the shaded region is the Compensating Variation.

demand curve, with u0, intersect at (x0, p0x). But the regular demand curve is flatter. Why? Recall

that by Slutsky:∂x

∂px=∂xc

∂px− x∂x

∂I

If x is a normal good, then ∂x/∂I > 0, and a rise in prices causes the regular demand to decreasefaster than the compensated demand because of the income effect, hence it appears flatter. All ofthis implies CV > ∆CS for a normal good.

For the EV ,

e(p1x, py, u

1) = e(p0x, py, u

1) +

∫ p1x

p0x

∂pxe(px, py, u

1)dpx

= e(p0x, py, u

1) +

∫ p1x

p0x

xc(px, py, u1)dpx

92

Page 94: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

so

EV =

∫ p1x

p0x

xc(px, py, u1)dpx

which is the area “underneath” the compensated demand curve between p0x and p1

x, with u = u1,which intersects the regular demand curve at (x1, p1

x). So we have Figure 18.3 for a normal good x.

We have shown that CV > ∆CS > EV , so you can think of ∆CS as approximating either one ofthese.

Figure 18.3: The area of the shaded region is the Equivalent Variation.

93

Page 95: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

19 Duopoly

The simplest market to analyze, in between the two extremes of perfect competition and monopoly,is one with two suppliers. In particular, suppose there are two suppliers of a homogeneous good,one that cannot be differentiated by consumers. Let y1 denote the amount supplied by Firm 1,and y2 the amount supplied by Firm 2 so that the inverse demand function is given by p(y1 + y2).Note that inverse demand is a function of the sum y1 + y2, reflecting the assumption that the twooutputs are perfect substitutes. We shall assume the following, for simplicity

• p(y1 + y2) = a− b(y1 + y2), i.e. linear demand

• MC = c/unit, a constant

The problem facing these firms is simple:

Firm 1: choose y1 so as to maximize π1(y1, y2) = y1p(y1 + y2)− cy1

Firm 2: choose y2 so as to maximize π2(y1, y2) = y2p(y1 + y2)− cy2

Note that Firm 1’s obejctive function depends on Firm 2’s choice, and vice versa.

19.1 Monopolization

What would a monopolist do? Suppose a monopolist owned both firms. Then she would choose y1

and y2 as follows:

maxy1,y2

y1p(y1 + y2) + y2p(y1 + y2)− cy1 − cy2 = maxy1+y2

(y1 + y2)p(y1 + y2)− c(y1 + y2)

= maxy

yp(y)− cy

= maxy

(a− c)y − by2

The FONC is (a− c)− 2by = 0, or

y = yM =a− c

2b

where M signifies monopoly. This implies that

p = pM =a+ c

2

Now suppose p = pM but the firms have separate ownership groups, each producing yM/2. (Thiswould constitute a perfect cartel.) Is this an equilibrium? Probably not. For Firm 1,

MR1 =∂

∂y1[y1p(y1 + y2)]

94

Page 96: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

If Firm 1 could increase output with the assurance that Firm 2 would not follow suit, then

MR1 = p(y1 + y2) + y1∂p

∂y1

= pM +b

2yM

=a+ c

2− a− c

4

=3c

4+a

4> c = MC

Thus under a joint monopoly (both firms producing yM/2), each firm has an incentive to cheat.For the industry as a whole,

(y1 + y2)p(y1 + y2) = a(y1 + y2)− b(y1 + y2)2 =⇒ MR = a− 2b(y1 + y2)

For an individual firm, however,

y1p(y1 + y2) = y1[a− b(y1 + y2)] =⇒ MR1 = a− 2by1 > MR

This is a fundamental problem with a cartel; each firm has an incentive to cheat and produce moreif it has any reason to believe the other firm will hold constant its production. The reason is thatwhen Firm 1 increases output, it considers only how this affects the price of the units it sells; Firm1 ignores the fact prices fall for Firm 2 as well. A monopolist, by contrast, takes account of the fulleffect of a price change on all units sold.

19.2 Duopoly Equilibrium

How does the duopoly market equilibrate? The answer depends how much each firm believes theother will react to a change in the level of output. The simplest assumption is the one we madeabove, that Firm 1 does not believe Firm 2 will adjust its output and vice versa. This assumptionwas suggested by Counot, a 19th century French economist. Let’s consider Firm 1’s optimal choicein this case. Fix y2. Then Firm 1’s objective is

maxy1

y1p(y1 + y2)− cy1 = maxy1

y1[a− b(y1 + y2)]

The FONC isa− by2 − 2by1 = 0

The SOC is not a concern. (Why?) This leads to

y1 = y∗1(y2) =a− c− by2

2b

The function y∗1 is called Firm 1’s reaction function. It represents the optimal choice by Firm 1, asa function of Firm 2’s level of output, under the Cournot assumption that Firm 2 will not respondfurther.

Observations:

95

Page 97: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

• If y2 = 0, then Firm 1 acts as a monopolist: y∗1(0) = (a− c)/2b = yM .

• If y2 ≥ (a− c)/b = 2yM , then y∗1(y2) = 0, that is, Firm 1 is driven out of the market.

• The slope of the reacion function is −1/2. Every two additional units produced by Firm 2cause Firm 1 to reduce output by one unit.

Figure 19.1: Firm 1’s reaction function, assuming linear demand.

By the same token, there is a reaction function for Firm 2, taking Firm 1’s output as given.Following the same procedure as above,

y∗2(y1) =a− c

2b− y1

2

If Firm 1 decides upon its ouput, given Firm 2’s output, and Firm 2 does the same, then wheredoes this process end? Presumably, it ends when Firm 1’s choice, taking Firm 2’s output as given,is such that given this level of output, Firm 2 produces the same level of output as Firm 1 thoughtit would. Formally,

y1 = y∗1(y∗2(y1))

If y1 is an equilibrium choice, then it has the property that when Firm 1 chooses y1, Firm 2chooses y∗2(y1), and the optimal response by Firm 1 is y∗1(y∗2(y1)), which leads us back to y1. Inmathematical terms, y1 is called a fixed point of the composition of functions y∗1 ◦ y∗2 .

Fortunately for us, there is a convenient way of visualizing a Cournot equilibrium. We simply plotthe reaction functions (remembering which is which!) as in Figure 19.2. Equilibrium occurs when

y1 = y∗1(y2) = y∗1(y∗2(y1)) =a− c

2b− 1

2

(a− c

2b− y1

2

)Solving,

y1 = y2 =2

3yM

and therefore

p =a+ 2c

3

The details are left to the reader.

96

Page 98: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

Figure 19.2: Cournot equilibrium supply y0 = 23yM .

19.3 Price Setting vs. Quantity Setting

The previous section was an analysis of the outcome when two duopolists take each other’s outputas given. A similar analysis can be carried out when duopolists set prices. For example, considerend-to-end railroads that wish to set the rates for freight. Railroad 1 hauls from point A to point B,at p1/ton, and Railroad 2 hauls from point B to point C, at p2/ton. Demand for transport servicesfrom A to C depends on p1 + p2. Assume for the sake of simplicity, that demand is linear:

x = a− b(p1 + p2)

Note that this means the two segments are perfect complements. (They are consumed together, sodemand is a function of p1 + p2 only.) Let’s assume, too, that cost per ton for Railroad 1 is c1, andcost per ton for Railroad 2 is c2. Suppose a single firm owned both railroads. Then it would choosea total price p so as to maximize

π(p) = (a− bp)(p− c1 − c2)

The FONC implies

p =a+ b(c1 + c2)

2b= pM

where pM denotes the monopolist’s price.

Now suppose the two railroads act as duopolists, each taking the other’s price as given. For thefirst railroad,

π1(p1, p2) = [a− b(p1 + p2)](p1 − c1)

The FONC implies

p∗1(p2) =a− bp2 + bc1

2bwhich looks a lot like the reaction function in the quantity-setting scenario. In particular, the slopeis again −1/2. By symmetry, Railroad 2’s reaction function is

p∗2(p1) =a− bp1 + bc2

2b

97

Page 99: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

In equilibrium, p1 = p∗1(p∗2(p1)). Solving,

p∗1 =a

3b+

2c1 − c23

and

p∗2 =a

3b+

2c2 − c13

For price-setting duopolists who sell perfectly complementary products in a market with lineardemand,

p∗1 + p∗2 =2a

3b+c1 + c2

3

Note that

p∗1 + p∗2 − pM =a− b(c1 + c2)

6b

If the railroads charged a combined c1 + c2, demand would be x = a − b(c1 + c2) > 0. Thus theduopolists actually charge an even higher price than a monopolist. This special result is due to theperfect complementarity.

98

Page 100: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

20 Symmetric Cournot Equilibria

20.1 n-Firm Symmetric Cournot Equilibria

Duopoly is simple when the two firms are identical and equilibrim is symmetric, with each firmproducing an equal share of the industry supply. Let us continue to assume linear demand. Recallthat for Firm 1,

π1 = y1p(y1 + y2)− cy1 = ay1 − by1(y1 + y2)− cy1 = (a− by2 − c)y1 − by21

The FONC isa− by2 − c = 2by1

Let y1 = y2 = y0, since we are in a symmetric equilibrium. Now solving the FONC,

y0 =a− c

3b=

2

3yM

The same appeal to symmetry enables us to solve for equilibrium output in a market with nsuppliers. In this case

π1 = y1p

(n∑i=1

yi

)− cy1 = ay1 − b

n∑i=1

yi − cy1

The FONC is

a− bn∑i=2

yi − 2by1 − c = 0

As before, yi = y0 for all i, so

y0 =a− c

b(n+ 1)

and

p = p0 = a− b(ny0) =

(1

n+ 1

)a+

(n

n+ 1

)c

As the number of firms increases, (relative to the “size” of the market), the symmetric Cournotequilibrium has each firm supplying less and less, and price converging to the competitive price c.

As a practical matter, the presence of fixed costs often prevents us from having a large number offirms in a given industry. With fixed costs, there is a social cost to more firms—namely, that totalfixed costs associated with the industry rise—as well as a benefit due to less monopolistic behavior.In our example, since costs are constant, there is no inefficiency as output per firm falls.

The problem is illustrated by Figure 20.1. In each case, the firm has c(y) = ky2/2+F , and thereforeMC = ky and AC = k/2y + F/y, which is U-shaped. Optimal AC is achieved by choosing y suchthat MC = AC, or yE =

√2F/k. (E signifies efficient scale.) In Figure 20.1b, in order to have

three or more firms, p must exceed p0; otherwise, firms would fail to recover their fixed costs. Insome cases p exceeds even the price that a monopolist would charge.

99

Page 101: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

(a) (b)

Figure 20.1: (a) Competitive paradigm. min AC is achieved at small scale relative to the size of themarket. (b) Non-competitive paradigm. min AC is achieved at a level of output that is largerelative to the size of the market.

20.2 Alternatives to the Cournot Assumption

1. Return for a moment to the duopoly model of Section 19, with linear demand and constantmarginal cost. Recall that Firm 1’s objective is to maximize

π1 = y1p(y1 + y2)− cy1

Under the Cournot assumption Firm 1 selects y∗1(y2) under the assumption that y2 is fixed.Suppose, however, that Firm 1 has reason to believe Firm 2 will respond to Firm 1’s choiceby setting y2 = ψ(y1). What does Firm 1 do in this case? The FONC is

p(y1 + y2) + y1p′(y1 + y2)[1 + ψ′(y1)]− c = 0

For example, Firm 2 might announce: “We plan to increase our output in (constant) propor-tion to yours.” Then

dy2

y2=dy1

y1

which implies

ψ′(y1) =dy2

dy1=y2

y1

and the FONC becomes

p(y1 + y2) + (y1 + y2)p′(y1 + y2)− c = 0

But this should remind you of the FONC for joint profit maximization that we saw in Sec-tion 19.1. Therefore, if each firm announced to the other the rule that

dyidyj

=yiyj

i, j = 1, 2

the two firms maintain the same level of output as a joint-monopoly, (provided each believesthe other).

100

Page 102: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

2. A second class of alternatives to the Cournot assumption involves a duopoly in which onefirm is “savvy” and the other one is “naive.” Suppose for example that Firm 2 always takesy1 as given, i.e. Firm 2 adopts the Cournot assumption. Firm 1 on the other hand is savvy,and recognizes Firm 2 is employing the Cournot reaction function y∗2(y1). Firm 1 is said tobe a Stackelberg leader while Firm 2 is a Stackelberg follower. (Stackelberg was an early 20thcentury German economist.) It can be shown that (1) the leader does better than the follower,(2) the leader does better than either firm would in a symmetric Cournot model, and (3) thefollower does worse than either firm would in a symmetric Cournot model.10

10Condition (3) is redundant since the symmetric Cournot equilibrium is Pareto optimal.

101

Page 103: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

21 Game Theory I

In Sections 19 and 20 we considered a duopoly with linear demand

p(y1 + y2) = a− b(y1 + y2)

and constant marginal costMC1 = MC2 = c

We identified three possible strategies:

1. Cooperation. Each firm produces yM/2.

yi =yM

2=a− c

4bi = 1, 2

p = pM =a+ c

2

πi =πM

2=

(a− b)2

8bi = 1, 2

2. Joint non-cooperation. Each firm produces y0 = 2yM/3.

yi =2

3yM =

a− c3b

i = 1, 2

p = p0 =a+ 2c

3

πi = π0 =(a− c)2

9bi = 1, 2

The situation is jointly non-cooperative in the sense that each firm is acting in its own,narrowly defined best interest, given what the other firm is producing. Given that Firm 1produces y0, Firm 2 is advised to produce y0 as well.

3. Cheating given that your competitor is cooperating. For example if Firm 1 sets y1 = yM/2,Firm 2’s best response is

y∗2

(yM

2

)= yM − 1

2

(yM

2

)=

3

4yM

which means

p = pC =3a+ 5c

8

We have also

π1 = πL =3(a− c)2

64b

π2 = πW =9(a− c)2

64b

102

Page 104: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

where W stands for “winner” (in this case cheater!), and L of course stands for loser. Noticethat

πW >1

2πM > π0 > πL

Cooperation is better than joint non-cooperation but, given that your competitor is cooper-ating, your best response is to cheat.

We can illustrate the dilemma in a box such as Figure 21.1, showing each firm’s actions and theresulting payoffs as ordered pairs.

Cooperate Don't Cooperate

Cooperate πM /2, π M /2 π

L , πW

Don't

CooperateπW , π L π

0, π 0

Firm 2Firm 1

Figure 21.1: The strategies are listed along the edges of the box. The payoffs are listed in order withPlayer 1 first.

E.g. if Firm 1 cooperates and Firm 2 does not, the payoffs are (πL, πW ), where the first coordinatecorresponds to Firm 1 and the second to Firm 2. Set (a− c)2/b = 1. Then our “game box” lookslike Figure 21.2.

C ¬C

C 1/8, 1/8 3/64, 9/64

¬C 9/64, 3/64 1/9, 1/9Player 1

Player 2

Figure 21.2: “C” stands for Cooperate, and ¬C for not C, or Don’t Cooperate.

Given a box like this, we can figure out which stategy each player will adopt.

• Suppose Firm 2 believes Firm 1 will play C (cooperate). Firm 2 then checks the secondcoordinate of each entry in row 1: 9/64 > 1/8, so Firm 2 plays ¬C (don’t cooperate). However,if Firm 2 believes Firm 1 will play ¬C, then Firm 2 checks the second coordinate of each entryin row 2: 1/9 > 3/64, so again Firm 2 plays ¬C.

• We can evaluate Firm 1’s choices the same way, only this time we check columns rather thanrows, and we compare first coordinates. The result is the same: Firm 1 is better off playing¬C, regardless of Firm 2’s choice.

Notice that in this game there is always an incentive for each player to choose ¬C, regardless of whatthe other player does. An action that is always the best response is called a dominant strategy. Thegame pictured in Figure 21.3 doesn’t have a unique dominant strategy. In this game, we say that(C,C) and (¬C,¬C) are Nash equilibria. A Nash equilibrium in a 2-player game is a combinationof strategies (S, T ) such that

103

Page 105: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

1. Given that Player 1 has chosen S, Player 2’s best response is T .

2. Given that Player 2 has chosen T , Player 1’s best response is S.

C ¬C

C 2, 2 1/2 3/2 ←2 chooses C if 1 plays C

¬C 3/2, 1/2 1, 1 ←2 chooses ¬C if 1 plays ¬C

↑ ↑

1 chooses C if 2

plays C

1 chooses ¬C if 2

plays ¬C

Player 2

Player 1

Figure 21.3: (C,C) and (¬C,¬C) are Nash equilibria since given that Player 1 or Player 2 plays C, hisopponent’s best response is to play C, and likewise for ¬C.

The duopoly game has a unique Nash equilibrium in (¬C,¬C). The game in Figure 21.3 has twoNash equilibria, although one is superior to the other.

You may have seen the duopoly game in disguise. One common version is known as the Prisoner’sDilemma. Suppose you and a former friend are involved in a legal dispute. You and he will appearbefore a judge who will determine who takes custody of the cat you bought together. You can hirea lawyer, or not. Suppose further that you estimate the probability of winning is 1/2 if neither ofyou hires a lawyer, or if you both hire a lawyer. But, if one of you hires a lawyer and the otherone does not, the one who is represented by a lawyer wins with probability 3/4. As we can see bylooking at the box in Figure 21.4, hiring a lawyer is a dominant strategy. The problem is lawyers

No Lawyer Lawyer

No Lawyer 1/2, 1/2 1/4, 3/4

Lawyer 3/4, 1/2 1/2, 1/2

ex-Friend

You

Figure 21.4: Hiring a lawyer is a dominant strategy in this game.

cost money, so your true “payoff” with a lawyer is lower than the box suggests. In fact both partiesare better off agreeing not to hire lawyers. But this is not a Nash equilibrium. Figure 21.5 displaysreal data pertaining to child custody cases in California in the early 1980s.

It may be possible to induce cooperation in a game that is played repeatedly. For example, considerthe following long-term strategy by a participant in a duopoly game:

• If Player 1 sees that the price last time was pM , then she produces yM/2.

• If Player 1 sees that the price last time was pC , she infers that Player 2 cheated, and “punishes”Player 2 by producing y0 the next k times, after which she reverts to yM/2.

Questions to consider:

104

Page 106: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

No Lawyer Lawyer

No Lawyer 75% 86%

Lawyer 49% 65%

Mother

Fath

er

Figure 21.5: Percentage of Mothers Awarded Child Physical Custody in San Mateo and Santa ClaraCounties, California, 1984. Source: Mnookin, Maccoby, Depha, and Albiston (1989)

1. Does the threat of punishment stop Player 2 from cheating?

2. Is the threat credible?

To answer Question 1, consider the costs and benefits of cheating in the current period, ignoringthe time value of money:

Benefit = πC − πM/2 = 9/64− 1/8 = 1/64

Cost = k(πM/2− π0) = k/72

Clearly, if k ≥ 2, the threat will deter cheating! As for Question 2, the punishment is to producey0. This is not too crazy, but is it credible? Given that Player 2 has cheated, he could simplyclaim that it was an honest mistake and promise not to do it again. Player 1 could then bypass thepunishment—does this sound familiar?—and save herself k(πM/2 − π0) too! So she has a strongincentive not to follow through on her threat. This is an example of a dynamic inconsistency.Player 1 would like to commit herself to carrying out the punishment in return for a deviation fromcooperative play but, given that Player 2 has cheated, she hurts herself by doing so and thereforehas an incentive to bail out early.

105

Page 107: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

22 Game Theory II

22.1 Tree Diagrams

In Section 21 we described a punishment strategy for the repeated Cournot game in which a playerchooses his current level of output based on last period’s price: if pt−1 < pM , he decides to producey0 for the next k periods. Afterwards he reverts to cooperative play, producing yM/2. We showedthat this strategy is effective, provided his compeitor believes he actually will execute it. But shouldhis competitor believe him? The same issue arises in numerous contexts:

• The Cold War. The U.S. threatened to start nuclear war with the USSR if the USSR invadedWestern Europe. Many Europeans themselves believed that even if the USSR invaded, theU.S. simply would cut its losses.

• Flood relief. The government would like to discourage homeowners from living in flood proneareas such as the New Jersey shore. But when a flood strikes, the government inevitablyoffers disaster relief.

• Entry deterrence. A grocery store currently is a monopolist in a certain town. Another chainis considering building a new store to compete with the existing one. The incumbent threatensto reduce prices if the chain enters the market.

Figure 22.1: The first coordinate is the payoff to Player 1, the incumbent, and the second coordinate isthe payoff to Player 2, the potential entrant. Note that we could replace (0, 0) and (π0, π0)with (0,−F ) and (π0, π0 − F ), where F denotes fixed cost of entry.

We can analyze simple dynamic games with the aid of a tree diagram such as the one in Figure 22.1,which shows each party’s possible moves. Consider the entry deterrence game. First, the potentialentrant decides whether to enter. Then, the incumbent decides whether to engage in a price war.

106

Page 108: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

As before,

πM = profit per year for incumbent without any competitors

π0 = profit per year for each firm if entry followed by Cournot duopoly

In a price war, the incumbent charges p = c and earns no profit. Notice that once the potentialentrant (Player 2) has acted, it’s up to the encumbent to decide where to go from there. SupposePlayer 2 has entered. The incumbent (Player 1) has to choose between the top two nodes: π0 > 0,so clearly it doesn’t make sense to fight once the competitor has entered. Thus we can concludethat Player 1 will choose “don’t fight.”

Player 2 on the other hand looks at the ultimate payoffs to entry. If she enters, she gets π0 sinceshe knows that Player 1 will choose “don’t fight,” so she always enters.

The method we used to analyze this game is called backward induction. At the last stage, dependingwhose turn it is, we deduce this player’s action by comparing his payoffs. Then we back up to theprevious move.

Notice that (enter, don’t fight) is the dynamically consistent equilibrium here. (enter, fight) is notdynamically consistent even though Player 1 threatens to fight, because given that Player 2 hasentered, Player 1 seeks to maximize his payoff and thus doesn’t fight.

Implications:

• In the Cold War, it was not a credible threat to promise all-out nuclear war if the USSRinvaded Western Europe.

• In hostage situations, it is not a credible threat to claim that you “don’t negotiate” withterrorists.

• In the entry game above, it is not a credible threat to claim that you will wage a price war ifanother supplier enters the market.

• The punishment strategy outlined in Section 21 is not a credible threat.

22.2 Interpretation

The prededing analysis is predicated on players behaving rationally; despite threatening to dosomething, once the time comes to make good on the threat, they always do what is in their bestinterest, regarless of the events leading up to that time. This is encountered quite often in economicsand finance, e.g. “bygones are bygones” and “sunk costs don’t count.”

Notice that in our entry game, the incumbent (Player 1) would like to be able to commit herself tobehaving irrationally. If the entrant (Player 2) knows that the incumbent will in fact fight, then hewon’t enter, especially if there is a fixed cost of entry.

Suppose there is an earlier decision that Player 1 can make to alter the payoffs should Player 2enter. Here the decision might involve investing in overhead that increases the operating costs forthe incumbent so that the payoffs are

πM − C, if Player 2 doesn’t enter, and

107

Page 109: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

π0 −A, if Player 2 enters and Player 1 doesn’t fight

Figure 22.2: Here the investment decision reduces Player 1’s payoffs by A if he doesn’t fight in the eventof entry and C if there is no entry.

This is illustrated in Figure 22.2. Will Player 1 invest in this strategy? Again, the answer is foundby backward induction:

• Path 1 (Player 1 doesn’t invest). We know that if Player 2 enters, Player 1 won’t fight. Player2 gets π0 if he enters and 0 otherwise, so he will enter, which means Player 1 gets π0.

• Path 2 (Player 1 does invest). We know that if Player 2 enters, Player 1 will fight if A > π0.Assuming this condition holds, Player 2 knows Player 1 will fight, so Player 2 won’t enter,which means Player 1 gets πM − C. This is worthwhile if C < πM − π0.

• Thus Player 1’s payoffs boil down to the following:

– don’t invest in entry deterrence, earn π0

– invest in entry deterrence, earn πM − C

Conclusion: Player 1 may make an investment in overhead provided:

• it reduces the profit from not fighting when Player 2 enters (A > π0)

• it isn’t too costly when Player 2 doesn’t enter (C < πM − π0)

The key to entry deterrence is that once the incumbent decides to invest, the decision must affecthis payoffs. He is committing to fighting by changing his payoffs in the latter stage of the game.

108

Page 110: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

There is an extension of this model to the case in which potential entrants don’t know whom they’redealing with. Suppose there are two types of incumbents:

• rational incumbents with payoffs 0 in a price war and π0 in a duopoly

• “mad dog” incumbents with payoffs π∗ in a price war and π0 − S in a duopoly

The possibility that π∗ > 0 reflects the idea that the mad dog likes to fight. S > 0 can thus bethought of as the shame that a mad dog feels for backing down. If π∗ > π0 − S, the mad dogwill fight, and this is the case if the mad dog really enjoys fighting or feels a substantial amount ofshame for backing down.

Suppose there is a fixed cost of entry F . The game looks like Figure 22.3. If Player 2 enters andthe incumbent is a mad dog, then a fight ensues and the entrant gets −F . If Player 2 enters andthe incumbent is rational, the incumbent doesn’t put up a fight and the entrant gets π0−F . Player2’s expected profit11 is

E[π2] = P (mad dog)× (−F ) + P (rational)× (π0 − F )

As the incumbent, you want to raise the entrant’s belief that you are crazy!

Figure 22.3: The potential entrant has no idea which type of incumbent he’s dealing with. It behoovesthe incumbent to signal that he’s crazy!

11See Section 23 for a definition of expected value.

109

Page 111: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

23 Uncertainty I: Income Lotteries

In the next four sections we extend the theory of consumer choice to the context of choice underuncertianty. For simplicity, we deal mainly with uncertainty regarding income. Assuming thatprices are fixed, alternative realizations of random income translate directly into alternative utilitylevels. We begin with a brief review of statistics.

23.1 Review of Basic Statistical Concepts

We define the mean, or expected value of a random variable X, denoted by E[X] (or sometimes byX), to be

E[X] =

n∑i=1

pixi

where X takes the value xi with probability pi. The mean is just a weighted average of thealternative realizations of X, with the weights being the probabilities associated with the respectiverealizations.

Consider the two random variablesX1 andX2 with probability distributions as shown in Figure 23.1.Note that

E[X1] = 10× .1 + 20× .2 + 30× .4 + 40× .2 + 50× .1 = 30,

E[X2] = 10× .5 + 50× .5 = 30,

so while these distributions have the same mean, X2 is more dispersed (X1 on the other hand ismore concentrated near its mean).

Figure 23.1: Two different distributions with identical means.

One way to describe the level of dispersion of a random variable is by its variance, denoted V[X]:

V[X] =

n∑i=1

pi(xi −X)2.

The variance of X is the mean squared difference between X and X. As an exercise, calculate V[X1]and V[X2] above. We say that a random variable X is degenerate if X = E[X] with probabilityone, in which case V[X] = 0.

110

Page 112: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

We can also consider functions of random variables. If g is a function defined on R, then Y = g(X)is a random variable. We define the mean E[Y ] as follows:

E[Y ] = E[g(X)] =

n∑i=1

pig(xi).

If g is linear, i.e. if g(x) = ax+ b for some choice of a and b, then

E[Y ] =

n∑i=1

pi(axi + b)

= a

n∑i=1

pixi + b

n∑i=1

pi︸ ︷︷ ︸1

= aE[X] + b.

As an exercise, show that V[aX + b] = a2V[X] for any choice of a, b.

23.2 Choices Over Uncertain Incomes

We now suppose that individuals are asked to make choices between alternative income lotteries.Each lottery is essentially a probability distribution of income. In ranking two alternative lotteries,we hold constant income in the absence of either lottery, (which in reality could be random).

Let y denote income. In a world without uncertainty individuals always prefer more income toless, so the following utility functions are all equivalent in the sense that they give rise to the sameindifference curves:

u(y) = ay + b, a > 0

u(y) = ey

u(y) = y3

Since each function is increasing, it indicates a preference for more income. This is all we need, ifall we want to know is how to rank incomes.

On the other hand, suppose we wish to rank income lotteries. For example, consider:

Payoff ProbabilityLottery 1: $100 0.5

0 0.5

Payoff ProbabilityLottery 2: $70 0.5

$30 0.5

In the 1940s John von Neumann and Oskar Morgenstern asked: is there some way of assigninga utility number to each possible outcome in such a way that we can compare these lotteries by

111

Page 113: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

comparing the expected utilities:

0.5× u(100) + 0.5× u(0) in case of Lottery 1

0.5× u(70) + 0.5× u(30) in case of Lottery 2

The answer is yes, (under some assumptions), although we won’t prove it. Thus, if preferencessatisfy certain conditions, then there is a utility function—call it an expected utility function—defined on the set of all possible incomes, that we can use to compare both certain incomes, (whichis trivially easy anyway), and income lotteries. The idea is that if we get the utility differncesbetween different incomes just right, then we can use the expected utility criterion to comparelotteries.

NOTE: Normally we don’t care about the gauge of a given utility function. That is, if u is a utilityfunction, then we regard v = g(u) as equivalent, provided g is a non-decreasing function.

How do you feel about Lottery 1 versus Lottery 2? Chances are, you would take Lottery 2. Thisreveals something about the shape of your expected utility function.

Figure 23.2: Concave expected utility function. u(50) > 0.5×u(30)+0.5×u(70) > 0.5×u(0)+0.5×u(100).

An expected utility function u is always increasing since more money is always better than less,(for an economist anyway). If u is linear, e.g. u(y) = ay + b, then

u(0) = b,

u(30) = 30a+ b,

u(70) = 70a+ b, and

u(100) = 100a+ b,

so clearly 0.5× u(70) + 0.5× u(30) = 0.5× u(0) + 0.5× u(100). This leads to our first result:

If the expected utility function is linear, then lotteries with equal expected utilities areconsidered equally good.

On the other hand, if you prefer Lottery 2, then your expected utility function must be concave asin Figure 23.2. If you prefer Lottery 1, this reveals that your expected utility function is convex.

112

Page 114: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

In general, it is useful to assume that people are risk-averse (gambling is an exception). We saythat a person is risk-averse if he prefers x for sure rather than x+ ε, where ε is a random variablewith E[ε] = 0:

E[u(x)] ≥ E[u(x+ ε)]

If u is concave, this equation holds. Why? For any realization of ε, say ε = εi,

u concave =⇒ u(x+ εi) ≤ u(x) + εiu′(x)

So, taking expectations over all realizations of ε,

E[u(x+ ε)] ≤ E[u(x)] + E[εu′(x)] = u(x) + u′(x)E[ε] = u(x)

since E[ε] = 0 by assumption.

113

Page 115: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

24 Uncertainty II: Expected Utility

24.1 Expected Utility

In Section 23 we introduced the idea of a special utility function, defined over nonrandom incomes,with curvature such that a consumer can use it to rank income lotteries. In particular, if an incomelottery is available that pays yi with probability pj , then it can be compared with any other lotterybased on the expected utility criterion:

E[u(y)] =∑i

piu(yi)

This function is called a von Neumann-Morgenstern utility function (vN-M), or sometimes simplyan expected utility function.

Examples:

• Linear. u(y) = ay + b, gives rise to an expected value ranking.

• Power function. u(y) = yα, where 0 < α < 1. This function is concave, so people withpreferences such as these are risk-averse.

• Exponential. u(y) = − exp(−ry), where r > 0. This function is increasing and concave, andranges from −∞ to 0. This particular function is often used in finance because if all incomelotteries are normally distributed, we get a nice ranking: for y ∼ N (µ, σ2), it can be shownthat12

E[− exp(−ry)] = − exp(−rµ+ r2σ2/2) = − exp[−r(µ− rσ2/2)]

Therefore, a lottery with mean µ and variance σ2 is assigned a value based on µ − rσ2/2.This is nice because, given µ, individuals with higher values of r assign a greater discount toa lottery with higher risk (variance).

We know that vN-M utility functions are not invariant under arbitrary transformations. If yourvN-M utility function is u(y) = αy, then you are risk-neutral, and care only about expected values.If my vN-M utility function is

v(y) =√u(y)

then mine is concave (v ∝ √y), and therefore I am risk-averse. Thus you and I evaluate lotteriesdifferently. Expected utility functions are, however, invariant under increasing linear transforma-tions. In other words, if your vN-M utility function is u(y) and mine is v(y) = au(y) + b, wherea > 0, then we evaluate lotteries the same way. To see this, consider a pair of lotteries y1 and y2.Suppose you prefer y1, i.e.

E[u(y1)] > E[u(y2)]

Then it also is true that

aE[u(y1)] + b > aE[u(y2)] + b ⇐⇒ E[au(y1) + b] > E[au(y2) + b]

12This can be shown by manipulating the moment generating function (MGF) for the normal distribution. Thereader is advised to consult a book on mathematical statistics.

114

Page 116: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

so I too prefer y1. This fact is very useful because it means we can rescale a vN-M utility functionso that the worst income realization (among a given set of lotteries) is assigned the value 0 andthe best one is assigned the value 1. To see this, imagine that we are comparing several lotteries:the worst outcome is −10, 000 and the best outcome is 250, 000. Suppose u(−10, 000) = u0 andu(250, 000) = u1. Then v(y) = au(y) + b, where

a =1

u1 − u0

b = − u0

u1 − u0

has v(−10, 000) = 0 and v(250, 000) = 1. We have seen already that v evaluates lotteries in thesame way as u, so we are better off using v instead.

Figure 24.1: Risk-neutral individual has p(1, 000)+(1−p)(−100) = 250, or p = 0.318, whereas risk-averseindividual has p250 > 0.318.

We now are in a position to describe how to derive one’s own vN-M utility function. Assume thebest possible outcome among lotteries under consideration is 1,000, and the worst is −100. We wishto assign utilities to all possible incomes ranging from −100 to 1,000. Begin by setting u(−100) = 0and u(1, 000) = 1. For any intermediate income level, e.g. 250, as yourself:

If I had to choose between 250, and a lottery in which I receive 1,000 with probabilityp, and −100 with probability 1− p, what value of p would make me indifferent?

Call this quantity p250. Clearly 0 < p250 < 1. Also, p251 > p250 (although not by much, probably).Now simply set u(250) = p250. Why does this work? By definition

p250u(1, 000) + (1− p250)u(−100) = u(250)

and we’ve normalized u so that u(1, 000) = 1 and u(−100) = 0, hence u(250) = p250. Experimentaleconomists use this idea in the lab to figure out whether a subject is more or less risk-averse. AsFigure 24.1 shows, the more convex one’s preferences, the bigger is p250, and the better the chancesof winning 1,000 have to be in order to forfeit 250 certain.

115

Page 117: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

24.2 The Demand for Insurance

We now use the expected utility function to show that if you are risk-averse, and you have access toactuarially fair insurance, then you will insure yourself fully against any risk. For example, supposeyour income is 30,000, and the probability that you will have an accident is p = 0.05. In the event ofan accident your medical bills will be 10,000. Your vN-M utillity function is u. Without insuranceyour expected utility of income is

(1− p)u(30, 000) + pu(20, 000)

How does insurance work in a simple world? An insurance contract for 1 worth of coverage is apromise by the insurance company to pay you 1 if you have an accident, and nothing otherwise. Ifthe premium, i.e. the cost to you, is π, then the expected value of the contract to the insurancecompany is

(1− p)π + p(π − 1)

With probability 1 − p, you pay the premium and nothing happens. With probability p, you paythe premium but there is a claim and therefore a benefit payment of 1. If insurance companies wererisk-neutral, they would compete for business by reducing π to the point that

(1− p)π + p(π − 1) = 0

This is so-called actuarially fair insurance: coverage of 1 is available for a premium equal to theprobability of a claim.

Suppose you buy c units of coverage at a premium of π. Your expected utility is

ϕ(c) = (1− p)u(30, 000− πc) + pu(20, 000− πc+ c)

where the function ϕ captures the value of different levels of coverage. If you choose c so as tomaximize ϕ, the FONC is

ϕ′(c) = −π(1− p)u′(3−, 000− πc) + p(1− π)u′(20, 000− πc+ c) = 0

The SOC is not a concern since

ϕ′′(c) = π2(1− p)u′′(30, 000− πc) + p(1− π)2u′′(20, 000− πc+ c)

is always negative under the assumption that you are risk-averse. (Why? u concave =⇒ u′′ < 0.)Consider the FONC carefully for π = p. In this case

u′(30, 000− pc) = u′(20, 000 + c(1− p))

If u′′ < 0 as usual, then u′ is strictly decreasing and therefore one-to-one, so

u′(x) = u′(y) ⇐⇒ x = y

hence30, 000− pc = 20, 000 + c(1− p)

or c = 10, 000!

Exercises:

116

Page 118: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

1. Redo the analysis of Section 24.2 assuming that if you buy insurance at all, you have to payan underwriting fee of f . The price per unit of coverage remains p (total cost of c units ofcoverage is pc+ f). Show that there is a number F such that

f ≤ F =⇒ you insure yourself fully

f > F =⇒ you don’t buy insurance at all

2. Redo the analysis of Section 24.2 assuming π > p, i.e. the insurance is not actuarially fair.

117

Page 119: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

25 Uncertainty III: Moral Hazard

One of the most interesting problems in markets with uncertainty is that of moral hazard, thetendency of economic agents to change their behavior inefficiently upon having entered a contractor some sort. We owe this term to the insurance industry: a policy holder who fails to exercise duecaution because he is insured is known as a moral hazard. A good example of this is a driver whorents a car and purchases the “full insurance” option. Moral hazard can arise in other contexts aswell. For example, it often is argued that welfare systems discourage those who are in the systemfrom seeking employment. In this section we analyze the demand for insurance when policyholders,through their own efforts, are capable of influencing the likelihood of an accident. We show that

1. With full insurance, policyholders have no incentive to avoid accidents.

2. A solution to the moral hazard problem involves a deductibility clause.

In particular, a high deductible generally will induce a greater level of preventive care at the cost ofinducing variability in the policy holder’s income. Thus there is a tradeoff between insurance andefficiency.

The model is simple. In each state of the world (accident/no accident) the insured has y initially.In the even of an accident he loses `. The insurance company offers to pay c in this event, in returnfor a per unit charge of π regardless. Expected utility depends on both ultimate wealth and effort xexpended on accident prevention. Assume that consumers evaluate income-effort bundles accordingto

u(ultimate income)− d(effort)

where u is an expected utility function and d represents the cost of a concerted effort to avoid anaccident. Assume d is convex, with d(0) = d′(0) = 0 as in Figure 25.1.

Figure 25.1: A representative cost-of-effort function.

The probability of an accident is p(x), where p is a decreasing function with p(0) = 0.5. We musthave p(x) > 0 and p′(x) < 0 for all x > 0. A consumer who buys c units of coverage and whoexpends x units of effort has expected utility

ϕ(c, x) = p(x)[u(y − πc− `+ c)− d(x)] + (1− p(x))[u(y − πc)− d(x)]

= p(x)u(y − `+ c(1− π)) + (1− p(x))u(y − πc)− d(x)

118

Page 120: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

Notice that since equal effort is expended whether or not there is an accident, we end up subtractingd(x) from expected utility of income. Suppose the insurance company, through vast experience,knows p(x), i.e. knows how much effort the insured will expend. If they break even, then

(1− p(x))π − p(x)(1− π) = 0 (25.1)

by the same line of reasoning as in Section 24.2, so they charge π = p(x).

The consumer views π as exogenous and chooses x so as to maximize expected utility. The FONCare13

ϕc = p(x)(1− π)u′(y − `+ c(1− π)) + π(1− p(x))u′(y − πc) ≥ 0 (25.2)

ϕx = p′(x)[u(y − `+ c(1− π))− u(y − πc)]− d′(x) = 0 (25.3)

Since the insurance company sets π such that (25.1), (25.2) may be rewritten as follows

u′(y − `+ c(1− π))− u′(y − πc) ≥ 0 (25.4)

Suppose that equality holds in (25.4), i.e. the insured gets all the coverage he wants. Then, as inSection 24.2

y − `+ c∗(1− π) = y − πc∗ ⇐⇒ c∗ = `

But, with full coverage is there any incentive to be cautious? If the insured goes out of his way tobe careful, p falls, and he saves

u(y − `+ c(1− π))− u(y − πc)

in utility. With full coverage the savings are nil: he doesn’t reap the benefit of his actions becausethe insurance company bears all of the risk. Therefore, if d′(0) = 0, the FONC are satisfied withx∗ = 0 and c∗ = `, i.e. the insured takes minimal care. Insurance companies expect this and setpremiums accordingly.

This level of care is socially inefficient because the marginal cost of care is 0 when x = 0. If theinsured were just a little bit more careful, it would cost next to nothing, yet it would result in feweraccidents and lower premiums. There is a breakdown in the usual argument about markets leadingto socially efficient outcomes because each consumer views the premium as exogenous even thoughultimately π = p(x) since, in the long run, the insurance company understands what is going on.

25.1 Solution with No Moral Hazard

Suppose the consumer recognizes that π = p(x) (this would be true if the insurance company couldmonitor her behavior). In this case her objective is to maximize

ϕ(x, c) = p(x)u(y − `+ c(1− p(x))) + (1− p(x))u(y − cp(x))− d(x)

13The “>” in (*) reflects the idea that the consumer is “rationed.”

119

Page 121: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

The FONC are

ϕc = p(x)[1− p(x)]u′(y − `+ c(1− p(x)))− p(x)[1− p(x)]u′(y − cp(x)) = 0 ⇐⇒ c = `

ϕx = p′(x)[u(y − `+ c(1− p(x)))− u(y − cp(x))]− d(x)

− cp(x)p′(x)u′(y − `+ c(1− p(x)))

− cp′(x)[1− p(x)]u′(y − cp(x))

= 0 (25.5)

Compare this to (25.3) and note that allowing premiums to vary according to effort gives rise toextra terms. Now use y − `+ c[1− p(x)] = y − cp(x) in (25.5):

− p′(x)u′(y − cp(x))` = d′(x) (25.6)

which has the following interpretation: if the insured expends more effort, the cost is d′(x), theRHS. On the other hand this reduces the likelihood of an accident by p′(x), saving ` times marginalutility of income u′(y−cp(x)), the LHS. The optimal level of caution is such that the marginal costsand marginal benefits are perfectly balanced. Note that (25.6) usually implies a level of cautiongreater than zero, unless p′(x) = 0, in which case an increase in effort doesn’t reduce the likelihoodof an accident. Notice too that the optimal solution has marginal benefit of accident preventionequal to u′(y − cp(x))× `.

25.2 A Partial Solution

How can we incentivize the insured to expend effort avoiding accidents when her efforts aren’trewarded with a lower premium? Look at equations (*) and (**). Assuming π = p,

u′(u− `+ c(1− π))− u′(y − πc) ≥ 0 (25.7)

p′(x)[u(y − πc)− u(y − `+ c(1− π))] = d′(x) (25.8)

Suppose the insurance company refuses to sell full coverage (c < `). Utility in the accident state isless than it is otherwise,

u(y − πc) > u(y − `+ c(1− π))

and there is in fact an incentive to avoid accidents. The insured prefers more coverage—becausethe LHS of (25.7) is positive—but the insurance company refuses to sell any more. The insurancecompany is instituting a deductible that the insured must pay in the event of an accident. Theamount of the deductible, `− c, influences the amount of care taken by the insured.

Let a = `− c. Then (25.8) becomes

−p′(x)[u(y − π`+ πa)− u(y − π`− a(1− π))] = d′(x)

or

∆u = u(y − π`+ πa)− u(u− π`− a(1− π)) = −d′(x)

p′(x)

See what the deductible does? It provides the insured with less income in the accident state. Nowwe can show that a higher deductible causes the insured to try even harder to avoid an accident.For example, if

p(x) = p− αx

120

Page 122: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

andd(x) = βx2

then optimal effort x∗ is such that

∆u =2βx∗

αor

x∗ =α

2β∆u

i.e. optimal effort increases with ∆u, which in turn increases with a.

121

Page 123: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

26 Uncertainty IV: The State-preference Approach and Ad-verse Selection

In this section we continue to consider the insurance problems with the following characteristics:

• An individual with income y is at risk of a losing `.

• The loss occurs with probability p.

• The individual has a vN-M utility function u(z), where z denotes (net, or ultimate) income.

• The individual has access to actuarially fair insurance with a per-unit premium of π = p.

• The individual’s expected utility is

ϕ(c) = pu(y − `+ c(1− π)) + (1− p)u(y − πc)

With c units of coverage the problem is summarized in Table 3. We now introduce a graphical way

Table 3: A summary of our assumptions regarding a policy holder.

State Probability Income Utilityaccident p y − `+ c(1− π) u(y − `+ c(1− π))no accident 1− p y − πc u(y − πc)

of analyzing a problem such as this. The approach we take is called the state-preference approach,and it applies only if p is fixed. Therefore it is less useful in moral hazard style problems where pvaries according to the endogenous variable x.

26.1 Setup

Figure 26.1: The slope of the indifference curves, on their way through the line yA = yN , is −p/(1− p).

122

Page 124: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

Think of the consumer as choosing a bundle consisting of two goods: income in the accident stateyA and income in the no-accident state yN . His expected utility is then

v(yA, yN ) = pu(yA) + (1− p)u(yN )

We can draw indifference curves in yAyN -space as in Figure 26.1. Note that in this case

MRS =v1

v2=

p

1− p× u′(yA)

u′(yN )

Along the line yA = yN ,

MRS =p

1− pThe consumer of Table 3 is represented graphically by Figure 26.2. Note that every point on the

Figure 26.2: E denotes the consumer’s endowment, her income’s in each state without insurance.

line through (y− `, y) with slope −p/(1−p) has the same expected utility. Why? Consider varyingyA and yN in such a way as to hold expected income constant:

pdyA + (1− p)dyN = 0

ordyN

dyA= − p

1− pThe consumer has access to insurance with a per-unit premium of π = p, so with c units of coveragehis incomes are

yA = y − `+ c(1− π)

yN = y − πc

As coverage increases, the consumer moves along a line with slope −π/(1 − π) since each unit ofcoverage raises income in the accident state by 1 − π and reduces income in the no-accident stateby π. But π = p, so we have Figure 26.3. Recall that on yA = yN , MRS = −p(1 − p), so, if the

123

Page 125: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

Figure 26.3: The tangency condition is satisfied by full coverage.

Figure 26.4: Here π > p, so the “budget line” is rotated about E.

consumer can buy as much insurance as he wants, then he will satisfy the tangency condition for aconstrained optimum by choosing c = `.

What happens if π > p? Have a look at Figure 26.4. Starting from the endowment (y − `, y), thebudget line has slope14 π/(1 − π) > p/(1 − p), hence the optimum lies above the line yA = yN .15

This means that when insurance companies sell policies with a “load” factor, consumers buy lessthan full coverage. Note the following:

• If π is too high as in Figure 26.5, the consumer won’t buy any coverage.

• If π < p, the consumer will over-insure as in Figure 26.6.

14Steepness, actually.15All indifference curves have slope −p/(1−p) on the line yA = yN , and u concave =⇒ v exhibits DMRS—verify

this!

124

Page 126: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

Figure 26.5: Here π is prohibitively high, so the consumer does without insurance altogether.

Figure 26.6: Here π is so low that the consumer can afford to provide himself with more income in theaccident state than he has initially, so he tends to over-insure.

26.2 Adverse Selection

We now are ready to consider the case of two types of consumers: high-risk consumers with p = pH

and low-risk consumers with p = pL. We assume that consumers know their own type but that theinsurance company cannot tell who’s who. Suppose the population is half high-risk, half low-risk,and the average level of risk is therefore given by

p =pH + pL

2

If the insurance company were to charge everyone π = p, the low-risk consumers would buy less thanfull coverage and the high-risk consumers would over-insure (or buy as much as they are allowed,up to c = `).

How might equilibrium work in this case? One possibility is a signaling equilibrium in which the

125

Page 127: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

nsurance companies offer two types of policies: one with a low premium πL = pL that requires adeductible a, another with a high premium πH = pH and no deductible. If a is high enough, thehigh-risk consumers will self-select and opt for the high-premium policy. The deductible has to besuch that they are a little worse off than they would be with a high premium and no deductible,otherwise they would masquerade as low-risk consumers.16

Figure 26.7: The low-risk consumer is in blue, and the high-risk consumer is in red.

In this model, which was described first by M. Rothschild and J. Stiglitz (QJE, 1976), the deductibleassociated with the low-risk contract serves the same purpose as extra education acquired by jobseekers in M. Spense’s job-market signalling model. The point is that low-risk consumers have alesser cost of bearing risk (deductible) because they know an accident is relatively unlikely. Thepresence of the high-risk consumers is a problem for the low-risk consumers: if they can identifythemselves credibly, they can achieve higher utility, but the only way to identify themselves crediblyis by purchasing a contract that a high-risk consumer would turn down.

Note that R-S equilibrium involves firms and consumers. Firms charge actuarially fair premiumsand therefore don’t profit in the long run. Consumers of all types are happy to make the appropriatechoices.

16This should remind the reader of price discrimination.

126

Page 128: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

27 Auctions I: Types of Auctions

Many items are sold by auction, including treasury bills, broadcasting rights, real estate, livestock,fine art, and natural resources (e.g. timber lands and oil fields). Large companies and governmentsalso use procedures that are equivalent to auctions to determine who will supply goods or servicesin some cases.

In this section and the next we examine how economists model auctions. Although auctions haveexisted for centuries, the basic theory thereof is quite modern. One good, somewhat advanced ref-erence is Paul Klemperer, “Auction Theory: A Guide to the Literature,” Journal of Economic Sur-veys Vol 13 (3), July 1999, which is available at http://www.nuff.ox.ac.uk/users/klemperer/

survey.pdf.

27.1 Basic Types of Auction

There are four basic types of auction for a single good:

1. English Auction. Also known as an “ascending bid” auction, this probably is the one withwhich you are most familiar. An auctioneer acts as moderator, and asks for bids from a groupof n bidders. If a bidder bids b(n), and no one outbids him, then he wins the auction and paysb(n) in return for the good. Note that the auctioneer may be a computer. (eBay essentiallyis an English auction arena, although each of the auctions has a time limit, which is unusualfor an English auction.)

2. Dutch Auction. Also known as a “descending bid” auction, the auctioneer calls out a de-scending sequence of prices, starting from a price that is clearly too high. The first bidder toannounce that she is willing to accept the current price, b(n), wins the auction and pays b(n)

in return for the good.

3. First-Price Sealed-Bid Auction. Bidders submit written bids. At a certain point the biddingis closed. The auctioneer then selects the highest bid b(n), which is declared the winner. Thewinner pays b(n).

4. Second-Price Sealed-Bid Auction. Also known as a “Vickery” auction, bidders again submitwritten bids and, at a certain point, the bidding is closed. The auctioneer then selects thehighest bid b(n), which is declared the winner; however, the winner pays the second highestbid b(n−1).

Auction models differ in their assumptions regarding how the value of an item at auction variesfrom one person to the next, and how much the bidders know about their own potential valuationsas well as those of other bidders. The value of the item to bidder i will be donoted vi.

We shall focus on three important cases:

1. Private Values. Each valuation is independent and known only to the bidder.

2. Common Value. vi = v for all i but v is unknown. (Examples might include an auction tosell the rights to drill for oil in a certain tract of land.)

127

Page 129: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

3. Affiliated Values. vi varies across bidders but bidders themselves do not know their ownvaluations with certainty, and the valuations are positively correlated. (Examples mightinclude an auction for a house.)

27.2 Important Results Concerning the Private Values Case

1. A Dutch auction is equivalent to a first-price sealed-bid auction.

In a Dutch auction there is no dynamic choice: one must choose an opt-in price ex ante and,if the price falls to that level, opt in and receive the good for that price. This is the sameproblem as deciding what bid to submit in a first-price sealed-bid auction. (We defer for themoment the optimal choice of bidding strategy in these auctions.)

2. In an English auction the optimal strategy is to keep bidding until the current highest bid bexceeds your valuation vi. Why?

(a) If b > vi, you are advised to walk away for otherwise, if you bid b′ > b, then the eventualwinner will pay at least b > vi.

(b) If vi > b, and you walk away, you leave a surplus vi − b > 0 “on the table.”

(c) If b = vi, then bid b+ ε and, if no one outbids you, break even.

3. In light of (2c), in an English auction the bidder with the highest valuation wins, and paysthe second highest valuation (plus a marginal amount ε needed to surpass the second highestbidder).

4. In a second-price sealed-bid auction, the optimal strategy is to bid your valuation.

Suppose your true value is v and you bid v− x, where x ≥ 0. Suppose the highest bid amongall other bidders is w.

(a) If v > w, you win and pay w.

(b) If v < w, you lose and pay nothing.

Your expected surplus,s = (v − w)P (v − x > w)

is maximized by setting x = 0!

Now suppose you bid v + x, where x and w are as before.

(a) If v + x < w, then you lose and pay nothing.

(b) If v + x > w, you win and pay w. Your surplus is v − w. There are two cases:

v ≥ w =⇒ you want to win

v < w =⇒ you don’t want to win

If you set x = 0, you win iff v > w, so x = 0 is the best choice.

5. Based on (4), in a second-price sealed-bid auction the winner is the bidder with the highestvaluation, who then pays a price equal to the second highest valuation.

128

Page 130: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

6. Results (3) and (5) imply that in the private values case an English auction is equivalent toa second-price sealed-bid auction.

7. In the common value case English auctions may be different because a bidder can makeinferences based on the identity/size of the remaining pool of bidders.

27.3 Bidding in a First-price Auction

How does one bid in a first-price auction? Let us start by making a few assumptions:

• The valuations v1, . . . , vn are independent and identically distributed (IID), with P (vi ≤ x) =

F (x) for all i. (This is sometimes written v1, . . . , vnIID∼ F , where F is the cumulative

distribution function, or CDF.)

• Each bidder adopts the same strategy and bids bi = B(vi), where B is the bid function.

What does the bid function look like?

• B is increasing for otherwise bidder with highest valuation wouldn’t necessarily win.

• B increasing =⇒ B invertible =⇒ if b = B(v), then v = g(b), where g denotes the inversebid function B−1.

Assuming each bidder bids according to B,

P (you win with bid b) = P (vj < g(b) for all other bidders j) = [F (g(b))]n−1

Let s = s(b, v) = expected surplus given bid b and valuation v. Then

s = (v − b)︸ ︷︷ ︸surplusif win

× [F (g(b))]n−1︸ ︷︷ ︸prob of winning

What is the FONC for b?

∂s

∂b= −[F (g(b))]n−1 + (v − b)(n− 1)[F (g(b))]n−2 × ∂

∂bF (g(b))

= −[F (g(b))]n−1 + (n− 1)(v − b)[F (g(b))]n−2f(g(b))g′(b)

where f = F ′ is the probability density function. Setting ∂s/∂b = 0,

[F (g(b))]n−1 = (n− 1)(v − b)[F (g(b))]n−2f(g(b))g′(b)

which implies

v − b =F (g(b))

f(g(b))× 1

g′(b)× 1

n− 1(27.1)

Note that since B′ > 0, so is g′ = 1/B′, and therefore v − b > 0. This means one should always“shade” his or her bid. Why? If you bid more, then although you win more often you also paymore. Like a monopolist, you must take this into account.

129

Page 131: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

As an example, suppose v1, . . . , vnIID∼ Uniform (0, 1), that is, suppose the valuations are indepen-

dent and identically distributed with

F (v) =

0, v < 0v, 0 ≤ v ≤ 11 v > 1

and therefore

f(v) =

{1, 0 ≤ v ≤ 10, otherwise

In this case (27.1) says

g(b)− b =1

n− 1× g(b)

g′(b)

which is a differential equation with solution

g(b) =n

n− 1b

This is easily verifiable:

g(b)− b =n

n− 1b− b =

n− (n− 1)

n− 1b =

b

n− 1=

1

n− 1×

nn−1bnn−1

=1

n− 1× g(b)

g′(b)

The bid function is recovered by setting v = g(b) and solving

v =n

n− 1b

for b = B(v),

B(v) =n− 1

nv

130

Page 132: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

28 Auctions II: Winner’s Curse

In this section we analyze the winner’s curse. The winner’s curse arises in common value andaffiliated values auctions, in which each bidder estimates the value of the item at auction. Bidderswith higher guesses, on average have made a positive error. So bidders must shade their bids tocompensate for the fact that when they win, on average it is the result of being overly optimistic.

A very simple example of the winner’s curse is the following: a police car is to be sold at auctionusing a second-price sealed-bid system. Each bidder inspects the car, then the bidding begins.

The true value of the car is v, a random variable with mean µ and variance σ2v . This reflects the

idea that over many auctions, the average value of an old police car is µ. But there is variabilityfrom one car to the next, captured by σ2

v .

Each bidder hires a mechanic to estimate the value of the car. The mechanic reports his or herestimated value ti = v + εi, where εi is the error in the mechanic’s assessment. A bidder doesn’tknow the values reported to his competitors by their respective mechanics.

We assume ε1, . . . , εnIID∼ Fε, with mean zero and variance σ2

ε . Note that the larger is σ2ε relative

to σ2V , the “noiser” the mechanics’ reports.

Based on ti, the ith bidder estimates the true value of the car. In particular, the bidder forms anestimate

yi = λti + (1− λ)µ

The idea behind this is that if ti is significantly noisy, the bidder should downweight the mechanic’sreport and assume instead that the average value at auction is more credible. Note that

E[yi] = λE[ti] + (1− λ)µ = λE[v + εi] + (1− λ)µ = λµ+ (1− λ)µ = µ

What is the optimal value of λ? The forecast error for a given value of λ is

δi = yi − v= λti + (1− λ)µ− v= λ(v + εi) + (1− λ)µ− v= (1− λ)(µ− v) + λεi

The variance of the forecast error is

V[δi] = V[(1− λ)(µ− v) + λεi]

= (1− λ)2V[v] + λ2V[εi]

= (1− λ)2σ2v + λ2σ2

ε

We choose λ to minimize the variance of the forecast error. The FONC is

∂V[δi]

∂λ= −2(1− λ)σ2

v + 2λσ2ε = 0

which implies (1− λ)σ2v = σ2

ε , or

λ = λ∗ =σ2v

σ2v + σ2

ε

131

Page 133: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

You may have seen this before: λ∗ is the signal-to-total-variance ratio. If σ2ε is small, then λ∗ is

nearly one, and the result is a weighted average with more weight on the mechanic’s report.

Based on the mechanic’s report, plus the optimal choice of λ = λ∗, each bidder now has a goodidea as to the value of the car in the current auction. Since it is a second-price auction, one mightthink each bidder simply should bid

y∗i = λ∗ti + (1− λ∗)µ

But this will give rise to a winner’s curse! The highest bidder wins the auction—this is the bidderwhose mechanic made the biggest positive error. He pays the amount of the second highest bid,which includes the second biggest positive error. So, even in a second-price auction, one must takeinto account the fact that on average the second highest bidder also was overly optimistic, andshade his or her bid.

In an auction with 10 bidders, if ε1, . . . , εnIID∼ N (0, 1), i.e. if the errors are normally distributed

with mean zero and variance one,17 then the expected value of the second highest error ε(n−1) isapproximately 1.003. Table 4 lists a few other cases. As you can see, E[ε(n−1)] grows with n. This

Table 4: Expected Second Highest Error for Various Numbers of Bidders

n E[ε(n−1)]10 1.00125 1.52435 1.692100 2.148

might cause you to worry a little about eBay: with thousands of bidders on a given item, if you donot know what the item is worth to you, then you ought to shade your bid quite a bit. But whatshould you bid?

Suppose each bidder shades his or her bid by k:

yi = λ∗ti + (1− λ∗)µ− k= λ∗v + (1− λ∗)µ+ λ∗εi − k

Let ε = E[ε(n−1)]. (This quantity usually is estimated by computer simulation.18) The expectedbid of the second-highest bidder is

λ∗v + (1− λ∗)µ+ λ∗ε− k.

So, if everyone sets k = λ∗ε, then each bidder should expect to pay

E[λ∗v + (1− λ∗)µ] = µ

in the event he wins, which means that on average he pays what the item is worth.

Bottom line: in the second-price auction we have set up, each bidder bids what she believes theitem is worth based on her information, minus a discount that is equal to the expected second-highesterror in the observed signal. Note that in the real world bidders hire consultants to simulate theauction (by making educated guesses as to σ2

ε and σ2v).

17See Figure 28.1.18See Appendix 28.1

132

Page 134: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

−4 −2 0 2 4

0.0

0.2

0.4

0.6

0.8

1.0

Standard Normal Distribution

x

Figure 28.1: Standard normal distribution with PDF in red and CDF in black.

28.1 Appendix: Order Statistics

Let X1, . . . , XnIID∼ FX , and let Yk = X(n−k+1) deonte the kth biggest observation, 1 ≤ k ≤ n.

Then

FY2(x) = P (exactly one observation is greater than x) + P (all observations are at most x)

= n[1− FX(x)]× [FX(x)]n−1 + [FX(x)]n

= n[FX(x)]n−1 − n[FX(x)]n + [FX(x)]n

= n[FX(x)]n−1 − (n− 1)[FX(x)]n

and therefore

fY2(x) = F ′Y2(x)

= n(n− 1)[FX(x)]n−2fX(x)− n(n− 1)[FX(x)]n−1fX(x)

= n(n− 1)[FX(x)]n−2fX(x)SX(x)

where SX(x) is defined to be 1− FX(x). Then

E[Y2] = n(n− 1)

∫Rx[FX(x)]n−2fX(x)SX(x)dx

133

Page 135: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

28.1.1 Uniform Distribution

If X1, . . . , XnIID∼ Uniform (0, 1), then

E[Y2] = n(n− 1)

∫ 1

0

xn−1(1− x)dx

= n(n− 1)

(1

n− 1

n+ 1

)=n− 1

n+ 1

28.1.2 Normal Distribution

If X1, . . . , XnIID∼ N (0, 1), then

E[Y2] = n(n− 1)

∫Rx[Φ(x)]n−2[1− Φ(x)]ϕ(x)dx

where Φ is the standard normal CDF and

ϕ(x) = Φ′(x) =1√2πe−x

2/2

This can be computed by Gaussian quadrature with the following R19 function:

ESB.gq = function(n, CDF, PDF){

# - computes approx. EV of 2nd biggest observation among n IID draws from dist. "CDF"

# - corresponding density is "PDF"

f = function(x){x*n*(n-1)*(CDF(x))^(n-2)*PDF(x)*(1 - CDF(x))}

# - "f" is density of 2nd biggest observation

integrate(f, -Inf, Inf)$value

}

As a check, we may approximate E[Y2] for n = 100 by simulation. The following R script returnsE[Y2] ≈ 2.148444, which agrees fairly well with the previous result.

ESB.sim = function(n,B){

# - computes EV of 2nd biggest observation among n IID draws from standard normal

x = 0

for(i in 1:B) x = x + sort(rnorm(n))[n-1]

x/B

}

print(ESB.sim(100,1e06))

19R is a programming language for statistical computing that is available free of charge at http://www.r-project.org.

134

Page 136: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

29 Finance I: Capital Asset Pricing Model

In this section we consider the implications of the simple assumption that investors hold only port-folios of assets that are mean-variance efficient. This turns out to have the surprising consequencethat the price of a stock, i.e. the price of a share in a publicly traded company, depends on thecovariance between the return on the stock and the return on the market as a whole. This resultwas discovered in the early 1960s by William Sharpe, who shared the Nobel Price in Economics onthe basis of his work. The theoretical model is called the Capital Asset Pricing Model (CAPM).

29.1 Assumptions

We shall assume that investors can choose among a set of assets, i = 1, . . . , n. An investor with xto invest selects a portfolio, or in other words a list of the amount invested in each of the possibleassets. Denote by αi the share of x in asset i. Note that αi ≥ 0 for all i, and

∑ni=1 αi = 1. You

can think of any vector α ∈ Rn with non-negative elements that add up to one as a portfolio.

We shall assume for simplicity that our investor has a one-period horizon, or holding period. Aninvestment of $1 in asset i will be worth 1 + Ri at the end of the holding period. Thus Ri is theproportional return on asset i over the holding period. Note that Ri ≥ −1 if assets have limitedliability for in that case the worst event would be to lose one’s entire investment. R1, . . . , Rn arerandom variables. The mean of Ri is ri = E[Ri], and the variance is σ2

i . Asset i is said to beriskier than asset j if σ2

i > σ2j . We do not assume that the returns on the respective assets are

independent. Instead, we assume there to be potential covariances

σ2ij = Cov[Ri, Rj ] = E[(Ri − ri)(Rj − rj)]

Note that the covariance of the return on asset i, with itself, is

E[(Ri − ri)2] = V[Ri] = σ2i

so we shall occasionally write σ2ii instead of σ2

i .

Recall20 the following: Given two random variablesX and Y , and observations (X1, Y1), . . . , (Xn, Yn),if you are interested in a line of best fit revealing the dependence of Y upon X, then you carry outa linear regression of Y on X. This procedure returns the least squares estimates α and β in thelinear model

Yi = α+ βXi + εi

where ε1, . . . , εn are the residual errors. The optimal coefficients are found by minimizing theresidual sum of squares

n∑i=1

ε2i =

n∑i=1

(α+ βXi − Yi)2

20If the reader has not taken econometrics or statistics, then this may not look familiar.

135

Page 137: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

This can be done by the same method we used in Section 28 to analyze the winner’s curse, andgives

β =Cov[X,Y ]

V[X]=σ2XY

σ2X

Now if the investor with x selects a portfolio α = (α1, . . . , αn), then how much will he have at theend of the holding period?

• The total amount invested in asset i is αix.

• At the end of the holding period this investment is worth αix(1 +Ri).

• The investor now hasn∑i=1

αix(1 +Ri) = x

(n∑i=1

αi +

n∑i=1

αiRi

)= x

(1 +

n∑i=1

αiRi

)

The return on the portfolio is

R =x (1 +

∑ni=1 αiRi)− xx

=

n∑i=1

αiRi

which is a weighted average of the returns on the respective assets, with each weight equal to thecorresponding share. The expected return is

E[R] = E

[n∑i=1

αiRi

]=

n∑i=1

αiri

What is the variance of R?

V[R] = E

( n∑i=1

αiRi −n∑i=1

αiri

)2

= E

( n∑i=1

αi(Ri − ri)

)2

= E

n∑i=1

n∑j=1

αiαj(Ri − ri)(Rj − rj)

=

n∑i=1

n∑j=1

αiαjE[(Ri − ri)(Rj − rj)]

=

n∑i=1

n∑j=1

αiαjσ2ij

=

n∑i=1

α2iσ

2ii +

∑1≤i≤n

∑1≤j≤nj 6=i

αiαjσ2ij

136

Page 138: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

The CAPM hinges on two assumptions:

1. There exists a risk-free asset—call it asset 1.

2. The market portfolio—the portfolio consisting of equal amounts of all shares—is mean-varianceefficient in the sense that no other portfolio realizes the same return but with lesser variance.

29.2 Conclusion

Consider an efficient portfolio αP = (αP1 , . . . , αPn ), with return RP =

∑ni=1 α

Pi Ri, and which has

the least possible variance subject to yielding an expected rate of return of rP = ERP . Such aportfolio solves

minα

n∑i=1

n∑j=1

αiαjσ2ij s.t.

n∑i=1

αiri = rP and

n∑i=1

αi = 1

The Lagrangian is

L(α, λ, µ) =

n∑i=1

n∑j=1

αiαjσ2ij − λ

(n∑i=1

αiri − rP

)− µ

(n∑i=1

αi − 1

)

FONC w.r.t. α:

LαPj = 2

n∑i=1

αPi σ2ij − λri − µ = 0, 1 ≤ j ≤ n (29.1)

In particular, (29.1) holds for asset 1; however, asset 1 is risk-free by assumption, and thereforeσ2

1i = 0 for all i. Thus, for asset 1, (29.1) implies

µ = −λr1 = −λr,

where r deontes the risk-free rate of return. Hence we can rewrite (29.1) as follows:

2

n∑i=1

αPi σ2ij = λ(rj − r), 1 ≤ j ≤ n (29.2)

Multiplying by αPj ,

2αPj

n∑i=1

αPi σ2ij = αPj λ(rj − r), 1 ≤ j ≤ n

and summing across all assets,

2

n∑j=1

αPj

n∑i=1

αPi σ2ij =

n∑j=1

αPj λ(rj − r)

which is equivalent to

2

n∑j=1

n∑i=1

αPi αPj σ

2ij = λ

n∑j=1

αPj rj − r

137

Page 139: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

orn∑j=1

n∑i=1

αPi αPj σ

2ij = λ(rP − r) (29.3)

since∑nj=1 α

Pj rj = rP . Let σ2

P =∑nj=1

∑ni=1 α

Pi α

Pj σ

2ij denote the minimum variance of the

portfolio with expected return rP . Then (29.3) says 2σ2P = λ(rP − r), or

λ =2σ2

P

rP − r

Plugging the result back into (29.2),

rj − r = (rP − r)n∑i=1

αPiσ2ij

σ2P

(29.4)

Finally, notice that

n∑i=1

αPi σ2ij = E

[(Rj − rj)

n∑i=1

αPi (Ri − ri)

]

= Cov

[Rj ,

n∑i=1

αPi Ri

]= Cov [Rj , RP ]

In other words,∑ni=1 α

Pi σ

2ij is the covariance of the return on asset j, with the return on the efficient

portfolio that has expected return rP . We can rewrite (29.4) as follows:

rj − r = (rP − r)Cov[Rj , RP ]

σ2P

(29.5)

Suppose now that (29.5) holds for the return on the market portfolio, RM . Let σ2M denote VRM .

Then it follows by (29.5) thatrj = r + β(rM − r)

where

β =Cov[Rj , RM ]

σ2M

is the regression coefficient one would get by carrying out linear regression of the return on asset j,on the return on the market portfolio. This regression coefficient is called the asset’s beta.

29.3 Summary

An asset’s beta measures the amount of systematic risk the asset carries. E.g. if β > 1, then theasset is expected to outperform the market in good times and perform worse than the market inbad times. This is risky! For an asset with β = 1.5, in a market with r = 0.05 and rM = 0.13, anexpected rate of return of

0.05 + 1.5× (0.13− 0.05) = 0.17

would be considered fair compensation for assuming the risk associated with this asset.

138

Page 140: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

30 Finance II: Efficient Market Hypothesis

30.1 Review

In Section 29 we considered the implications of the CAPM assumptions with respect to the returnon a risky asset i. In particular, if asset i has random return Ri and beta β, (i.e. if, on average,when RM is 1% higher/lower than usual, then Ri is β% higher/lower than usual), then ri = E[Ri]is required to satisfy

ri = π0 + βπ1 (30.1)

where, in the CAPM, π0 = r is the risk-free rate of return and π1 = rM − r is the excess return onthe market portfolio.

Researchers in the 1970s and 80s attempted to test the CAPM by estimating betas for classes ofassets and checking whether, on average, assets with bigger betas had higher expected returns. Thiswas not always successful. These days, economists interpret the model less literally. Often, theyaugment the model with additional factors. The original CAPM says that all one needs to knowabout an asset is its covariance with the market. A more agnostic view is that while beta matters,there may be additional considerations. So it is common to see models such as the following:

ri = π0 + βπ1 + γπ2

where γ is some other factor. A typical factor is one that reflects how a given asset covaries with aportfolio comprised of small stocks, or with a portfolio made up of bonds as opposed to stocks.

One key use of (30.1) is the determination of how to discount returns on different assets. Forexample, if we are dealing with an asset that is priced at P0 in the current period, and will sell forP1 in the next, then the expected return on the asset is (E[P1]−P0)/P0. If the asset has beta equalto β, then under the CAPM the expected return is π0 + βπ1. Thus we have

E[P1]

P0− 1 = π0 + βπ1 =⇒ P0 =

E[P1]

1 + π0 + βπ1

This very simple equation has many immediate implications.

30.2 Efficient Market Hypothesis

Consider a very short holding period, e.g. one week. The discounting is negligible, thereby giving

P0 = E[P1] (30.2)

This means the price today has to be the expected value of the price next week. A stochastic processis simply a sequence of random variables X1, X2, . . .

Examples:

• The height of the Nile River at a given location on June 1, some year onward.

• The closing price of the S&P 500 on Friday, some week onward.

139

Page 141: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

Roughly speaking, a random walk is a stochastic process with the additional property that

E[Xt|Xt−1, . . . , X1] = Xt−1

i.e. the best forecast of the value in the next period is the current value. So people sometimes saythat asset prices constitute a random walk.

Equation (30.2) is sometimes called the efficient market hypothesis (EMH). The key insight in thisequation is that all the information we have to forecast the value of the asset tomorrow is factoredinto the current value. As an example, suppose you think a share of Google stock will be worth x insix weeks. Then you should be willing to pay almost x for the stock now, (subject to the discountfactor only).

Suppose (30.2) holds for a stock. The realized gain from buying the stock today and selling it inthe next period is

P1 − P0 = P1 − E[P1]

But the deviation of a random variable from its expected value is unpredictable. This means thattechniques such as drawing charts, etc., (so called “technical analysis”), cannot work!

Suppose new information is revealed about an asset. Take, for example, news concerning a drugcompany such as Merck: the news could be regarding a prospective drug that is being evaluatedin a randomized clinical trial, or the discovery of side effects associated with an existing drug, ora decision by the FDA, etc. Equation (30.2) says the news—whatever it may be—should cause aninstantaneous adjustment of the stock price, up or down. Likewise, news with implications for theentire economy, e.g. the results of a Federal Reserve “Open Market” meeting, should cause themarket as a whole to adjust, up or down, instantaneously, as people adjust their expectations.

This leads to the idea of an event study. If one is trying to evaluate the effect of news on the valueof a firm, one looks at the excess returns on the firm’s stock:

XRt =Pt − Pt−1

Pt−1− βMt −Mt−1

Mt−1

where Pt is the value of a share in the firm under consideration, at closing on day t, and Mt is thevalue of the market index at the same time.

The cumulative excess return is the sum of the excess returns over some horizon:

CXRt =

t∑i=1

XRi

where period zero is some time prior to the breaking of the news, e.g. 7–14 days beforehand. Onethen plots CXRt, 1 ≤ t ≤ T , where T is several days after the breaking of the news. Ideally oneshould observe random fluctuations before and after the news, with a “jump” on the day of thenews.21

An implication of the EMH is that on average there is no advantage to following the suggestionsof advisors (at least, adjusting for the excess risk of the portfolios they recommend). It is widely

21A good reference on the topic is W. Craig McKinley, “Event Studies in Finance and Economics,” Journal ofEconomic Literature, March 1997.

140

Page 142: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

Figure 30.1: Plot of cumulative abnormal return for earning announcements from event day –20 to eventday 20. The abnormal return is calculated using the market model as the normal returnmeasure. Source: McKinley (1997).

believed that the high returns reported by some funds in some periods are merely strings of luck.Tables 5–7, (drawn from a paper by Burton Malkiel, “The Efficient Market Hypothesis and ItsCritics,” Journal of Economic Perspectives, Winter 2003), demonstrate this idea.

141

Page 143: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

Table 5: Percentage of Large Capitalization Equity Funds Outperformed by Index Ending 6/30/2002

1 year 3 years 5 years 10 yearsS&P 500 vs. Large Cap Equity Funds 63% 56% 70% 79%Wilshire 5000 vs. Large Cap Equity Funds 72% 64% 69% 74%

Note: All large capitalization mutual funds in existence are covered with the exception of “sector”funds and funds investing in foreign securities.

Source: Lipper Analytic Services.

Table 6: Median Total Returns Ending 12/31/2001

10 years 15 years 20 yearsLarge Cap Equity Funds 10.98% 11.95% 13.42%S&P 500 Index 12.94% 13.74% 15.24%

Source: Lipper Analytic Services, Wilshire Associates, Standard &Poor’s and The Vanguard Group.

Table 7: Getting Burned by Hot Funds

1998 – 1999 2000 – 2001Average AverageAnnual Annual

Fund Name Rank Return Rank ReturnVan Wagoner:Emrg Growth 1 105.52 1106 −43.54Rydex:OTC Fund;Inv 2 93.43 1103 −36.31TCW Galileo:AGr Eq;Instl 3 92.78 1098 −34.00RS Inv:Emrg Growth 4 90.19 1055 −26.17PBHG:Large Cap 20 5 84.56 1078 −29.03Janus Olympus Fund 6 77.24 1061 −27.03Van Kampen Aggr Gro;A 7 76.70 1067 −28.04Janus Mercury 8 76.31 1057 −26.35PBHG:Sel Equity 9 76.21 1097 −33.19WM:Growth;A 10 74.77 1046 −25.82Berger new Generation;Inv 11 73.31 1107 −45.96Janus Enterprise 12 72.28 1101 −35.40Janus Venture 13 72.22 1091 −30.89Fidelity Aggr Growth 14 70.56 1105 −38.02Janus Twenty 15 69.09 1090 −30.83Amer Cent:New Oppty 16 67.64 1033 −24.11Morg Stan Sm Cap Gro;B 17 66.59 1102 −35.96Van Kampen Emrg Gro;A 18 65.67 1021 −22.70TCW Galileo:SC Gro;Instl 19 64.87 1099 −34.77Black Rock:Md Cap Gro;Instl 20 64.44 1009 −22.18Average Fund Return 76.72 −31.52S&P 500 Return 24.75 −10.50

Source: Analytic Services and Bogle Research Institute, Valley Forge, PA.

142

Page 144: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

31 Public and Near-public Goods

A pure public good is one such as public radio, with two properties:

1. The amount of the good consumed by one person has no effect on its availability to others.(This is called the no rivalry, or no congestion condition.)

2. A person cannot be prevented from consuming the good. (This is called the non-exclusionarycondition.)

Sometimes condition (1) is true while (2) is not. This is arguably the case with intellectual propertydistributed via the internet. (If I download a song or a software program, my use does not affectanyone else’s use.)

Additional examples of near public goods:

• parks and wildlife reserves (although in some cases these can become congested)

• national defense.

There are many goods/services that are widely thought of as public goods yet really aren’t, e.g.schools, which are subject to congestion and also are excludable.

31.1 Optimal Provision of Goods with No-rivalry Characteristics

Consider a public good which comes in various amounts. Let x be the amount provided, at a costof p dollars per unit.

An economy has n consumers, i = 1, . . . , n. Consumer i has income yi, and pays a tax ti toward thepurchase of the public good. Additionally, consumer i has utility given by ui(ci, x) = ui(yi − ti, x).

31.1.1 Case 1: one consumer; x = t1/p.

The objective ismaxt1

u1(y1 − t1, t1/p)

FONC:

−u1c(y1 − t1, t1/p) +

1

pu1x(y1 − t1, t1/p) = 0 =⇒ u1

x(y1 − t1, t1/p)u1c(y1 − t1, t1/p)

= p

i.e. MRS1(y1 − t1, t1/p) = p. Recall that MRS1 is consumer 1’s willingness to pay for the lastunit of the public good x, in units of consumption c, or dollars.

31.1.2 Case 2: two consumers; x = (t1 + t2)/p.

The objective is:

maxt1,t2

u1(y1 − t1, (t1 + t2)/p) s.t. u2(y2 − t2, (t1 + t2)/p) ≥ k2

143

Page 145: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

Why? A social optimum must maximize consumer 1’s utility subject to consumer 2’s current utility.Such an outcome is called Pareto optimal. (If Pareto optimality fails to hold, then we could re-allocate resources with the end result that both consumers are better off.) Varying k2 traces out afull range of potential social optima.

The Lagrangian is:

L(t1, t2, λ; k2) = u1(y1 − t1, (t1 + t2)/p) + λ[u2(y2 − t2, (t1 + t2)/p)− k2

](In what follows we shall occasionally omit functional dependencies for the sake of notationalsimplicity.) Let v1 be the maximum value of u1 s.t. u2 ≥ k2. We know by the Envelope Theoremthat ∂v1/∂k2 = ∂L/∂k2 = −λ, so λ > 0. A higher value of λ assigns greater weight to consumer2’s outcome.

FONC:

Lt1 = −u1c +

1

pu1x +

λ

pu2x = 0

Lt2 = −λu2c +

1

pu1x +

λ

pu2x = 0

Note that the second and third terms in each of the above equations are the same, so we get

u1c = λu2

c ,

or λ = u1c/u

2c . The intuition behind this is that the social planner can rearrange taxes on consumers

1 and 2 while keeping x constant. If consumer 1 pays one less tax dollar, his utility increases byu1c . Likewise, if consumer 2 pays one less tax dollar, her utility increases by u2

c . At the optimum, again of one unit in consumer 2’s utility corresponds to a gain of λ in consumer 1’s utility.

The first of the FONC can be rewritten as follows:

u1c =

1

pu1x +

λ

pu2x

=1

pu1x +

u1c/u

2c

pu2x

=1

p

[u1x + u1

c

(u2x

u2c

)]=⇒ u1

x

u1c

+u2x

u2c

= p

orMRS1 +MRS2 = p

This means the optimal choice of x has the property that p equals the aggregate willingness to pay!

144

Page 146: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

31.1.3 Case 3: n consumers; x = τ/p, where τ =∑ni=1 ti.

The objective is

maxt1,...,tn

u1(y1 − t1, τ/p) s.t.

u2(y2 − t2, τ/p) ≥ k2

u3(y3 − t3, τ/p) ≥ k3

...un(yn − tn, τ/p) ≥ kn

This is the n-consumer version of Pareto optimality. The optimal choice of taxes is the one thatmaximizes consumer 1’s utility subject to minimum levels of utility for the other n− 1 consumers.The Lagrangian is

L = u1(y1 − t1, τ/p) +

n∑i=2

λi[ui(yi − ti, τ/p)− ki].

For convenience define λ1 = 1 and k1 = 0. Then

L =

n∑i=1

λi[ui(yi − ti, τ/p)− ki]

FONC:

Lti = −λiuic +1

p

n∑i=1

λiuix = 0, 1 ≤ i ≤ n

Note that the sum is constant with respect to i, so we must have

λ1u1c = λ2u

2c = · · · = λnu

nc

In particular,u1c = λiu

ic, 2 ≤ i ≤ n

and thus

λi =u1c

uic, 2 ≤ i ≤ n

Putting the last result back into the first of the FONC gives

u1c =

1

p

n∑i=1

(u1c

uic

)uix

Dividing by u1c and multiplying by p, we see that

p =

n∑i=1

u1x

u1c

=

n∑i=1

MRSi

As in the case of two consumers, p equals the aggregate willingness to pay.

Implications:

145

Page 147: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

• For a non-rivalrous good, the optimal provision of the good has the property that the marginalcost p equals the aggregate willingness to pay. This is called the Samuelson condition becauseit was derived by the great American economist Paul Samuelson in 1954.

• A simple market mechanism will not necessarily achieve the optimality condition. With non-excludable goods, in fact, it is hard to see why anyone is willing to contribute voluntarily,(although people do). Thus, the provision of pure public goods usually is left to politicalmechanisms.

• With excludable goods such as proprietary software, a per-user fee may be reasonable. Notethat the producer receives the sum of the user fees.

• For questions such as how much to invest in wilderness areas, some suggest polling the publicand asking how much people would be willing to pay to expand/protect the wilderness versusselling it off. This practice is controversial because it’s unclear whether those polled under-stand the questions, or tell the truth. Moreover, goods such as wilderness areas are valued ina passive way since most people never will experience them first hand. Unlike ordinary con-sumer goods, there is no observable behavior that can be traced back to a person’s willingnessto pay. Despite these issues, this method, known as contingent valuation, was used to valuethe environmental damage—or lost passive use—caused by the Exxon Valdez oil spill.

31.2 Appendix: Social Optimum with Ordinary Goods

You may be wondering how the idea of a social optimum works with ordinary goods. Let’s considerthe decision how to allocate an ordinary good x. The government collects a tax ti from the ithconsumer, and allocates to the consumer xi units of the good. The budget constraint for thegovernment in this case is τ = pχ, where τ =

∑ni=1 ti, χ =

∑ni=1 xi, and p is the price of x.

Assume, as before, that consumer i has income yi and uses his or her after-tax income to buyci = yi − ti units of the numeraire good.

The objective is

maxt1,...,tn,x1,...,xn

u1(y1 − t1, x1) s.t.

u2(y2 − t2, x2) ≥ k2

u3(y3 − t3, x3) ≥ k3

...un(yn − tn, xn) ≥ knτ = pχ

The Lagrangian is

L = u1(y1 − t1, x1) +

n∑2

λi[ui(yi − ti, xi)− ki] + µ(τ − pχ)

Once again define λ1 = 1 and k1 = 0 so that

L =

n∑i=1

λi[ui(yi − ti, xi)− ki] + µ(τ − pχ)

146

Page 148: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

FONC:

Lti = −λiuic + µ = 0, 1 ≤ i ≤ nLxi = λiu

ix + µp, 1 ≤ i ≤ n

The first collection of FONC implies

u1c = λiu

ic, 2 ≤ i ≤ n

or

λi =u1c

uic, 2 ≤ i ≤ n

Combining these results with the second collection of FONC gives u1x = µp, or, equivalently, p =

u1x/u

1c = MRS1, and

λiuix = µp

=⇒(u1c

uic

)uix = u1

cp

=⇒ p =uixuic

= MRSi, 1 ≤ i ≤ n

Thus at a social optimum we have

MRSi = p, 1 ≤ i ≤ n (31.1)

Note that this is the same condition that would result from opening a market in good x, andcharging p dollars per unit of x. However, in order to reach a particular social optimum we wouldhave to redistribute income via our choice of t1, . . . , tn.

It is possible to show the following:

• Any particular social optimum can be achieved by opening a free market in good x, andredistributing income via taxes.

• For any given distribution of income, setting all taxes equal to zero achieves one possiblePareto optimum. This may not be the one that people particularly like—it will result inhighest utility for the person with highest income—but it is nonetheless efficient in the sensethat it satisfies (31.1).

147

Page 149: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

32 Externalities

Externalities arise when the consumption or production of a good by one economic agent causes aside effect for others. Examples include air pollution caused by burning fossil fuels, the playing ofloud music, etc. Externalities can be positive as well: a classic example is bees, which are neededto pollinate fruit trees!

This secion deals primarily with air pollution, which is like a public good to the extent that airquality affects the entire population of an area.

32.1 Consumption Externalities

We shall use an extended version of the model used in our analysis of public goods. Assume thatconsumers care about three things:

• consumption of a basic, numeraire good c

• consumption of a good x, with an externality

• the level z of the externality

Think of x as gasoline and z as the amount of smog in the air. Consider an economy with nconsumers, i = 1, . . . , n. Consumer i has income yi and with it consumes ci and xi. The level z ofthe externality is determined by the total consumption of x:

z = αχ

where χ =∑ni=1 xi and α is the amount of smog produced per gallon of gas used. Let p denote the

price—and the marginal cost—of x. The utility of consumer i is given by

ui(ci, xi, z) = ui(yi − pxi, xi, αχ)

We are assuming that uic > 0, uix > 0, and uiz < 0, i.e. z is bad. Notice the similarity between zand the public goods we studied previously: consumer i’s “consumption” of z has no effect on theamount of z “available” to others.

32.1.1 Market Equilibrium

Consumer 1 takes p as given, and while he realizes z = α∑ni=1 xi, he also takes x2, . . . , xn (gas

consumption of others) as given. His objective is

maxx1

u1(y1 − px1, x1, αχ)

FONC:

−pu1c +

1

pu1x + αu1

z = 0 =⇒ u1x

u1c︸︷︷︸

MRSi(x,c)

= p− αu1z

u1c

148

Page 150: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

In general, a consumer is advaised to set her MRS—for x relative to c—equal to p − αuiz/uic. Ifuiz < 0, then p−αu1

z/U1c > p, so the consumer acts as if the price of x is actually higher. The price

difference αuiz/uic is α, (the rate of production of z per unit x), times the marginal willingness to

pay for clean air, uiz/uic.

32.1.2 Social Optimum

A social planner has to allocate x and collect taxes ti, (i = 1, . . . , n) that balance the government’scosts: τ = pχ, where τ =

∑ni=1 ti and χ =

∑ni=1 xi. As before, we look for Pareto omptimal

outcomes. The social planner’s objective is:

maxt1,...,tnx1,...,xn

u1(y1 − t1, x1, αχ) s.t.

u2(y2 − t2, x2, αχ) ≥ k2

...un(yn − tn, xn, αχ) ≥ knτ = pχ

Define λ1 = 1, k1 = 0. The Lagrangian is

L =

n∑i=1

[λiui(yi − ti, xi, αχ)− ki] + µ(τ − pχ)

FONC:

Lti = −λiuic + µ = 0, 1 ≤ i ≤ n (32.1)

Lxi = uix + α

n∑i=1

λiuiz − µp = 0, 1 ≤ i ≤ n (32.2)

Equations (32.1) implyµ = λiu

ic, 1 ≤ i ≤ n

and in particularµ = u1

c (32.3)

As a consequence,

λi =u1c

uic, 1 ≤ i ≤ n (32.4)

149

Page 151: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

Equations (32.2) imply

λiuix = µp− α

n∑i=1

λiuiz

=⇒ λiµuix = p− α

n∑i=1

λiµuiz

=⇒ λiu1c

uix = p− αn∑i=1

λiu1c

uiz by (32.3)

=⇒ u1c/u

ic

u1c

uix = p− αn∑i=1

u1c/u

ic

u1c

uiz by (32.4)

=⇒ uixuic︸︷︷︸

MRSi(x,c)

= p− αn∑i=1

uizuic, 1 ≤ i ≤ n

This means everyone has to setMRS = p+ τ

where τ = −αν, and ν =∑ni=1 u

iz/u

ic is the aggregate marginal willingness to pay for clean air.

32.1.3 Market Equilibrium versus Social Optimum

Market Eq: MRSi(x, c) = p− αuiz/uicSocial Opt: MRSi(x, c) = p− α

∑ni=1 u

iz/u

ic = p+ τ

So, in the social optimum, consumer i takes account of the effect of her gas consumption on everyoneelse whereas in the market equilibrium she cares only about herself.

The sum p + τ is the social marginal cost of consuming gas. It exceeds the private cost p if α isnon-zero, and if there is some value to clean air, (which obviously is the case if uiz/u

ic ≤ 0 for all

i). In the real world, α is very small but n is very big, so while αuiz/uic is negligible, τ can be

significant.

In the 1920s the English economist Arthur C. Pigou figured out that one can “correct” an externalityby taxing the activity that creates it, with a tax τ . We have shown that the optimal Pigouvian taxfor a consumption externality that affects the entire population is

τ = α∑i

{consumer i’s willingness to pay for marginal reduction in externality}.

32.1.4 Other Examples

• Taxes for “wear and tear” on the road. The usual justification for a gas tax—apart fromthe air pollution effect—is that driving causes the roadways to deteriorate. If the wear andtear caused by a given car is proportional to the car’s gas mileage, a Pigouvian tax on gas issensible.

150

Page 152: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

• Taxes on cigarettes are sometimes justified because they are a tax on second hand smoke.

• Some people have proposed a tax on foods that cause obesity. This is a more complicatedcase but the basis of their argument is that health care costs for those over 65, (which iswhen most costs are incurred), are heavily subsidized through Medicare. Thus, if someoneeats too much and as a result winds up with diabetes later in life, this person contributes tothe Medicare bill, which we all pay.

32.2 Production Externalities

We will restrict our attention to a very simple example of a production externality. The exampleis motivated by the electric power industry, which in most places uses coal to create electricity.

Assume there are n plants, i = 1, . . . , n. Plant i has cost function ci(si, yi), where yi is the amountof electricity (kWh) produced, and si is a choice variable representing the choice of factors thataffect the amount of SO2 produced. For example, si could represent the choice of what type of coalto use (more expensive coal from the Western US, which burns cleaner, versus cheaper coal fromthe East), or the choice of what kind of scrubber to install. The amount of SO2 emitted by theplant is

zi = yiαi(si)

where α′i(si) < 0 and α′′i (si) > 0, i.e. αi is decreasing and convex as in Figure 32.1.

Figure 32.1: α′i(si) < 0 and α′′i (si) > 0.

Let ν be the aggregate willingness to pay to avoid SO2—across the entire population, not onlythe power industry—and let p be the value of a kWh of electricity. From the point of view of anindustry regulator, the objective is to maximize the industry surplus, valuing SO2 at −ν/kWh:

maxy1,...,yn

π − νζ

where

π =

n∑i=1

πi =

n∑i=1

[pyi − ci(si, yi)]

151

Page 153: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

is the total profit of the industry as a whole, and

ζ =

n∑i=1

zi =

n∑i=1

yiαi(si)

is the total amount of SO2 emitted by the industry. (As an alternative, we could set up the problemby having utility functions for all the local residents, who each use electricity and consume another,numeraire good c, and wish to avoid having SO2 in the air. As an exercise, set up the problem thisway.)

FONC w.r.t. yi:p− ciyi − ναi(si) = 0, 1 ≤ i ≤ n (32.5)

This means the output of plant i should be chosen so that

ciyi + ναi(si) = p

The LHS, ciyi + ναi(si), is called the marginal social cost of production at plant i. The regulatorwants to set this equal to p, the “social” value of a kWh of electricity.

FONC w.r.t. αi:− cisi − νyiα

′i(si) = 0, 1 ≤ i ≤ n (32.6)

Dividing by yi,1

yicisi︸ ︷︷ ︸

∂ACi/∂si

= −να′i(si)

The optimal choice is the one for which the marginal increase in average cost offsets the marginalvalue of the reduced pollution per unit of output. Assuming ACi(si, yi) is convex in si (so thatwith higher si, an additional increase in si has a bigger effect on ACi) and that αi is decreasingand convex, we have Figure 32.2.

Figure 32.2: How can a regulator get Plant i to choose s∗i ?

Method 1 (Pigouvian Tax):

152

Page 154: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

• Tax each plant ν per ton of SO2 produced.

• Buy electricity at p/kWh.

The manager of plant i will then attempt to maximize

πi = pyi − ci(si, yi)− ναi(si)

which has FONC equivalent to (32.5) and (32.6) above.

Method 2 (Cap & Trade):

• Distribute among the plants a fixed amount of SO2 emission rights, each of which entitlesthe bearer to produce a ton of SO2.

• Allow the plants to trade emission rights among themselves.

• Buy electricity at p/kWh.

Let q be the value of an emission right, where q > 0. A plant manager who owns k emissionrights will then attempt to maximize

pyi − ci(si, yi) + v

where v = kq− qyiαi(si) is the value of the emission rights she can sell on the market (or willhave to buy). Notice that if q = v, the FONC for this plant is equivalent to (32.5) and (32.6).This is how SO2 really is regulated.

Why use Method 2?

• In reality, no one knows what v to charge. So instead the regulator looks at the total amountof SO2 emitted at some reference point in time, then issues a somewhat smaller number ofemission rights, e.g. 80%. This method ensures that SO2 is reduced by 20% “efficiently.”

• Firms prefer this method because they get the emission rights “free of charge.” (Emissionrights were distributed in the early 1990s, and plants were allowed to trade them, but therules forbidding them from exceeding the limits didn’t take effect until 1995.)

• It is claimed that enforcement is easier.

153

Page 155: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

33 Empirical Methods in Microeconomics

This section provides the reader with an overview of how microeconomists use real data to testalternative theories and (in some cases) estimate the relevant parameters of a particular model.The examples are drawn from my own work in labor economics.

33.1 Experiments and Counterfactuals

Suppose one is interested in testing a prediction of microeconomic theory. To be concrete, we shallconsider four examples:

• If single mothers currently on welfare are offered an earnings subsidy, will they work more?

• If the supply of low-skilled workers in a local labor market is increased by an influx of immi-grants, will wages of native, low-wage workers fall?

• If the minimum wage is increased, will low-wage employers hire fewer workers?

• If people without health insurance are provided insurance, will they use more health careservices? Will they become healthier?

The classical scientific approach to such questions would be to conduct a randomized experiment.In such an experiment, a population whose behavior is to be studied would be randomly divided intotwo groups: the treatment group, members of which receive the “treatment,” and the control group,members of which do not. For the welfare question, the population would be single mothers currentlyon welfare. For the immigrant question, the population would be cities (or other geographic entitiessuch as counties). For the minimum wage question, the population would be employers. For thefinal question, the population would be the uninsured. Note that some of these experiments seemharder to carry out than others.

Let’s assume that one could conduct a randomized experiment on welfare mothers. (In reality, suchan experiment was conducted in two Canadian provinces in the mid-90s. We will examine the datashortly.) How would one do this? Presumably, one could tabulate the employment rates of thetreatment group YT and the control group YC some time after the subsidy was in place. One wouldthen calculate the treatment effect

∆ = YT − YCThe idea of a randomized experiment is that in the absence of the treatment, the two groups wouldhave had equal outcomes. Randomization is key: if treatment status really is randomly assignedto the general population, then it is reasonable to expect the two groups to exhibit the samebehavior in the absence of treatment. The impact of “statistical accidents” is minimized by usingbig groups. The behavior of the control group represents a counterfactual for assessing whether ornot the treatment has an effect. If a theory predicts that a subsidy will increase work effort, forexample, then we want to test the null hypothesis H0 : ∆ = 0 versus the alternative hypothesisH1 : ∆ > 0.

A randomized experiment is considered the gold standard for scientific evidence. The FDA, forexample, requires drug companies to evaluate the efficacy of a new drug by means of a randomizedexperiment. The high status of randomized experiments is due to several features:

154

Page 156: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

1. Randomization ensures that YC is a valid counterfactual. So, except for chance errors, ∆ istruly attributable to the treatment, not to some inherent difference between the two groups.

2. Once the experimental design is determined, the researcher’s hands are tied. There is no roomfor weaseling. (The experimental design is a full description of the population, the samplesize, the randomization procedure, the treatment, and the data collection process.)

3. Because of (1) and (2), randomized experiments are easy to understand and therefore have alot of credibility.

33.1.1 The Self Sufficiency Project (SSP)

SSP is the name of a randomized experiment conducted in Canada during the 90s. Half a randomsample of single mothers who had been on welfare for at least a year was assigned to the treatmentgroup. The other half was assigned to the control group. Members of the control group were eligibleto receive their regular welfare benefit, a fixed monthly sum based on the number of children inthe home as well as the province, (e.g. $712 per month for a mother of one in New Brunswick).Welfare payments are reduced dollar-for-dollar for those who earn over $200 per month. Membersof the treatment group were allowed to remain on welfare but were offered an earnings subsidyS = (M − E)/2, where M is a monthly earnings target ($2500/month) and E is actual earnings.So, if a participant earned $650 in a month, she received a subsidy of $925. Participants qualifiedfor the subsidy only if they worked at least 30 hours per week, for up to three years. They also hadto receive their first subsidy payment within a year of entering the treatment group or they forfeitedall future eligibility. Figure 33.1 shows the monthly budget constraint, and Figure 33.2 shows the

Figure 33.1: Monthly budget constraint for members of the treatment group in the SSP.

fractions of each group on welfare as a function of time, in months, since random assignment, alongwith a graph of the average employment rate for each group.

155

Page 157: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

Time (Months)

Fra

ctio

n of

Gro

up o

n IA

0 10 20 30 40 50 60 70

0.0

0.2

0.4

0.6

0.8

1.0

Control GroupTreatment GroupDifference

Time (Months)

Fra

ctio

n of

Gro

up E

mpl

oyed

0 10 20 30 40 50

010

2030

40

Control GroupTreatment GroupDifference

Figure 33.2: Source: D. Card and D. Hyslop, “Estimating the Effects of a Time-limited Earnings Subsidyfor Welfare Leavers,” Econometrica 73, November 2005.

156

Page 158: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

33.2 Research Designs Based on Natural Experiments

Often we cannot carry out an experiment, either because it would cost a lot, and be quite invasive(e.g. SSP), or because it would be impractical. How do we proceed in such cases?

One approach is to consider events that occur, and gauge whether an anlysis of the event could beinterpreted as if the event were a random experiment. A very simple example is a paper I wroteon the Mariel Boatlift. In that paper, examined the movements in wages and unemployment ratesin Miami, (where the Marielitos landed), and within a control group comprised of four other cities:Tampa, Houston, Atlanta, and Los Angeles. A key difference between a true randomized experimentand a natural experiment is that treatment is not randomly assigned. So it is debatable whetherthe control group provides a valid counterfactual. For my paper, I examined trends in employmentin Miami versus the average of the four other cities throughout the 70s: the two moved in closeparallel. (Ironically, the editor of the journal forced me to remove this graph from the publishedpaper!)

In a natural experiment, it may not happen that outcomes are exactly the same in both groups,even before the treatment. Let

∆0 = Y 0T − Y 0

C

represent the pre-existing gap in the outcome—or measurable quantity—of iterest (e.g. averagewages), and let

∆1 = Y 1T − Y 1

C

represent the gap at some time after the treatment has begun. Then we might want to look at the“difference-in-differences”

DD = ∆1 −∆0 = (Y 1T − Y 0

T )− (Y 1C − Y 0

C)

This is the change in the treatment group relative to the change in the control group. The implicitassumption is that in the absence of treatment, ∆0 would have remained constant.

33.2.1 The Mariel Boatlift

In the Boatlift, about 125,000 Cuban immigrants were transported on a flotilla of small boats toMiami, over the period from April 1980 to July of the same year. This represented an increase ofabout 7% in the Miami labor force—mainly in the ranks of the unskilled. One simple hypothesisis that such an influx would reduce wages for unskilled workers already in Miami. Table 8 showsoutcomes for blacks in Miami relative to the comparison cities.

33.3 Natural Experiments with Several Control Groups

In a natural experiment, one never can be sure the control group provides a valid counterfactual.Sometimes it is possible to do additional checks by using two or more control groups. Then you

157

Page 159: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

Table 8: Logarithms of Real Hourly Earnings of Workers Age 16–61 in Miami and Four Comparison Cities,1979–85.

Group 1979 1980 1981 1982 1983 1984 1985

Miami:Whites 1.85 1.83 1.85 1.82 1.82 1.82 1.82

(.03) (.03) (.03) (.03) (.03) (.03) (.05)Blacks 1.59 1.55 1.61 1.48 1.48 1.57 1.60

(.03) (.02) (.03) (.03) (.03) (.03) (.04)Cubans 1.58 1.54 1.51 1.49 1.49 1.53 1.49

(.02) (.02) (.02) (.02) (.02) (.03) (.04)Hispanics 1.52 1.54 1.54 1.53 1.48 1.59 1.54

(.04) (.04) (.05) (.05) (.04) (.04) (.06)Comparison Cities:Whites 1.93 1.90 1.91 1.91 1.90 1.91 1.92

(.01) (.01) (.01) (.01) (.01) (.01) (.01)Blacks 1.74 1.70 1.72 1.71 1.69 1.67 1.65

(.01) (.02) (.02) (.01) (.02) (.02) (.03)Hispanics 1.65 1.63 1.61 1.61 1.58 1.60 1.58

(.01) (.01) (.01) (.01) (.01) (.01) (.02)

Note: Entries represent means of log hourly earnings (deflated by the Consumer Price Index—1980=100) for workers age 16–61 in Miami and four comparison cities: Atlanta, Houston, Los Angeles,and Tampa–St. Petersburg.

Source: D. Card, “The Impact of the Mariel Boatlift on the Miami Labor Market,” Industrial andLabor Relations Review, January 1990. Based on samples of employed workers in the ongoing rotationof groups of the Current Population Survey in 1979–85. Due to a change in SMSA coding proceduresin 1985, the 1985 sample is based on individuals in outgoing rotation groups for January–June of 1985only.

can construct

DD1 = (Y 1T − Y 0

T )− (Y 1C1− Y 0

C1)

DD2 = (Y 1T − Y 0

T )− (Y 1C2− Y 0

C2)

DD3 = (Y 1C2− Y 0

C2)− (Y 1

C1− Y 0

C1)

where C1 refers to control group 1, and C2 refers to control group 2. Ideally it will be the case thatDD1 = DD2, or equivalently, DD3 = 0.

33.3.1 The New Jersey Minimum Wage

In April 1992, the minimum wage rose from $4.25 to $5.05 per hour in the state of NJ. Elsewhere,it remained $4.25. The statute that raised the minimum wage had been passed in fall of the yearbefore, and, in anticipation, Alan Krueger and I developed a survey of fast food restaurants in NJand PA. We surveyed a set of about 400 restaurants first in February–March of 1992, (just beforethe increase), and again in late fall. We were extremely careful to track down all the restaurantsthat were surveyed in the first round. The treatment group consisted of restaurants in NJ whosestarting wages were less than $5.00 per hour prior to the increase. There were two control groups:restaurants in PA, and restaurants in NJ that already were paying relatively high wages, ($5.00

158

Page 160: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

Table 9: Average Employment Per Store Before and After the Rise in the NJ Minimum Wage

or more per hour prior to the increase). Table 9 shows the comparisons of employment growthbetween groups.

33.4 The Discontinuity Research Design

Sometimes one cannot find a good natural experiment; it is nonetheless possible to find a goodcounterfactual by looking at treatments that affect some groups but not other, extremely similargroups. A good example is Medicare. When individuals who have worked for at least 10 years turn65, they become eligible for “free” health insurance. (One also is eligible if one’s spouse worked 10years.) This age limit suggests that we compare individuals who are just a few months youngerthan 65, with those who are a few months older. Figure 33.3 shows the fractions of people withhealth insurance, by age (measured in quarters). The plots are for two groups: (relatively) moreeducated whites (over 12 years of education), and less educated minorities (blacks and hispanicswith less than 12 years of education). The idea of the discontinuity design is that the rule thatgrants free insurance to those who reach their 65th birthday creates an experiment: we think ofthose just over 65 as the treatment group, and those just under 65 and the control group. Thereare some potential problems with this idea, depending on the application:

• It may be that other factors, apart from the primary treatment, also change at the same pointin time. So it is important to check very carefully that these factors are very similar betweengroups.

159

Page 161: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

Age

Fra

ctio

n of

Gro

up In

sure

d

55 60 65 70 75

0.6

0.7

0.8

0.9

1.0

Whites, High Edu. (Actual)Whites, High Edu. (Pred.)Overall (Actual)Overall (Pred.)Minorities, Low Edu. (Actual)Minorities, Low Edu. (Pred.)

Figure 33.3: Health insurance coverage rates by age, based on 1992–2001 data from NHIS.

• There may be an age trend in the outcome of interest, so that even without treatment,individuals who are a little over 65 tend to be a little different from those under 65 in acertain respect. This can be checked by looking at the age profile of the outcome of interest.

• If individuals know they soon will be eligible for Medicare, they may act differently when theyare just under 65 from the way they would if there were no such rule.

160

Page 162: Lecture Notes for Econ 101A - · PDF fileLecture Notes for Econ 101A David Card Dept. of Economics UC Berkeley The manuscript was typeset by Daniel Nolan in LATEX. The gures were created

Percentage Who Did Not Get Medical Care Last Year for Cost Reasons

Age

Per

cent

age

55 60 65 70 75

24

68

1012

14Whites, High Edu.OverallMinorities, Low Edu.

Florida Outpatient Data, 1997−2002

Age

Log(

No.

of C

atar

act S

urgu

ries)

55 60 65 70 75

1.5

2.0

2.5

3.0

3.5

WhiteHispanicBlack

Figure 33.4: Here are plots showing the fractions of individuals belonging to three demographics whoreport that they did not receive medical care in the last year because they could not affordit, and the number of cataract surguries by age in Florida. You can see the discontinuitiesin the cataract data.