· nonparametric instrumental variable estimation of structural quantile effects victor...

33
Working Paper Series _______________________________________________________________________________________________________________________ National Centre of Competence in Research Financial Valuation and Risk Management Working Paper No. 453 NONPARAMETRIC INSTRUMENTAL VARIABLE ESTIMATION OF STRUCTURAL QUANTILE EFFECTS V. Chernozhukov P. Gagliardini O. Scaillet First version: December 2006 Current version: June 2011 This research has been carried out within the NCCR FINRISK project on “Financial Econometrics for Risk Management” ___________________________________________________________________________________________________________

Upload: others

Post on 25-Apr-2020

25 views

Category:

Documents


0 download

TRANSCRIPT

Page 1:  · NONPARAMETRIC INSTRUMENTAL VARIABLE ESTIMATION OF STRUCTURAL QUANTILE EFFECTS VICTOR CHERNOZHUKOV PATRICK GAGLIARDINI OLIVIER SCAILLET Abstract. We study the asymptotic distributi

Working Paper Series

_______________________________________________________________________________________________________________________

National Centre of Competence in Research Financial Valuation and Risk Management

Working Paper No. 453

NONPARAMETRIC INSTRUMENTAL VARIABLE ESTIMATION OF STRUCTURAL QUANTILE EFFECTS

V. Chernozhukov P. Gagliardini

O. Scaillet

First version: December 2006 Current version: June 2011

This research has been carried out within the NCCR FINRISK project on

“Financial Econometrics for Risk Management”

___________________________________________________________________________________________________________

Page 2:  · NONPARAMETRIC INSTRUMENTAL VARIABLE ESTIMATION OF STRUCTURAL QUANTILE EFFECTS VICTOR CHERNOZHUKOV PATRICK GAGLIARDINI OLIVIER SCAILLET Abstract. We study the asymptotic distributi

NONPARAMETRIC INSTRUMENTAL VARIABLE ESTIMATION OFSTRUCTURAL QUANTILE EFFECTS

VICTOR CHERNOZHUKOV PATRICK GAGLIARDINI OLIVIER SCAILLET

Abstract. We study the asymptotic distribution of Tikhonov Regularized estimation of

quantile structural e!ects implied by a nonseparable model. The nonparametric instrumental

variable estimator is based on a minimum distance principle. We show that the minimum

distance problem without regularization is locally ill-posed, and consider penalization by the

norms of the parameter and its derivatives. We derive pointwise asymptotic normality and

develop a consistent estimator of the asymptotic variance. We study the small sample prop-

erties via simulation results, and provide an empirical illustration to estimation of nonlinear

pricing curves for telecommunications services in the U.S.

Key words: Nonparametric Quantile Regression, Instrumental Variable, Ill-Posed Inverse

Problems, Tikhonov Regularization, Nonlinear Pricing Curve.

JEL classification: C13, C14, D12, G12. AMS 2000 classification: 62G08, 62G20,

62P20.

Date: 15 June 2011.We thank the editor and referees for helpful suggestions. We thank J. Hausman and G. Sidak for generously

providing us with the data on telecommunications services in the U.S. We are especially grateful to X. Chen for

comments that have led to major improvements in the paper. We also thank G. Dhaene, R. Koenker, O. Linton,

E. Mammen, M. Wolf, and seminar participants at Athens University, Zurich University, Bern University, Boston

University, Queen Mary, Greqam Marseille, Mannheim University, Leuven University, Toulouse University,

ESEM 2008 and SITE 2009 workshop for helpful comments. The last two authors acknowledge the support

provided by the Swiss National Science Foundation through the National Center of Competence in Research:

Financial Valuation and Risk Management (NCCR FINRISK).

1

Page 3:  · NONPARAMETRIC INSTRUMENTAL VARIABLE ESTIMATION OF STRUCTURAL QUANTILE EFFECTS VICTOR CHERNOZHUKOV PATRICK GAGLIARDINI OLIVIER SCAILLET Abstract. We study the asymptotic distributi

2

1. Introduction

We propose and analyze estimators of the nonseparable model Y = g(X,U), where theerror U is independent of the instrument Z, and has a uniform distribution U ! U(0, 1)(Chernozhukov and Hansen (2005), Chernozhukov, Imbens and Newey (2007)). The functiong(x, u) is strictly monotonic increasing w.r.t. u " [0, 1]. The variable X has compact supportX = [0, 1] and is potentially endogenous. The variables Y and Z have compact supports Y #[0, 1] and Z = [0, 1]dZ . The parameter of interest is the quantile structural e!ect !0(x) = g(x, ")on X for a given " " (0, 1). The function !0 measures the structural impact of the regressorX on the " -quantile of the dependent variable Y . Formally it satisfies the conditional quantilerestriction

P [Y $ !0(X) | Z] = P [g(X,U) $ g(X, ") | Z] = P [U $ " | Z] = ", (1.1)

which yields the endogenous quantile regression representation (Horowitz and Lee (2007)):

Y = !0(X) + V, P [V $ 0 | Z] = ". (1.2)

The main contribution of this paper is the derivation of the large sample distribution of aTikhonov regularized estimator of !0. This is the first distributional result in the literature onnonlinear problems, and it is noteworthy because of a fundamental di"culty of linearization ofa nonlinear ill-posed problem such as (1.2), as pointed out in Horowitz and Lee (2007). Eventhough this paper focuses on a particular case (1.2) of a nonlinear ill-posed problem, the resultsof the paper are conceptually amenable to other problems. Indeed, the nonsmooth case (1.2)analyzed here is in some sense the hardest; so our analysis could be applied to other problems,such as nonlinear ill-posed pricing problems in finance (see e.g. Egger and Engl (2005), Chenand Ludvigson (2009)) along similar lines.

We build on a series of fundamental papers on ill-posed endogenous mean regressions (Aiand Chen (2003), Darolles, Fan, Florens, and Renault (2011), Newey and Powell (2003), Halland Horowitz (2005), Horowitz (2007), Blundell, Chen, and Kristensen (2007)), and the re-view paper by Carrasco, Florens, and Renault (CFR, 2007). The main issue in nonparametricestimation with endogeneity is overcoming ill-posedness of the associated inverse problem.It occurs since the mapping of the reduced form parameter (that is, the distribution of thedata) into the structural parameter (that is, the instrumental regression function) is not con-tinuous. We need a regularization of the estimation to recover consistency. Here we fol-low Gagliardini and Scaillet (GS, 2006) and study a Tikhonov Regularized (TiR) estimator(Tikhonov (1963a,b), Groetsch (1984), Kress (1999)). We achieve regularization by adding acompactness-inducing penalty term, the Sobolev norm, to a functional minimum distance cri-terion. For nonparametric instrumental variable estimation of endogenous quantile regression

Page 4:  · NONPARAMETRIC INSTRUMENTAL VARIABLE ESTIMATION OF STRUCTURAL QUANTILE EFFECTS VICTOR CHERNOZHUKOV PATRICK GAGLIARDINI OLIVIER SCAILLET Abstract. We study the asymptotic distributi

3

(NIVQR), Chernozhukov, Imbens and Newey (2007) discuss identification and estimation viaa constrained minimum distance criterion. Horowitz and Lee (2007) give optimal consistencyrates for a L2-norm penalized estimator.

In independent work for a general setting, Chen and Pouzo (2008, 2009) study semiparamet-ric sieve estimation of conditional moment models based on possibly nonsmooth generalizedresidual functions. Specifically, Chen and Pouzo (2009) focus on the semiparametric e"ciency,asymptotic normality, and a weighted bootstrap procedure for the finite-dimensional param-eter, and use a finite-dimensional sieve to estimate the functional parameter. They coverpartially linear IVQR as a particular example. Chen and Pouzo (2008) give an in-depth, uni-fying treatment of convergence rates of penalized sieve-based estimators, and characterize whenthe sieve or the penalization dominates the convergence rates. Our and results of Chen andPouzo (2008, 2009) are complementary to each other (their results do not nest our results, andvice versa, since we work in an infinite-dimensional parameter space, while Chen and Pouzo(2008) work in finite-dimensional parameter spaces of increasing dimensions). The most im-portant di!erence is the derivation of pointwise asymptotic normality which is not available inChen and Pouzo (2008, 2009). Our other specific contributions for NIVQR include a proof ofill-posedness and a proof of consistency under weak conditions on the penalization parameter.

We organize the rest of the paper as follows. In Section 2, we prove local ill-posedness, andclarify the importance of including a derivative in the penalization. In Section 3, we proveconsistency of our Q-TiR estimator. In Section 4, we show pointwise asymptotic normality,and introduce a consistent estimator of the asymptotic variance. In Section 5, we provide com-putational experiments and present an empirical illustration to estimation of nonlinear pricingcurves for telecommunications services in the U.S. In the Appendix, we gather the technical as-sumptions and some proofs. We place all omitted proofs in the online supplementary materials(Chernozhukov, Gagliardini, Scaillet (2011)).

2. Ill-posedness in nonseparable models

From (1.1), the quantile structural e!ect !0 is a solution of the nonlinear functional equationA (!0) = ", where the operator A is defined by

A (!) (z) =!

FY |X,Z (!(x)|x, z) fX|Z(x|z)dx, z " Z,

and FY |X,Z and fX|Z denote the c.d.f. of Y given X,Z, and the p.d.f. of X given Z, re-spectively. Alternatively, in terms of the conditional c.d.f. FU |X,Z of U given X,Z, we canrewrite A (!) (z) =

"FU |X,Z

#g!1(x,!(x))|x, z

$fX|Z(x|z)dx, where g!1(x, .) denotes the gen-

eralized inverse of function g(x, .) w.r.t. its second argument. The functional parameter !0

Page 5:  · NONPARAMETRIC INSTRUMENTAL VARIABLE ESTIMATION OF STRUCTURAL QUANTILE EFFECTS VICTOR CHERNOZHUKOV PATRICK GAGLIARDINI OLIVIER SCAILLET Abstract. We study the asymptotic distributi

4

belongs to a subset # of the weighted Sobolev space H l[0, 1], l " N % {&}, that is the com-pletion of

%! " C l[0, 1] | '!'H < &

&w.r.t. the weighted Sobolev norm '!'H := (!,!)1/2

H ,where (!,#)H :=

'ls=0 as(*s!,*s#), as > 0, is the weighted Sobolev scalar product, and

(!,#) ="!(x)#(x)dx. Below we focus on (i) as = 1, when l < &, (ii) as = 1/s!, when l = &.

The former gives the classical Sobolev space of order l (Adams and Fournier (2003)) while thelatter gives the Sobolev space H"[0, 1] :=

%! " C"[0, 1] |

'"s=0

1s! (*

s!,*s!) < &&

of infi-nite order (Dubinskij (1986)). These Sobolev spaces are Hilbert spaces w.r.t. (., .)H , and theembeddings H"[0, 1] # H l[0, 1] # L2[0, 1], l " N, are compact (Adams and Fournier (2003),Theorem 6.3). We use the L2-norm '!' = (!,!)1/2 as consistency norm, and we assume that# is bounded and closed w.r.t. '.' .

We assume that !0 is globally identified on # (see Appendix C in Chernozhukov and Hansen(2005) for a discussion) and interior.

Assumption 1: (i) A (!) + " = 0, ! " #, if and only if ! = !0; (ii) function !0 is aninterior point of set # w.r.t. norm '.' .

We use the conditional moment restriction m(!0, z) := A (!0) (z) + " = 0, z " Z, andconsider the criterion

Q"(!) :=1

"(1 + ")E(m(!, Z)2

)=: 'A (!) + "'2

L2(FZ ,!) ,

where FZ denotes the marginal distribution of Z and L2(FZ , ") denotes the L2 space w.r.t.measure FZ/ ("(1 + ")). The true structural function !0 minimizes this criterion function Q".

The following proposition shows that the minimum distance problem above is locally ill-posed (see e.g. Definition 1.1 in Hofmann and Scherzer (1998)). There are sequences of in-creasingly oscillatory functions arbitrarily close to !0 that approximately minimize Q" whilenot converging to !0. In other words, function !0 is not identified in # as an isolated mini-mum of Q". Therefore, ill-posedness can lead to inconsistency of the naive analog estimatorsbased on the empirical analog of Q". In order to rule out these explosive solutions we usepenalization.

Proposition 1: Under Assumptions 1 (i) and A.3: (a) the problem is locally ill-posed,namely for any r > 0 small enough, there exist $ " (0, r) and a sequence (!n) # Br(!0) :=%! " L2[0, 1] : '!+ !0' < r

&such that '!n + !0' , $ and Q" (!n) - Q"(!0) = 0; (b) any

sequence (!n) # Br(!0) such that '!n + !0' , $ for r > $ > 0 and Q" (!n) - 0 satisfies

lim supn#"

'*!n' = +&.

The proof of result (a) gives explicit sequences (!n) generating ill-posedness. Since there isno general characterization of the ill-posedness of a nonlinear problem through conditions onits linearization, i.e., on the Frechet derivative of the operator (Engl, Kunisch and Neubauer

Page 6:  · NONPARAMETRIC INSTRUMENTAL VARIABLE ESTIMATION OF STRUCTURAL QUANTILE EFFECTS VICTOR CHERNOZHUKOV PATRICK GAGLIARDINI OLIVIER SCAILLET Abstract. We study the asymptotic distributi

5

(1989), Schock (2002)), this result does not follow from the ill-posedness of the linearizedversion of our problem. Under a stronger condition than Assumption 1 (i), namely localinjectivity of A, the definition of local ill-posedness is equivalent to A!1 being discontinuousin a neighborhood of A (!0) (see Engl, Hanke and Neubauer (2000), Chapter 10). Part (b)provides a theoretical underpinning for including the norm '*!' of the derivative in thepenalty term.

3. Consistency of the Q-TiR estimator

We consider a penalized criterion LT (!) := QT (!) + %T '!'2H , where %T > 0, P -a.s., and

QT (!) :=1

T "(1 + ")

T*

t=1

m (!, Zt)2 . (3.1)

The conditional moment m(!, z) is estimated nonparametrically by

m (!, z) :=!

FY |X,Z (! (x) |x, z) fX|Z(x|z)dx + " =: A (!) (z) + " , z " Z, (3.2)

where fX|Z and FY |X,Z denote kernel estimators of fX|Z and FY |X,Z with bandwidth hT > 0and kernel K satisfying Assumption A.2.

Proposition 2: Suppose %T is a stochastic sequence such that %T > 0, %T - 0, P -a.s., and1"T

+log T

ThdZ+1T

+ h2mT

,= Op(1), where m , 2 is the order of di!erentiability of the joint density

fX,Y,Z of (X,Y,Z) . Then, under Assumptions 1 (i) and A.1-A.3, the Q-TiR estimator !defined by

! := arg inf#$!

QT (!) + %T '!'2H , (3.3)

is consistent, namely '!+ !0'p- 0, as sample size T - &.

Term %T '!'2H in definition (3.3) penalizes highly oscillating components of the estimated

function induced by ill-posedness, and restores its consistency. In (3.3) we work with a func-tion space-based estimator as in Horowitz and Lee (2007) (see also the suggestion in Neweyand Powell (2003), p. 1573). In Section 5 we compute the estimator based on a finite largenumber of polynomials. The discrepancy between the function space-based estimator and theimplemented estimator is of a numerical nature since our type of asymptotics does not rely ona sieve approach.

To show Proposition 2 we use two results. First, the Sobolev penalty implies that thesequence of estimates ! for T " N is tight in

#L2 [0, 1] , '.'

$. This induces an e!ective com-

pactification of the parameter space: there exists a compact set that contains ! for any largeT with probability 1+ &, for any arbitrarily small & > 0. Second, we obtain a suitable uniform

Page 7:  · NONPARAMETRIC INSTRUMENTAL VARIABLE ESTIMATION OF STRUCTURAL QUANTILE EFFECTS VICTOR CHERNOZHUKOV PATRICK GAGLIARDINI OLIVIER SCAILLET Abstract. We study the asymptotic distributi

6

convergence result for QT on an infinite-dimensional and possibly non-totally bounded param-eter set # by exploiting the specific expression of m(!, z) given in (3.2). We are able to reducethe sup over # to a sup over a bounded subset of a finite-dimensional space. Proposition 6.2in Chen and Pouzo (2008) states a consistency result for nonparametric additive IVQR using aseries estimator for m(!, z) under similar conditions. In the rest of the paper we assume a de-terministic regularization parameter %T . The assumption in Proposition 2 becomes hT . T!$

and %T . T!% for 0 < ' < 1dZ+1 , 0 < ( < min {1 + ' (dZ + 1) , 2m'}; the relation aT . bT , for

positive sequences aT and bT , means that aT /bT is bounded away from 0 and & as T - &.

4. Asymptotic distribution of the Q-TiR estimator

In this section we derive a feasible asymptotic normality theorem. After deriving the first-order condition (Section 4.1) we show how to control the error induced by linearization ofthe problem under suitable smoothness assumptions (Sections 4.2 and 4.3). The validity of aBahadur-type representation for the functional estimator makes it possible to show asymptoticnormality (Section 4.4). We then provide a consistent estimator for the asymptotic variance(Section 4.5).

4.1. First-order condition. The asymptotic expansion of the Q-TiR estimator is derived byfollowing the same steps as in the usual finite-dimensional setting. To cope with the functionalnature of !0, we exploit an appropriate notion of di!erentiation to get the first-order condition.More precisely, we introduce the operator from L2[0, 1] to L2(FZ , ") corresponding to theFrechet derivative A := DA (!0) of operator A at !0 (see Appendix A.4):

A! (z) =!

fX,Y |Z(x,!0(x)|z)! (x) dx, (4.1)

and the operator from L2[0, 1] to L2(FZ , ") corresponding to the Frechet derivative A :=DA (!) of operator A at !:

A! (z) =!

fX,Y |Z(x, !(x)|z)! (x) dx, (4.2)

where z " Z and ! " L2[0, 1]. The space L2(FZ , ") is endowed with the scalar product(#1,#2)L2(FZ ,!) := 1

T !(1!!)'T

t=1 #1 (Zt)#2 (Zt). Under Assumptions A.4 (ii)-(iii), operatorA is compact, which implies that the linearized version of our problem is ill-posed. UnderAssumption 1 (ii), the Q-TiR estimator satisfies w.p.a. 1 the first-order condition

0 =d

d$LT (!+ $!)

----&=0

=2

T "(1 + ")

T*

t=1

.A(!)(Zt) + "

/A!(Zt) + 2%T (!,!)H

= 20A%

.A (!) + "

/+ %T !,!

1

H, (4.3)

Page 8:  · NONPARAMETRIC INSTRUMENTAL VARIABLE ESTIMATION OF STRUCTURAL QUANTILE EFFECTS VICTOR CHERNOZHUKOV PATRICK GAGLIARDINI OLIVIER SCAILLET Abstract. We study the asymptotic distributi

7

for all ! " H l[0, 1], where the second line in (4.3) comes from the definition of the op-erator A% through

0A%#,!

1

H:= 1

T !(1!!)'T

t=1 # (Zt) A! (Zt) , for any ! " H l[0, 1] and

# " L2(FZ , "). Operator A% is the adjoint of A w.r.t. the scalar products (., .)H on H l[0, 1]and (., .)L2(FZ ,!) on L2(FZ , "). It is the empirical counterpart of the adjoint operator A% ofA w.r.t. the Sobolev scalar product on H l[0, 1] and scalar product (., .)L2(FZ ,!) on L2(FZ , ").The operator A% can be characterized in terms of the adjoint A of A w.r.t. the L2[0, 1]scalar product, defined by A#(x) = 1

!(1!!)"

fX,Y,Z(x,!0(x), z)#(z)dz. For l = 1 we haveA% = D!1A, where D!1 denotes the inverse of operator D : H2

0 [0, 1] - L2[0, 1] with H20 [0, 1] :=%

! " H2[0, 1] : *!(0) = *!(1) = 0&

and D! =#1 + *2

$! (see Appendix A.4 for the deriva-

tion and the characterization with l > 1). From (4.3) holding for all ! " H l[0, 1], estimate !satisfies the nonlinear integro-di!erential equation of Type II:

A%.A (!) + "

/+ %T ! = 0. (4.4)

4.2. Highlighting the nonlinearity issue. We can rewrite Equation (4.4) by using thesecond-order expansion

2A (!) = 2A (!0) + A0$!+ R,

where

A0!(z) :=!

fX,Y |Z(x,!0(x)|z)!(x)dx, $! := !+ !0,

and R := R (!,!0) is the second-order residual term (see Lemma A.7). Then, after rearrangingterms and performing an asymptotic expansion, we get the decomposition (Appendix A.5):

$! = (%T + A%A)!1 A%) + BT + RT + KT ($!) , (4.5)

where )(z) :=" "

(" + 1 {y $ !0(x)}) "fX,Y,Z(x,y,z)fZ(z) dydx, $f = f + f , and

BT :=3(%T + A%A)!1 A%A + 1

4!0. (4.6)

The Bahadur-type representation (4.5) of the Q-TiR estimator (see e.g. Koenker (2005)) iskey to our asymptotic normality result. The stochastic term (%T + A%A)!1 A%) corresponds tothe leading term yielding asymptotic normality of the standard exogenous quantile regression.The deterministic term BT is the bias function induced by regularization. The remainder termRT accounts for kernel estimation of operator A and its expression is given in (A.8) below.

The nonlinearity term KT ($!) is defined by KT ($!) := +.%T + A%

0A0

/!1A%

0R, where A%0 is

defined as A%, but with !0 substituted for !.

The major di!erence between our ill-posed setting and standard finite-dimensional paramet-ric estimation problems, or well-posed functional estimation problems, concerns the behaviour

Page 9:  · NONPARAMETRIC INSTRUMENTAL VARIABLE ESTIMATION OF STRUCTURAL QUANTILE EFFECTS VICTOR CHERNOZHUKOV PATRICK GAGLIARDINI OLIVIER SCAILLET Abstract. We study the asymptotic distributi

8

of the nonlinearity term KT ($!). Controlling KT ($!) is a fundamental di"culty of lineariza-tion of a nonlinear ill-posed inverse problem. We prove in Lemma A.8 that KT ($!) satisfiesa quadratic bound

555KT ($!)555 $ C/

%T'$!'2 , (4.7)

w.p.a. 1, with a suitable constant C. In the RHS of (4.7) the coe"cient of the quadratic bounddiverges as the sample size increases. Hence, the usual argument that the quadratic nonlinearityterm is negligible w.r.t. the first-order term no matter the convergence rate of the latter, doesnot apply. Still, under Assumptions 2-4 discussed below and ensuring '$!' = op

#/%T$, we

can control the nonlinearity term KT ($!) and the remainder term RT .

4.3. Assumptions for asymptotic normality. Let !" := arg inf#$! Q" (!)+% '!'2H , with

% > 0, denote the nonlinear Tikhonov solution in the population. We will make the followingassumptions as well as the more technical Assumptions A.4-5 stated in the Appendix.

Assumption 2: The solution !" is unique and the equation DA(!)% (A (!) + ")+%! = 0,! " #, admits the unique solution ! = !", for all small % > 0.

This assumption involves the first-order condition for minimization of the penalized mini-mum distance criterion in the population. It rules out local extrema over # di!erent from theglobal minimum !".

Assumption 3: A! = 0 for ! " L2[0, 1] if and only if ! = 0.

This is an injectivity condition for the Frechet derivative A, that is, a local identification

assumption. Since operator A is such that A!(z) = E

67'g'u (X, ")

8!1!(X)|Z = z, U = "

9

and *g/*u > 0, Assumption 3 is equivalent to completeness of X by Z conditional onU = " ; see Chernozhukov, Imbens and Newey (2007) for examples and further discussionon the relationship with completeness.

Assumption 4: (i) The function !0 satisfies the source condition'"

j=1&(j ,#0'2H)2!

j< &,

& " (1/2, 1], where +j 0 0 and ,j are the eigenvalues and eigenfunctions of the compact,

self-adjoint operator A%A on H l[0, 1], with ',j'H = 1. (ii)'"

j=1&(j ,#0'2H

)j< 1/c2, where c :=

supx,y,z

--*yfX,Y |Z(x, y|z)--. (iii) % (%) := inf#$Hl[0,1]:

(#(=1

'A!'2L2(FZ ,!) +% '!'2

H , C%a, for C > 0,

a " (0, 1/2).

The source condition in Assumption 4 (i) requires that function !0 can be well approximatedby the elements of the eigenfunction basis {,j : j " N} associated with the larger eigenvaluesof operator A%A. The coe"cients (,j ,!0)H have to decrease to zero as j - & su"ciently fastcompared to the eigenvalues +j. This condition controls the bias contribution as in the proof

Page 10:  · NONPARAMETRIC INSTRUMENTAL VARIABLE ESTIMATION OF STRUCTURAL QUANTILE EFFECTS VICTOR CHERNOZHUKOV PATRICK GAGLIARDINI OLIVIER SCAILLET Abstract. We study the asymptotic distributi

9

of Proposition 3.11 in CFR, and implies (see Appendix A.5)

'BT 'H = O.%*T

/. (4.8)

Hence, the larger the parameter &, the smaller the regularization bias. Assumption 4 (ii) isthe analog of Assumption 6 in Horowitz and Lee (2007) for Sobolev penalization. It controlsthe distance between the nonlinear Tikhonov solution in the population and !0 along the linesof Proposition 10.7 in Engl, Hanke and Neubauer (2000). Together with Assumption 4 (i) itimplies '!"T + !0'H = O

#%*T$

(see the proof of Lemma B.6 in the supplementary materials).Finally, Assumption 4 (iii) implies that

inf#$!:

(#!#0()d*"

Q" (!) + % '!'2H + Q"(!0) + % '!0'2

H , C%% (%) , (4.9)

as % - 0, for any C < d2 (see Lemma B.6 in the supplementary materials). Inequality(4.9) replaces the usual ”identifiable uniqueness” condition (White and Wooldridge (1991))inf#$!:(#!#0()&Q" (!)+Q" (!0) > 0, which does not hold with ill-posedness (see Proposition1 (a)). Function %(%) characterizes how well the penalized criterion distinguishes !0 fromfunctions ! outside a neighbourhood of !0, when the radius

/% of the neighbourhood shrinks

to zero. The smaller the parameter a, the more e!ective the penalization restores “identifiableuniqueness”. Inequality (4.9) is sharp in the sense that for a linear problem the LHS is equalto d2%% (%) + O

#%3/2

$. Assumptions 2-4 are used to control the large deviation probability

P ['$!' , d/%T ], d > 0 (see Lemma B.7 in the supplementary materials). From (4.7) this

yields an upper bound for the nonlinearity term KT ($!) (see Lemmas A.10 and A.12).

In order to discuss plausibility of Assumption 4 (iii), let {,j : j " N} be an orthonormalbasis in L2[0, 1] of eigenfunctions of AA to eigenvalues +j. Consider the three conditions

(a)"*

j,k=1:j +=k

(,j , ,k)2H',j'2

H',k'2H

< 1, (b) ',j'2H , C1-j, (c) +j , C2.j , C1, C2 > 0, 1j , 1,

where -j 2 & and .j 0 0 are two given positive sequences. Condition (a) states that theeigenfunctions are not very correlated under (., .)H . Condition (b) gives a lower bound on thespeed of divergence of the Sobolev norms of the eigenfunctions in terms of sequence -j . Simi-larly, condition (c) gives a lower bound on the speed of convergence to zero of the eigenvaluesin terms of sequence .j . The divergence of sequence -j must dominate the convergence to zeroof sequence .j for Assumption 4 (iii) to hold. Specifically, conditions (a)-(c) with -j = jp and.j = j!+, 0 < / < p, imply Assumption 4 (iii) with a = +

++p . The hyperbolic decay of +j cor-responds to mild ill-posedness for operator AA. Moreover, conditions (a)-(c) with -j = exp(j2)and .j = exp (+/j), / > 0, imply Assumption 4 (iii) with any a " (0, 1/2). The geometric

Page 11:  · NONPARAMETRIC INSTRUMENTAL VARIABLE ESTIMATION OF STRUCTURAL QUANTILE EFFECTS VICTOR CHERNOZHUKOV PATRICK GAGLIARDINI OLIVIER SCAILLET Abstract. We study the asymptotic distributi

10

decay of +j corresponds to severe ill-posedness for operator AA (see CFR and Kress (1999) forthe terminology).

Remark 1. In the online supplementary materials we provide an example where the orthonor-mal eigenfunctions of AA in L2[0, 1] are given by ,j(x) = 1, ,j(x) =

/2 cos(0(j + 1)x),

j = 2, 3, . . . These functions satisfy (,j , ,k)H = 1{j = k}'l

s=0 as(0j)2s. Thus, for hyperbolicdecay +j , Cj!+ and finite Sobolev order l > //2, Assumption 4 (iii) holds with a = +

++2l .Similarly, for geometric decay +j , C exp (+/j) and Sobolev order l = &, Assumption 4 (iii)holds with any a " (0, 1/2).

The example in Remark 1 shows that Assumption A.4 (iii) involves an adaptation conditionbetween the speed of decay of the spectrum of operator AA (mild, resp. severe, ill-posednesss)and the Sobolev order l (finite, resp. infinite). This is related to condition (8.28) for operatorA in Engl, Hanke and Neubauer (2000). Within our setting of assumptions for asymptoticnormality, allowing for higher-order Sobolev penalties permits to accommodate various formsof ill-posedness. While the decay behaviours of the spectra of operators AA and A%A areexpected to be tightly related, making the link explicit appears di"cult in a general framework.

The regularity conditions on the eigenfunctions of A%A are more restrictive than in Horowitzand Lee (2007), and Chen and Pouzo (2008) (see also the discussion of Assumption A.5 inAppendix 1). They are useful to establish pointwise asymptotic normality.

4.4. Asymptotic normality. Let us define the following quantities

12T (x) :=

"*

j=1

+j

(%T + +j)2,2

j(x) and VT (%T ) :=1T

!12

T (x)dx.

The following proposition computes the limit distribution of the Q-TiR estimator for (anysequence of) typical points. By (sequence of) typical points, we mean sets XT # X = [0, 1]such that for x " XT

VT (%T )12

T (x)/T= O(1), (4.10)

that is, the variance at a typical point x is not much smaller than the integrated variance.

Proposition 3: Suppose Assumptions 1-4 and A.1-5 hold. Let hT and %T be such that:(a) hT . T!$ and %T . T!% with 0 < ' < 1

2(dZ+2) , 0 < ( < 12 min

71 + ' (dZ + 1) ,m', 1

1+a

8;

(b) %2*T = O (VT (%T )), VT (%T ) = o (%T ); (c) h1/4

TVT ("T ;2)VT ("T ) = o(1) and h2m

TVT ("T ;2m+&1)

VT ("T ) = o(1),for some $1 > 1, where VT (%T ; $) := 1

T

'"j=1

)j

("T +)j)2 ',j'2 j&. Then for any sequence x " XT

:T/12

T (x) (!(x) + !0(x) + BT (x)) d- N(0, 1).

Page 12:  · NONPARAMETRIC INSTRUMENTAL VARIABLE ESTIMATION OF STRUCTURAL QUANTILE EFFECTS VICTOR CHERNOZHUKOV PATRICK GAGLIARDINI OLIVIER SCAILLET Abstract. We study the asymptotic distributi

11

Furthermore, if %2*T = o (VT (%T )),

:T/12

T (x) (!(x) + !0(x)) d- N(0, 1).

The variance function 12T (x) involves the spectrum of operator A%A and the regularization

parameter %T . The penalty term %T '!'2H in the criterion defining the Q-TiR estimator implies

that the inverse eigenvalues 1/+j of the inverse of operator A%A are ridged with +j/(%T + +j)2.The variance formula is reminiscent of the usual asymptotic variance of the quantile regressionestimator: it involves the factor fV |X,Z(0|x, z) (see (1.2)) through the Frechet derivative A,

and the factor "(1 + ") through the adjoint A%, hidden in the spectrum of A%A. Since +j =(,j , A%A,j)H = 'A,j'2

L2(FZ ,!) $ 'A'2L ',j'2, where 'A'L denotes the operator norm of A,

we deduce that +!1j ',j'2 , 'A'!2

L , which implies'"

j=1 +!1j ',j'2 = &. Then, by using

(4.10), we get 12T (x) - & as %T - 0. Thus, 12

T (x) summarizes the impact of ill-posedness on

the nonparametric convergence rate:

T/12T (x). The conditions (a)-(c) on hT and %T ensure

that the asymptotic distribution of the nonlinear estimator ! is the same as the one inducedby linearization. These conditions depend on the instrument dimension dZ , the smoothnessproperties of the joint density of the observations via m, as well as on parameters & and a

introduced in Assumption 4. In particular, we need & > 1/2 > a. Since BT (x) = O('BT 'H) =O#%*T$

as shown in the proof of Proposition 3, %2*T = o(VT (%T )) is a su"cient condition for bias

negligibility at a typical point. Below in Remark 3 we state an example to discuss the mutualcompatibility of the conditions on hT and %T for linearization and bias negligibility. Finally, weshow in the supplementary materials that under a strengthening of Assumption A.5 (iii), themean integrated square error (MISE) is asymptotically like E['!+!0'2] = MT (%T ) (1 + o(1)),where MT (%T ) :=

" #1T 1

2T (x) + BT (x)2

$dx.

4.5. Estimating the asymptotic variance. The estimation of the asymptotic variance of! requires the estimation of the spectrum of operator A%A. Let us assume the followingsemiparametric specification for the decay of the eigenvalues +j and eigenfunction values ,j(x)at x " X when j - &.

Assumption 5: (i) The eigenvalues are such that +j = c1,j exp(v,j/) and (ii) the eigen-function values are such that ,j(x)2 = c2,j exp(wj2)3j , where v,j = +(j, log j), wj = + log j,

3j , 0 is an unknown periodic sequence with period S, / = (/1,/2)!

and 2 are unknownparameters, and c1,j, c2,j > 0 are unknown sequences converging to strictly positive constantsas j - &.The specification in Assumption 5 (i) accommodates both geometric decay with /2 = 0 andhyperbolic decay with /1 = 0 in the eigenvalues (severe and mild ill-posedness for operator

Page 13:  · NONPARAMETRIC INSTRUMENTAL VARIABLE ESTIMATION OF STRUCTURAL QUANTILE EFFECTS VICTOR CHERNOZHUKOV PATRICK GAGLIARDINI OLIVIER SCAILLET Abstract. We study the asymptotic distributi

12

A%A). The specification in Assumption 5 (ii) accommodates both a slowly-varying trend com-ponent exp(wj2) and an oscillatory component 3j in the eigenfunction values. We can relaxAssumption 5 to cover more general specifications of vj and wj.

Remark 2. For the geometric case of the example in Remark 1, we have +j = c1e!+1(j!1), withc1,/1 > 0. For the Sobolev order l = 1, we also have +j = c1

1+,2(j!1)2e!+1(j!1) and ,1(x)2 = 1,

,j(x)2 = c21+,2(j!1)2

cos2(0(j +1)x), j > 1, with c1, c2 > 0. Such an example is compatible withAssumption 5 where /2 = 2 = 2 and 3j = cos2(0(j + 1)x), if x is a rational number.

Let +j and ,j , for j " N, be the eigenvalues and eigenfunctions of operator A%A, normalizedsuch that

55,j

55H

= 1, where A = DA (!) is based on a pilot estimator !. The pilot estimator! is a Q-TiR estimator with regularization parameter %T and bandwidth hT satisfying theassumptions of Proposition 3. Let nT $ NT be integers growing with T . Define +j = +j

for 1 $ j $ nT , and +j = +nT exp((vj + vnT ),/), for nT < j $ NT , where the estimator /is computed through Ordinary Least Squares (OLS) by regressing log (+j/+nT ) on vj + vnT

for j = nT/2, · · · , nT + 1. Similarly, define ,j(x)2 = ,j(x)2 for 1 $ j $ nT , and ,j(x)2 =,S,nT (x)2 exp((wj + wnT )2)3jmodS , for nT < j $ NT . Here, ,S,j(x)2 :=

'S!1k=0 ,j!k(x)2, the

estimator 2 is computed through OLS by regressing log#,S,j(x)2/,S,nT (x)2

$on wj + wnT for

j = nT /2, · · · , nT + 1, and 3j = 2SnT

'k:nT/2-k<nT ,k=jmodS ,k(x)2/,S,k(x)2, for j = 1, ..., S,

where k = jmodS if k + j is an integer multiple of S. The estimator of the variance functionis 12

T (x) ='NT

j=1)j

()j+"T )2,j(x)2.

We have introduced the parameter nT since the nonparametric estimates of the spectrumof A%A may be unprecise in relative terms for j close to the truncation parameter NT . Thenonparametric estimates are replaced by extrapolated estimates between nT and NT . Theextrapolation procedure exploits the supposed decay behaviour of the spectrum in Assumption5. For the eigenfunction values, we estimate and extrapolate the parametric trend componentby using that the filtered spectral coe"cients ,S,j(x)2 :=

'S!1k=0 ,j!k(x)2 approaches exp(wj2)

for j - &, up to a scale constant. Then, we estimate the periodic component by averagingthe detrended square eigenfunction values over all lags j with same phase of the cycle.

Proposition 4: Denote $+j := mink-j (+k!1 + +k). Let NT be such that

"*

j=NT +1

+j ',j'2 = o#T%2

TVT (%T )$, (4.11)

and nT such that NT = O(nT ) and

1exp(wnT 2)$+nT

+1

Th2T

+ h2mT + MT

#%T$,1/2

= Op

.T!b

/, (4.12)

Page 14:  · NONPARAMETRIC INSTRUMENTAL VARIABLE ESTIMATION OF STRUCTURAL QUANTILE EFFECTS VICTOR CHERNOZHUKOV PATRICK GAGLIARDINI OLIVIER SCAILLET Abstract. We study the asymptotic distributi

13

for b > 0. Assume that 12%,T (x)/12

T (x) = O(1), where 12%,T (x) =

'"j=1

)j

()j+"T )2c2,j exp (wj2).

Then, under Assumptions 1-5 and A.1-5, -2T (x)

-2T (x)

p- 1.

Condition (4.11) ensures that the truncation bias is negligible. Condition (4.12) ensuresthat +j and ,j(x)2 are consistent in relative terms uniformly over 1 $ j $ nT . It involvesthe asymptotic MISE MT (%T ) of the pilot estimator. The condition NT = O(nT ) and theextrapolation procedure yield relative consistency of +j and ,j(x)2 uniformly over 1 $ j $ NT .

Remark 3. Let the eigenvalues and eigenfunctions satisfy Assumption 5 with /1,2 > 0,/2 , 0 and ',j'2 . j!. (see Remarks 1 and 2). Let us verify that there exist admissiblevalues of (, ', (, ', so that the conditions on %T , hT , %T , hT , nT , NT , to get Proposi-tions 3 and 4 are all compatible. In the supplementary materials we prove that VT (%T ; $) .

1T"T [log(1/"T )]""# for $ , 0, and 12

T (x),12%,T (x) . 1

"T [log(1/"T )]". If 2

m(1+2*) < ' < 1dZ+1

2*!12*+1 ,

11+2* < ( < 1

2 min7

1 + (dZ + 1)',m', 11+a

8, the pilot estimator satisfies the assumptions of

Proposition 3. If 2m(1+2*) < ' < 1

dZ+12*!12*+1 ,

11+2* < ( < 1

2 min7

1 + (dZ + 1)',m', 11+a

8, es-

timator ! is asymptotically normal with vanishing bias. The conditions on (, ', (, ' arecompatible if m > 2(dZ + 1) and & > 1

2 + max7

dZ+1m , a

8. Since

'"j=n+1 +j ',j'2 $ Ce!+1n,

n " N, and $+j = +j!1 + +j . j!+2e!+1j , conditions (4.11) and (4.12) are satisfied fornT = c1 log T and NT = c2 log T such that c1 < 1

+1min

71!2$

2 ,m', 1!%2 , &(

8and c2 > 1

+1(.

Then:

T/12T (x) (!(x) + !0(x)) d- N(0, 1), from which asymptotically valid pointwise confi-

dence intervals can be computed.

5. Monte-Carlo results and empirical illustration

5.1. Monte-Carlo results. We consider an experiment, following Newey and Powell (2003),where the errors U%

1 , U%2 and the instrument Z follow a trivariate normal distribution, with

zero means, unit variances and a correlation coe"cient of 0.5 between U%1 and U%

2 . We takeX% = Z + U%

2 , and build X = & (X%), Y = sin (0X) + U%1 , and U = &(U%

1 ). The mediancondition is P [Y + !0 (X) $ 0 | Z] = .5, where the functional parameter is !0(x) = sin (0x),x " [0, 1]. We consider sample size T = 1000 and a classical Sobolev penalty of order l "{1, 2}. For l = 1 we consider % = % " {.00002, .00005, .0001}, while for l = 2 we consider% = % " {.0000003, .000001, .000003}. We use Gaussian kernels and select the bandwidth withthe standard rule of thumb (Silverman (1986)). To compute ! and ! we opt for a numericalseries approximation based on standardized shifted Chebyshev polynomials of the first kind,and a user-supplied analytical gradient and Hessian optimization procedure. We report resultsusing 16 polynomials (order 0 to 15); results using the first 8, or more, polynomials are nearlyidentical. The variance estimator 12

T (x) is computed with NT = nT = 4. Figure 1 displays

Page 15:  · NONPARAMETRIC INSTRUMENTAL VARIABLE ESTIMATION OF STRUCTURAL QUANTILE EFFECTS VICTOR CHERNOZHUKOV PATRICK GAGLIARDINI OLIVIER SCAILLET Abstract. We study the asymptotic distributi

14

the QQ-plots of the finite sample distributions of:

T/12T (x) (!(x) + !0(x)), x = .5, for l = 1,

% = % = .00005 (left), and l = 2, % = % = .000001 (right), built on 1000 replications. Thefinite sample distributions are close to the standard normal distribution. For the selectedvalues of %,%, the regularization bias is rather small. Table I reports the finite sample coverageof pointwise confidence intervals for !0(x), x " {.10, .25, .50, .75, .90} using the di!erent valuesof l, %, %. Let us first consider the results for l = 1 (left panel). The finite sample coverageis close to the nominal coverage at 90%, 95%, 99% for x = .5 and % = % = .00005. At x = .5we observe overrejection for % = % = .0001 because of regularization bias, and underrejectionfor % = % = .00002 because of a too small number of terms in the estimated variance. Forinstance, with NT = nT = 5 and at x = .5, the finite sample coverage at 90%, 95%, 99% is.964, .985, .998 for l = 1, % = % = .00002, and .947, .974, .999 for l = 2, % = % = .0000003. Forx = .10, .25, .75, .90 and the considered values of %,%, we typically observe some overrejection.Finally, the results for l = 2 are qualitatively similar to those for l = 1.

5.2. Empirical illustration. This section presents an empirical illustration with U.S. long-distance call data extracted from the sample of Hausman and Sidak (2004). They investigatenonlinear price schedules chosen by consumers of message toll service o!ered by long-distanceinterexchange carriers. We estimate median structural e!ects for nonlinear pricing curvesbased on the conditional quantile condition P [Y $ !0 (X) | Z] = .5, with X = & (min) andZ = & (Inc). Variable Y is the price per minute in dollars, min is the standardized amount ofuse in minutes, and Inc is the standardized logarithm of annual income. We look at clients ofa leading long-distance interexchange carrier with age between 30 and 45. To study the e!ectof education on the chosen nonlinear price schedule we divide the sample into people withat most 12 years of education (T = 978), and people with more than 12 years of education(T = 435).

We set % = .0001, and start the optimization algorithm with the NIVR estimates of GS. Thespecification test of Gagliardini and Scaillet (2007) does not reject the null hypothesis of thecorrect specification of the moment restriction used in estimating the mean pricing curve at the5% significance level (p-value = .32, .77). We apply a reduction factor .8 to the regularizationparameters selected by the heuristic data-driven procedure of GS run on the linearization. Forthe two education categories, we get % = .154, .034 under a Sobolev penalty with l = 1, and% = .088, .001 under a Sobolev penalty with l = 2. We present the empirical results withsixteen polynomials. They remain virtually unchanged when increasing gradually the numberof polynomials from eight to sixteen. There is a stabilization of the value of the optimizedobjective function, of the loadings in the numerical series approximation, and of the data-driven regularization parameter. Higher order polynomials receive loadings which are closer

Page 16:  · NONPARAMETRIC INSTRUMENTAL VARIABLE ESTIMATION OF STRUCTURAL QUANTILE EFFECTS VICTOR CHERNOZHUKOV PATRICK GAGLIARDINI OLIVIER SCAILLET Abstract. We study the asymptotic distributi

15

and closer to zero. This suggests that we can limit ourselves to a small number of polynomialsin this application.

Figure 2 plots the estimated NIV median structural e!ect, pointwise asymptotic confidenceintervals at 95%, and the linear IVQR estimate for the two education categories. The upperpanel for l = 1 and the lower panel for l = 2 show that estimated NIVQR and IVQR structurale!ects are close, and their patterns di!er across the two education categories. As in Hausmanand Sidak (2004) we observe that less educated customers pay more than better educatedcustomers when the number of minutes of use increases. A possible explanation is that thelatter exploit better the tari! options for long-distance calls available at those ranges.

Figure 3 is a picture ”a la box-plot” where we represent the estimated quartile structurale!ects and the estimated mean structural e!ect with l = 1. The box-plot interpretation comesfrom the conditional probability of the shaded area being asymptoticallyP [g (X, 0.25) $ Y $ g (X, 0.75) |Z = z] = .5, for any given value z of the instrument. Ver-tical sections of the shaded areas correspond to measures of dispersion. For both educationcategories, the dispersion is smaller at high usage of the service, likely because of the e!ort ofpeople to find the most convenient tari!s.

References

[1] Adams, R. and J. Fournier (2003): Sobolev Spaces. Boston: Academic Press.

[2] Ai, C. and X. Chen (2003): E"cient Estimation of Models with Conditional Moment Restrictions Con-

taining Unknown Functions, Econometrica, 71, 1795-1843.

[3] Alt, H. (1992): Linear Functional Analysis. Berlin: Springer Verlag.

[4] Andrews, D. (1994): Empirical Process Methods in Econometrics, in the Handbook of Econometrics, Vol.

4, ed. by R. Engle and D. McFadden. Amsterdam: North Holland, 2247-2294.

[5] Blundell, R., X. Chen, and D. Kristensen (2007): Semi-Nonparametric IV Estimation of Shape Invariant

Engel Curves, Econometrica, 75, 1613-1669.

[6] Bosq, D. (1998): Nonparametric Statistics for Stochastic Processes. Berlin: Springer Verlag.

[7] Bosq, D. (2000): Linear Processes in Function Spaces. Theory and Applications. Berlin: Springer Verlag.

[8] Carrasco, M., J.-P. Florens, and E. Renault (2007): Linear Inverse Problems in Structural Econometrics:

Estimation Based on Spectral Decomposition and Regularization, in the Handbook of Econometrics, Vol.

6, ed. by J. Heckman and E. Leamer. Amsterdam: North Holland, 5633-5751.

[9] Chen, X. and S. Ludvigson (2009): Land of Addicts? An Empirical Investigation of Habit-Based Asset

Pricing Models, Journal of Applied Econometrics, 24, 1057-1093.

[10] Chen, X. and D. Pouzo (2008): Estimation of Nonparametric Conditional Moment Models with Possibly

Nonsmooth Moments, Cowles Foundation Discussion Paper 1650.

[11] Chen, X. and D. Pouzo (2009): E"cient Estimation of Semiparametric Conditional Moment Models with

Possibly Nonsmooth Residuals, Journal of Econometrics, 152, 46-60.

[12] Chernozhukov, V., P. Gagliardini, and O. Scaillet (2011): Nonparametric Instrumental Variable Estimation

of Structural Quantile E!ects, Supplementary material: Proofs.

Page 17:  · NONPARAMETRIC INSTRUMENTAL VARIABLE ESTIMATION OF STRUCTURAL QUANTILE EFFECTS VICTOR CHERNOZHUKOV PATRICK GAGLIARDINI OLIVIER SCAILLET Abstract. We study the asymptotic distributi

16

[13] Chernozhukov, V. and C. Hansen (2005): An IV Model of Quantile Treatment E!ect, Econometrica, 73,

245-261.

[14] Chernozhukov, V., G. Imbens, and W. Newey (2007): Instrumental Variable Estimation of Nonseparable

Models, Journal of Econometrics, 139, 4-14.

[15] Darolles, S., Y. Fan, J.-P. Florens, and E. Renault (2011): Nonparametric Instrumental Regression, forth-

coming in Econometrica.

[16] Dubinskij, J. (1986): Sobolev Spaces of Infinite Order and Di!erential Equations. Dordrecht: Kluwer

academic publishers.

[17] Egger, H., and H. Engl (2005): Tikhonov Regularization Applied to the Inverse Problem of Option Pricing:

Convergence Analysis and Rates, Inverse Problems, 21, 1027-1045.

[18] Engl, H., M. Hanke, and A. Neubauer (2000): Regularization of Inverse Problems. Dordrecht: Kluwer

academic publishers.

[19] Engl, H., K. Kunisch, and A. Neubauer (1989): Convergence Rates for Tikhonov Regularization of Non-

linear Ill-posed Problems, Inverse problems, 5, 523-540.

[20] Gagliardini, P. and O. Scaillet (2006): Tikhonov Regularization for Nonparametric Instrumental Variable

Estimators, Swiss Finance Institute Discussion Paper 2006.33.

[21] Gagliardini, P. and O. Scaillet (2007): A Specification Test for Nonparametric Instrumental Variable Re-

gression, Swiss Finance Institute Discussion Paper 2007.13.

[22] Groetsch, C. (1984): The Theory of Tikhonov Regularization for Fredholm Equations of the First Kind.

Boston: Pitman Advanced Publishing Program.

[23] Hall, P. and J. Horowitz (2005): Nonparametric Methods for Inference in the Presence of Instrumental

Variables, Annals of Statistics, 33, 2904-2929.

[24] Hall, P. and M. Hosseini-Nasab (2006): On Properties of Functional Principal Components Analysis,

Journal of the Royal Statistical Society, Series B, 68, 109-126.

[25] Hansen, B. (2008): Uniform Convergence Rates for Kernel Estimation with Dependent Data, Econometric

Theory, 24, 726-748.

[26] Hausman, J. and G. Sidak (2004): Why Do the Poor and the Less-Educated Pay More for Long-Distance

Calls?, Contributions to Economic Analysis and Policy, 3, Article 3.

[27] Hofmann, B. and O. Scherzer (1998): Local Ill-posedness and Source Conditions of Operator Equations in

Hilbert Spaces, Inverse Problems, 14, 1189-1206.

[28] Horowitz, J. (2007): Asymptotic Normality of a Nonparametric Instrumental Variables Estimator, Inter-

national Economic Review, 48, 1329-1349.

[29] Horowitz, J. and S. Lee (2007): Nonparametric Instrumental Variables Estimation of a Quantile Regression

Model, Econometrica, 75, 1191-1208.

[30] Koenker, R. (2005): Quantile Regression, Econometric Society Monographs, 38. Cambridge: Cambridge

University Press.

[31] Kress, R. (1999): Linear Integral Equations. New York: Springer.

[32] Newey, W. and J. Powell (2003): Instrumental Variable Estimation of Nonparametric Models, Economet-

rica, 71, 1565-1578.

[33] Reed, M. and B. Simon (1980): Functional Analysis. San Diego: Academic Press.

[34] Schock, E. (2002): Non-linear Ill-posed Equations: Counter-examples, Inverse problems, 18, 715-717.

[35] Silverman, B. (1986): Density Estimation for Statistics and Data Analysis. London: Chapman and Hall.

Page 18:  · NONPARAMETRIC INSTRUMENTAL VARIABLE ESTIMATION OF STRUCTURAL QUANTILE EFFECTS VICTOR CHERNOZHUKOV PATRICK GAGLIARDINI OLIVIER SCAILLET Abstract. We study the asymptotic distributi

17

[36] Tikhonov, A. N. (1963a): On the Solution of Incorrectly Formulated Problems and the Regularization

Method, Soviet Math. Doklady, 4, 1035-1038 (English Translation).

[37] Tikhonov, A. N. (1963b): Regularization of Incorrectly Posed Problems, Soviet Math. Doklady, 4, 1624-1627

(English Translation).

[38] White, H. and J. Wooldridge (1991): Some Results on Sieve Estimation with Dependent Observations, in

Nonparametric and Semiparametric Methods in Econometrics and Statistics, ed. by W. Barnett, J. Powell

and G. Tauchen. New York: Cambridge University Press, 459-493.

Appendix A. Regularity Conditions and Proofs

A.1. Assumptions A. Below we list the additional technical regularity conditions. In partic-ular, we invoke A.1-3 for proofs of local ill-posedness and consistency and A.4-5 for asymptoticnormality. For a function f of variable s in Rds and a multi-index / " Nds , we denote*+f := *+1

s1· · ·*+ds

sdsf , |/| :=

'dsi=1 /i, 'f'" := sups |f(s)| and 'Df'" :=

'+:|+|=1 '*+f'".

A.1: (i) {(Xl, Yl, Z%l ) : l = 1, ..., T %} is a sample of i.i.d. observations of random variable

(X,Y,Z%) admitting a density fX,Y,Z# on the support X 3 Y 3 Z% # Rd, where X = [0, 1],Y = [0, 1], Z% # RdZ , d = 2 + dZ . (ii) The density fX,Y,Z# is in class Cm

#Rd$, with

m , 2, and *+fX,Y,Z# is uniformly continuous and bounded, for any / " Nd with |/| = m.

(iii) The random variable (X,Y,Z) is such that (X,Y,Z) = (X,Y,Z%) if Z% " Z, whereZ = [0, 1]dZ is interior to Z%, and the density fZ of Z is such that infz$Z fZ(z) > 0.A.2: The kernel K on Rd is such that (i)

"K(u)du = 1 and K is bounded ; (ii) K has

compact support; (iii) K is di!erentiable, with bounded derivatives; (iv)"

u+K(u)du = 0 forany / " Nd with |/| < m.

A.3: (i) The function " 4- g(x, ") is strictly monotonic increasing and continuous, for almostany x " (0, 1), and supx,! |g(x, ")| < &, supx,! |*xg(x, ")| < &; (ii) 'DfX|Z'" < & ; (iii)'DFU |X,Z'" < &.

A.4: (i) There exists h > 0 such that function q (s) :='

+:|+|-m supv$Bh(s) |*+fX,Y,Z(v)|,

s " S := X 3 Y 3 Z, is integrable and satisfies"S

q(s)2

fX,Y,Z(s)ds < &, where Bh(s) denotes theball in Rd of radius h centered at s; (ii) 'fX,Y |Z'" < &; (iii) '*yfX,Y |Z'" < &.

A.5: (i)'"

j,k=1,j +=k&(j ,(k'2

((j(2((k(2 < &; (ii) The functions #j(z) := 1*)j

(A,j) (z), j " N, satisfy

supj$N E(|#j (Z)|s

)1/s< &, for s , 4; (iii) The functions #j are in class Cm

#RdZ

$such that

E(|*+#j (Z)|s

)1/s = O#j|+|

$for any / " NdZ with |/| $ m.

In Assumption A.1 (i), the compact support of X and Y is used for technical reasons.Assuming univariate X simplifies the exposition. Assumptions A.1 (ii) and A.2 are classicalconditions in kernel density estimation concerning smoothness of the density and of the kernel.

Page 19:  · NONPARAMETRIC INSTRUMENTAL VARIABLE ESTIMATION OF STRUCTURAL QUANTILE EFFECTS VICTOR CHERNOZHUKOV PATRICK GAGLIARDINI OLIVIER SCAILLET Abstract. We study the asymptotic distributi

18

In particular, when m > 2, K is a higher order kernel. Moreover, we assume a compact supportfor the kernel K to simplify the set of regularity conditions. In Assumption A.1 (iii), variableZ is obtained by truncating Z% on the compact set Z, and the density fZ of Z is boundedfrom below away from 0 on the support Z. The corresponding observations are Zt, t = 1, ..., T ,where T $ T %. We get the estimator fX,Y,Z of the density fX,Y,Z from the kernel estima-tor fY,X,Z#(x, y, z) = 1

T #hdT

'T #

l=1 K ((Xl + x)/hT )K ((Yl + y)/hT )K ((Z%l + z)/hT ) of density

fX,Y,Z# by normalization, namely fY,X,Z = fY,X,Z#/"X.Y.Z fY,X,Z# = fY,X,Z#/

"Z fZ#. This

trick is used in the proofs to control for small values of the estimator of the density of Z

appearing in denominators and to avoid edge e!ects. Alternative approaches to address thesetechnical issues consist in using trimming (see e.g. Hansen (2008)), boundary kernels (see e.g.Hall and Horowitz (2005) for the use of such kernels in NIVR) or density weighting (see e.g.Horowitz and Lee (2007)). Assumption A.3 (i) is a boundedness and smoothness condition onfunction g(x, ") w.r.t. both its arguments. Assumptions A.3 (ii) and (iii) concern boundednessand smoothness of the p.d.f. of X given Z, and the c.d.f. of U given X,Z, respectively.

Assumption A.4 concerns the joint density fX,Y,Z and the conditional density fX,Y |Z . Specif-ically, Assumption A.4 (i) imposes an integrability condition on a suitable measure of localvariation of density fX,Y,Z and its derivatives. This assumption is used in the proof of LemmaA.10 to bound higher order terms in the asymptotic expansion of the estimator coming fromkernel estimation bias. Assumptions A.4 (ii)-(iii) are used to show that A is Frechet di!er-entiable, with compact Frechet derivative. These assumptions can be rewritten in terms ofdensities fU |X,Z, fX|Z and function g. The formulation as in Assumptions A.4 (ii)-(iii) is closerto the use in the proofs, and simplifies the exposition. Assumption A.4 (iii) also implies aLipschitz behaviour of the Frechet derivative operator DA(!) w.r.t. ! in a neighborhood ofthe true function (Assumption ii) in Theorem 10.4 of Engl, Hanke and Neubauer (2000)). Fi-nally, Assumption A.5 concerns the singular system

%/+j ,,j ,#j ; j " N

&of operator A (Kress

(1999), p. 278). Assumption A.5 (i) requires that the (., .)H-orthonormal basis of eigenfunc-tions of A%A satisfy a summability condition w.r.t. (., .). This assumption eases the derivationof the upper bound of the MISE in Lemma A.10. Assumptions A.5 (ii) and (iii) ask for theexistence of bounds for moments of derivatives of functions #j , j " N. Functions #j , j " N,are an orthonormal system in L2(FZ , "). These assumptions control for terms of the type"#j(z) [1 {y $ !0(x)} + " ] fX,Y,Z(s)ds, uniformly in j " N, in the proof of Lemmas A.10, A.11

and A.14.

A.2. Proof of Proposition 1. In Step 1 we show local ill-posedness of the nonseparablesetting (part (a)), and in Step 2 we prove that sequences generating ill-posedness exhibitdiverging L2-norm of their first derivative (part (b)). We place all omitted proofs to thesupplementary materials.

Page 20:  · NONPARAMETRIC INSTRUMENTAL VARIABLE ESTIMATION OF STRUCTURAL QUANTILE EFFECTS VICTOR CHERNOZHUKOV PATRICK GAGLIARDINI OLIVIER SCAILLET Abstract. We study the asymptotic distributi

19

Step 1. (Proof of part (a)) We use the next Lemma A.1, which is a local version ofProposition 10.1 in Engl, Hanke and Neubauer (2000).

Lemma A.1. Suppose that: (i) Operator A is compact. (ii) For any r > 0 small enough,there exists a sequence (!n) # Br(!0) s.t. !n ! !0 and A (!n) w- A (!0) where w- denotesweak convergence. Then the minimum distance problem is locally ill-posed.

Condition (i) in Lemma A.1 follows from the next Lemma A.2, which is proved using aresult in Alt (1992).

Lemma A.2. Under Assumptions A.3 (ii) and (iii), operator A is compact.

Let us now verify Condition (ii) in Lemma A.1. Define #(x) := sin (20x) and #n(x) :=$#(nx), x " X , where 0 < $ < min{", 1 + "}. Further, let !n(x) := g(x, " + #n(x)), x " X .Then, we deduce that '!n + !0'2 =

"X [g(x, " + $#(nx)) + g (x, ")]2 dx. Split the integral

w.r.t. x over the partition ((k + 1) /n, k/n], k = 1, ..., n, of (0, 1]. It follows that

'!n + !0'2 =n*

k=1

! k/n

(k!1)/n[g(x, " + $#(nx)) + g (x, ")]2 dx

=n*

k=1

1n

! 1

0

6g

+k + 1

n+

y

n, " + $#(y)

,+ g

+k + 1

n+

y

n, "

,92

dy,

using the periodicity of #. Using Assumption A.3 (i),

'!n + !0'2 =n*

k=1

1n

! 1

0

6g

+k + 1

n, " + $#(y)

,+ g

+k + 1

n, "

,92

dy + O(1/n).

The first term is a converging Riemann sum, and we get '!n + !0'2 - I& :=!

X

! 1

0[g(x, " + $#(y)) + g (x, ")]2 dydx as n - &. Then, I& > 0, and I& - 0 as $ - 0 by the

dominated convergence Theorem. Thus, for $ > 0 su"ciently small, we have (!n) # Br(!0)and !n ! !0. For q " L2(FZ , ") we have,

(q,A (!n))L2(FZ ,!) =1

" (1 + ")

!q(z)fZ(z)

!

XfX|Z(x|z)FU |X,Z (" + #n (x) |x, z) dxdz.

Thus, we have to show

Jn :=!

q(z)fZ(z)!

XfX|Z(x|z)FU |X,Z (" + #n (x) |x, z) dxdz - "

!q(z)fZ(z)dz. (A.1)

Page 21:  · NONPARAMETRIC INSTRUMENTAL VARIABLE ESTIMATION OF STRUCTURAL QUANTILE EFFECTS VICTOR CHERNOZHUKOV PATRICK GAGLIARDINI OLIVIER SCAILLET Abstract. We study the asymptotic distributi

20

To this end, split the integral w.r.t. x over the partition ((k + 1) /n, k/n] with k = 1, ..., n

Jn =n*

k=1

! k/n

(k!1)/n

!q(z)fZ(z)fX|Z(x|z)FU |X,Z (" + $# (nx) |x, z) dzdx

=n*

k=1

1n

! 1

0

!q(z)fZ(z)fX|Z

+k + 1

n+

1n

y|z,

FU |X,Z

+" + $# (y) |k + 1

n+

1n

y, z

,dzdy

after a change of variable and using periodicity of function #. Then we have

Jn =n*

k=1

1n

!q(z)fZ(z)fX|Z

+k + 1

n|z,! 1

0FU |X,Z

+" + $# (y) |k + 1

n, z

,dydz + I1,n, (A.2)

where |I1,n| $'n

k=11n2

" 10

"q(z)fZ(z) supu,x,z |*xH(u, x|z)| ydzdy = O(1/n) and H(u, x|z) :=

FU |X,Z (u|x, z) fX|Z(x|z), with supu,x,z |*xH(u, x|z)| < & from Assumptions A.3 (ii)-(iii).Since the Riemann sum in (A.2) converges to the corresponding integral, we get

Jn -!

X

!q(z)fZ(z)fX|Z (x|z)

! 1

0FU |X,Z (" + $# (y) |x, z) dydzdx =: J .

Using that"X fX|Z (x|z) FU |X,Z(u|x, z)dx = P [U $ u|Z = z] = u by the independence of U

and Z, and the uniform distribution of U , we get J = ""

q(z)fZ(z)dz+$"

q(z)fZ(z)" 10 # (y) dy =

""

q(z)fZ(z)dz, and (A.1) follows.

Step 2. (Proof of part (b)) The proof is by contradiction. Suppose that there existsB < & such that '*!n' $ B for any n large enough. Since # is bounded, and by thecompact embedding theorem (see Adams and Fournier (2003)), set {! " # : '*!' $ B} iscompact w.r.t. the norm '.'. Therefore, there exists a subsequence (!mn) which converges innorm '.' to !% " #, say. Since Q" is continuous, we get Q" (!mn) - Q" (!%), and thusQ" (!%) = 0. By identification (Assumption 1 (i)), we deduce !% = !0, and the subsequence(!mn) converges to !0. But this is impossible, since '!mn + !0' , $ > 0.

A.3. Proof of Proposition 2. We establish existence of the Q-TiR estimator in Step 1 beforeshowing its consistency in Step 2.

Step 1. (Existence) Since QT (!) = 1T !(1!!)

'Tt=1 m(!, Zt)2 is positive, a function ! " #

minimizes QT (!) + %T '!'2H if and only if

! = arg inf#$!

QT (!) + %T '!'2H , s.t. %T '!'2

H $ LT (!0). (A.3)

The solution ! in (A.3) exists P -a.s. since (i) mapping !- '!'2H is lower semicontinuous on

H l[0, 1] w.r.t. the norm '.' (see Reed and Simon (1980), p. 358) and mapping ! - QT (!) iscontinuous on # w.r.t. the norm '.', P -a.s., for any T ; (ii) set

7! " # : '!'2

H $ L8

is compactw.r.t. the norm '.', for any constant 0 < L < & (compact embedding theorem; see Adamsand Fournier (2003)).

Page 22:  · NONPARAMETRIC INSTRUMENTAL VARIABLE ESTIMATION OF STRUCTURAL QUANTILE EFFECTS VICTOR CHERNOZHUKOV PATRICK GAGLIARDINI OLIVIER SCAILLET Abstract. We study the asymptotic distributi

21

Step 2. (Consistency) The next Lemma A.3 establishes (uniform) convergence of the min-imum distance criterion QT (!) by using results in Andrews (1994), Bosq (1998) and Hansen(2008).

Lemma A.3. Under Assumptions A.1, A.2, A.3 (ii)-(iii): (i) QT (!0) + Q" (!0) = Op (aT ),

where aT :=log T

ThdZ+1T

+ h2mT ; (ii) sup#$! |QT (!) + Q" (!)| = Op

./aT + 1*

T

/= op(1) for

aT = o(1).

By Lemma A.3 (i) and the condition on %T , we have

0 $ QT (!) + %T '!'2H $ QT (!0) + %T '!0'2

H = Op(aT + %T ) = Op(%T ). (A.4)

By QT , 0, this implies that %T '!'2H = Op(%T ), that is, '!'2

H = Op(1). Thus, by the compactembedding theorem, the sequence of minimizers ! is tight in (L2[0, 1], '·'). Namely, for any & >

0, there exists a compact subset K* of (L2[0, 1] 5 #, ' ·' ), such that! " K* wp ! 1 + &, where notation wp ! 1 + & means ”with probability at least 1 + &

for all su"ciently large sample sizes”.

Next we have that for any $ > 0 and & > 0, and any T su"ciently large,

P [! 6" B&(!0)] $ P [{! 6" B&(!0)} 5{ ! " K*}] + P [! /" K*]

$ P [{! 6" B&(!0)} 5{ ! " K*}] + &

$ P [ inf#$K!/!\B#(#0)

QT (!) + %T '!'2H $ QT (!) + %T '!'2

H ] + &.

Using bound (A.4) and Lemma A.3 (ii), we get:

P [! 6" B&(!0)] $ P [ inf#$K!/!\B#(#0)

Q"(!) + %T '!'2H + op(1) $ Op(%T )] + &

$ P [ inf#$K!/!\B#(#0)

Q"(!) + op(1) $ Op(%T )] + &

$ P [ inf#$K!/!\B#(#0)

Q"(!) $ op(1)] + &.

Now let .*,& := inf#$K!/!\B#(#0) Q"(!). By compactness of K* , continuity of Q" and identi-fication, we have .*,& = Q"(!%

*,&) > 0 for some !%*,& " K* 5# \ B&(!0). Thus

P [! 6" B&(!0)] $ P [.*,& $ op(1)] + & - &, as T - &.

Since & can be made arbitrary small, we conclude that P [! 6" B&(!0)] - 0. Since $ > 0 isarbitrary, consistency follows.

Page 23:  · NONPARAMETRIC INSTRUMENTAL VARIABLE ESTIMATION OF STRUCTURAL QUANTILE EFFECTS VICTOR CHERNOZHUKOV PATRICK GAGLIARDINI OLIVIER SCAILLET Abstract. We study the asymptotic distributi

22

A.4. Frechet derivative of A and characterization of its adjoints. We start with theFrechet derivative A of operator A before characterizing the adjoints A and A%.

Lemma A.4. Under Assumption A.4 (ii)-(iii), the Frechet derivative of A at !0 is thelinear operator A := DA (!0) defined by A! (z) =

"fX,Y |Z(x,!0(x)|z)! (x) dx, z " Z,

for ! " L2[0, 1]. Moreover we have A (!) = A (!0) + A$! + R (!,!0), where the resid-ual R (!,!0) is such that 'R (!,!0)'L2(FZ ,!) $ 1

2/!(1!!)

c '$!'2, $! := ! + !0, and c :=

supx$X ,y$Y ,z$Z |*yfX,Y |Z(x, y|z)|.

The characterization of the adjoint A of operator A w.r.t. the L2 scalar product (., .) isobtained from:

(#, A!)L2(FZ ,!) =1

" (1 + ")

!#(z)

+!fX,Y |Z(x,!0(x)|z)! (x) dx

,f(z)dz

=1

" (1 + ")

! +!fX,Y,Z(x,!0(x), z)#(z)dz

,! (x) dx =

0A#,!

1,

for ! " L2[0, 1] and # " L2(FZ , "), which yields A#(x) = 1!(1!!)

"fX,Y,Z(x,!0(x), z)#(z)dz.

The characterization of the adjoint A% of operator A w.r.t. the Sobolev scalar product (., .)H

involves the solution of a Dirichlet problem of finite, or infinite, order (see Dubinskij (1986),Chapter 2, for elliptic boundary value problems of infinite order). Let us introduce the functionp(%) =

'ls=0 as%s. When l < & and as = 1, the function p(%) is a polynomial of order l.

When l = & and as = 1/s!, we get p(%) = exp(%). Moreover, let us introduce the completeorthonormal system of L2[0, 1] given by #1(x) = 1, #j(x) =

/2 cos(0 (j + 1) x), j > 1. Let us

define:

S l[0, 1] =

;<

=! " L2[0, 1] :"*

j=1

3p#(0 (j + 1))2

$(!, #j)

42< &

>?

@ , l " N% {&} .

Space S l[0, 1] is a linear vector subspace of L2[0, 1] made of functions whose basis coe"cients(!, #j) feature rapid decay such that p

#(0 (j + 1))2

$(!, #j), j = 1, · · · , are square-summable.

It is an Hilbert space w.r.t. the scalar product (!,,)S :='"

j=1

(p#(0 (j + 1))2

$)2 (!, #j)(,, #j).We denote by '!'S := (!,!)1/2

S the associated norm. When l < &, the space S l[0, 1] is equiv-alent to H2l

0 [0, 1] :=%! " H2l[0, 1] : *s!(0) = *s!(1) = 0, s = 1, 3, · · · , 2l + 1

&, that is the

Sobolev space of order 2l with boundary conditions for the odd-order derivatives (see Kress(1999), Chapter 8, for similar results with periodic functions). When l = &, the functionsin S"[0, 1] are C"-functions with odd-order derivatives vanishing at the boundary, and expo-nentially decaying basis coe"cients.

Page 24:  · NONPARAMETRIC INSTRUMENTAL VARIABLE ESTIMATION OF STRUCTURAL QUANTILE EFFECTS VICTOR CHERNOZHUKOV PATRICK GAGLIARDINI OLIVIER SCAILLET Abstract. We study the asymptotic distributi

23

Lemma A.5. (i) The di!erential operator D := p(+*2) is well-defined on S l[0, 1]. (ii) Forany f " L2[0, 1], the (infinite-order) ODE:

Du = f, u " S l[0, 1], (A.5)

admits the unique solution u ='"

j=11

p((,(j!1))2)(f, #j)#j . (iii) The operator D from S l[0, 1]to L2[0, 1] is invertible, and the inverse D!1 : L2[0, 1] - S l[0, 1] is continuous.

For any a " [0, 1] and s = 1, · · · , l + 1, the linear functional ! 4- *s!(a) is continuouson H l[0, 1]. Denote by &(s)a " H l[0, 1] its Riesz representant, i.e., (&(s)a ,!)H = *s!(a) for any! " H l[0, 1]. The next lemma gives the characterization of the adjoint A% for l , 1.

Lemma A.6. (i) The adjoint of operator A w.r.t. the Sobolev scalar product (., .)H is A% =ED!1A, where the operator E : S l[0, 1] - H l[0, 1] is defined by: Eu = u, if l = 1; Eu =u +

'0l/21j=1 (+1)j

.uj(1)&

(2j!1)1 + uj(0)&

(2j!1)0

/, where uj =

'l!ji=j(+*2)iu, if 2 $ l < &; Eu =

u +'"

j=1(+1)j 1j!

.uj(1)&

(2j!1)1 + uj(0)&

(2j!1)0

/, where uj =

'"i=j

j!(i+j)!(+*2)iu, if l = &, for

u " S l[0, 1]. (ii) The operator E : S l[0, 1] - H l[0, 1] is continuous.

A.5. Proof of Proposition 3. The steps are as follows: 1) getting the first-order condition,2) deriving a Bahadur representation and an asymptotic expansion of the MISE, 3) provingasymptotic normality and 4) showing bias negligibility.

Step 1. (First-order condition) The following lemma provides the Frechet derivative ofoperator A.

Lemma A.7. Under Assumption A.2, the Frechet derivative of A at ! is the linear operatorA := DA (!) defined by A! (z) =

"fX,Y |Z(x, !(x)|z)! (x) dx, z " Z, for ! " L2[0, 1].

Moreover we have A (!) = A (!) + A (!+ !) + R (!, !), where R (!, !) is such that P -a.s.,555R (!, !)555

L2(FZ ,!)$ 1

2/!(1!!)

c '!+ !'2, and c := supx$X ,y$R,z$Z |*y fX,Y |Z(x, y|z)|.

By Assumption 1 (ii), let r > 0 be such that Br (!0) 5 H l[0, 1] is contained in #. When'$!' < r we have: 1! " H l[0, 1], 74 = 4 (!) > 0 : ! + $! " # for any $ " R s.t. |$| < 4.By Lemma A.7, the estimator ! satisfies the first order condition (4.3). We show in thesupplementary materials that P ['$!' , r] = O

.T!b

/, for any b > 0.

Step 2. (Bahadur representation and asymptotic expansion of the MISE) We first rewrite(4.4) as

$! = $# + KT ($!) , (A.6)

where$# := #+!0, with # :=.%T + A%

0A0

/!1A%

0r+.%T + A%

0A0

/!1 .A% + A%

0

/.A (!) + "

/

=: #1 + #2 and r := " + A0!0 + 2A(!0), and KT ($!) defined in Section 4.2. The interpretationof #1 is as a linearized solution obtained from applying a Tikhonov regularization to the linear

Page 25:  · NONPARAMETRIC INSTRUMENTAL VARIABLE ESTIMATION OF STRUCTURAL QUANTILE EFFECTS VICTOR CHERNOZHUKOV PATRICK GAGLIARDINI OLIVIER SCAILLET Abstract. We study the asymptotic distributi

24

proxy A0! = r. Impact of nonlinearity is two-fold. We face the second-order term R inKT ($!) because of the expansion, and we face A% + A%

0 in #2 because of ! in A. Now, wedecompose $# as:

$# = (%T + A!A)"1 A!) + BT + RT , (A.7)

RT =6.%T + A!

0A0

/"1+ (%T + A!A)"1

9A!) +

6.%T + A!

0A0

/"1A!

0A0 + (%T + A!A)"1 A!A

9!0

+.%T + A!

0A0

/"1 .A!

0

.) + q

/+ A!)

/+.%T + A!

0A0

/"1 .A! + A!

0

/.A (!) + "

/, (A.8)

where q := A (!0) + " + ). The Bahadur-type representation (4.5) follows from (A.6) and(A.7).

We show by using Lemma A.8 that the nonlinearity term KT ($!) in Equation (A.6) satisfiesa quadratic bound w.p.a. 1.

Lemma A.8. Under Assumptions A.1-A.3, A.4 (ii)-(iii) and ' < 1dZ+4 , for any b > 0 and

C > 12/!(1!!)

supx,y,z

--*yfX,Y |Z(x, y|z)--:

P

6555KT ($!)555 >

C/%T

'$!'29

= O.T!b

/.

Next, by exploiting (A.6) and the quadratic nature of KT ($!) , we get an asymptoticexpansion of the MISE E

3'$!'2

4in terms of %T and the expectations of powers of

555$#555.

Lemma A.9. Under Assumptions 1-4, A.1-A.3, A.4 (ii)-(iii), ' < 14+dZ

and provided that

( < min7

1!$(dZ+1)1+2a , 2m$

1+2a , 12(1+a)

8we have that for any b > 0

E3'$!'2

4= E

6555$#555

29

+ O

+1/%T

E

6555$#555

39,

+ O.T!b

/.

To compute moments of555$#

555 , we use the decomposition (A.7) and get the next upperbound for the asymptotic MISE.

Lemma A.10. Under Assumptions 1-4, A.1-5, and ' < 12(dZ+2) , ( < 1

2 min {1 + ' (dZ + 1) ,m',

11+a

8, VT (%T ) = o (%T ) , h1/4

TVT ("T ;2)VT ("T ) = o(1): E

3'$!'2

4= O (MT (%T )), where MT (%T ) =

" #1T 1

2T (x) + BT (x)2

$dx.

Step 3. (Asymptotic normality) From the Bahadur representation (4.5), we have:

T/12T (x) (! (x) + !0 (x)) =

:T/12

T (x) (%T + A%A)!1 A%.) + E)

/(x)

+:

T/12T (x)BT (x) +

:T/12

T (x)KT ($!) (x)

+:

T/12T (x) (%T + A%A)!1 A%E) (x) +

:T/12

T (x)RT (x)

=: (I) + (II) + (III) + (IV) + (V).

Page 26:  · NONPARAMETRIC INSTRUMENTAL VARIABLE ESTIMATION OF STRUCTURAL QUANTILE EFFECTS VICTOR CHERNOZHUKOV PATRICK GAGLIARDINI OLIVIER SCAILLET Abstract. We study the asymptotic distributi

25

Step 3(a). (Asymptotic normality of I) Since {,j : j " N} is an orthonormal basis w.r.t.(., .)H , we can write:

(%T + A%A)!1 A%.) + E)

/(x) =

"*

j=1

0,j , (%T + A%A)!1 A%

.) + E)

/1

H,j(x)

="*

j=1

1%T + +j

0A,j , ) + E)

1

L2(FZ ,!),j(x),

for almost any x " [0, 1]. Then, we get

:T/12

T (x) (%T + A%A)!1 A%.) + E)

/(x) =

"*

j=1

wj,T (x)Zj,T , (A.9)

where Zj,T :=0#j ,

/T.) + E)

/1

L2(FZ ,!)and wj,T (x) :=

*)j

"T +)j,j (x) /1T (x), j = 1, 2, · · · .

Note that'"

j=1 wj,T (x)2 = 1. Let gj(r) := 1!(1!!)#j(z)1#0 (w) where 1#0 (w) := 1{y $

!0(x)} + " , w = (y, x).

Lemma A.11. Under Assumptions A.1, A.2, A.5 (iii), ( < m', VT ("T )-2

T (x)/T= O(1),

/hT

VT ("T ;&1)VT ("T ) = o(1), h2m

TVT ("T ;2m+&1)

VT ("T ) = o(1), $1 > 1 :'"

j=1 wj,T (x)Zj,T = 1*T

'Tt=1 YtT +

op(1), where YtT :='"

j=1 wj,T (x)gj(Rt), Rt = (Xt, Yt, Zt).

From Lemma A.11 it is su"cient to prove that T!1/2'Tt=1 YtT is asymptotically N(0, 1) dis-

tributed. Note that E [gj(R)] = 1!(1!!)

1*)j

E [(A,j) (Z)E [1#0 (W ) |Z]] = 0, andCov [gj(R), gk(R)] = 1*

)j*)k

(,j, A%A,k)H = &j,k. Thus E [YtT ] = 0 and V [YtT ] = 1. Bythe Lyapunov CLT, it is su"cient to show that

1T 1/2

E3|YtT |3

4- 0, T - &. (A.10)

To this goal, using the triangular inequality and denoting 'gj'3 := E3|gj(R)|3

41/3, we get

E3|YtT |3

41/3$

'"j=1 |wj,T (x)| 'gj'3 $ C

-T (x)

'"j=1

*)j

"T +)j|,j (x)|, where C :=

1!(1!!) supj$N E

3|#j (Z)|3

41/3< & from Assumption A.5 (ii). From the Cauchy-Schwarz

inequality we have'"

j=1

*)j

"T +)j|,j (x)| $

.'"j=1

)j

("T +)j)2,j (x)2 j&2

/1/2 .'"j=1

1j#2

/1/2, and

'"j=1

1j#2 < &, for any $2 > 1. From (4.10) we get 1

T 1/2 E3|YtT |3

4= O

+31

T 1/3*T (x)

VT ("T )

43/2,

,

where &T (x) := 1T

'"j=1

)j

("T +)j)2,j (x)2 j&2 . Condition (A.10) follows from

"1

T 1/3*T (x)

VT ("T )dx =1

T 1/3VT ("T ;&2)VT ("T ) = O

.h1/4

TVT ("T ;&2)VT ("T )

/= o(1).

Step 3(b). (Negligibility of III) We use the following lemma.

Page 27:  · NONPARAMETRIC INSTRUMENTAL VARIABLE ESTIMATION OF STRUCTURAL QUANTILE EFFECTS VICTOR CHERNOZHUKOV PATRICK GAGLIARDINI OLIVIER SCAILLET Abstract. We study the asymptotic distributi

26

Lemma A.12. Under Assumptions A.1-A.3, A.4 (ii)-(iii), ' < 1dZ+4 :

KT ($!) (x) = Op

.1*"T

'$!'2/

By Lemmas A.10 and A.12, we get KT ($!) (x) = Op

.1*"T

MT (%T )/. Since

"BT (x)2dx =

O#%2*

T

$(see Step 4) and %2*

T = O (VT (%T )), we have MT (%T ) = O (VT (%T )). Then, we get:

T/12T (x)KT ($!) (x) = Op

+:VT ("T )"T

,= op(1) from (4.10) and VT (%T ) = o (%T ) .

Step 3(c). (Negligibility of IV and V) Here we rely on the following lemmas.

Lemma A.13. Under Assumptions A.2, A.4 (i), ( < m$2 , %2*

T = O (VT (%T )), VT ("T )-2

T (x)/T= O(1) :

:T/12

T (x) (%T + A%A)!1 A%E) (x) = o(1).

Lemma A.14. Under Assumptions A.1-5, %2*T = O (VT (%T )), VT ("T )

-2T (x)/T

= O(1), VT (%T ) =

o(%T ), ' < 12(dZ+2) , ( < 1

2 min7

1 + ' (dZ + 1) ,m', 11+a

8::

T/12T (x)RT (x) = op(1) .

Step 4. (Bias negligibility) Similarly to the proof of Proposition 3.11 in CFR, we have

'BT '2H =

"*

j=1

%2T (,j ,!0)2H(%T + +j)2

= %2*T

"*

j=1

%2!2*T +2*

j

(%T + +j)2(,j ,!0)2H+2*

j

$ %2*T

"*

j=1

(,j ,!0)2H+2*

j

= O.%2*

T

/,

from Assumption 4 (i). This proves (4.8). Moreover, from Lemma C.1 in the supplemen-tary materials, we have BT (x) $ 2'BT 'H , for any x " [0, 1]. Thus, from (4.10) we get:

T/12T (x)BT (x) = O

+:%2*

T /VT (%T ),

= o(1) for any typical point x, if %2*T = o (VT (%T )).

A.6. Proof of Proposition 4. Condition -2T (x)-2

T (x)

p- 1 is equivalent to |-2T (x)!-2

T (x)|-2

T (x)= op(1). Let

us introduce the truncated series 120,T (x) =

'NTj=1

)j

()j+"T )2,j(x)2, and the residual 12

1,T (x) ='"

j=NT +1)j

()j+"T )2,j(x)2. We have |-2

T (x)!-2T (x)|

-2T (x)

$ |-2T (x)!-2

0,T (x)|-2

T (x)+

-21,T (x)

-2T (x)

. In Step 1 we show

that the residual in the truncated series is negligible: 5T :=-21,T (x)

-2T (x)

= o(1). In Step 2 we show

that the truncated estimator is consistent for the truncated series: &T := |-2T (x)!-2

0,T (x)|-2

T (x)= op(1).

In Step 3 we show the relative consistency of the estimated spectrum.

Step 1. (Condition on the cut-o!) Using" 10 1

21,T (x)dx =

'"j=NT +1

)j

()j+"T )2',j'2 $

%!2T

'"j=NT +1 +j ',j'2 and Condition (4.11), we get

" 10 -

21,T (x)dx/T

VT ("T ) = o(1). This implies-21,T (x)/T

VT ("T ) =

o(1) for almost all x " [0, 1]. By using VT ("T )-2

T (x)/T= O(1), it follows that 5T = o(1).

Step 2. (Consistent estimation of the truncated series) We have

12T (x) + 12

0,T (x) =NT*

j=1

+j

(+j + %T )2.,j(x)2 + ,j(x)2

/+

NT*

j=1

A+j

(+j + %T )2+ +j

(+j + %T )2

B

,j(x)2.

Page 28:  · NONPARAMETRIC INSTRUMENTAL VARIABLE ESTIMATION OF STRUCTURAL QUANTILE EFFECTS VICTOR CHERNOZHUKOV PATRICK GAGLIARDINI OLIVIER SCAILLET Abstract. We study the asymptotic distributi

27

Moreover )j

()j+"T )2+ )j

()j+"T )2= 1

()j+"T )2()j+"T )2#%2

T + +j+j$(+j + +j) , and

--- )j

()j+"T )2+ )j

()j+"T )2

--- $1

()j+"T )2|+j + +j | + 1

()j+"T )()j+"T ) |+j + +j | . Thus:

&T $ 112

T (x)

NT*

j=1

+j

(+j + %T )2---,j(x)2 + ,j(x)2

---+1

12T (x)

NT*

j=1

1(+j + %T )2

|+j + +j| ,j(x)2

+1

12T (x)

NT*

j=1

1(+j + %T ) (+j + %T )

|+j + +j | ,j(x)2. (A.11)

Let us define 5%j := c2,j exp (wj2), 61,T := sup1-j-NT

|)j!)j |)j

and 62,T := sup1-j-NT

|(j(x)2!(j(x)2|/#j

.

For any j $ NT , we have---,j(x)2 + ,j(x)2

--- $ 5%j 62,T , and |+j + +j| ,j(x)2 $ +j61,T

.,j(x)2 + 5%j 62,T

/.

Moreover, by the Cauchy-Schwartz inequality:

NT*

j=1

1(+j + %T ) (+j + %T )

|+j + +j| ,j(x)2 $NT*

j=1

C+j +j

---,j(x)---.|,j(x)| +

:5%j 62,T

/

(+j + %T ) (+j + %T )61,TC

1 + 61,T

$

D

ENT*

j=1

+j,j(x)2

(+j + %T )2

F

G1/2

H

IJ

D

ENT*

j=1

+j,j(x)2

(+j + %T )2

F

G1/2

+ /62,T

D

ENT*

j=1

+j5%j

(+j + %T )2

F

G1/2K

LM61,TC

1 + 61,T.

Thus, from (A.11), we get

&T $12%,0,T (x)12

T (x)(1 + 61,T ) 62,T +

120,T (x)12

T (x)61,T +

1T (x)10,T (x)12

T (x)61,TC

1 + 61,T+1T (x)1%,0,T (x)

12T (x)

61,T/62,TC

1 + 61,T,

where 12%,0,T (x) :=

'NTj=1

)j/#j()j+"T )2

. Now, by using 120,T (x) $ 12

T (x), 12%,0,T (x) $ 12

%,T (x) $C12

T (x), for a constant C, and 12T (x) $ 12

T (x) (&T + 1 + 5T ) , we obtain

&T $ C62,T (1 + 61,T ) + 61,T +C

1 + &T + 5T61,TC

1 + 61,T

.1 +

CC62,T

/.

Since 5T = o(1) from Step 1, we deduce &T = op(1) if we can show 61,T = op(1) and 62,T = op(1).

Step 3(a). (Relative consistency of +j and ,j(x)2 uniformly over 1 $ j $ nT ) Let us first

show that sup1-j-nT

|)j!)j |)j

= Op(T!b) and sup1-j-nT

|(j(x)2!(j(x)2|/#j

= Op(T!b), for b > 0.These conditions are implied by

1+nT

sup1-j-nT

|+j + +j| = op(T!b),15%nT

sup1-j-nT

55,j + ,j

55 = op(T!b), (A.12)

for b > 0. To prove (A.12) we use the next Lemma A.15, which follows from Lemmas 4.2and 4.3 in Bosq (2000); see also Theorem 1 in Hall and Hosseini-Nasab (2006) for a spectraldecomposition in L2[0, 1].

Page 29:  · NONPARAMETRIC INSTRUMENTAL VARIABLE ESTIMATION OF STRUCTURAL QUANTILE EFFECTS VICTOR CHERNOZHUKOV PATRICK GAGLIARDINI OLIVIER SCAILLET Abstract. We study the asymptotic distributi

28

Lemma A.15. For any j: (i) |+j + +j| $555D

555H

, and (ii)55,j + ,j

55 $ 2*

2")j

555D555

H, where

D := A%A + A%A, and555D

555H

:= sup#$Hl[0,1]:(#(H=1

555D!555

Hdenotes the operator norm in

H l[0, 1].

From Lemma A.15, an upper bound on the rate of convergence of |+j + +j | and55,j + ,j

55

can be deduced from the rate of convergence of555D

555H

.

Lemma A.16. Under Assumptions A.1-5:555D

555H

= Op

+1/Th2

T

+ hmT + '!+ !0'

,.

From Lemmas A.15 and A.16, and '!+ !0' = Op

.MT

#%T$1/2

/from Lemma A.10, we get

sup1-j-nT

|+j + +j | = Op (.T ) , sup1-j-nT

55,j + ,j

55 = Op

+.T

$+nT

,,

where .T := 1/Th2

T

+hmT +MT

#%T$1/2

. Thus, (A.12) follows from condition (4.12) and $+nT $+nT!1.

Step 3(b). (Relative consistency of +j and ,j(x)2 uniformly over 1 $ j $ NT ) Finally, weprove in the next Lemma A.17 that the uniform convergence of +j and ,j can be extended toj $ NT by the extrapolation procedure.

Lemma A.17. Under Assumptions 5 and A.1-5, and if NT = O(nT ): (i) supnT <j-NT

|)j!)j |)j

=

op(1), (ii) supnT <j-NT

|(j(x)2!(j(x)2|/#j

= op(1).

Page 30:  · NONPARAMETRIC INSTRUMENTAL VARIABLE ESTIMATION OF STRUCTURAL QUANTILE EFFECTS VICTOR CHERNOZHUKOV PATRICK GAGLIARDINI OLIVIER SCAILLET Abstract. We study the asymptotic distributi

29

l = 1 l = 2% = % = .00002 % = % = .0000003

x = 0.1 x = .25 x = .5 x = .75 x = .9 x = 0.1 x = .25 x = .5 x = .75 x = .990% .909 .952 .754 .947 .868 .928 .986 .718 .990 .91795% .956 .973 .850 .978 .937 .967 .993 .807 .998 .95399% .989 .996 .953 .993 .979 .994 .997 .915 1 .989

% = % = .00005 % = % = .000001x = 0.1 x = .25 x = .5 x = .75 x = .9 x = 0.1 x = .25 x = .5 x = .75 x = .9

90% .964 .968 .879 .956 .947 .989 .985 .922 .973 .98695% .989 .989 .935 .981 .974 .996 .997 .971 .996 .99699% .996 1 .983 .996 .998 .999 .999 .994 .999 1

% = % = .0001 % = % = .000003x = 0.1 x = .25 x = .5 x = .75 x = .9 x = 0.1 x = .25 x = .5 x = .75 x = .9

90% .985 .958 .954 .951 .977 .999 .943 .992 .927 195% .995 .980 .984 .979 .994 .999 .978 .999 .964 199% .999 .992 .998 .998 1 1 .995 1 .995 1

Table I: Finite-sample coverage probabilities of confidence intervals for !0(x).

Page 31:  · NONPARAMETRIC INSTRUMENTAL VARIABLE ESTIMATION OF STRUCTURAL QUANTILE EFFECTS VICTOR CHERNOZHUKOV PATRICK GAGLIARDINI OLIVIER SCAILLET Abstract. We study the asymptotic distributi

30

−3 −2 −1 0 1 2 3−3

−2

−1

0

1

2

3x = 0.50, l = 1, ! = ! = .00005

Empirical quantile

N(0,

1) q

uant

ile

−3 −2 −1 0 1 2 3−3

−2

−1

0

1

2

3x = 0.50, l = 2, ! = ! = .000001

Empirical quantile

N(0,

1) q

uant

ile

Figure 1. QQ-plot of:

T/12T (x) (!(x) + !0(x)).

Page 32:  · NONPARAMETRIC INSTRUMENTAL VARIABLE ESTIMATION OF STRUCTURAL QUANTILE EFFECTS VICTOR CHERNOZHUKOV PATRICK GAGLIARDINI OLIVIER SCAILLET Abstract. We study the asymptotic distributi

31

0 100 200 3000

0.05

0.1

0.15

0.2

0.25

long−distance calls (in minutes)

price

per

min

ute

(in d

olla

rs)

Years of education below or equal to 12

0 100 200 3000

0.05

0.1

0.15

0.2

0.25

long−distance calls (in minutes)

price

per

min

ute

(in d

olla

rs)

Years of education above 12

0 100 200 3000

0.05

0.1

0.15

0.2

0.25

long−distance calls (in minutes)

price

per

min

ute

(in d

olla

rs)

Years of education below or equal to 12

0 100 200 3000

0.05

0.1

0.15

0.2

0.25

long−distance calls (in minutes)

price

per

min

ute

(in d

olla

rs)

Years of education above 12

Figure 2. Estimated median structural e!ect for the two education categories:NIVQR (solid line), IVQR (dashed line), 95% confidence intervals (dotted lines).The upper panel uses a Sobolev penalty of order l = 1, the lower panel usesl = 2.

Page 33:  · NONPARAMETRIC INSTRUMENTAL VARIABLE ESTIMATION OF STRUCTURAL QUANTILE EFFECTS VICTOR CHERNOZHUKOV PATRICK GAGLIARDINI OLIVIER SCAILLET Abstract. We study the asymptotic distributi

32

0 100 200 3000

0.05

0.1

0.15

0.2

0.25

long−distance calls (in minutes)

price

per

min

ute

(in d

olla

rs)

Years of education below or equal to 12

0 100 200 3000

0.05

0.1

0.15

0.2

0.25

long−distance calls (in minutes)

price

per

min

ute

(in d

olla

rs)

Years of education above 12

Figure 3. Estimated quartile structural e!ects (solid lines) and mean struc-tural e!ects (dashed lines) for the two education categories, with a Sobolevpenalty of order l = 1.

Massachusetts Institute of Technology;

University of Lugano and Swiss Finance Institute;

University of Geneva and Swiss Finance Institute