model estimation and application - baruch...

40
Model Estimation and Application Liuren Wu Zicklin School of Business, Baruch College Option Pricing Liuren Wu (Baruch) Estimation and Application Option Pricing 1 / 40

Upload: lekhuong

Post on 14-Mar-2018

219 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Model Estimation and Application - Baruch Collegefaculty.baruch.cuny.edu/lwu/890/ADP_Estimation.pdf · Model Estimation and Application Liuren Wu Zicklin School of Business, Baruch

Model Estimation and Application

Liuren Wu

Zicklin School of Business, Baruch College

Option Pricing

Liuren Wu (Baruch) Estimation and Application Option Pricing 1 / 40

Page 2: Model Estimation and Application - Baruch Collegefaculty.baruch.cuny.edu/lwu/890/ADP_Estimation.pdf · Model Estimation and Application Liuren Wu Zicklin School of Business, Baruch

Outline

1 Statistical dynamics

2 Risk-neutral dynamics

3 Joint estimation

4 Applications

Liuren Wu (Baruch) Estimation and Application Option Pricing 2 / 40

Page 3: Model Estimation and Application - Baruch Collegefaculty.baruch.cuny.edu/lwu/890/ADP_Estimation.pdf · Model Estimation and Application Liuren Wu Zicklin School of Business, Baruch

Outline

1 Statistical dynamics

2 Risk-neutral dynamics

3 Joint estimation

4 Applications

Liuren Wu (Baruch) Estimation and Application Option Pricing 3 / 40

Page 4: Model Estimation and Application - Baruch Collegefaculty.baruch.cuny.edu/lwu/890/ADP_Estimation.pdf · Model Estimation and Application Liuren Wu Zicklin School of Business, Baruch

Estimating statistical dynamics

Constructing likelihood of the Levy return innovation based on Fourier inversionof the characteristic function.

If the model is a Levy process without time change, the maximum likelihoodestimation procedure is straightforward.

Given initial guesses on model parameters that control the Levy triplet(µ, σ, π(x)), derive the characteristic function.Apply FFT to generate the probability density at a fine grid of possiblereturn realizations — Choose a large N and a large η to generate a findgrid of density values.Interpolate to generate intensity values at the observed return values.Take logs on the densities and sum them.Numerically maximize the aggregate likelihood to determine theparameter estimates.Trick: Do as much pre-calculation and pre-processing as you can tospeed up the estimation.Standardizing the data can also be helpful in reducing numerical issues.Example: CGMY, 2002, The Fine Structure of Asset Returns, Journal of Business, 75(2), 305–332.

Liuren Wu, Dampened Power Law, Journal of Business, 2006, 79(3), 1445–1474.

Liuren Wu (Baruch) Estimation and Application Option Pricing 4 / 40

Page 5: Model Estimation and Application - Baruch Collegefaculty.baruch.cuny.edu/lwu/890/ADP_Estimation.pdf · Model Estimation and Application Liuren Wu Zicklin School of Business, Baruch

Estimating statistical dynamics

The same MLE method can be extended to cases where only the innovationis driven by a Levy process, while the conditional mean and variance can bepredicted by observables:

dSt/St = µ(Zt)dt + σ(Zt)dXt

where Xt denotes a Levy process, and Zt denotes a set of observables thatcan predict the mean and variance.

Perform Euler approximation:

Rt+∆t =St+∆t − St

St= µ(Zt)∆t + σ(Zt)

√∆t(Xt+∆t − Xt)

From the observed return series Rt+∆t , derive a standardized returnseries,

SRt+∆t = (Xt+∆t − Xt) =Rt+∆t − µ(Zt)∆t

σ(Zt)√

∆t

Since SRt+∆t is generated by the increment of a pure Levy process, wecan build the likelihood just like before.

Given the Euler approximation, the exact forms of µ(Z ) and σ(Z ) donot matter as much.

Liuren Wu (Baruch) Estimation and Application Option Pricing 5 / 40

Page 6: Model Estimation and Application - Baruch Collegefaculty.baruch.cuny.edu/lwu/890/ADP_Estimation.pdf · Model Estimation and Application Liuren Wu Zicklin School of Business, Baruch

Estimating statistical dynamics

dSt/St = µ(Zt)dt + σ(Zt)dXt

When Z is unobservable (such as stochastic volatility, activity rates), theestimation becomes more difficult.

One normally needs some filtering technique to infer the hiddenvariables Z from the observables.

Maximum likelihood with partial filtering: Alireza Javaheri, Inside Volatility

Arbitrage : The Secrets of Skewness

MCMC Bayesian estimation: Eraker, Johannes, Polson (2003, JF): The Impact of Jumps

in Equity Index Volatility and Returns; Li, Wells, Yu, (RFS, forthcoming): A Bayesian Analysis of

Return Dynamics with Levy Jumps.

GARCH: Use observables (return) to predict un-observable (volatility).

Constructing variance swap rates from options and realized variancefrom high-frequency returns to make activity rates more observable.Wu, Variance Dynamics: Joint Evidence from Options and High-Frequency Returns.

Liuren Wu (Baruch) Estimation and Application Option Pricing 6 / 40

Page 7: Model Estimation and Application - Baruch Collegefaculty.baruch.cuny.edu/lwu/890/ADP_Estimation.pdf · Model Estimation and Application Liuren Wu Zicklin School of Business, Baruch

Estimating statistical dynamics

Wu, Variance Dynamics: Joint Evidence from Options and High-Frequency Returns.

Use index options to replicate variance swap rates, VIX.

Under affine specifications, VIX 2t = 1

T EQ [∫ t+h

tvsds] = a(h) + b(h)vt ,

where (a(h), b(h)) are functions of risk-neutral v -dynamics.Solve for vt from VIX: vt = (VIX 2

t − a(h))/b(h).Build the likelihood on vt as an observable:

dvt = µ(vt)dt + σ(vt)dXt

Use Euler approximation to solve for the Levy component Xt+∆t − Xt

from vt .

Build the likelihood on the Levy component based on FFT inversion ofthe characteristic function.

Use high-frequency returns to construct daily realized variance (RV).

Treat RV as noisy estimators of vt : RVt = vt∆t + error .Given vt , build quasi-likelihood function on the realized variance error.

Future research: Incorporate more observables.

Liuren Wu (Baruch) Estimation and Application Option Pricing 7 / 40

Page 8: Model Estimation and Application - Baruch Collegefaculty.baruch.cuny.edu/lwu/890/ADP_Estimation.pdf · Model Estimation and Application Liuren Wu Zicklin School of Business, Baruch

Outline

1 Statistical dynamics

2 Risk-neutral dynamics

3 Joint estimation

4 Applications

Liuren Wu (Baruch) Estimation and Application Option Pricing 8 / 40

Page 9: Model Estimation and Application - Baruch Collegefaculty.baruch.cuny.edu/lwu/890/ADP_Estimation.pdf · Model Estimation and Application Liuren Wu Zicklin School of Business, Baruch

Estimating the risk-neutral dynamics

Nonlinear weighted least square to fit Levy models to option prices. Dailycalibration (Bakshi, Cao, Chen (1997, JF), Carr and Wu (2003, JF))

The key issue is how to define the pricing error and how to build the weight:

In-the-money is dominated by the intrinsic value, not by the model. Ateach strike, use the out-of-the-money option: Call when K > F andput when K ≤ F .Pricing errors can either be absolute errors (market minus model), orpercentage errors (log (market/model)).

Using absolute errors favors options with higher values (longermaturity, near the money).Using percentage errors put more uniform weight across options, butmay put too much weight on illiquid options (far out of money).

Errors can be either in dollar prices or implied volatilities.My current choice: Use out of money option prices to define absoluteerrors, use the inverse of vega as weights.

Liuren Wu (Baruch) Estimation and Application Option Pricing 9 / 40

Page 10: Model Estimation and Application - Baruch Collegefaculty.baruch.cuny.edu/lwu/890/ADP_Estimation.pdf · Model Estimation and Application Liuren Wu Zicklin School of Business, Baruch

Estimating the risk-neutral dynamics

Sometimes separate calibration per maturity is needed for a simple Levymodel (e.g., VG, MJD)

Levy processes with finite variance implies that non-normality dies awayquickly with time aggregation.

Model-generated implied volatility smile/smirk flattens out at longmaturities.

Separate calibration is necessary to capture smiles at long maturities.

Adding a persistent stochastic volatility process (time change) helps improvethe fitting along the maturity dimension.

Daily calibration: activity rates and model parameters are treated thesame as free parameters.

Dynamically consistent estimation: Parameters are fixed, only activityrates are allowed to vary over time.

Liuren Wu (Baruch) Estimation and Application Option Pricing 10 / 40

Page 11: Model Estimation and Application - Baruch Collegefaculty.baruch.cuny.edu/lwu/890/ADP_Estimation.pdf · Model Estimation and Application Liuren Wu Zicklin School of Business, Baruch

Static v. dynamic consistency

Static cross-sectional consistency: Option values across differentstrikes/maturities are generated from the same model (same parameters) ata point in time.

Dynamic consistency: Option values over time are also generated from thesame no-arbitrage model (same parameters).

While most academic & practitioners appreciate the importance of beingboth cross-sectionally and dynamically consistent, it can be difficult toachieve while generating good pricing performance. So it comes tocompromises.

Market makers:Achieving static consistency is sufficient.Matching market prices is important to provide two-sided quotes.

Long-term convergence traders:Pricing errors represent trading opportunities.Dynamic consistency is important for long-term convergence trading.

A well-designed model (with several time changed Levy components) canachieve both dynamic consistency and good performance.Fewer parameters (parsimony), more activity rates.Liuren Wu (Baruch) Estimation and Application Option Pricing 11 / 40

Page 12: Model Estimation and Application - Baruch Collegefaculty.baruch.cuny.edu/lwu/890/ADP_Estimation.pdf · Model Estimation and Application Liuren Wu Zicklin School of Business, Baruch

Dynamically consistent estimation

Nested nonlinear least square (Huang and Wu (2004)):Often has convergence issues.

Cast the model into state-space form and use MLE.

Define state propagation equation based on the P-dynamics of theactivity rates. (Need to specify market price on activity rates, but noton return risks).

Define the measurement equation based on option prices(out-of-money values, weighted by vega,...)

Use an extended version of Kalman filter (EKF, UKF, PKF) topredict/filter the distribution of the states and measurements.

Define the likelihood function based on forecasting errors on themeasurement equations.

Estimate model parameters by maximizing the likelihood.

Liuren Wu (Baruch) Estimation and Application Option Pricing 12 / 40

Page 13: Model Estimation and Application - Baruch Collegefaculty.baruch.cuny.edu/lwu/890/ADP_Estimation.pdf · Model Estimation and Application Liuren Wu Zicklin School of Business, Baruch

The Classic Kalman filter

Kalman filter (KF) generates efficient forecasts and updates underlinear-Gaussian state-space setup:

State : Xt+1 = FXt +√

Σxεt+1,Measurement : yt = HXt +

√Σyet

The ex ante predictions as

X t = FXt−1; V x,t = FVx,t−1F> + Σx ;

y t = HX t ; V y ,t = HV x,tH> + Σy .

The ex post filtering updates are,

Kt = V x,tH> (V y ,t

)−1= V xy ,t

(V y ,t

)−1, → Kalman gain

Xt = X t + Kt (yt − y t) ,

Vx,t = V x,t − KtV y ,tK>t = (I − KtH)V x,t

The log likelihood is build on the forecasting errors of the measurements,

lt = − 12 log

∣∣V y ,t

∣∣− 12

((yt − y t)

> (V y ,t

)−1(yt − y t)

).

Liuren Wu (Baruch) Estimation and Application Option Pricing 13 / 40

Page 14: Model Estimation and Application - Baruch Collegefaculty.baruch.cuny.edu/lwu/890/ADP_Estimation.pdf · Model Estimation and Application Liuren Wu Zicklin School of Business, Baruch

Numerical twists

Kalman filter with fading memory: Replace V x,t with

V x,t = α2FVx,t−1F> + Σx ,

with α > 1 (e.g., α ≈ 1.01). This slight raise on V x,t increases the Kalmangain and hence increases the responsiveness of the states to the newobservations (versus old observations). This twist allows the filtering toforget about old observations gradually.

Application: Recursive least square estimation of the coefficient X :yt = HtX + et . In this case, the coefficient X is the hidden state. Inthis case, we can estimate the coefficient recursively and update thecoefficient estimate after each new observation yt :Xt = Xt−1 + Kt(yt − HtXt−1), with

V x,t = Vx,t−1α2,

Kt = V x,tH>t (HtV y ,tH

>t + Σy ),

Vx,t = (I − KtHt)V x,t .

Liuren Wu (Baruch) Estimation and Application Option Pricing 14 / 40

Page 15: Model Estimation and Application - Baruch Collegefaculty.baruch.cuny.edu/lwu/890/ADP_Estimation.pdf · Model Estimation and Application Liuren Wu Zicklin School of Business, Baruch

The Extended Kalman filter: Linearly approximating themeasurement equation

If we specify affine-diffusion dynamics for the activity rates, the statedynamics (X ) can be regarded as Gaussian linear, but option prices (y) arenot linear in the states:

State : Xt+1 = FXt +√

Σx,tεt+1,Measurement : yt = h(Xt) +

√Σyet

One way to use the Kalman filter is by linear approximating themeasurement equation,

yt ≈ HtXt +√

Σyet , Ht =∂h(Xt)

∂Xt

∣∣∣∣Xt=Xt

It works well when the nonlinearity in the measurement equation is small.

Numerical issues (some are well addressed in the engineering literature)

How to compute the gradient?How to keep the covariance matrix positive definite.

Liuren Wu (Baruch) Estimation and Application Option Pricing 15 / 40

Page 16: Model Estimation and Application - Baruch Collegefaculty.baruch.cuny.edu/lwu/890/ADP_Estimation.pdf · Model Estimation and Application Liuren Wu Zicklin School of Business, Baruch

Approximating the distribution

Measurement : yt = h(Xt) +√

Σyet

The Kalman filter applies Bayesian rules in updating the conditionallynormal distributions.

Instead of linearly approximating the measurement equation h(Xt), wedirectly approximate the distribution and then apply Bayesian rules on theapproximate distribution.

There are two ways of approximating the distribution:

Draw a large amount of random numbers, and propagate these randomnumbers — Particle filter. (more generic)Choose “sigma” points deterministically to approximate thedistribution (think of binominal tree approximating a normaldistribution) — unscented filter. (faster, easier to implement, andworks reasonably well when X follow pure diffusion dynamics)

Liuren Wu (Baruch) Estimation and Application Option Pricing 16 / 40

Page 17: Model Estimation and Application - Baruch Collegefaculty.baruch.cuny.edu/lwu/890/ADP_Estimation.pdf · Model Estimation and Application Liuren Wu Zicklin School of Business, Baruch

The unscented transformation

Let k be the number of states. A set of 2k + 1 sigma vectors χi aregenerated according to:

χt,0 = Xt , χt,i = Xt ±√

(k + δ)(Vx,t)j (1)

with corresponding weights wi given by

wm0 = δ/(k + δ), w c

0 = δ/(k + δ) + (1− α2 + β), wi = 1/[2(k + δ)],

where δ = α2(k + κ)− k is a scaling parameter, α (usually between 10−4

and 1) determines the spread of the sigma points, κ is a secondary scalingparameter usually set to zero, and β is used to incorporate prior knowledgeof the distribution of x .It is optimal to set β = 2 if x is Gaussian.

We can regard these sigma vectors as forming a discrete distribution with wi

as the corresponding probabilities.

Think of sigma points as a trinomial tree v. particle filtering assimulation.If the state vector does not follow diffusion dynamics and hence can nolonger be approximated by Gaussian, the sigma points may not beenough. Particle filtering is needed.

Liuren Wu (Baruch) Estimation and Application Option Pricing 17 / 40

Page 18: Model Estimation and Application - Baruch Collegefaculty.baruch.cuny.edu/lwu/890/ADP_Estimation.pdf · Model Estimation and Application Liuren Wu Zicklin School of Business, Baruch

The unscented Kalman filter

State prediction:

χt−1 =

[Xt−1, Xt−1 ±

√(k + δ)Vx,t−1

], (draw sigma points)

χt,i = Fχt−1,i ,

X t =∑2k

i=0 wmi χt,i ,

V x,t =∑2k

i=0 wci (χt,i − X t)(χt,i − X t)

> + Σx .

(2)

Measurement prediction:

χt =

[X t ,X t ±

√(k + δ)V x,t

], (re-draw sigma points)

ζt,i = h(χt,i ),

y t =∑2k

i=0 wmi ζt,i .

(3)

Redrawing the sigma points in (3) is to incorporate the effect of processnoise Σx .

Since the state propagation equation is linear in our specification, we canreplace the state prediction step in (2) by the Kalman filter and only drawsigma points in (3).Liuren Wu (Baruch) Estimation and Application Option Pricing 18 / 40

Page 19: Model Estimation and Application - Baruch Collegefaculty.baruch.cuny.edu/lwu/890/ADP_Estimation.pdf · Model Estimation and Application Liuren Wu Zicklin School of Business, Baruch

The unscented Kalman filter

Measurement update:

V y ,t =∑2k

i=0 wci

[ζt,i − y t

] [ζt,i − y t

]>+ Σy ,

V xy ,t =∑2k

i=0 wci

[χt,i − X t

] [ζt,i − y t

]>,

Kt = V xy ,t

(V y ,t

)−1,

Xt = X t + Kt (yt − y t) ,

Vx,t = V x,t − KtV y ,tK>t .

One can also do square root UKF to increase the numerical precision and tomaintain the positivity definite property of the covariance matrix.

Liuren Wu (Baruch) Estimation and Application Option Pricing 19 / 40

Page 20: Model Estimation and Application - Baruch Collegefaculty.baruch.cuny.edu/lwu/890/ADP_Estimation.pdf · Model Estimation and Application Liuren Wu Zicklin School of Business, Baruch

The Square-root UKF

Let St denote the Cholesky factor of Vt such that Vt = StS>t .

State prediction:

χt−1 =[Xt−1, Xt−1 ±

√(k + δ)Sx,t−1

], (draw sigma points)

χt,i = Fχt−1,i ,

X t =∑2k

i=0 wmi χt,i ,

Sx,t = qr{[√

w c1 (χt,1:2k − X t)

√Σx

]},

Sx,t = cholupdate{Sx,t , χt,0 − X t ,wc0 }.

(4)

Measurement prediction:

χt =[X t ,X t ±

√(k + δ)Sx,t

], (re-draw sigma points)

ζt,i = h(χt,i ), y t =∑2k

i=0 wmi ζt,i .

(5)

Liuren Wu (Baruch) Estimation and Application Option Pricing 20 / 40

Page 21: Model Estimation and Application - Baruch Collegefaculty.baruch.cuny.edu/lwu/890/ADP_Estimation.pdf · Model Estimation and Application Liuren Wu Zicklin School of Business, Baruch

The Square-root UKF

Measurement update:

Sy ,t = qr{[√

w c1 (ζt,1:2k − y t)

√Σy

]},

Sy ,t = cholupdate{Sy ,t , ζt,0 − y t ,wc0 },

V xy ,t =∑2k

i=0 wci

[χt,i − X t

] [ζt,i − y t

]>,

Kt =(V xy ,t/S

>y ,t

)/Sy ,t ,

Xt = X t +Kt (yt − y t) ,U = KtSy ,t ,

Sx,t = cholupdate{Sx,t ,U,−1}.

Liuren Wu (Baruch) Estimation and Application Option Pricing 21 / 40

Page 22: Model Estimation and Application - Baruch Collegefaculty.baruch.cuny.edu/lwu/890/ADP_Estimation.pdf · Model Estimation and Application Liuren Wu Zicklin School of Business, Baruch

Outline

1 Statistical dynamics

2 Risk-neutral dynamics

3 Joint estimation

4 Applications

Liuren Wu (Baruch) Estimation and Application Option Pricing 22 / 40

Page 23: Model Estimation and Application - Baruch Collegefaculty.baruch.cuny.edu/lwu/890/ADP_Estimation.pdf · Model Estimation and Application Liuren Wu Zicklin School of Business, Baruch

Joint estimation of P and Q dynamics

Pan (2002, JFE): GMM. Choosing moment conditions becomes increasingdifficult with increasing number of parameters.

Eraker (2004, JF): Bayesian with MCMC. Choose 2-3 options per day.Throw away lots of cross-sectional (Q) information.

Bakshi & Wu (2010, MS), “ The Behavior of Risk and Market Prices of Riskover the Nasdaq Bubble Period”MLE with filtering

Cast activity rate P-dynamics into state equation, cast option pricesinto measurement equation.

Use UKF to filter out the mean and covariance of the states andmeasurement.

Construct the likelihood function of options based on forecasting errors(from UKF) on the measurement equations.

Given the filtered activity rates, construct the conditional likelihood onthe returns by FFT inversion of the conditional characteristic function.

The joint log likelihood equals the sum of the log likelihood of optionpricing errors and the conditional log likelihood of stock returns.Liuren Wu (Baruch) Estimation and Application Option Pricing 23 / 40

Page 24: Model Estimation and Application - Baruch Collegefaculty.baruch.cuny.edu/lwu/890/ADP_Estimation.pdf · Model Estimation and Application Liuren Wu Zicklin School of Business, Baruch

Existing issues

When the state dynamics are discontinuous, the performance of UKF candeteriorate. Using particle filter instead increases the computational burdendramatically.

Inconsistency regarding the current state of the state vector.

When we price options, we derive the option values (Fouriertransforms) as a function of the state vector.

In model estimation, the value of the state vector is never known, weonly know its distribution (characterized by the sigma points).

The two components (pricing and model estimation) are inherentlyinconsistent with each other.

Several papers try to use incomplete information to explain thecorrelation between credit events and their arrival intensity.

Conditional on incomplete information about the state vector (thecurrent state of the state vector is not known in full), how the optionvaluation should adjust?

Liuren Wu (Baruch) Estimation and Application Option Pricing 24 / 40

Page 25: Model Estimation and Application - Baruch Collegefaculty.baruch.cuny.edu/lwu/890/ADP_Estimation.pdf · Model Estimation and Application Liuren Wu Zicklin School of Business, Baruch

Outline

1 Statistical dynamics

2 Risk-neutral dynamics

3 Joint estimation

4 Applications

Liuren Wu (Baruch) Estimation and Application Option Pricing 25 / 40

Page 26: Model Estimation and Application - Baruch Collegefaculty.baruch.cuny.edu/lwu/890/ADP_Estimation.pdf · Model Estimation and Application Liuren Wu Zicklin School of Business, Baruch

Why are we doing this?

Go back to the overview slides:

1 Inferring expectations and risk preferences.The literature on this line of research is very long...You can tell a more detailed story with options than with the primarysecurities alone.

2 Interpolation and extrapolation:

Market makers, normally perform frequent calibration of a simplermodel. The calibration could even be on each maturity separately.Data vendors (such as Bloomberg, Markit): This is a rapidly growingbusiness — Populating quotes using models for markets without validquotes.

3 Alpha generation: Statistical arbitrage based on no-arbitrage models

Normally needs a dynamically consistent way of estimating the model.The model can be more rigid but needs to be stable.I’ll give some examples on this line of application.

Liuren Wu (Baruch) Estimation and Application Option Pricing 26 / 40

Page 27: Model Estimation and Application - Baruch Collegefaculty.baruch.cuny.edu/lwu/890/ADP_Estimation.pdf · Model Estimation and Application Liuren Wu Zicklin School of Business, Baruch

Statistical arbitrage based on no-arbitrage models

Example I: Statistical arbitrage on interest rate swaps based on no-arbitragedynamic term structure models.Reference: Bali, Heidari, and Wu, Predictability of Interest Rates and Interest-Rate Portfolios, Journal of Business and Economic

Statistics, 2009, 27(4), 517-527.

This is not on options or an option pricing model, but it is one of the first papersthat provide detailed steps of such an application.Basic idea:

Interest rates across different maturities are related.

A dynamic term structure model provides a (smooth) functional form forthis relation that excludes arbitrage.

The model usually consists of specifications of risk-neutral factordynamics (X ) and the short rate as a function of the factors, e.g.,rt = ar + b>r Xt .

Nothing about the forecasts: The “risk-neutral dynamics” are estimated tomatch historical term structure shapes.

A model is well-specified if it can fit most of the term structure shapesreasonably well.

Liuren Wu (Baruch) Estimation and Application Option Pricing 27 / 40

Page 28: Model Estimation and Application - Baruch Collegefaculty.baruch.cuny.edu/lwu/890/ADP_Estimation.pdf · Model Estimation and Application Liuren Wu Zicklin School of Business, Baruch

A 3-factor affine model

with adjustments for discrete Fed policy changes:

Pricing errors on USD swap rates in bps

Maturity Mean MAE Std Auto Max R2

2 y 0.80 2.70 3.27 0.76 12.42 99.963 y 0.06 1.56 1.94 0.70 7.53 99.985 y -0.09 0.68 0.92 0.49 5.37 99.997 y 0.08 0.71 0.93 0.52 7.53 99.9910 y -0.14 0.84 1.20 0.46 8.14 99.9915 y 0.40 2.20 2.84 0.69 16.35 99.9030 y -0.37 4.51 5.71 0.81 22.00 99.55

Superb pricing performance: R-squared is greater than 99%.Maximum pricing errors is 22bps.

Pricing errors are transient compared to swap rates (0.99):Average half life of the pricing errors is 3 weeks.The average half life for swap rates is 1.5 years.

Liuren Wu (Baruch) Estimation and Application Option Pricing 28 / 40

Page 29: Model Estimation and Application - Baruch Collegefaculty.baruch.cuny.edu/lwu/890/ADP_Estimation.pdf · Model Estimation and Application Liuren Wu Zicklin School of Business, Baruch

Investing in interest rate swaps based on dynamic termstructure models

If you can forecast interest rate movements,

Long swap if you think rates will go down.Forget about dynamic term structure model: It does not help yourinterest rate forecasting.

If you cannot forecast interest rate movements (it is hard), use the dynamicterm structure model not for forecasting, but as a decomposition tool:

yt = f (Xt) + et

What the model captures (f (Xt)) is the persistent component.What the model misses (the pricing error e) is the more transient andhence more predictable component.

Form swap-rate portfolios that

neutralize their first-order dependence on the persistent factors.only vary with the transient residual movements.

Result: The portfolios are strongly predictable, even though the individualinterest-rate series are not.Liuren Wu (Baruch) Estimation and Application Option Pricing 29 / 40

Page 30: Model Estimation and Application - Baruch Collegefaculty.baruch.cuny.edu/lwu/890/ADP_Estimation.pdf · Model Estimation and Application Liuren Wu Zicklin School of Business, Baruch

Static arbitrage trading based on no-arbitrage dynamicterm structure models

For a three-factor model, we can form a 4-swap rate portfolio that has zeroexposure to the factors.

The portfolio should have duration close to zeroNo systematic interest rate risk exposure.The fair value of the portfolio should be relatively flat over time.

The variation of the portfolio’s market value is mainly induced by short-termliquidity shocks...

Long/short the swap portfolio based on its deviation from the fair modelvalue.

Provide liquidity to where the market needs it and receives a premiumfrom doing so.

Liuren Wu (Baruch) Estimation and Application Option Pricing 30 / 40

Page 31: Model Estimation and Application - Baruch Collegefaculty.baruch.cuny.edu/lwu/890/ADP_Estimation.pdf · Model Estimation and Application Liuren Wu Zicklin School of Business, Baruch

The time-series of 10-year USD swap rates

Hedged (left) v. unhedged (right)

Jan96 Jan98 Jan00 Jan02

−45

−40

−35

−30

−25

−20

−15

Inte

rest

Rat

e P

ortfo

lio, B

ps

Jan96 Jan98 Jan00 Jan023.5

4

4.5

5

5.5

6

6.5

7

7.5

8

10−Y

ear

Sw

ap, %

It is much easier to predict the hedged portfolio (left panel) than the unhedgedswap contract (right panel).

Liuren Wu (Baruch) Estimation and Application Option Pricing 31 / 40

Page 32: Model Estimation and Application - Baruch Collegefaculty.baruch.cuny.edu/lwu/890/ADP_Estimation.pdf · Model Estimation and Application Liuren Wu Zicklin School of Business, Baruch

Back-testing results from a simple investment strategy

95-00: In sample. Holding each investment for 4 weeks.

Liuren Wu (Baruch) Estimation and Application Option Pricing 32 / 40

Page 33: Model Estimation and Application - Baruch Collegefaculty.baruch.cuny.edu/lwu/890/ADP_Estimation.pdf · Model Estimation and Application Liuren Wu Zicklin School of Business, Baruch

Caveats

Convergence takes time: We take a 4-week horizon.

Accurate hedging is vital for the success of the strategy. The model needs tobe estimated with dynamic consistency:

Parameters are held constant. Only state variables vary.

Appropriate model design is important: parsimony, stability, adjustmentfor some calendar effects.

The model not only needs to generate mean-reverting pricing errors,but also needs to generate stable risk sensitivities — The swap portfolioconstructed based on the model must be really free of market risk.

Daily fitting of a simpler model (with daily varying parameters) isdangerous.

Spread trading (one factor) generates low Sharpe ratios.

Butterfly trading (2 factors) is also not guaranteed to succeed.

Liuren Wu (Baruch) Estimation and Application Option Pricing 33 / 40

Page 34: Model Estimation and Application - Baruch Collegefaculty.baruch.cuny.edu/lwu/890/ADP_Estimation.pdf · Model Estimation and Application Liuren Wu Zicklin School of Business, Baruch

Another example: Trading the linkages between sovereignCDS and currency options

When a sovereign country’s default concern (over its foreign debt) increases,the country’s currency tend to depreciate, and currency volatility tend to rise.

“Money as stock” corporate analogy.

Observation: Sovereign credit default swap spreads tend to move positivelywith currency’s

option implied volatilities (ATMV): A measure of the return volatility.risk reversals (RR): A measure of distributional asymmetry.

Liuren Wu (Baruch) Estimation and Application Option Pricing 34 / 40

Page 35: Model Estimation and Application - Baruch Collegefaculty.baruch.cuny.edu/lwu/890/ADP_Estimation.pdf · Model Estimation and Application Liuren Wu Zicklin School of Business, Baruch

Co-movements between CDS and ATMV/RR

Dec01 Jul02 Jan03 Aug03 Feb04 Sep04 Mar050

2

4

6

CD

S S

prea

d, %

Mexico

Dec01 Jul02 Jan03 Aug03 Feb04 Sep04 Mar0510

20

30

40

Impl

ied

Vol

atili

ty F

acto

r, %

Dec01 Jul02 Jan03 Aug03 Feb04 Sep04 Mar050

50

100

CD

S S

prea

d, %

Brazil

Dec01 Jul02 Jan03 Aug03 Feb04 Sep04 Mar050

100

200

Impl

ied

Vol

atili

ty F

acto

r, %

Dec01 Jul02 Jan03 Aug03 Feb04 Sep04 Mar050

1

2

3

4

5

CD

S S

prea

d, %

Mexico

Dec01 Jul02 Jan03 Aug03 Feb04 Sep04 Mar050

2

4

6

8

10

Ris

k R

ever

sal F

acto

r, %

Dec01 Jul02 Jan03 Aug03 Feb04 Sep04 Mar050

50

100

CD

S S

prea

d, %

Brazil

Dec01 Jul02 Jan03 Aug03 Feb04 Sep04 Mar050

20

40

Ris

k R

ever

sal F

acto

r, %

Liuren Wu (Baruch) Estimation and Application Option Pricing 35 / 40

Page 36: Model Estimation and Application - Baruch Collegefaculty.baruch.cuny.edu/lwu/890/ADP_Estimation.pdf · Model Estimation and Application Liuren Wu Zicklin School of Business, Baruch

A no-arbitrage model that prices CDS and currency options

Reference: Carr & Wu, Theory and Evidence on the Dynamic Interactions Between Sovereign Credit Default Swaps and

Currency Options, Journal of Banking and Finance, 2007, 31(8), 2383-2403.

Model specification:

At normal times, the currency price (dollar price of a local currency, saypeso) follows a diffusive process with stochastic volatility.

When the country defaults on its foreign debt, the currency price jumpsby a large amount.

The arrival rate of sovereign default is also stochastic and correlatedwith the currency return volatility.

Under these model specifications, we can price both CDS and currencyoptions via no-arbitrage arguments. The pricing equations is tractable.Numerical implementation is fast.

Estimate the model with dynamic consistency: Each day, three things vary:(i) Currency price (both diffusive moves and jumps), (ii) currency volatility,and (iii) default arrival rate.

All model parameters are fixed over time.

Liuren Wu (Baruch) Estimation and Application Option Pricing 36 / 40

Page 37: Model Estimation and Application - Baruch Collegefaculty.baruch.cuny.edu/lwu/890/ADP_Estimation.pdf · Model Estimation and Application Liuren Wu Zicklin School of Business, Baruch

The hedged portfolio of CDS and currency options

Suppose we start with an option contract on the currency. We need four otherinstruments to hedge the risk exposure of the option position:

1 The underlying currency to hedge infinitesimal movements in exchange rate

2 A risk reversal (out of money option) to hedge the impact of default on thecurrency value.

3 A straddle (at-the-money option) to hedge the currency volatility movement.

4 A CDS contract to hedge the default arrival rate variation.

The portfolio needs to be rebalanced over time to maintain neutral to the riskfactors.

The value of hedged portfolio is much more transient than volatilities or cdsspreads.

Liuren Wu (Baruch) Estimation and Application Option Pricing 37 / 40

Page 38: Model Estimation and Application - Baruch Collegefaculty.baruch.cuny.edu/lwu/890/ADP_Estimation.pdf · Model Estimation and Application Liuren Wu Zicklin School of Business, Baruch

Back-testing results

Liuren Wu (Baruch) Estimation and Application Option Pricing 38 / 40

Page 39: Model Estimation and Application - Baruch Collegefaculty.baruch.cuny.edu/lwu/890/ADP_Estimation.pdf · Model Estimation and Application Liuren Wu Zicklin School of Business, Baruch

A more generic setup

Let yt ∈ RN denote a vector of observed derivative prices, let h(Xt) denotethe model-implied value as a function of the state vector Xt ∈ Rk . LetHt = ∂h(Xt)/∂Xt ∈ RN×k denote the gradient matrix at time t.

−et = h(Xt)− yt can be regarded as the “alpha” of the asset (by assumingthat it will go to zero down the road) and Ht its risk exposure.

We can solve the following quadratic program:

maxwt

−w>et −1

2γw>t Vew

subject to factor exposure constraints:

H>t wt = c ∈ Rk .

The above equation maximizes the expected return (alpha) of theportfolio subject to factor exposure constraints.Setting c = 0 maintains (first order) factor-neutrality.One can also take on factor exposures to obtain factor risk premium.Caveat: Special care is needed for factor neutrality when the statevector can jump.

Liuren Wu (Baruch) Estimation and Application Option Pricing 39 / 40

Page 40: Model Estimation and Application - Baruch Collegefaculty.baruch.cuny.edu/lwu/890/ADP_Estimation.pdf · Model Estimation and Application Liuren Wu Zicklin School of Business, Baruch

Bottom line

If you have a working crystal ball, others’ risks become your opportunities.

Forget about no-arbitrage models; lift the market.

No-arbitrage type models become useful when

You cannot forecast the future accurately: Risk persists.Hedge risk exposures, orTake a controlled exposure to certain risk sources and receive riskpremiums accordingly.

Understanding risk premium behavior is useful not just for academicpapers.

Perform statistical arbitrage trading on derivative products that profitfrom short-term market dislocations.

Caveat: When hedging is off, risk can overwhelm profit opportunities.

Liuren Wu (Baruch) Estimation and Application Option Pricing 40 / 40