kevin sheppard :// · volatility overview what is volatility? why does it change? what are arch,...

Univariate Volatility Modeling

Kevin Sheppardhttp://www.kevinsheppard.com

Oxford MFEThis version: February 2, 2018

February 5 – 6, 2018

Volatility Overview

� What is volatility?� Why does it change?� What are ARCH, GARCH, TARCH, EGARCH, SWARCH, ZARCH, APARCH,STARCH, etc. models?

� What does time-varying volatility look like?� What are the basic properties of ARCH and GARCH models?� What is the news impact curve?� How are the parameters of ARCH models estimated? What aboutinference?

� Twists on the standard model� Forecasting conditional variance� Realized Variance� Implied Volatility

2 / 40

Model Building

� ARCH and GARCH models are essentially ARMA modelsI Box-Jenkins Methodology

– Parsimony principle

Steps:1. Inspect the ACF and PACF of ε2t

ε2t = ω + (α+ β)ε2t−1 − βνt−1 + νt

– ACF indicates α (or ARCH of any kind)– PACF indicates β

2. Build initial model based on these observation3. Iterate between model and ACF/PACF of e2t =

ε2tσ2t

3 / 40

S&P 500 ε2t ACF/PACFSquared Residuals ACF Squared Residuals PACF

10 20 30 400.0

0.1

0.2

0.3

10 20 30 400.0

0.1

0.2

0.3

Standardized Squared Residuals ACF Standardized Squared Residuals PACF

10 20 30 400.0

0.1

0.2

0.3

10 20 30 400.0

0.1

0.2

0.3

4 / 40

WTI ε2t ACF/PACFSquared Residuals ACF Squared Residuals PACF

10 20 30 40

0.0

0.1

0.2

10 20 30 40

0.0

0.1

0.2

Standardized Squared Residuals ACF Standardized Squared Residuals PACF

10 20 30 40

0.0

0.1

0.2

10 20 30 40

0.0

0.1

0.2

5 / 40

How I built a model for the S&P 500

α1 α2 γ1 γ2 β1 β2 LL

GARCH(1,1) 0.099(0.000)

0.890(0.000)

−7374.3

GARCH(1,2) 0.099(0.000)

0.890(0.000)

0.000(0.999)

−7374.3

GARCH(2,1) 0.057(0.003)

0.063(0.023)

0.865(0.000)

−7368.1

GJR-GARCH(1,1,1) 0.000(0.999)

0.171(0.000)

0.901(0.000)

−7261.6

GJR-GARCH(1,2,1) 0.000(0.999)

0.147(0.000)

0.030(0.507)

0.897(0.000)

−7261.1

TARCH(1,1,1)? 0.000(0.999)

0.158(0.000)

0.918(0.000)

−7231.2

TARCH(1,2,1) 0.000(0.999)

0.157(0.000)

0.001(0.971)

0.918(0.000)

−7231.2

TARCH(2,1,1) 0.000(0.999)

0.011(0.728)

0.156(0.000)

0.909(0.000)

−7230.0

EGARCH(1,0,1) 0.201(0.000)

0.981(0.000)

−7381.3

EGARCH(1,1,1) 0.121(0.000)

−0.137(0.000)

0.980(0.000)

−7238.5

EGARCH(1,2,1) 0.114(0.000)

−0.203(0.000)

0.073(0.016)

0.982(0.000)

−7232.4

EGARCH(2,1,1) 0.017(0.652)

0.122(0.006)

−0.146(0.000)

0.976(0.000)

−7230.2

6 / 40

Testing for (G)ARCH

� ARCH is autocorrelation in ε2tI All ARCH processes have this, whether GARCH or EGARCH or otherI ARCH-LM testI Directly test for autocorrelation:

ε2t = φ0 + φ1ε2t−1 + . . .+ φPε

2t−P + ηt

I H0 : φ1 = φ2 = . . . = φP = 0I T × R2 d→ χ2PI Standard LM test from a regression.I More powerful test: Fit an ARCH(P) modelI The forbidden hypothesis

σ2t = ω + α1ε2t−1 + β1σ

2t−1

H0 : α1 = 0, H1α > 0

7 / 40

Forecasting: ARCH(1)

� Simple ARCH model

εt ∼ N(0, σ2t )

σ2t = ω + α1ε2t−1

I 1-step ahead forecast is known todayI All ARCH-family models have this property

εt ∼N(0, σ2t )

σ2t =ω + α1ε2t−1

Et[σ2t+1] =Et[ω + α1ε2t ]

=ω + α1ε2t

I Note: Et[ε2t+1] = Et[e2t+1σ2t+1] = σ2t+1Et[e

2t+1] = σ2t+1

I Further:Et[ε2t+h] = Et[Et+h−1[e2t+hσ

2t+h]] = Et[Et+h−1[e2t+h]σ

2t+h] = Et[σ2t+h]

8 / 40

Forecasting: ARCH(1)

� 2-step ahead

Et[σ2t+2] = Et[ω + α1ε2t+1]

=ω + α1Et[ε2t+1]

=ω + α1(ω + α1ε2t )

=ω + α1ω + α21ε2t

I h-step ahead forecast

Et[σ2t+h] =

h−1∑i=0

αi1ω + αh1ε2t

I Just the AR(1) forecasting formula

– Why?

9 / 40

Forecasting: GARCH(1,1)

� 1-step ahead

Et[σ2t+1] = Et[ω + α1ε2t + β1σ

2t ]

= ω + α1ε2t + β1σ

2t

� 2-step ahead

Et[σ2t+2] = Et[ω + α1ε2t+1 + β1σ

2t+1]

= ω + α1Et[ε2t+1] + β1Et[σ2t+1]

= ω + α1Et[e2t+1σ2t+1] + β1Et[σ2t+1]

= ω + α1Et[e2t+1]σ2t+1 + β1Et[σ2t+1]

= ω + α1 · 1 · σ2t+1 + β1Et[σ2t+1]

= ω + α1σ2t+1 + β1Et[σ2t+1]

= ω + (α1 + β1)σ2t+1

10 / 40

Forecasting: GARCH(1,1)

� h-step ahead

Et[σ2t+h] =

h−1∑i=0

(α1 + β1)iω + (α1 + β1)

h−1(α1ε2t + β1σ

2t )

� Also essentially an AR(1), technically ARMA(1,1)

11 / 40

Forecasting: TARCH(1,0,0)

� This one is a messI Nonlinearities cause problems

– All ARCH-family models are nonlinear, but some are linearity in ε2t– Others are not

σt = ω + α1|εt−1|I Forecast for t + 1 is known at time t

– Always, always, always, . . .

Et[σ2t+1] = Et[(ω + α1|εt|)2]

= Et[ω2 + 2ωα1|εt|+ α21ε2t ]

= ω2 + 2ωα1Et[|εt|] + α21Et[ε2t ]

= ω2 + 2ωα1|εt|+ α21ε2t

12 / 40

TARCH(1,0,0) continued...

� Multi-step is less straightforward

Et[σ2t+2] = Et[(ω + α1|εt+1|)2]

= Et[ω2 + 2ωα1|εt+1|+ α21ε2t+1]

= ω2 + 2ωα1Et[|εt+1|] + α21Et[ε2t+1]

= ω2 + 2ωα1Et[|et+1|σt+1] + α21Et[e2t σ

2t+1]

= ω2 + 2ωα1Et[|et+1|]Et[σt+1] + α21Et[e2t ]Et[σ

2t+1]

= ω2 + 2ωα1Et[|et+1|](ω + α1|εt|) + α21 · 1 · (ω2 + 2ωα1|εt|+ α21ε2t )

If et+1 ∼ N(0, 1), E[|et+1|] =√

2π

Et[σ2t+2] = ω2 + 2ωα1

√2π

(ω + α1|εt|) + α21(ω2 + 2ωα1|εt|+ α21ε

2t )

13 / 40

Assessing forecasts: Augmented MZ

� Start from Et[r2t+h] ≈ σ2t+h|tI Standard Augmented MZ regression:

ε2t+h − σ2t+h|t = γ0 + γ1σ2t+h|t + γ2z1t + . . .+ γK+1zKt + ηt

I ηt is heteroskedastic in proportion to σ2t : Use GLS.I An improved GMZ regression (GMZ-GLS)

ε2t+h − σ2t+h|tσ2t+h|t

= γ01

σ2t+h|t+ γ11+ γ2

z1tσ2t+h|t

+ . . .+ γK+1zKtσ2t+h|t

+ νt

I Better to use Realized Variance to evaluate forecasts

RVt+h − σ2t+h|t = γ0 + γ1σ2t+h|t + γ2z1t + . . .+ γK+1zKt + ηt

I Also can use GLS versionI Both RVt+h and ε2t+h are proxies for the variance at t + h

– RV is just better, often 10×+ more precise

14 / 40

Assessing forecasts: Diebold-Mariano

� Relative forecast performanceI MSE loss

δt =(ε2t+h − σ2A,t+h|t

)2−(ε2t+h − σ2B,t+h|t

)2I H0 : E[δt] = 0, HA

1 : E[δt] < 0, HB1 : E[δt] > 0

ˆδ = R−1R∑r=1

δr

I Standard t-test, 2-sided alternativeI Newey-West covariance always neededI Better DM using “Q-Like” loss (Normal log-likelihood “Kernel”)

δt =

(ln(σ2A,t+h|t) +

ε2t+hσ2A,t+h|t

)−

(ln(σ2B,t+h|t) +

ε2t+hσ2B,t+h|t

)I Patton & Sheppard (2008)

15 / 40

Realized Variance

Realized Variance

� Variance measure computed using ultra-high-frequency data (UHF)I Uses all available information to estimate the variance over someperiod

– Usually 1 day

I Variance estimates from RV can be treated as “observable”– Standard ARMA modeling– Variance estimates are consistent– Asymptotically unbiased– Variance converges to 0 as the number of samples increases

I Problems arise when applied to market data

– Noise (bid-ask bounce)– Market closure– Prices discrete– Prices not continuously observable– Data quality

16 / 40

Realized Variance

� AssumptionsI Log-prices are generated by an arbitrage-free semi-martingale

– Prices are observable– Prices can be sampled often

I Defined

RV (m)t =

m∑i=1

(pi,t − pi−1,t

)2=

m∑i=1

r2i,t.

– m-sample Realized Variance– pi,t is the ith log-price on day t– ri,t is the ith return on day t

I Only uses information on day t to estimate the variance on day tI Consistent estimator of the integrated variance∫ t+1

tσ2s ds

I “Total variance” on day t

17 / 40

Why Realized Variance Works

� Consider a simple Brownian motion

dpt = µ dt + σ dWt

I m-sample Realized Variance

RV (m)t =

m∑i=1

r2i,t

I Returns are IID normal

ri,ti.i.d.∼ N

(µ

m,σ2

m

)I Nearly unbiased

E[RV (m)

t

]=µ2

m+ σ2

I Variance close to 0

V[RV (m)

t

]= 4

µ2σ2

m2 + 2σ4

m18 / 40

Why Realized Variance Works

� Works for models with time-varying drift and stochastic volatility

dpt = µt dt + σt dWt

I No arbitrage imposes some restrictions on µtI Works with price processes with jumpsI In the general case:

RV (m)t

p→∫ 1

0σ2s ds+

N∑n=1

J2n

I Jn are jumps

19 / 40

Why Realized Variance Doesn’t Work� Multiple prices at the same time

I Define the price as the average share price (volume weighted price)I Use simple average or medianI Not a problem

� Prices only observed on a discrete gridI $.01 or £.0025I Nothing can be doneI Small problem

� Data qualityI UHF price data is generally messyI TyposI Wrong time-stampsI Pre-filter to remove obvious errorsI Often remove “round trips”

� No price available at some point in timeI Use the last observed price: last price interpolationI Averaging prices before and after leads to bias

20 / 40

Solutions to bid-ask bounce type noise

� Bid-ask bounce is a big problemI Simple model with “pure” noise

pi,t = p∗i,t + νi,t

– pi,t is the observed price with noise– p∗i,t is the unobserved efficient price– νi,t is the noise

I Easy to showri,t = r∗i,t + ηi,t

– r∗i,t is the unobserved efficient return– ηi,t = νi,t − νi−1,t is a MA(1) error

I RV is badly biasedRV (m)

t ≈ RV t +mτ 2

– Bias is increasing in m– Variance is also increasing in m

21 / 40

Simple solution

� Do not sample frequentlyI 5-30 minutes

– Better than daily but still inefficient

I Remove MA(1) by filtering

– ηi,t is an MA(1)– Fit an MA(1) to observed returns

ri,t = θεi−1,t + εi,t

– Use fit residuals εi,t to compute RV– Generally biased downward

I Use mid-quotes

– A little noise– My usual solution

22 / 40

A modified Realized Variance estimator: RVAC1

� Best solution is to use a modified RV estimatorI RVAC1

RVAC1(m)t =

m∑i=1

r2i,t + 2m∑i=2

ri,tri−1,t

I Adds a term to RV to capture the MA(1) noiseI Looks like a simple Newey-West estimatorI Unbiased in pure noise modelI Not consistentI Realized Kernel Estimator

– Adds more weighted cross-products– Consistent in the presence of many realistic noise processes– Fairly easy to implement

23 / 40

One final problem

� Market closureI Markets do not operate 24 hours a day (in general)I Add in close-to-open return squared

RV (m)t = r2CtO,t +

m∑i=1

r2i,t

– rCtO,t = pOpen,t − pClose,t−1I Compute a modified RV by weighting the overnight and open hourestimates differently

RV(m)t = λ1r2CtO,t + λ2RV

(m)t

24 / 40

The volatility signature plot

� Hard to know how often to sampleI Visual inspection may be useful

Definition (Volatility Signature Plot)

The volatility signature plot displays the time-series average of RealizedVariance

RV (m)t = T−1

T∑t=1

RV (m)t

as a function of the number of samples, m. An equivalent representationdisplays the amount of time, whether in calendar time or tick time(number of trades between observations) along the X-axis.

25 / 40

Some empirical results

� S&P 500 Depository ReceiptsI SPiDeRsI AMEX: SPYI Exchange Traded FundI Ultra-liquid

– 100M shares per day– Over 100,000 trades per day– 23,400 seconds in a typical trading day

I January 1, 2006 - July 31, 2008I Filtered by daily High-Low dataI Some cleaning of outliers

26 / 40

SPDR Realized Variance (RV )RV , 15 seconds RV , 1 minute

2008 2012 2016

40

80

120

2008 2012 2016

40

80

120

RV , 5 minutes RV , 15 minutes

2008 2012 2016

40

80

120

2008 2012 2016

40

80

120

27 / 40

SPDR Realized Variance (RVAC1)RVAC1, 15 seconds RVAC1, 1 minute

2008 2012 2016

40

80

120

2008 2012 2016

40

80

120

RVAC1, 5 minutes RVAC1, 15 minutes

2008 2012 2016

40

80

120

2008 2012 2016

40

80

120

28 / 40

Volatility Signature PlotsVolatility Signature Plot for SPDR RV

1s 5s 15s 30s 1m 2m 5m 10m 30m14

16

18

20

22

Volatility Signature Plot for SPDR RVAC1

1s 5s 15s 30s 1m 2m 5m 10m 30m14

15

16

17

29 / 40

Bitcoin Realized Variance1-minute RV

Jul Jul Jul0

100

200

300

5-minute RV

Jan2016

Jan2017

Jan2018

Jul Jul Jul0

100

200

300

30 / 40

Modeling Realized Variance

� Two choices� Treat volatility as observable and model as ARMA

I Really simply to doI Forecasts are equally simpleI Theoretical motivation why RV may be well modeled by an ARMA(P,1)

� Continue to treat volatility as latent and use ARCH-type modelI Realized Variance is still measured with errorI A more precise measure of conditional variance that daily returnssquared, r2t , but otherwise similar

31 / 40

Treating RV as observable

� Main model used is a Heterogeneous Autoregression� Restricted AR(22) in levels

RVt = φ0 + φ1RVt−1 + φ5RV5,t−1 + φ22RV22,t−1 + εt

� Or in logs

lnRVt = φ0 + φ1 lnRVt−1 + φ5 lnRV5,t−1 + φ22 lnRV22,t−1 + εt

where RV j,t−1 = j−1∑j

i=1 RVt−i is a j lag moving average� Model picks up volatility changes at the daily, weekly, and monthlyscale

� Fits and forecasts RV fairly wellI MA term may still be needed

32 / 40

Leaving RV latent

� Alternative if to treat RV as latent and use a non-negativemultiplicative error model (MEM)

� MEMs specify the mean of a process as µt × ψt where ψt is a mean 1shock.

� A χ21 is a natural choice here� ARCH models are special cases of a non-negative MEM model� Easy to model RV using existing ARCH mdoes

1. Construct rt = sign (rt)√RVt

2. Use standard ARCH model building to construct a model for rt

σ2t = ω + α1r2t−1 + γ1r2t−1I[rt−1<0] + β1σ2t−1

becomes

σ2t = ω + α1RVt−1 + γ1RVt−1I[rt−1<0] + β1σ2t−1

33 / 40

Implied Variance

Implied Volatility and VIX

� Implied volatility is very different from ARCH and Realized measures� Market based: Level of volatility is calculated from options prices� Forward looking: Options depend on future price path� “Classic” implied relies on the Black-Scholes pricing formula� “Model free” implied volatility exploits a relationship between thesecond derivative of the call price with respect to the strike and therisk neutral measure

� VIX is a Chicago Board Options Exchange (CBOE) index based on amodel free measure

� Allows volatility to be directly traded

34 / 40

Black-Scholes Implied Volatility

� Black-Scholes Options Pricing� Prices follow a geometric Brownian Motion

dSt = µStdt + σStdWt

� Constant drift and volatility� Price of a call is

C(T ,K) = SΦ(d1) + Ke−rTΦ(d2)

where

d1 =ln (S/K) +

(r + σ2/2

)T

σ√T

d2 =ln (S/K) +

(r − σ2/2

)T

σ√T

.

� Can invert to produce a formula for the volatility given the call priceC(T ,K)

σImpliedt = g (Ct(T ,K), St,K,T , r)

35 / 40

BSIV against Moneyness for SPYJan 15, 2018 options expiring on Feb 2, 2018

85 90 95 100 105Moneyness

0.10

0.15

0.20

0.25

0.30 Put IVCall IV

36 / 40

Model Free Implied Volatility� Model free uses the relationship between option prices and RN density� The price of a call option with strike K and maturity t is

C(t,K) =

∫ ∞K

(St − K)φt (St) dSt

� φt (St) is the risk-neutral density at maturity t� Differentiating with respect to strike yields

∂C(t,K)

∂K= −

∫ ∞K

φt (St) dSt

� Differentiating again with respect to strike yields

∂2C(t,K)

∂K2= φt (K)

� The change in an option price as a function of the strike K is theprobability of the stock price having value K at time t

� Allows for risk-neutral density to be recovered from a continuum of optionswithout assuming a model for stock prices

37 / 40

Model Free Implied Volatility

� The previous result allows a model free IV to be computed from

EF

[∫ t

0

(∂FsFs

)2ds

]= 2

∫ ∞0

CF(t,K)− (F0 − K)+

K2dK

� Devil is in the detailsI Only finitely many callsI Thin tradingI Truncation

M∑m=1

[g(T ,Km) + g(T ,Km−1)] (Km − Km−1)

where

g(T ,K) =C(t,K/B(0, t))− (S0 − K)

+

K2� See Jiang & Tian (2005, RFS) for a very useful discussion

38 / 40

VIX

� VIX is continuously computed by the CBOE� Uses a model-free style formula� Uses both calls and puts� Focuses on out-of-the-money options

I OOM options are more liquid

� Formula:

σ2 =2TerT

N∑i=1

∆KiK2i

Q(Ki)−1T

(F0K0− 1)2

I Q(Ki) is the mid-quote for a strike of Ki, K0 is the first strike below theforward index level

I Only uses out-of-the-money optionsI VIX appears to have information about future realized volatility that isnot in other backward looking measures (GARCH/RV)

39 / 40

VIX against TARCH(1,1,1) Forward-volVIX and Forward Volatility

19952000

20052010

20150

20

40

60

80 VIX22-day Forward Vol

VIX Forward Volatility DIfference

19952000

20052010

2015

40

20

0

20

40 / 40

kevin sheppard :// · volatility overview what is volatility? why does it change? what are arch,...

Documents