kevin sheppard :// · volatility overview what is volatility? why does it change? what are arch,...
TRANSCRIPT
Univariate Volatility Modeling
Kevin Sheppardhttp://www.kevinsheppard.com
Oxford MFEThis version: February 2, 2018
February 5 – 6, 2018
Volatility Overview
� What is volatility?� Why does it change?� What are ARCH, GARCH, TARCH, EGARCH, SWARCH, ZARCH, APARCH,STARCH, etc. models?
� What does time-varying volatility look like?� What are the basic properties of ARCH and GARCH models?� What is the news impact curve?� How are the parameters of ARCH models estimated? What aboutinference?
� Twists on the standard model� Forecasting conditional variance� Realized Variance� Implied Volatility
2 / 40
Model Building
� ARCH and GARCH models are essentially ARMA modelsI Box-Jenkins Methodology
– Parsimony principle
Steps:1. Inspect the ACF and PACF of ε2t
ε2t = ω + (α+ β)ε2t−1 − βνt−1 + νt
– ACF indicates α (or ARCH of any kind)– PACF indicates β
2. Build initial model based on these observation3. Iterate between model and ACF/PACF of e2t =
ε2tσ2t
3 / 40
S&P 500 ε2t ACF/PACFSquared Residuals ACF Squared Residuals PACF
10 20 30 400.0
0.1
0.2
0.3
10 20 30 400.0
0.1
0.2
0.3
Standardized Squared Residuals ACF Standardized Squared Residuals PACF
10 20 30 400.0
0.1
0.2
0.3
10 20 30 400.0
0.1
0.2
0.3
4 / 40
WTI ε2t ACF/PACFSquared Residuals ACF Squared Residuals PACF
10 20 30 40
0.0
0.1
0.2
10 20 30 40
0.0
0.1
0.2
Standardized Squared Residuals ACF Standardized Squared Residuals PACF
10 20 30 40
0.0
0.1
0.2
10 20 30 40
0.0
0.1
0.2
5 / 40
How I built a model for the S&P 500
α1 α2 γ1 γ2 β1 β2 LL
GARCH(1,1) 0.099(0.000)
0.890(0.000)
−7374.3
GARCH(1,2) 0.099(0.000)
0.890(0.000)
0.000(0.999)
−7374.3
GARCH(2,1) 0.057(0.003)
0.063(0.023)
0.865(0.000)
−7368.1
GJR-GARCH(1,1,1) 0.000(0.999)
0.171(0.000)
0.901(0.000)
−7261.6
GJR-GARCH(1,2,1) 0.000(0.999)
0.147(0.000)
0.030(0.507)
0.897(0.000)
−7261.1
TARCH(1,1,1)? 0.000(0.999)
0.158(0.000)
0.918(0.000)
−7231.2
TARCH(1,2,1) 0.000(0.999)
0.157(0.000)
0.001(0.971)
0.918(0.000)
−7231.2
TARCH(2,1,1) 0.000(0.999)
0.011(0.728)
0.156(0.000)
0.909(0.000)
−7230.0
EGARCH(1,0,1) 0.201(0.000)
0.981(0.000)
−7381.3
EGARCH(1,1,1) 0.121(0.000)
−0.137(0.000)
0.980(0.000)
−7238.5
EGARCH(1,2,1) 0.114(0.000)
−0.203(0.000)
0.073(0.016)
0.982(0.000)
−7232.4
EGARCH(2,1,1) 0.017(0.652)
0.122(0.006)
−0.146(0.000)
0.976(0.000)
−7230.2
6 / 40
Testing for (G)ARCH
� ARCH is autocorrelation in ε2tI All ARCH processes have this, whether GARCH or EGARCH or otherI ARCH-LM testI Directly test for autocorrelation:
ε2t = φ0 + φ1ε2t−1 + . . .+ φPε
2t−P + ηt
I H0 : φ1 = φ2 = . . . = φP = 0I T × R2 d→ χ2PI Standard LM test from a regression.I More powerful test: Fit an ARCH(P) modelI The forbidden hypothesis
σ2t = ω + α1ε2t−1 + β1σ
2t−1
H0 : α1 = 0, H1α > 0
7 / 40
Forecasting: ARCH(1)
� Simple ARCH model
εt ∼ N(0, σ2t )
σ2t = ω + α1ε2t−1
I 1-step ahead forecast is known todayI All ARCH-family models have this property
εt ∼N(0, σ2t )
σ2t =ω + α1ε2t−1
Et[σ2t+1] =Et[ω + α1ε2t ]
=ω + α1ε2t
I Note: Et[ε2t+1] = Et[e2t+1σ2t+1] = σ2t+1Et[e
2t+1] = σ2t+1
I Further:Et[ε2t+h] = Et[Et+h−1[e2t+hσ
2t+h]] = Et[Et+h−1[e2t+h]σ
2t+h] = Et[σ2t+h]
8 / 40
Forecasting: ARCH(1)
� 2-step ahead
Et[σ2t+2] = Et[ω + α1ε2t+1]
=ω + α1Et[ε2t+1]
=ω + α1(ω + α1ε2t )
=ω + α1ω + α21ε2t
I h-step ahead forecast
Et[σ2t+h] =
h−1∑i=0
αi1ω + αh1ε2t
I Just the AR(1) forecasting formula
– Why?
9 / 40
Forecasting: GARCH(1,1)
� 1-step ahead
Et[σ2t+1] = Et[ω + α1ε2t + β1σ
2t ]
= ω + α1ε2t + β1σ
2t
� 2-step ahead
Et[σ2t+2] = Et[ω + α1ε2t+1 + β1σ
2t+1]
= ω + α1Et[ε2t+1] + β1Et[σ2t+1]
= ω + α1Et[e2t+1σ2t+1] + β1Et[σ2t+1]
= ω + α1Et[e2t+1]σ2t+1 + β1Et[σ2t+1]
= ω + α1 · 1 · σ2t+1 + β1Et[σ2t+1]
= ω + α1σ2t+1 + β1Et[σ2t+1]
= ω + (α1 + β1)σ2t+1
10 / 40
Forecasting: GARCH(1,1)
� h-step ahead
Et[σ2t+h] =
h−1∑i=0
(α1 + β1)iω + (α1 + β1)
h−1(α1ε2t + β1σ
2t )
� Also essentially an AR(1), technically ARMA(1,1)
11 / 40
Forecasting: TARCH(1,0,0)
� This one is a messI Nonlinearities cause problems
– All ARCH-family models are nonlinear, but some are linearity in ε2t– Others are not
σt = ω + α1|εt−1|I Forecast for t + 1 is known at time t
– Always, always, always, . . .
Et[σ2t+1] = Et[(ω + α1|εt|)2]
= Et[ω2 + 2ωα1|εt|+ α21ε2t ]
= ω2 + 2ωα1Et[|εt|] + α21Et[ε2t ]
= ω2 + 2ωα1|εt|+ α21ε2t
12 / 40
TARCH(1,0,0) continued...
� Multi-step is less straightforward
Et[σ2t+2] = Et[(ω + α1|εt+1|)2]
= Et[ω2 + 2ωα1|εt+1|+ α21ε2t+1]
= ω2 + 2ωα1Et[|εt+1|] + α21Et[ε2t+1]
= ω2 + 2ωα1Et[|et+1|σt+1] + α21Et[e2t σ
2t+1]
= ω2 + 2ωα1Et[|et+1|]Et[σt+1] + α21Et[e2t ]Et[σ
2t+1]
= ω2 + 2ωα1Et[|et+1|](ω + α1|εt|) + α21 · 1 · (ω2 + 2ωα1|εt|+ α21ε2t )
If et+1 ∼ N(0, 1), E[|et+1|] =√
2π
Et[σ2t+2] = ω2 + 2ωα1
√2π
(ω + α1|εt|) + α21(ω2 + 2ωα1|εt|+ α21ε
2t )
13 / 40
Assessing forecasts: Augmented MZ
� Start from Et[r2t+h] ≈ σ2t+h|tI Standard Augmented MZ regression:
ε2t+h − σ2t+h|t = γ0 + γ1σ2t+h|t + γ2z1t + . . .+ γK+1zKt + ηt
I ηt is heteroskedastic in proportion to σ2t : Use GLS.I An improved GMZ regression (GMZ-GLS)
ε2t+h − σ2t+h|tσ2t+h|t
= γ01
σ2t+h|t+ γ11+ γ2
z1tσ2t+h|t
+ . . .+ γK+1zKtσ2t+h|t
+ νt
I Better to use Realized Variance to evaluate forecasts
RVt+h − σ2t+h|t = γ0 + γ1σ2t+h|t + γ2z1t + . . .+ γK+1zKt + ηt
I Also can use GLS versionI Both RVt+h and ε2t+h are proxies for the variance at t + h
– RV is just better, often 10×+ more precise
14 / 40
Assessing forecasts: Diebold-Mariano
� Relative forecast performanceI MSE loss
δt =(ε2t+h − σ2A,t+h|t
)2−(ε2t+h − σ2B,t+h|t
)2I H0 : E[δt] = 0, HA
1 : E[δt] < 0, HB1 : E[δt] > 0
ˆδ = R−1R∑r=1
δr
I Standard t-test, 2-sided alternativeI Newey-West covariance always neededI Better DM using “Q-Like” loss (Normal log-likelihood “Kernel”)
δt =
(ln(σ2A,t+h|t) +
ε2t+hσ2A,t+h|t
)−
(ln(σ2B,t+h|t) +
ε2t+hσ2B,t+h|t
)I Patton & Sheppard (2008)
15 / 40
Realized Variance
Realized Variance
� Variance measure computed using ultra-high-frequency data (UHF)I Uses all available information to estimate the variance over someperiod
– Usually 1 day
I Variance estimates from RV can be treated as “observable”– Standard ARMA modeling– Variance estimates are consistent– Asymptotically unbiased– Variance converges to 0 as the number of samples increases
I Problems arise when applied to market data
– Noise (bid-ask bounce)– Market closure– Prices discrete– Prices not continuously observable– Data quality
16 / 40
Realized Variance
� AssumptionsI Log-prices are generated by an arbitrage-free semi-martingale
– Prices are observable– Prices can be sampled often
I Defined
RV (m)t =
m∑i=1
(pi,t − pi−1,t
)2=
m∑i=1
r2i,t.
– m-sample Realized Variance– pi,t is the ith log-price on day t– ri,t is the ith return on day t
I Only uses information on day t to estimate the variance on day tI Consistent estimator of the integrated variance∫ t+1
tσ2s ds
I “Total variance” on day t
17 / 40
Why Realized Variance Works
� Consider a simple Brownian motion
dpt = µ dt + σ dWt
I m-sample Realized Variance
RV (m)t =
m∑i=1
r2i,t
I Returns are IID normal
ri,ti.i.d.∼ N
(µ
m,σ2
m
)I Nearly unbiased
E[RV (m)
t
]=µ2
m+ σ2
I Variance close to 0
V[RV (m)
t
]= 4
µ2σ2
m2 + 2σ4
m18 / 40
Why Realized Variance Works
� Works for models with time-varying drift and stochastic volatility
dpt = µt dt + σt dWt
I No arbitrage imposes some restrictions on µtI Works with price processes with jumpsI In the general case:
RV (m)t
p→∫ 1
0σ2s ds+
N∑n=1
J2n
I Jn are jumps
19 / 40
Why Realized Variance Doesn’t Work� Multiple prices at the same time
I Define the price as the average share price (volume weighted price)I Use simple average or medianI Not a problem
� Prices only observed on a discrete gridI $.01 or £.0025I Nothing can be doneI Small problem
� Data qualityI UHF price data is generally messyI TyposI Wrong time-stampsI Pre-filter to remove obvious errorsI Often remove “round trips”
� No price available at some point in timeI Use the last observed price: last price interpolationI Averaging prices before and after leads to bias
20 / 40
Solutions to bid-ask bounce type noise
� Bid-ask bounce is a big problemI Simple model with “pure” noise
pi,t = p∗i,t + νi,t
– pi,t is the observed price with noise– p∗i,t is the unobserved efficient price– νi,t is the noise
I Easy to showri,t = r∗i,t + ηi,t
– r∗i,t is the unobserved efficient return– ηi,t = νi,t − νi−1,t is a MA(1) error
I RV is badly biasedRV (m)
t ≈ RV t +mτ 2
– Bias is increasing in m– Variance is also increasing in m
21 / 40
Simple solution
� Do not sample frequentlyI 5-30 minutes
– Better than daily but still inefficient
I Remove MA(1) by filtering
– ηi,t is an MA(1)– Fit an MA(1) to observed returns
ri,t = θεi−1,t + εi,t
– Use fit residuals εi,t to compute RV– Generally biased downward
I Use mid-quotes
– A little noise– My usual solution
22 / 40
A modified Realized Variance estimator: RVAC1
� Best solution is to use a modified RV estimatorI RVAC1
RVAC1(m)t =
m∑i=1
r2i,t + 2m∑i=2
ri,tri−1,t
I Adds a term to RV to capture the MA(1) noiseI Looks like a simple Newey-West estimatorI Unbiased in pure noise modelI Not consistentI Realized Kernel Estimator
– Adds more weighted cross-products– Consistent in the presence of many realistic noise processes– Fairly easy to implement
23 / 40
One final problem
� Market closureI Markets do not operate 24 hours a day (in general)I Add in close-to-open return squared
RV (m)t = r2CtO,t +
m∑i=1
r2i,t
– rCtO,t = pOpen,t − pClose,t−1I Compute a modified RV by weighting the overnight and open hourestimates differently
RV(m)t = λ1r2CtO,t + λ2RV
(m)t
24 / 40
The volatility signature plot
� Hard to know how often to sampleI Visual inspection may be useful
Definition (Volatility Signature Plot)
The volatility signature plot displays the time-series average of RealizedVariance
RV (m)t = T−1
T∑t=1
RV (m)t
as a function of the number of samples, m. An equivalent representationdisplays the amount of time, whether in calendar time or tick time(number of trades between observations) along the X-axis.
25 / 40
Some empirical results
� S&P 500 Depository ReceiptsI SPiDeRsI AMEX: SPYI Exchange Traded FundI Ultra-liquid
– 100M shares per day– Over 100,000 trades per day– 23,400 seconds in a typical trading day
I January 1, 2006 - July 31, 2008I Filtered by daily High-Low dataI Some cleaning of outliers
26 / 40
SPDR Realized Variance (RV )RV , 15 seconds RV , 1 minute
2008 2012 2016
40
80
120
2008 2012 2016
40
80
120
RV , 5 minutes RV , 15 minutes
2008 2012 2016
40
80
120
2008 2012 2016
40
80
120
27 / 40
SPDR Realized Variance (RVAC1)RVAC1, 15 seconds RVAC1, 1 minute
2008 2012 2016
40
80
120
2008 2012 2016
40
80
120
RVAC1, 5 minutes RVAC1, 15 minutes
2008 2012 2016
40
80
120
2008 2012 2016
40
80
120
28 / 40
Volatility Signature PlotsVolatility Signature Plot for SPDR RV
1s 5s 15s 30s 1m 2m 5m 10m 30m14
16
18
20
22
Volatility Signature Plot for SPDR RVAC1
1s 5s 15s 30s 1m 2m 5m 10m 30m14
15
16
17
29 / 40
Bitcoin Realized Variance1-minute RV
Jul Jul Jul0
100
200
300
5-minute RV
Jan2016
Jan2017
Jan2018
Jul Jul Jul0
100
200
300
30 / 40
Modeling Realized Variance
� Two choices� Treat volatility as observable and model as ARMA
I Really simply to doI Forecasts are equally simpleI Theoretical motivation why RV may be well modeled by an ARMA(P,1)
� Continue to treat volatility as latent and use ARCH-type modelI Realized Variance is still measured with errorI A more precise measure of conditional variance that daily returnssquared, r2t , but otherwise similar
31 / 40
Treating RV as observable
� Main model used is a Heterogeneous Autoregression� Restricted AR(22) in levels
RVt = φ0 + φ1RVt−1 + φ5RV5,t−1 + φ22RV22,t−1 + εt
� Or in logs
lnRVt = φ0 + φ1 lnRVt−1 + φ5 lnRV5,t−1 + φ22 lnRV22,t−1 + εt
where RV j,t−1 = j−1∑j
i=1 RVt−i is a j lag moving average� Model picks up volatility changes at the daily, weekly, and monthlyscale
� Fits and forecasts RV fairly wellI MA term may still be needed
32 / 40
Leaving RV latent
� Alternative if to treat RV as latent and use a non-negativemultiplicative error model (MEM)
� MEMs specify the mean of a process as µt × ψt where ψt is a mean 1shock.
� A χ21 is a natural choice here� ARCH models are special cases of a non-negative MEM model� Easy to model RV using existing ARCH mdoes
1. Construct rt = sign (rt)√RVt
2. Use standard ARCH model building to construct a model for rt
σ2t = ω + α1r2t−1 + γ1r2t−1I[rt−1<0] + β1σ2t−1
becomes
σ2t = ω + α1RVt−1 + γ1RVt−1I[rt−1<0] + β1σ2t−1
33 / 40
Implied Variance
Implied Volatility and VIX
� Implied volatility is very different from ARCH and Realized measures� Market based: Level of volatility is calculated from options prices� Forward looking: Options depend on future price path� “Classic” implied relies on the Black-Scholes pricing formula� “Model free” implied volatility exploits a relationship between thesecond derivative of the call price with respect to the strike and therisk neutral measure
� VIX is a Chicago Board Options Exchange (CBOE) index based on amodel free measure
� Allows volatility to be directly traded
34 / 40
Black-Scholes Implied Volatility
� Black-Scholes Options Pricing� Prices follow a geometric Brownian Motion
dSt = µStdt + σStdWt
� Constant drift and volatility� Price of a call is
C(T ,K) = SΦ(d1) + Ke−rTΦ(d2)
where
d1 =ln (S/K) +
(r + σ2/2
)T
σ√T
d2 =ln (S/K) +
(r − σ2/2
)T
σ√T
.
� Can invert to produce a formula for the volatility given the call priceC(T ,K)
σImpliedt = g (Ct(T ,K), St,K,T , r)
35 / 40
BSIV against Moneyness for SPYJan 15, 2018 options expiring on Feb 2, 2018
85 90 95 100 105Moneyness
0.10
0.15
0.20
0.25
0.30 Put IVCall IV
36 / 40
Model Free Implied Volatility� Model free uses the relationship between option prices and RN density� The price of a call option with strike K and maturity t is
C(t,K) =
∫ ∞K
(St − K)φt (St) dSt
� φt (St) is the risk-neutral density at maturity t� Differentiating with respect to strike yields
∂C(t,K)
∂K= −
∫ ∞K
φt (St) dSt
� Differentiating again with respect to strike yields
∂2C(t,K)
∂K2= φt (K)
� The change in an option price as a function of the strike K is theprobability of the stock price having value K at time t
� Allows for risk-neutral density to be recovered from a continuum of optionswithout assuming a model for stock prices
37 / 40
Model Free Implied Volatility
� The previous result allows a model free IV to be computed from
EF
[∫ t
0
(∂FsFs
)2ds
]= 2
∫ ∞0
CF(t,K)− (F0 − K)+
K2dK
� Devil is in the detailsI Only finitely many callsI Thin tradingI Truncation
M∑m=1
[g(T ,Km) + g(T ,Km−1)] (Km − Km−1)
where
g(T ,K) =C(t,K/B(0, t))− (S0 − K)
+
K2� See Jiang & Tian (2005, RFS) for a very useful discussion
38 / 40
VIX
� VIX is continuously computed by the CBOE� Uses a model-free style formula� Uses both calls and puts� Focuses on out-of-the-money options
I OOM options are more liquid
� Formula:
σ2 =2TerT
N∑i=1
∆KiK2i
Q(Ki)−1T
(F0K0− 1)2
I Q(Ki) is the mid-quote for a strike of Ki, K0 is the first strike below theforward index level
I Only uses out-of-the-money optionsI VIX appears to have information about future realized volatility that isnot in other backward looking measures (GARCH/RV)
39 / 40
VIX against TARCH(1,1,1) Forward-volVIX and Forward Volatility
19952000
20052010
20150
20
40
60
80 VIX22-day Forward Vol
VIX Forward Volatility DIfference
19952000
20052010
2015
40
20
0
20
40 / 40