applied statistics i

87
Applied Statistics Vincent JEANNIN – ESGF 4IFM Q1 2012 1 [email protected] ESGF 4IFM Q1 2012

Upload: vincent-jeannin

Post on 22-Jun-2015

386 views

Category:

Economy & Finance


1 download

DESCRIPTION

First course of Applied Statistics, MSc level in Buisiness School.

TRANSCRIPT

Page 1: Applied Statistics I

Applied Statistics Vincent JEANNIN – ESGF 4IFM

Q1 2012

1

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Page 2: Applied Statistics I

2

Summary of the session (est. 4.5h) • Introduction & Objectives • Bibliography • First approach: Descriptive Statistics • The Normal Distribution • Applications (GBM, B&S, Greeks, CRR)

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Page 3: Applied Statistics I

Introduction & Objectives

• What are statistics?

• Why Should you use them?

3

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Describe data behaviour

Modelise data behaviour

• Take the opportunity to remember financial mathematics basics

• Acquire theory knowledge on statistics

• Usage of R and Excel

Business decisions (pricing, investments,…)

Page 4: Applied Statistics I

Bibliography

4

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Page 5: Applied Statistics I

First Approach: Descriptive statistics

5

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

FCOJ Front Month: 31st Dec 2008 / 30th Sep 2011

Page 6: Applied Statistics I

6

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

First step, calculate the linear returns 𝑅𝑡 =𝑉𝑡𝑉𝑡−1− 1

Then, the mean 𝑅 =1

𝑛 𝑅𝑖

𝑛

𝑖=1

Expected return, not average return!

How to calculate average return on the period? (compound return) How to obtain it by a sum?

𝑅𝐶𝑜𝑚𝑝 =𝑉1𝑉𝑛

𝑛

− 1

𝑅𝐶𝑜𝑚𝑝 = 𝑙𝑛𝑉𝑖𝑉𝑖−1= ln 𝑉𝑛 − ln 𝑉1 = 𝑙𝑛

𝑉𝑛𝑉1

𝑛

𝑖=2

𝑅1 = 𝑙𝑛𝑉2𝑉1= ln 𝑉2 − ln 𝑉1

𝑅2 = 𝑙𝑛𝑉3𝑉2= ln 𝑉3 − ln 𝑉2

𝑅𝑛−1 = 𝑙𝑛𝑉𝑛−1𝑉𝑛−2= ln 𝑉𝑛−1 − ln 𝑉𝑛−2

𝑅𝑛 = 𝑙𝑛𝑉𝑛𝑉𝑛−1= ln 𝑉𝑛 − ln 𝑉𝑛−1

Page 7: Applied Statistics I

7

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Excel and R can give an idea of the distribution

R, easier, faster,… Function summary

Excel, functions Min, Max, Average, Percentile

• Free • Open Source • Developments shared by developers

Page 8: Applied Statistics I

8

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

R can easily show the distribution of returns

Interesting shape but what next?

Page 9: Applied Statistics I

9

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

The four moments

Mean 𝑅 =1

𝑛 𝑅𝑖

𝑛

𝑖=1

Standard Deviation

Expected Return

Dispersion from the mean

Square root of the variance

𝜎 = 𝐸 𝑋 − 𝑋 2

SD is the square root of the mean of squared differences to the mean

𝜎 =1

𝑛 𝑅𝑖 − 𝑅

2

𝑛

𝑖=1

Page 10: Applied Statistics I

10

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Quick check: what is the SD of {1,2,-3,0,-2,1,1}?

Excel function STDEVP

R function sd

FCOJ has a SD of 2.16%

Page 11: Applied Statistics I

11

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

> xBar<-mean(FCOJ$V1)

> SD <- sd(FCOJ$V1)

> hist(FCOJ$V1, breaks=c(xBar-6*SD,xBar-5*SD,xBar-4*SD,xBar-3*SD,xBar-

2*SD,xBar-

SD,xBar,xBar+SD,xBar+2*SD,xBar+3*SD,xBar+4*SD,xBar+5*SD,xBar+6*SD),main="F

COJ Returns",xlab="Return",ylab="Occurence")

Histogram centred on the mean with SD multiples groups

Symmetric-ish

Page 12: Applied Statistics I

12

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

693 data

75.04% within ±1𝜎

94.23% within ±2𝜎

98.99% within ±3𝜎

Page 13: Applied Statistics I

13

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Skewness, the third moment

𝑆𝐾𝐸𝑊 𝑋 = 𝐸𝑋 − 𝑋

𝜎

3

=𝐸 𝑋 − 𝑋 3

𝐸 𝑋 − 𝑋 2 3/2

Asymmetry of the distribution

• Negative skew: long left tail, mass on the right, skew to the left • Positive skew: long right tail, mass on the left, skew to the right

Should I rather buy or sell a positive skewed asset?

Page 14: Applied Statistics I

14

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Excel function SKEW

R function skewness (package moments)

FCOJ is positively skewed

> require(moments)

> library(moments)

> skewness(FCOJ$V1)

[1] 0.2030842

Page 15: Applied Statistics I

15

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Kurtosis, the fourth moment

Peakedness of the distribution

• Positive excess Kurtosis: high peak around the mean, fat tails • Negative excess Kurtosis: low peak around the mean, thin tails

𝐾𝑈𝑅𝑇 𝑋 = 𝐸𝑋 − 𝑋

𝜎

4

=𝐸 𝑋 − 𝑋 4

𝐸 𝑋 − 𝑋 2 2

It’s a usage to deal with the excess kurtosis (relative to the normal distribution, subtracting 3

Which distribution you’d rather buy or sell?

Page 16: Applied Statistics I

16

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

What is the most platykurtic distribution in the nature?

Toss it!

Head = Success = 1 / Tail = Failure = 0

> require(moments)

> library(moments)

> toss<-rbinom(10000000,1,0.5)

> mean(toss)

[1] 0.5001777

> kurtosis(toss)

[1] 1.000001

> kurtosis(toss)-3

[1] -1.999999

> hist(toss, breaks=10,main="Tossing a

coin 10 millions times",xlab="Result

of the trial",ylab="Occurence")

> sum(toss)

[1] 5001777

Page 17: Applied Statistics I

17

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

50.01777% rate of success: fair or not fair? Trick coin ?

On a perfect 50/50, Kurtosis would be 1, Excess Kurtosis -2: the minimum!

This is a Bernoulli trial

𝐵(𝑛, 𝑝)

𝑝 Mean

SD 𝑝(1 − 𝑝)

Skewness 1 − 2𝑝

𝑝(1 − 𝑝)

Kurtosis 1

𝑝(1 − 𝑝)− 3

Easy to demonstrate if p=0.5 the Kurtosis will be the lowest Bit more complicated to demonstrate it for any distribution

Will be tested later with a Bayesian approach

𝑛 > 1 0 < 𝑝 < 1 with and 𝑝 ∈ ℝ and 𝑛 integer

Page 18: Applied Statistics I

18

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Excel function KURT

R function kurtosis (package moments)

FCOJ is leptokurtic

> require(moments)

> library(moments)

> kurtosis(FCOJ$V1)

[1] 6.34176

> kurtosis(FCOJ$V1)

[1] 3.34176

Page 19: Applied Statistics I

19

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Sum-up: • Positive expected return • Positive skew • Positive excess kurtosis

Buy or Sell?

Is that actually enough to take investment decision? What next? How different is the FCOJ distribution from the Normal Distribution?

Page 20: Applied Statistics I

20

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

The Normal Distribution

Snapshot, 4 moments:

Mean

SD

Skewness

Kurtosis

0

1

0

3

Snapshot, Shape:

Let’s discuss about the standard normal first…

Page 21: Applied Statistics I

21

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

𝑓 𝑥 =1

2𝜋𝜎2𝑒−(𝑥−𝜇)2

2𝜎2 Density

𝑁(𝜇, 𝜎) Notation

Distributions of zeros means with following SD: 0.5 / 0.75 / 1 / 1.5 / 2

Which one is which one?

Page 22: Applied Statistics I

22

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

> x=seq(-4,4,length=500)

> y1=dnorm(x,mean=0,sd=0.5)

> y2=dnorm(x,mean=0,sd=0.75)

> y3=dnorm(x,mean=0,sd=1)

> y4=dnorm(x,mean=0,sd=1.5)

> y5=dnorm(x,mean=0,sd=2)

> plot(x,y1,type="l",lwd=3,col="red",

main="Normal Distributions", ylab="f(x)")

> lines(x,y2,type="l",lwd=3,col="blue")

> lines(x,y3,type="l",lwd=3,col="black")

> lines(x,y4,type="l",lwd=3,col="yellow")

> lines(x,y5,type="l",lwd=3,col="pink")

All other things equal, low SD is a high peak Values are more compacted around the mean

• FCOJ has a mean of 1.364% and a SD of 2.164% • Let’s compare the distribution with a normal distribution with the same

mean and SD

FCOJ<-

read.csv(file="C:/Users/Vinz/Desktop/FCOJStats.csv",head=FALSE,sep=",")

x=seq(-0.2,0.2,length=200)

y1=dnorm(x,mean=mean(FCOJ$V1),sd=sd(FCOJ$V1))

hist(FCOJ$V1, breaks=100,main="FCOJ Returns / Normal

Distribution",xlab="Return",ylab="Occurence")

lines(x,y1,type="l",lwd=3,col="red")

Page 23: Applied Statistics I

23

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

The excess Kurtosis sign is obvious, isn’t it?

Page 24: Applied Statistics I

24

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Same SD, different mean, more straight forward

Page 25: Applied Statistics I

25

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Cumulative Distribution

This is the integral of the density function

Reminder: the CDF (Cumulative Distribution Function) is the probability of the random variable X given a distribution to be lower or equal to x

𝑃 𝑋 ≤ 𝑥 = 𝜙 𝑥 = 𝑓 𝑥 𝑑𝑥𝑥

−∞

Important Properties

𝑃 𝑋 = 𝑥 = 0

𝑃 𝑦 ≤ 𝑋 ≤ 𝑥 = 𝑃(𝑋 ≤ 𝑥)-𝑃(𝑋 ≤ 𝑦)

𝑃 𝑋 ≥ 𝑥 = 1 − 𝑃(𝑋 ≤ 𝑥)

lim𝑥→−∞𝑃 𝑋 ≤ 𝑥 = 0

lim𝑥→+∞𝑃 𝑋 ≤ 𝑥 = 1

Page 26: Applied Statistics I

26

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Can’t be expressed with elementary functions: - Help with tables - Help with calculator

Again, let’s discuss about the standard normal first…

𝑃 𝑋 ≤ 0 = 𝑃 𝑋 ≤ −1

𝑃 𝑋 ≤ −2

𝑃 𝑋 ≤ −3

𝑃 −1 ≤ 𝑋 ≤ 1

𝑃 −2 ≤ 𝑋 ≤ 2

𝑃 −3 ≤ 𝑋 ≤ 3

𝑃 𝑋 ≤ −1.645

𝑃 𝑋 ≤ −2.326

0.5

> x=seq(-4,4,length=500)

>plot(x,pnorm(x,mean=0,sd=1),col=

"red",type="l",lwd=3,

xlab="x",ylab="P(X<=x)",

main="Normal Standard CFD")

= 0.05

= 0.01

= 0.158

= 0.023

= 0.001

= 0.682

= 0.954

= 0.996

Page 27: Applied Statistics I

27

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

>x=seq(-4,4,length=500)

>plot(x,pnorm(x,mean=0,sd=1),co

l="black",type="l",lwd=3,

xlab="x",ylab="P(X<=x)",

main="Normal Distributions -

CFD's")

>lines(x,pnorm(x,mean=0,sd=0.75

),col="red",type="l",lwd=3)

>lines(x,pnorm(x,mean=0,sd=1.25

),col="pink",type="l",lwd=3)

>lines(x,pnorm(x,mean=1,sd=1.25

),col="yellow",type="l",lwd=3)

Identify: N(0,0.75) / N(0,1) / N(0,1.25) / N(1,1.25)

𝑃 𝑋 ≤ 𝜇 = 𝑃 𝑋 ≤ −𝜎 + 𝜇

𝑃 𝑋 ≤ −2 ∗ 𝜎 + 𝜇

𝑃 𝑋 ≤ −3 ∗ 𝜎 + 𝜇

𝑃 𝜇 − 𝜎 ≤ 𝑋 ≤ 𝜇 + 𝜎

𝑃 𝜇 − 2 ∗ 𝜎 ≤ 𝑋 ≤ 𝜇 + 2 ∗ 𝜎

𝑃 𝜇 − 3 ∗ 𝜎 ≤ 𝑋 ≤ 𝜇 + 3 ∗ 𝜎

𝑃 𝑋 ≤ −1.645 ∗ 𝜎 + 𝜇

𝑃 𝑋 ≤ −2.326 ∗ 𝜎 + 𝜇

0.5

General Case

= 0.05

= 0.01

= 0.159

= 0.023

= 0.001

= 0.682

= 0.954

= 0.996

Page 28: Applied Statistics I

28

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Standardization

𝑋~𝑁(𝜇, 𝜎)

𝑌 =𝑋 − 𝜇

𝜎

𝑌~𝑁(0,1)

Only one statistical table to use

𝑃 𝑋 ≤ 𝑥 = 𝑃 𝑌 ≤𝑥 − 𝜇

𝜎 𝑌~𝑁(0,1) with

Page 29: Applied Statistics I

29

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Let be X~N(2,4) Find:

𝑃 𝑋 ≤ −1.86

𝑃 𝑋 ≤ −1.86 =P 𝑌 ≤−1.86−2

4

With Y~N(0,1)

P 𝑌 ≤ −0.965 =?

Use the table!

Linear interpolation acceptable

P 𝑌 ≤ −0.96 =0.1685

P 𝑌 ≤ −0.97 =0.1660

P 𝑌 ≤ −0.965 =0.16725

P 𝑋 ≤ −1.86 =0.16725

Page 30: Applied Statistics I

30

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Back to FCOJ… Let’s compare FCOJ CFD with Normal Distribution (same mean/SD)

>x=seq(-4,4,length=500)

>plot(ecdf(FCOJ$V1),do.points=FALSE, col="red", lwd=3, main="Normal

Distribution against FCOJ - CFD's", xlab="x", ylab="P(X<=x)")

>lines(x,pnorm(x,mean=mean(FCOJ$V1),sd=sd(FCOJ$V1)),col="blue",type="l",l

wd=3)

Where can you see the excess kurtosis?

Page 31: Applied Statistics I

ESG

F 4

IFM

Q1

20

12

vi

nzj

ean

nin

@h

otm

ail.c

om

31

>qqnorm(FCOJ$V1)

>qqline(FCOJ$V1)

• This is the QQ Plot to compare the quantiles to a normal distribution • If observations are not on the fitted line, it would suggest a normal distribution

Conclusion?

Following intuition is the first step of descriptive statistics, however, formally testing them is even better! Later step…

Fat Tail

Page 32: Applied Statistics I

32

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

• Would you rather trade financial product with high or low SD? • Would you rather trade financial product which has return with a negative

mean?

Discussion

SD measures the risk, the volatility: depends on risk appetite

• Mean is irrelevant standalone and you could bet on mean reversion • Very often, the mean is fixed to 0 in finance whatever its real value is

Page 33: Applied Statistics I

Applications

33

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Discrete form 𝑑𝑠𝑡 = 𝜇𝑠𝑡𝑑𝑡 + 𝜎𝑠𝑡 𝑑𝑡𝜀

Geometric Brownian Motion

Based on Stochastic Differential Equation 𝑑𝑠𝑡 = 𝜇𝑠𝑡𝑑𝑡 + 𝜎𝑠𝑡𝑊𝑡

with 𝜀~N(0,1)

Used for random walk, martingale, Monte-Carlo, Black & Scholes…

It becomes easy to simulate the price process but what are problems?

Volatility depends on the square root of the time, problem of extrapolation

1% Daily volatility is: • 4.58% Monthly • 7.94% Quarterly • 15.87% Yearly • 35.50% 5 Years • 50.20% 10Years

𝜎𝑇 = 𝜎𝑡 ∗𝑇

𝑡

Is this realistic?

Page 34: Applied Statistics I

ESG

F 4

IFM

Q1

20

12

vi

nzj

ean

nin

@h

otm

ail.c

om

34

First Excel problem on the RAND function: • Random number generation is pseudo random • Uniform distribution [0,1] • No seed fixing = Heavy memory usage (new numbers generated when

spreadsheet is recalculated)

3 acceptable solutions: • Assume the generated number is a probability and the invert it with

NORM.INV(RAND(), mean, standard_dev) but fatter tails • Box-Muller method using SQRT(-2*LN(RAND()))*SIN(2*PI()*RAND()) but is

only exact with a perfect uniform random number generation • Central Limit Theorem, normal distribution is approached by 12 uniform

random variables [0,1] subtracting 6, so use RAND()+RAND()+RAND()+RAND()+RAND()+RAND()+RAND()+RAND()+RAND()+RAND()+RAND()+RAND()-6 but fatter tails

Actual normality of such methods will be tested later…

Page 35: Applied Statistics I

ESG

F 4

IFM

Q1

20

12

vi

nzj

ean

nin

@h

otm

ail.c

om

35

So Excel is an hassle… Use R! • Proper random number generation on any chosen distribution • Seed fixable • Quicker

Let’s show why it’s better to use a discretisation • Let’s assume a stock with an annual drift (expected return) of 5%, a yearly

volatility of 5%, let’s simulate the price in one year by two methods • One year “one shot” • One year with daily (252 business days) steps

> Drift<-0.05

> Volat<-0.05

> Spot<-100

> Simul<-Spot+Drift*Spot+Volat*Spot*rnorm(10)

> plot(c(Spot,Simul[1]), type="l",

ylim=c(min(Simul)-1,max(Simul)+1),

main="Simulation one shot", xlab="T", ylab="S")

> lines(c(Spot,Simul[2]), type="l")

> lines(c(Spot,Simul[3]), type="l")

> lines(c(Spot,Simul[4]), type="l")

> lines(c(Spot,Simul[5]), type="l")

> lines(c(Spot,Simul[6]), type="l")

> lines(c(Spot,Simul[7]), type="l")

> lines(c(Spot,Simul[8]), type="l")

> lines(c(Spot,Simul[9]), type="l")

> lines(c(Spot,Simul[10]), type="l")

Page 36: Applied Statistics I

ESG

F 4

IFM

Q1

20

12

vi

nzj

ean

nin

@h

otm

ail.c

om

36

> summary(Simul)

Min. 1st Qu. Median Mean 3rd Qu. Max.

96.51 105.10 107.00 106.60 108.80 116.50

> sd(Simul)

[1] 5.23066

Very sensitive to the random number picked

20.99 difference between the lowest and highest scenario

SD of 5.23 in the results

What would be the mean in a perfect situation?

Very sensitive to the number of trials

Page 37: Applied Statistics I

ESG

F 4

IFM

Q1

20

12

vi

nzj

ean

nin

@h

otm

ail.c

om

37

library(sde)

require(sde)

nbsim<-252

Drift<-0.05

Volat<-0.05

Spot<-100

G1<-GBM(x=Spot,r=Drift, sigma=Volat,N=nbsim)

G2<-GBM(x=Spot,r=Drift, sigma=Volat,N=nbsim)

G3<-GBM(x=Spot,r=Drift, sigma=Volat,N=nbsim)

G4<-GBM(x=Spot,r=Drift, sigma=Volat,N=nbsim)

G5<-GBM(x=Spot,r=Drift, sigma=Volat,N=nbsim)

G6<-GBM(x=Spot,r=Drift, sigma=Volat,N=nbsim)

G7<-GBM(x=Spot,r=Drift, sigma=Volat,N=nbsim)

G8<-GBM(x=Spot,r=Drift, sigma=Volat,N=nbsim)

G9<-GBM(x=Spot,r=Drift, sigma=Volat,N=nbsim)

G10<-GBM(x=Spot,r=Drift, sigma=Volat,N=nbsim)

plot(G1,ylim=c(90,115), col=1, main="GBM day by day",

xlab="T", ylab="S")

lines(G2, col=2)

lines(G3, col=3)

lines(G4, col=4)

lines(G5, col=5)

lines(G6, col=6)

lines(G7, col=7)

lines(G8, col=8)

lines(G9, col=9)

lines(G10, col=10)

Use the package sde of R for the step by step (discrete) method

Page 38: Applied Statistics I

ESG

F 4

IFM

Q1

20

12

vi

nzj

ean

nin

@h

otm

ail.c

om

38

> FinalS<-

c(G1[nbsim+1],G2[nbsim+1],G3[nbsim+1],G4[nbsim+1],G5[nbsim+1],G6[nbsim+1],G7[

nbsim+1],G8[nbsim+1],G9[nbsim+1],G10[nbsim+1])

> summary(FinalS)

Min. 1st Qu. Median Mean 3rd Qu. Max.

97.81 101.80 103.00 103.70 105.80 109.00

> sd(FinalS)

[1] 3.535826

Lower sensitive to the random numbers chosen

11.29 difference between the lowest and highest scenario

SD of 3.54

Still sensitive to the number of trials

Page 39: Applied Statistics I

39

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Introduction to LogNormaility

Do you remember the slide number 6?

𝑅𝑙𝑖𝑛 = 𝑒𝑅𝑙𝑛 − 1 𝑉𝑡 = 𝑉𝑡−1 ∗ (1 + 𝑅𝑙𝑖𝑛)

𝑉𝑡 = 𝑉𝑡−1 ∗ 𝑒𝑅𝑙𝑛 𝑅𝑙𝑛 = 𝑙𝑛

𝑅𝑙𝑖𝑛+1

FCOJ<-read.csv(file="S:/Vincent/FCOJStats.csv",head=FALSE,sep=",")

FCOJ$V1<-log(FCOJ$V1+1)

hist(FCOJ$V1,breaks=100, main="FCOJ LogReturns / Normal

Distribution",xlab="LogReturn",ylab="Occurence")

x=seq(-0.2,0.2,length=200)

y1=dnorm(x,mean=mean(FCOJ$V1),sd=sd(FCOJ$V1))

lines(x,y1,type="l",lwd=3,col="red")

Page 40: Applied Statistics I

40

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

The LogReturns seem normal (ish) distributed

If LogReturns are normally distributed, the stock price is log normally distributed (useful property as it’s bounded by 0 and it allows to use continuous compounded returns)

𝑆𝑡 = 𝑆𝑡−1𝑒𝑟𝑑𝑡 𝑆𝑡−1 = 𝑆𝑡𝑒

−𝑟𝑑𝑡

Page 41: Applied Statistics I

41

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Black & Scholes

Let’s look at the underling price diffusion process through another angle

𝑅𝑓 + μ

Time

𝑆

Job done, isn’t it?

Page 42: Applied Statistics I

42

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Price distribution of the underlying at maturity

Payoff distribution of the option at maturity can be deducted

Expected Payoff can be calculated

Present value of the expected payoff is the option price!

Pricing Principle

• No arbitrage opportunity (no free lunch). • Existence of a risk-free rate (borrower and lender). • No liquidity problem on long and short positions. • No fees or costs. • Market efficiency. • Stock price follows a geometric Brownian motion with constant drift and volatility. • No dividend.

Assumptions

This is obviously not true… Very strong assumptions!

Page 43: Applied Statistics I

43

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Geometric Brownian Motion & Black & Scholes Option Valuation

Based on Stochastic Differential Equation 𝑑𝑠𝑡 = 𝜇𝑠𝑡𝑑𝑡 + 𝜎𝑠𝑡𝑊𝑡

is a Brownian Motion, in other word a random walk following a normal distribution (zero mean)

𝑊𝑡

𝑑𝑆

𝑆= 𝜇𝑑𝑡 + 𝜎𝑑𝑊

A small variation of price has an expected return of 𝜇 (known, drift) and a standard deviation of 𝜎𝑑𝑊 (uncertain, diffusion)

Over longer horizons, the price is lognormally distributed (then it can’t go below 0, we’ll come back to this)

Risk neutral probability: an option perfectly hedge on continuous basis is risk free and portfolio earns the risk free rate. Drift then has no impact

Demonstration based on integration with Ito Lemma and risk neutral probability (11.6 / 12.7 in John Hull)

Page 44: Applied Statistics I

44

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Pricing Formulas

𝐶 = 𝑁 𝑑1 ∗ 𝑆 − 𝑁 𝑑2 ∗ 𝐾 ∗ 𝑒−𝑟∗(𝑇−𝑡)

𝑑1 =𝑙𝑛𝑆𝐾+ 𝑟 +

𝜎2

2∗ (𝑇 − 𝑡)

𝜎 𝑇 − 𝑡

𝑑2 = 𝑑1 − 𝜎 𝑇 − 𝑡

𝑃 = −𝑁 −𝑑1 ∗ 𝑆 + 𝑁 −𝑑2 ∗ 𝐾 ∗ 𝑒−𝑟∗(𝑇−𝑡)

Buy the Call, Sell the Put… Arbitrage?

𝐶 − 𝑃 = 𝑁 𝑑1 ∗ 𝑆 − 𝑁 𝑑2 ∗ 𝐾 ∗ 𝑒−𝑟∗ 𝑇−𝑡 + 𝑁 −𝑑1 ∗ 𝑆 − 𝑁 −𝑑2 ∗ 𝐾 ∗ 𝑒

−𝑟∗(𝑇−𝑡)

𝐶 − 𝑃 = 𝑁 𝑑1 ∗ 𝑆 − 𝑁 𝑑2 ∗ 𝐾 ∗ 𝑒−𝑟∗ 𝑇−𝑡 +

1 − 𝑁 𝑑1 ∗ 𝑆 − 1 − 𝑁 𝑑2 ∗ 𝐾 ∗ 𝑒−𝑟∗(𝑇−𝑡)

𝐶 − 𝑃 = 𝑆 − 𝐾 ∗ 𝑒−𝑟∗(𝑇−𝑡)

The price difference is the present value of the difference to the strike No arbitrage opportunity!

When does C=P?

Page 45: Applied Statistics I

45

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Greeks - Delta

∆𝐶=𝜕𝐶

𝜕𝑆= 𝑁(𝑑1)

• First derivative of the value of the option with respect to the underlying price S

• Underlying equivalent position • Probability of the option to be at the money at

expiry

Delta ~0.5 if…

Delta [0,1] if…

Delta [-1,0] if…

S is the present value of the strike for a call

For a Call

For a Put

∆𝑃= 𝑁(𝑑1) − 1

What is the exact delta of a Long Call ATMF?

What is the delta of a combined Long Call and Long Put ATMF?

What is the delta of a combined Long Call and Short Put ATMF?

What is the new price of the Call ($7.9683) if S moves up $1.5 with delta=0.5398?

Page 46: Applied Statistics I

46

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Greeks - Gamma

• Second derivative of the value of the option with respect to the underlying price S

• First derivative of the value of the delta with respect to the underlying price S

• Pace of the delta movement • Second order Greek

Gamma [0,1] if…

Gamma [-1,0] if…

Gamma=max if…

Long option

Short option

ATMF

𝛾 =𝜕∆

𝜕𝑆=𝑁′(𝑑1)

𝑠𝜎 𝑇 − 𝑡

What is the new price of the Call ($7.9683) if S moves up $1.5 with delta=0.5398 and a gamma of 0.0198?

Need to use second order central finite difference (Taylor Series)

Page 47: Applied Statistics I

47

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Greeks – Delta/Gamma

Forgetting Gamma is dangerous, difference is 0.25% in our example!

𝑑𝐶 = 𝐶 + ∆ ∗ 𝑑𝑆 +1

2∗ 𝛾 ∗ 𝑑𝑆2

What is the new delta?

8.8003

0.5695

How to delta hedge and gamma hedge?

Third order known as Speed, hardly used…

Write the Taylor Development until the Speed level… 1

6∗ 𝑆𝑝𝑒𝑒𝑑 ∗ 𝑑𝑆3

Page 48: Applied Statistics I

48

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Greeks - Vega

• First derivative of the value of the option with respect to the implied volatility

• Volatility sensitivity • First order Greek

Vega [0,1] if…

Vega [-1,0] if…

Long option

Short option

𝜏 =𝜕𝐶

𝜕𝜎= 𝑆𝑁′(𝑑1) 𝑇 − 𝑡

What is the new price of the Call ($7.9683) if the volatility moves up 1.5 point with a 0.7942 Vega?

Note, it’s not an actual Greek letter! Tau is used…

Second order exists as Vanna, third order as Vomma… Hardly used as it can’t be hedged easily. Volatility of the volatility is THE BIG problem in finance!

Page 49: Applied Statistics I

49

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Greeks - Theta

𝜃𝐶 = −𝑆𝑁′ 𝑑1 𝜎

2 𝑇 − 𝑡− 𝑟𝐾𝑒−𝑟 𝑇−𝑡 𝑁(𝑑2)

𝜃𝑃 = −𝑆𝑁′ 𝑑1 𝜎

2 𝑇 − 𝑡+ 𝑟𝐾𝑒−𝑟 𝑇−𝑡 𝑁(−𝑑2)

Simply the time decay 𝜃 =𝜕𝑉

𝜕𝑡

Theta >0 if…

Theta <0 if…

Short option

Long option

Time has as well noticeable effects on Delta (Charm), Gamma (Color) and Vega (DvegaDtime)

What is the new price of the Call ($7.9683) in 2 days with -0.9920 Theta?

Theta is am annual value

Page 50: Applied Statistics I

50

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Greeks - Rho

𝜌𝐶 =𝜕𝐶

𝜕𝑟= 𝐾 𝑇 − 𝑡 𝑒−𝑟(𝑇−𝑇)𝑁(𝑑2)

• First derivative of the value of the option with respect to the interest rate

𝜌𝑃 = −𝐾 𝑇 − 𝑡 𝑒−𝑟 𝑇−𝑡 𝑁(−𝑑2)

What is the new price of the Call ($7.9683) if r moves up 1 basis point with Rho=184.1895?

Careful, high convexity. Need a second order for extreme movement.

Page 51: Applied Statistics I

51

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Sum Up - Example

What is the new price of the Call ($7.9683) if S moves up $1.5 with delta=0.5398 and a gamma of 0.0198, volatility moves up 1.5 point with a 0.7942 Vega, r moves up 1 basis point with Rho=184.1895 and placing you 2 days after with a final Theta of -0.9920?

10.0147

Real pricing: 10.0094

Difference of only 0.05% mainly due to the other effects on Greeks by time decay but it’s pretty close!

Page 52: Applied Statistics I

52

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Sum Up Greeks/Time

Call 100, S=105, r=5%, Maturity from 4y, Vol=10%

Page 53: Applied Statistics I

53

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Sum Up Greeks/Spot Price

Call 100, r=5%, Maturity 4y, Vol=10%

Page 54: Applied Statistics I

54

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Sum Up Greeks/Strike

S=105, r=5%, Maturity 4y, Vol=10%

Page 55: Applied Statistics I

55

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Sum Up Greeks/Vol

Call 100, S=105, r=5%, Maturity 4y

Page 56: Applied Statistics I

56

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Conclusion on B&S

Great, easy, quick

Strong assumptions, continuous

Only European option

We need a path dependant method!

It will allow to include early exercise, dividend, pricing European digital,…

Page 57: Applied Statistics I

57

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Binomial Model (Cox, Ross, Rubinstein, 1979)

Path dependent (valuation of European options, American options, Digital,…)

May include dividends

Discretisation of the continuous random walk

Why?

How?

Page 58: Applied Statistics I

58

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Binomial Model: principles

Construct a tree lattice representing the stock price following a GBM

Price the option by backwards induction

“Slice” maturity in a predefined number of steps

3 Steps

Page 59: Applied Statistics I

59

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Let’s assume the maturity is divided by 2

Page 60: Applied Statistics I

60

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Cox Ross Rubinstein up and down factors based on GBM

Do you see other methods? Which? Why? Which one are better?

At each node, S goes up or down by one SD

𝑢 = 𝑒𝜎 𝑡

𝑑 =1

𝑢= 𝑒−𝜎 𝑡

𝑓𝑎𝑐𝑡𝑜𝑟𝑠 = 1

Page 61: Applied Statistics I

61

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Let’s build a tree with 3 steps, with S=100, σ=10%, 1.5 year to maturity

100

𝑢 = 𝑒𝜎 𝑡 = 𝑒0.1 0.5 = 1.073271

𝑑 = 𝑒−𝜎 𝑡 = 𝑒−0.1 0.5 = 0.931731

107.33

93.17

115.19

100

86.81

107.33

93.17

123.63

80.89

Be clever building it!

What happened to the drift implied by the risk free rate?

Page 62: Applied Statistics I

62

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

𝑆𝑛 = 𝑆0 ∗ 𝑢𝑁𝑢−𝑁𝑑

What is the price of the stock at any given node?

𝑛 + 1

How many nodes do you have at the end of the tree?

𝑆

If number of steps are even, what’s the value of the middle node on the last step?

Page 63: Applied Statistics I

63

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

100

107.33

93.17

115.19

100

86.81

107.33

93.17

123.63

80.89

Having S at maturity, it’s easy to have the price of a EU Call 105 at maturity

0

0

2.33

18.63

Backward inductions, we have the probabilities, let’s assume a risk free rate of 5%

Page 64: Applied Statistics I

64

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

115.19

107.33

123.63

2.33

18.63

u

d

Need to calculate the new probabilities integrating the Risk Free Rate to comply with the risk neutrality assumption

S𝑒𝑟𝑡 = 𝑝𝑆𝑢 + 1 − 𝑝 𝑆𝑑 𝑒𝑟𝑡 = 𝑝𝑢 + 1 − 𝑝 𝑑

𝑝 =𝑒𝑟𝑡 − 𝑑

𝑢 − 𝑑

Therfore:

BV= OpUp ∗ p + OpDown ∗ 1 − p ∗ 𝑒−𝑟𝑡

12.78

Page 65: Applied Statistics I

65

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

100

107.33

93.17

115.19

100

86.81

107.33

93.17

123.63

80.89

0

0

2.33

18.63

12.78

1.5

0

8.74

0.97

5.96

Page 66: Applied Statistics I

66

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

A European 105 Call option with 1.5 years to Maturity, a Volatility of 10% and a risk free rate of 5% with three steps worth 5.96

How much with B&S? 6.22

Significant difference, why?

Sensitivity to the number of steps

The more step, the less discrete, the more continuous

Extrapolated to the infinite, you’d find your GBM and so B&S!

Page 67: Applied Statistics I

67

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

5.9

5.95

6

6.05

6.1

6.15

6.2

6.25

6.3

6.35

6.4

1 6 11 16 21 26 31 36 41 46 51 56 61 66 71

CRB

BS

B&S / CRR Convergence: usually 40 steps are reasonable

Page 68: Applied Statistics I

68

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

I meant American Option! Let’s start all over again…

CRR main advantage is the ability to price American Options

Page 69: Applied Statistics I

69

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

100

107.33

93.17

115.19

100

86.81

107.33

93.17

123.63

80.89

0

0

2.33

18.63

13.84

1.5

0

Binomial Value

On each node you need to check any early exercise possibility

Intrinsic value

10.19

0

0

But sometimes holding is better than exercising and in this case no early exercise worth and price of the European Call and American Call will be the same

8.74

0.97

2.33

0

5.96

Page 70: Applied Statistics I

70

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Pricing of an American Put option, S=50, K=50 with a 10% risk free rate, a 40% volatility, 5 steps and time to maturity 0.4167 year.

Tree of stock price

Page 71: Applied Statistics I

71

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Binomial Value at the next to last and last node (i.e. Valuating as if it was a European Put)

0

0

0

5.45

14.64

21.93

0

0

2.66

9.90

18.08

Page 72: Applied Statistics I

72

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Any early exercise worth?

0

0

0

5.45

14.64

21.93

0

0

2.66

9.90

18.08

0

0

0

10.31

18.5

Page 73: Applied Statistics I

73

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Finally…

0

0

0

5.45

14.64

21.93

0

0

2.66

10.31

18.5

0

1.30

6.38

14.64

0.64

3.77

10.36

6.96

2.16 4.49

Page 74: Applied Statistics I

74

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

The American Put worth 4.49

The European Put worth 4.32

Difference can be non negligible

Page 75: Applied Statistics I

75

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Pricing of an European Digital Put option, Q=15, S=50, K=50 with a 10% risk free rate, a 40% volatility, 5 steps and time to maturity 0.4167 year.

Tree of stock price

The pay-off at maturity is binary: 0 if out of the money, Q if in the money

Page 76: Applied Statistics I

76

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Last node pay off is then straight forward

0

0

0

15

15

15

Page 77: Applied Statistics I

77

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Then method doesn’t change… Backward induction.

0

0

0

15

15

15

0

0

7.33

14.88

14.88

14.75

10.96

3.58

0

1.75

7.15

12.72

9.81

4.38

7.00

Page 78: Applied Statistics I

78

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Pricing of an Bermuda Put option, S=50, K=50 with a 10% risk free rate, a 40% volatility, 5 steps and time to maturity 0.4167 year.

Tree of stock price

Let’s suppose this Bermuda can only be exercised between the 4th and 5th step

Page 79: Applied Statistics I

79

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Any early exercise worth?

0

0

0

5.45

14.64

21.93

0

0

2.66

9.90

18.08

0

0

0

10.31

18.5

No exercises on lower steps

Page 80: Applied Statistics I

80

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Finally…

0

0

0

5.45

14.64

21.93

0

0

2.66

10.31

18.5

0

1.30

6.38

14.22

0.64

3.77

10.16

6.86

2.16 4.44

A “full” American option would have been exercised, not this one

Page 81: Applied Statistics I

81

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Pricing of an Put option, S=50, K=50 with a 10% risk free rate, a 40% volatility, 5 steps and time to maturity 0.4167 year, paying a $2.06 dividend on the in 3.5 months.

Construct the usual tree

Subtract the present value of the dividend on each node before it occurs

Pricing can continue as usual

3 Steps

The dividend occurs between the 3rd and 4th step

Value at step 0

Value at step 1

Value at step 2

Value at step 3

𝑃𝑉 = 2.06 ∗ 𝑒−10%∗3.512 = 2

𝑃𝑉 = 2.06 ∗ 𝑒−10%∗

3.512−0.41675 = 2.02

𝑃𝑉 = 2.06 ∗ 𝑒−10%∗

3.512−0.4167∗25 = 2.03

𝑃𝑉 = 2.06 ∗ 𝑒−10%∗

3.512−0.4167∗35 = 2.05

Page 82: Applied Statistics I

82

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Tree of stock price impacted of dividends

Page 83: Applied Statistics I

83

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Pricing by the usual backward induction (don’t forget potential early exercise)

0

0

0

0

0 0

2.66

10.31

18.50

1.30

6.38

14.22

10.16

3.77

6.86

2.16

4.44 5.45

14.64

21.93

0.64

Page 84: Applied Statistics I

84

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

The American Put worth 4.49

The European Put worth 4.32

The Digital Put paying 15 worth 7.00

The Bermuda Put with exercise on the lath fifth of the maturity worth 4.44

You can virtually price anything you want!

CRR Sum-Up

What can’t you price?

The American Put paying a 2.06 dividend worth 4.44

Page 85: Applied Statistics I

85

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Pricing of an Barrier Put option, S=50, K=50 with a 10% risk free rate, a 40% volatility, 5 steps and time to maturity 0.4167 year with a knock out barrier at 60

Tree of stock price

The option is cancelled if S goes to 60

Way to reduce the price of the option

KO

You can’t tell how much worth the option on this final node: 0 or 5.45?

0

5.45

Page 86: Applied Statistics I

86

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

CRR Extension

How to converge faster to the correct option price?

Put a third factor

• Up • Down • Stable

Careful, the tree has to recombine!

YES NO

Page 87: Applied Statistics I

87

vin

zjea

nn

in@

ho

tmai

l.co

m

ESG

F 4

IFM

Q1

20

12

Conclusion

Normal Distribution

GBM

B&S

CRR