topic one { financial risk: loss distributions and risk

131
Risk Management Topic One – Financial risk: Loss distributions and risk measures 1.1 Types of financial risks Systemic risk: Wall Street fails Main Street Black Monday in 1987: Market failure arising from program trading 2010 Flash crash Cash flow and liquidity risk Operational / Model / Legal risks 1.2 Hedging of market risks Dynamic hedging of options Minimum variance hedge ratio 1

Upload: others

Post on 14-Jan-2022

4 views

Category:

Documents


0 download

TRANSCRIPT

Risk Management

Topic One – Financial risk: Loss distributions and risk measures

1.1 Types of financial risks

• Systemic risk: Wall Street fails Main Street

• Black Monday in 1987: Market failure arising from program trading

• 2010 Flash crash

• Cash flow and liquidity risk

• Operational / Model / Legal risks

1.2 Hedging of market risks

• Dynamic hedging of options

• Minimum variance hedge ratio

1

1.3 Portfolio loss distribution

• Credit risk: Loan portfolio losses

• Fitting of loss distribution

1.4 VaR (Value-at-Risk), expected shortfall and economic capital

• VaR calculations

• Expected shortfall

• Coherent risk measures

• Risk control for expected utility-maximizing investors

• Economic capital

• Extreme value theory

2

1.1 Types of financial risks

Risk can be defined as loss or exposure to mischance, or more quantita-

tively as the volatility of unexpected outcomes, generally related to the

value of assets or liabilities of concern. It is best measured in terms of

probability distribution functions.

• While some firms may passively accept financial risks, others attempt

to create a competitive advantage by judicious exposure to financial

risk. This is similar to the counterparties (buyer of insurance and

insurance company) in insurance contracts. Another example is the

highly liquid credit default swap, where the counterparties are the

protection buyer and protection seller. The protection is referencing

single credit asset or basket of risky assets. In both cases, these risks

must be monitored because of their potential for damage (or even

ruin).

Risk management is the process by which various risk exposures are iden-

tified, measured, and controlled.

3

Systemic risk: Wall Street fails Main Street

Systemic risk is the risk of loss from some catastrophic event that can

trigger a collapse in a certain industry or economy.

• The Asian turmoil of 1997 wiped off about three-fourth of the dollar

capitalization of equities in Indonesia, Korea, Malaysia, and Thailand.

• The Russian default in August 1998 sparked a global financial crisis

that culminated in the near failure of a big hedge fund, Long Term

Capital Management.

• The subprime lending crisis and mortgage meltdown triggered the fi-

nancial tsunami in 2008, witnessing the bankruptcy of Lehman Broth-

ers and failure of other major financial institutions, like AIG, CitiGroup,

Merill Lynch.

• The European debt crisis in 2011, caused by failure of several euro-

zone member states (PIIGS: Portugal, Italy, Ireland, Greece, Spain)

to repay or refinance their government debts, triggered crashes in the

financial markets around the globe.

4

Black Monday, October 19, 1987

U.S. stocks (DJIA) collapsed by 22.68 percent, wiping out $1 trillion in

capital. This decline remains to be the largest one-day percentage decline

in DJIA. This is a trading event, but not an economic one.

• The most important factor was program trading. In its intention to

protect every single portfolio from risk, it became the largest single

source of market risk.

• The computer programs in program trading automatically began to

liquidate stocks as certain loss targets were hit, pushing prices lower.

The lower prices fueled more liquidation with stocks dropping 22%

on the day. This is an example of behavior that is rational on an

individual level, but irrational if everyone adopts the same behavior.

5

• One of the core assumptions of these programs was proven false, as

it assumed that there are sufficient buyers and sellers on both sides

to provide enough liquidity. The same programs also automatically

turned off all buying. Since these programs were widely adopted by

institutional investors, so buying orders vanished all around the stock

market at basically the same time that these programs started selling.

Global market impacts in the aftermath

By the end of October, stock markets had fallen in Hong Kong (45.5%),

Australia (41.8%), Spain (31%), United Kingdom (26.45%). Howev-

er, the US economy was barely affected and growth actually increased

throughout 1987 and 1988, with the DJIA regaining its pre-crash closing

high of 2,722 points in early 1989. Can we do investing like Buffet: Buy

on the fear, sell on the greed?

6

Market downward corrections occurred on a few days prior to Black Mon-

day coupled with political crisis (Iran hit an American supertank with a

Silkworm missile).

7

Market factor: excessive valuations

• 1987 had been a strong year for the stock market leading up to the

crash, as it continued the bull market that began in 1982. The stock

market and economy were diverging for the first time in the bull

market. Due to this factor, valuation for the stock market climbed to

excessive levels, with the price to earnings ratio climbing above 20.

Future estimates for earnings were trending lower, and stocks were

unaffected.

• Market participants were aware of these issues, but the use of portfolio

insurance (hedging a portfolio of stocks against the market risk by

short selling index futures) led many to ignore these warning signs.

This false belief ended up fueling excessive risk taking, which only

became apparent when stocks began to weaken in the days leading

up to the stock market crash.

8

2010 Flash Crash

May 6, 2010 Flash Crash was a trillion-dollar stock market crash, which

started at 2:32pm and lasted for approximately 36 minutes. The DJIA

plunged 998.5 points (about 9%), most within minutes, only to recover

a large part of the loss.

9

• Navinder Singh Sarao (London based home trader) was trading E-

mini S&P 500 futures contracts as a spoofer (bid or offer with intent

to cancel before the orders are filled). He would put in a big order to

sell a whole bunch of futures at a price a few ticks higher than the

best offer. He would not sell any futures since he was not offering

the best price. But he had to keep constantly updating his orders to

keep them a few ticks higher than the best offer, to make sure that

he did not accidentally sell any futures as the market moved.

• Sarao built a spoofing robot (an order is canceled by an automated

algorithm if the market gets close) and traded a ton of E-mini fu-

tures during the flash crash: “62,077 E-mini S&P contracts with a

notional value of $3.5 billion” and made “approximately $879,018 in

net profits” on that day.

10

Cash flow risk: Long Term Capital Management

• Portfolios that are highly leveraged and subject to margin calls from

the lender. The portfolio manager may be forced to liquidate the

assets, so transforming paper losses into realized losses.

• Long Term Capital Management (LTCM) is a hedge fund formed in

the mid 1990s. The hedge fund’s investment strategy was known as

convergence arbitrage.

• It would find two bonds, X and Y , issued by the same company

promising the same payoffs, with X being less liquid than Y . The

market always places a value on liquidity. As a result the price of

X would be less than the price of Y . LTCM would buy X, short Y

and wait, expecting the prices of the two bonds to converge at some

future time. Normally, with better liquidity on X provided by LTCM,

price of X moves up while price of Y moves down.

11

• When interest rates increased (decreased), the company expected

both bonds to move down (up) in price by about the same amount, so

that the collateral it paid on bond X would be about the same as the

collateral it received on bond Y . It therefore expected that there would

be no significant outflow of funds as a result of its collateralization

agreements.

• In August 1998, Russia defaulted on its debt and this led to what is

termed a“flight to quality” in capital markets. One result was that

investors valued liquid instruments more highly than usual and the

spreads between the prices of the liquid and illiquid instruments in

LTCM’s portfolio increased dramatically (instead of convergence).

12

Posting Collateral

• The prices of the bonds LTCM had bought went down and the prices

of those it had shorted increased. LTCM was required to post collat-

eral on both.

• The company was highly leveraged (about 50 times) and unable to

make the payments required under the collateralization agreements.

The result was that positions had to be closed out and there was a

total loss of about $4 billion.

• If the company had been less highly leveraged, it would probably have

been able to survive the flight to quality and could have waited for

the prices of the liquid and illiquid bonds to become closer (eventual

convergence of prices). Collaterization is meant to safeguard against

default risk. However, in many historical cases (like Orange County

in Treasury Bonds position and AIG in senior default swaps position),

the call for posting collaterization led to default due to immediate

realization of losses.

13

Liquidity risk

Asset liquidity risk

This arises when a transaction cannot be conducted at prevailing market

prices due to the size of the position relative to normal trading lots.

– Some assets, like Treasury bonds, have deep markets where most

positions can be liquidated easily with very little price impact.

– Other assets, like OTC (over-the-counter) derivatives or emerging

market equities, any significant transaction can quickly affect prices.

14

Measuring market liquidity

If someone is willing to bid in a stock at $9.9 but a seller is only willing to

post an offer price at $10.1, then the bid-offer spread is $10.1 − $9.9 =

$0.2.

One measure of the market liquidity of an asset is its bid-offer spread.

This can be measured either as a dollar amount or as a proportion of the

asset price. The dollar bid-offer spread is

p = Offer price−Bid price.

The proportional bid-offer spread for an asset is defined as

s =Offer price−Bid price

Mid-market price,

where the mid-market price is halfway between the bid and the offer price.

15

One measure of the liquidity of a book is how much it would cost to liqui-

date the book in normal market conditions within a certain time. Suppose

that si is an estimate of the proportional bid-offer spread in normal market

conditions for the ith financial instrument held by a financial institution

and αi is the dollar value of the position in the instrument. We then have

cost of liquidation (normal market) =n∑

i=1

siαi

2

where n is the number of positions.

Since si increases with the size of position i, holding many small positions

rather than a few large positions therefore tends to entail less liquidity

risk. Setting limits to the size of any one position can therefore be one

way of reducing liquidity trading risk.

16

The bid-ask spread may be widened under abnormal market condition,

known as liquidity crunch.

17

Loss if risk controls: Barings Bank disaster

• Nicholas Leeson, an employee of Barings Bank in the Singapore office

in 1995, had a mandate to look for arbitrage opportunities between

the Nikkei 225 futures prices on the Singapore exchange and the

Osaka exchange. Over time Leeson moved from being an arbitrageur

to being a speculator without anyone in the Barings London head

office noticed that he had changed the way he was using derivatives.

In 1994, Leeson is thought to have made $20 million for Barings,

one-fifth of the total firm’s profit. He drew a $150,000 salary with a

$1 million bonus.

• He began to make losses in 1995, which he was able to hide. He then

began to take bigger speculative positions in an attempt to recover

the losses, but only resulted in making the losses worse. In the end,

Leeson’s total loss was close to 1 billion dollars. As a result, Barings

– a bank that had been in existence for 200 years – was wiped out.

18

• Lesson to be learnt: Both financial and nonfinancial corporations

must set up controls to ensure that derivatives are being used for

their intended purpose. Risk limits should be set and the activities

of traders should be monitored daily to ensure that the risk limits are

adhered to.

Similar cases continued to occur in later years

• Societe Generate lost $6.7 billion in January 2008 after trader Jerome

Kerviel took unauthorized positions on European stock index futures.

• UBS lost $2 billion in September 2011 after Kweku Adoboli took

unauthorized positions on currency swap trades.

19

Model risk

• Mathematicians understand the limitations of the model while practi-

tioners use them to advance their flavors.

The mathematical model used to value positions is misused. A good

example is the rating of Constant Proportional Debts Obligations

(CPDO) based on flawed mathematical models. Investors received

LIBOR+200 bps coupon rate on CPDO, which had been rated AAA.

Most investors on CPDO lost almost 100% of their investments during

the financial tsunami in 2008.

• Suppose 3 different models give prices of $6 million, $7.5 million and

$8.5 million for a particular structured product. Even if the financial

institution believes the first model is the best one and plans to use

that model for daily repricing and hedging, it would ensure that the

price it charges the client is at least $8.5 millions. If the product is

sold at $9 million, the potential profit will be $3 million if the market

behaviors follow the first model.

20

Legal risk

Legal risk is generally related to credit risk, since counterparties that lose

money on a transaction may try to find legal grounds for invalidating

the transaction. It may take the form of shareholder law suits against

counterparty corporations.

Examples

• Two municipalities in Britain had taken large positions in interest rate

swaps that turned out to produce large losses. The swaps were later

ruled invalid by the British High Court. The court decreed that the city

councils did not have the authority to enter into these transactions

and so the cities were not responsible for the losses. Their bank

counterparties had to swallow the losses.

• A Hong Kong example is the Lehman Brothers’ mini-bonds. Local

banks which sold these sophisticated products to layman (measured

by education level) investors had to swallow the losses.

21

1.2 Hedging of market risks

Dynamic hedging of options

A trader sells 100,000 European call options on a non-dividend-paying

stock: S = $49, X = $50, r = 5%, σ = 20%, T = 20 weeks.

Terminal payoff of a call option = max(ST −X,0).

22

Some theoretical considerations

Let V (S, t) denote the price function of the European call option, S be

the underlying stock price and t is the calendar time.

Write ∆ =∂V

∂Sas the rate of change of V with respect to S. Note that

0 ≤∆ ≤ 1 (why?). Suppose an issuer writes a European call option, he is

faced with the liability of paying ST −X at maturity T if ST > X (option

buyer chooses to exercise).

The risk arises from the fluctuation of the stock price S.

Given the short position in V , the writer can hedge the liability exposure

by long buying α units of the stock to make the portfolio delta neutral.

Value V (S, t) and delta ∆ =∂V

∂Sof the call are calculated based on an

option pricing model (potential exposure to model risk since the hedger

needs to specify the volatility of the underlying asset price).

23

Let π(S, t) be the portfolio value, where

π(S, t) = −V (S, t) + αS.

The risk arises from S, so we are interested to examine∂π

∂S(variation of

the portfolio value with varying levels of S). We obtain

∂π

∂S= −

∂V

∂S+ α = −∆+ α.

If we choose α = ∆, then the portfolio becomes delta neutral.

Black and Scholes pioneered the concept of riskless hedging to derive the

option pricing model. Scholes received the Nobel award in Economics in

1997 in recognition of this contribution (Black passed away in 1995).

24

Intuitively, suppose ∆ =∂V

∂S= 0.3, one dollar increase in S leads to $0.3

increase in option value. The liability is 0.3. If the writer holds 0.3 units

of stock simultaneously, the net gain in the hedged portfolio is zero.

Note that ∆ has dependence on S and t, so ∆ changes dynamically over

the life of the option as S changes continuously. The number of units

of stock held by the option writer has to change dynamically over time.

This is why we use the term “dynamic hedging”.

This is not the same as the forward contract, where ∆ = 1 (always

holds one unit of the underlying asset) since the forward buyer has the

obligation to buy the underlying asset (unlike an option buyer who has

the right but not the obligation to exercise the option).

25

Black-Scholes-Merton option pricing formula:

V (S, t) = SN(d1)−Xe−r(T−t)N(d2),

where N(x) =1√2π

∫ x

−∞e−t

2/2 dt,

d1 =ln S

X +(r + σ2

2

)(T − t)

σ√T − t

and d2 =ln S

X +(r − σ2

2

)(T − t)

σ√T − t

.

Here, σ is volatility (standard derivation of log daily return) of the stock

price. The interest rate r is assumed to be constant since the impact of

fluctuation in r on option price is secondary.

26

The challenge is the specification of σ, which is the major source of

the model risk. The hedger may use the implied volatility inferred from

traded options or his own choice of volatility. However, either volatility

would not match with volatility of Mother Nature in general. The profit

or loss depends on the difference of the actual volatility and implemented

volatility multiplied by1

2S2∂∆

∂S=

1

2S2∂

2V

∂S2, where

∂∆

∂S=

∂2V

∂S2is called the

gamma Γ.

On the major contribution to option trading, the option pricing theory

provides ∆ =∂V

∂S= N(d1), which can be found once σ is specified.

The writer has to rely on the option price formula to obtain ∆ in his

implementation of the dynamic hedging strategy.

The option pricing theory plays an important role to facilitate trading of

options. Before the advent of the Black-Scholes-Merton theory, trading

of options was not popular since writers did not know how to perform

hedging and earned the fee beyond the fair premium at low risk.

27

• When the current stock price is low, the chance of expiring in-the-

money is low, so the delta is close to zero.• When the current stock price is high (relative to the strike price X),

the chance of expiring in-the-money is high, so the delta is close to

one.• When the current stock price is around the strike price, the delta

is around 0.5, representing roughly equal chance of expiring in-the-

money or out-of-the-money.28

Similar ideas can be extended to hedging a portfolio of instruments with

various risk sources. The challenges include

(i) identification of the risk sources;

(ii) sensitivity of the portfolio value with respect to these random risk

sources.

Some of these risk sources may not be prices of available traded instru-

ments, with examples like macro economic factor (say, Fed rate). The

Fed rate is not a direct tradeable instrument. However, there are in-

struments whose values are dependent on the Fed rate. Suppose one’s

portfolio value is dependent on the Fed rate, the portfolio risk can be

hedged by holding off-setting instruments whose values are also depen-

dent on the Fed rate.

The number of units held for the hedging instruments depends on the

ratio of sensitivity of the portfolio value and hedged instrument to the

underlying risk factor.

29

Dynamic hedging of an European call position at work

At the time of the trade, the call option fair value is $2.40 and the delta

is 0.522. Suppose the amount received by the seller for the options is

$300,000 (good for the seller). Since the seller is short 100,000 options,

the value of the seller’s portfolio is −$240,000.

Immediately after the trade, the seller’s portfolio can be made delta neu-

tral by buying 52,200 shares of the underlying stock. The cost of shares

purchased = 52,200× $49 = 2,557.8 thousand.

Since the delta changes when stock price changes over the life of the

option, the trader has to adjust the stock holding amount via rebalancing

in order to maintain delta-neutral. This is called dynamic hedging.

30

Scenario One: call option expires in-the-money

31

Cash flows arising from rehedging (dynamic rebalancing) and interest

costs

The stock price falls by the end of the first week to $48.12. The delta

declines to 0.458. A long position in 45,800 shares is now required to

hedge the option position. A total of 6,400 (= 52,200− 45,800) shares

are therefore sold to maintain the delta neutrality of the hedge.

The strategy realizes $308,000 in cash, and the cumulative borrowings

at the end of week 1 are reduced to $2,252,300. Note that interest rate

cost of one week, calculated by

2,557.8 thousand× 0.05/52 ≈ 2.5 thousand

has to be added. This comes out to be (in thousands)

2,557.8− 308+ 2.5 = 2,252.3.

32

• During the second week, the stock price reduces to $47.37 and delta

declines again. This leads to 5,800 shares being sold at the end of

the second week.

• During the third week, the stock price increases to over $50 and delta

increases. This leads to 19,600 shares being purchased at the end of

the third week!

Toward the maturity date of the option, it becomes apparent that the

option will be exercised and delta approaches 1.0. By week 20, therefore,

the hedger owns 100,000 shares.

Since the strike price is $50, the hedger receives $5 million (= 100,000×$50) for these shares when the option is exercised so that the total cost

of hedging it is $5,263,300− $5,000,000 = $263,300.

How would you compare the fair value of the call, which is $2.4 at initiation

of the trade, with the total cost of hedging per unit of the call option,

which is $2.633?

33

It is necessary to adjust the time value, where the value of the call 20

weeks after the trade is $2.4 × (1 + 0.05 × 20/52) ≈ 2.446. The seller

loses if he charges the price of the call option at the “fair value”.

• The higher cost of hedging when compared with the fair value may

be attributed to the overhedging due to delay in rebalancing (weekly

adjustment of hedging position). However, more frequent rebalancing

means higher transaction costs.

• Luckily, the seller received $3 per call option, so he maintains a gain

of $3× (1+ 0.05× 20/52) = $3.058− $2.633 = $0.425 per option at

maturity.

• The delta-hedging procedure in effect creates a long position in the

option synthetically to neutralize the seller’s short option position.

The hedger is forced into the buy-high and sell-low trading strategy

since the hedging procedure involves selling stock just after the price

has gone down and buying stock just after the price has gone up.

Note that transaction costs have not been included.

34

Remarks

• As the call option expires in-the-money (ST = $57.25 and X = $50),

the total sum of the stock units purchased over the 20 weeks must be

100,000 shares. These shares can be delivered to honor the obligation

since the option buyer chooses to exercise the call.

• The fair call option premium received upfront by the writer is the

present value of the total costs of setting up the hedging procedure.

35

One may query whether the cost of buying 100,000 shares over time can

be covered by the option premium of $3 per each unit. For example, can

the hedger cover the high cost if the stock price increases sharply (say,

up to $150 which is well above X = $50)?

It is not necessary to worry if one follows the dynamic hedging procedure

throughout the whole life of the option (not to start buying more shares

only when the stock price increases sharply). This is quite a miracle.

Indeed, the hedger almost holds the full amount of 100,000 units well

before the stock price rises to $150.

36

Scenario Two: Call option expires out-of-the-money (no exercise of call)

37

Remarks

• In case the call option expires out-of-the-money, the net number of

shares bought throughout the hedging procedure would be zero.

• Though the hedged portfolio ends up with zero number of shares held

at maturity, there is cost incurred in performing the dynamic hedging

procedure. This hedging cost is compensated by the option premium

collected at initiation.

Understanding and implementation of the dynamic hedging strategy en-

hance the growth of trading of options (like the strong warrant markets

in Hong Kong).

38

Minimum variance hedge ratio

Suppose a risk manager wants to hedge his exposure on asset S using

N units of hedging instrument F . One example is the hedge of jet fuel

using available heating oil futures contract. The total change in the value

of the hedged portfolio from the current time to the end of the hedging

period is given by

∆V = ∆S +N∆F.

Our goal is to find the number of units of the hedging instrument to be

used in order to minimize the variance of ∆V (which is used as a proxy

of the risk exposure). Recall var(X+Y ) = var(X)+2cov(X,Y )+var(Y ),

we have

σ2∆V = σ2∆S +2Nσ∆S,∆F +N2σ2∆F .

The variances and covariance are expressed in dollars, not in rates of

return. The first order condition to find N∗ that minimizes σ2∆V is given

by

0 =∂σ2∆V

∂N= 2σ∆S,∆F +2Nσ2∆F .

39

In our later exposition, we drop ∆ for notational simplicity. We obtain

N∗ = −σSFσ2F

= −ρSFσSσF

, where ρSF =σSFσSσF

.

Suppose we perform linear regression of ∆SS on ∆F

F , where

∆S

S= α+ βSF

∆F

F+ ϵ,

where ϵ is the error term with zero mean. The beta coefficient is known

to be βSF = σSF/σ2F (recall a similar derivation in CAPM). At the optimal

hedge N∗, the variance of ∆V becomes

σ∗V2 = σ2S +

(−σSFσ2F

)2σ2F +2

(−σSFσ2F

)σSF = σ2S −

σ2SFσ2F

.

The variance is reduced by σ2SF/σ2F . If the correlation coefficient of F and

S is close to one, then

σ∗V2 = σ2S − ρ2SFσ

2S → 0+ as ρSF → 1−.

40

Effectiveness of the hedge can be quantified by R, where

R2 =σ2S − σ∗V

2

σ2Sor σ∗V = σS

√1−R2.

This represents the relative decrease of variance of S with the introduction

of the N∗ units of the hedging instruments. Combining all the above

results, we obtain

R2 =σ2S −

(σ2S −

σ2SFσ2F

)σ2S

=σ2SFσ2Sσ

2F

= ρ2SF .

As expected, R→ 1− when ρSF → 1−. This ideal scenario of almost per-

fect correlation corresponds to the absence of basis risk. Basis risk is the

risk associated with imperfect hedging, which arises from the difference

between the price of the asset to be hedged and the price of the asset

serving as the hedge.

41

Example

An airline knows that it will need to purchase 10,000 metric tons of jet

fuel in three months. It wants some protection against an upturn in prices

using futures contracts.

The company can hedge using heating oil futures contracts traded on

NYMEX. The notional for one contract is 42,000 gallons. As there is no

futures contract on jet fuel, the risk manager wants to check if heating

oil could provide an efficient hedge instead. The current price of jet fuel

is $277/metric ton. The futures price of heating oil is $0.6903/gallon.

The standard derivation of the rate of change in jet fuel prices over three

months is 21.17%, that of futures is 18.59%, and the correlation is 0.8243.

42

Compute the following quantities.

(a) The notional and standard derivation of the unhedged fuel cost in

dollars.

(b) The optimal number of futures contract to buy, rounded to the closest

integer.

(c) The standard derivation of the hedged fuel cost in dollars.

Solution

(a) The position notional of the jet fuel is $2,770,000. The standard

derivation in dollars is

σS = 0.2117× $277× 10,000 = $586,409.

For reference, that of one futures contract is

σF = 0.1859× $0.6903× 42,000 = $5,389.72

with a futures notional of $0.6903× 42,000 = $28,992.60.

43

(b) The airline company has to buy heating oil futures as protection.

First, we compute beta of the rates of return of the two different

types of oil prices, which is

βsf =ρsfσs

σf= 0.8243(0.2117/0.1859) = 0.9387.

The corresponding covariance term is

σsf = 0.8243× 0.2117× 0.1859 = 0.03244.

Adjusting for the notionals, this is

σSF = 0.03244× 2,770,000× 28,993 = 2,605,268,452.

The optimal hedge ratio is given by

N∗ =σSFσ2F

=2,605,268,452

5,389.722= 89.7

or 90 contracts after rounding.

44

(c) We find the risk of the hedged position and effectiveness of the hedge.

The volatility of the unhedged position is σS = $586,409. The vari-

ance of the hedged position is

σ2S = ($586,409)2 = +343,875,515,281

−σ2SF/σ2F = −(2,605,268,452/5,390)2 = −233,653,264,867

V (hedged) = +110,222,250,414

Taking the square root, the volatility of the hedged position is σ∗V =

$331,997. Thus the hedge has reduced the risk from $586,409 to

$331,997. Computing the R2, we find that one minus the ratio of

the hedged and unhedged variances is

1−110,222,250,414

343,875,515,281= 67.95%.

This is exactly the square of the correlation coefficient, 0.82432 =

0.6795, or effectiveness of the hedge.

45

Technical result on linear regression

Suppose we perform linear regression of S on F , where

S = α+ βF + ϵ,

where α is the intercept and β is the slope of the regression line.

The residual ϵ can be taken to have zero mean, where E(ϵ) = 0, since any

nonzero E(ϵ) can be absorbed into α. Also, the random noises ϵ should

be uncorrelated with F .

Consider

cov(S, F ) = cov(α+ βF + ϵ, F )

= cov(α, F ) + βcov(F, F ) + cov(ϵ, F ) = βvar(F ).

so that

β =cov(S, F )

var(F )=

ρSFσFσSσ2F

,

where ρSF is the correlation coefficient of S and F .

46

1.3 Portfolio loss distribution

Risk elements

1. Exposure at default and recovery rate, both are random variables.

2. Default probability.

3. Credit migration – the process of changing the creditworthiness of an

obligor as characterized by the transition probabilities from one credit

state to other credit states.

• Arrival risk: timing of the event of default, modeled by a stopping

time τ

• Magnitude risk: loss amount (exposure net of the recovery value)

Loss amount = par value (possibly plus accrued interest) –

market value of a defaultable bond

47

Characterization of the credit risk of loans

• Financial variables to be considered include

– default probability (DP )

– loss fraction called the loss given default (LGD)

– exposure at default (EAD)

Goal: Derive the portfolio loss distribution based on the information of

individual risks and their correlations.

48

Loss variable of single name

L = EAD × SEV × L

where L = 1D, E[1D] = DP . Here, D is the default event that the

obligor defaults within a certain period of time. We treat severity (SEV )

of loss in case of default as a random variable with E[SEV ] = LGD.

Based on the assumption that the exposure, severity and default event

are independent, the expected loss (EL):

EL = E[L] = E[EAD]× LGD ×DP.

Here, EAD is in general stochastic and E[EAD] is the expectation of

several relevant underlying random variables.

It is common to have the situation where the severity of losses and the

default events are random variables driven by a common set of underlying

factors. In this case we need to have the information of the joint distri-

bution of SEV and 1D in order to perform the expectation calculations.

49

Unexpected loss – standard deviation of L

As a measure of the magnitude of the deviation of losses from the EL,

a natural choice is the standard deviation of the loss variable L, which is

termed unexpected loss in the risk management literature.

Unexpected loss (UL) =√var(L) =

√var(EAD × SEV × L).

Under the assumption that the severity and the default event D are inde-

pendent, and also EAD is taken to be deterministic, we have

UL = EAD ×√var(SEV )×DP + LGD2 ×DP (1−DP ).

In contrast to EL, UL served as the proxy for the uncertainty faced by

the bank when investing in a portfolio since UL captures the deviation

from the expectation.

50

Proof

We make use of var(X) = E[X2]− E[X]2, so that

var(1D) = DP (1−DP )

since E[12D] = E[1D] = DP . Assuming SEV and 1D are independent,

we have

var(SEV1D) = E[SEV 212D]− E[SEV1D]2

= E[SEV 2]E[12D]− E[SEV ]2E[1D]2

= var(SEV ) + E[SEV ]2DP − E[SEV ]2DP2

= var(SEV )×DP + LGD2 ×DP (1−DP ).

When SEV is random, an additional contribution to var(SEV1D) arises

from var(SEV )×DP .

51

Portfolio losses

Consider a portfolio of m risky obligors

Li = EADi × SEVi ×1Di, i = 1, . . . ,m, P [Di] = E[1Di

] = DPi.

The random portfolio loss Lp is given by

Lp =m∑

i=1

Li =m∑

i=1

EADi × SEVi ×1Di.

Using the additivity of expectation, we obtain

ELp =m∑

i=1

ELi =m∑

i=1

EADi × LGDi ×DPi.

In the case UL, additivity holds if the loss variable Li are pairwise uncor-

related. That is,

var(m∑

i=1

Li) =m∑

i=1

var(Li),

provided that Li are uncorrelated. Unfortunately, correlations are the

“main part of the game” and a main driver of credit risk.

52

The general case with non-zero correlations is given by

ULp =√var(Lp)

=

√√√√√ m∑i=1

m∑j=1

EADi × EADj × cov(SEVi ×1Di, SEVj ×1Dj

).

For a portfolio with constant severities, we have the following simplified

formula

UL2p =

m∑i=1

m∑j=1

EADi×EADj×LGDi×LGDj×√DPi(1−DPi)DPj(1−DPj)ρij,

where

ρij = correlation coefficient between default events

=cov(1Di

,1Dj)√

var(1Di)var(1Dj

),

with var(1Di) = DPi(1−DPi).

53

Example: Two-asset credit portfolio

Take m = 2, LGDi = EADi = 1, i = 1,2, then

UL2p = p1(1− p1) + p2(1− p2) + 2ρ

√p1(1− p1)p2(1− p2),

where pi is the default probability of obligor i, i = 1,2, and ρ is the

correlation coefficient.

Remarks on default correlation

(i) When ρ = 0, the two default events are uncorrelated. Under full

diversification with widely different assets of diversified classes, the

correlation is viewed as being close to zero.

54

(ii) When ρ > 0, the default of one counterparty increases the likelihood

that the other counterparty may also default. Consider

P [1D2= 1|1D1

= 1] =P [1D2

= 1,1D1= 1]

P [1D1= 1]

=E[1D1

1D2]

p1

=p1p2 + cov(1D1

,1D2)

p1= p2 +

cov(1D1,1D2

)

p1.

Positive correlation leads to a conditional default probability higher

than the unconditional default probability p2 of obligor 2.

• Under the case of perfect correlation and p = p1 = p2, we have

ULp = 2√p(1− p).

This means the portfolio contains the risk of only one obligor but

with double intensity (concentration risk). The default of one obligor

makes the other obligor defaulting almost surely.

55

Matrix representation of portfolio variance UL2p in terms of individual

unexpected loss ULi

Recall

UL2p = var(Lp) = cov(L1 + · · ·+ Lm, L1 + · · ·+ Lm)

=m∑

i=1

m∑j=1

cov(Li, Lj) =m∑

i=1

m∑j=1

ULiULjρij,

where UL2i = var(Li) and the correlation coefficient between loss variables

ρij =cov(Li, Lj)√

var(Li)√var(Lj)

, i, j = 1,2, . . . ,m.

56

Suppose we write

L = (UL1 UL2 · · ·ULm)T

as the vector of unexpected losses of the m obligors and

Ω =

ρ11 · · · ρ1m... ...

ρm1 · · · ρmm

as the correlation matrix, then the matrix representation of portfolio vari-

ance is given by

UL2P = LTΩL.

Since UL2P ≥ 0, so Ω is semi-positive definite. A matrix A is said to be

semi-positive definite matrix if xTAx ≥ 0 for all x.

57

We examine how ULP is affected by the change in ULk.

For a fixed k, consider

∂UL2P

∂ULk=

N∑j=1

N∑i=1

∂ULi

∂ULkULjρij +

N∑i=1

N∑j=1

∂ULj

∂ULkULiρij

=N∑

j=1

ULjρkj +N∑

i=1

ULiρik

Since ρkj = ρjk, we obtain

∂UL2P

∂ULk= 2

N∑j=1

ULjρkj

giving

∂ULP

∂ULk=

N∑j=1

ρkjULj

ULP.

58

Risk contribution

Recall that∂ULp∂ULi

gives the change in the portfolio risk ULP due to one

unit of exposure of risky asset i. The risk contribution of a risky asset i

to the portfolio unexpected loss is defined to be the incremental risk that

the exposure of a single asset contributes to be the portfolio’s total risk,

namely,

RCi = ULi∂ULp

∂ULi=

ULi∑

j ULjρij

ULp.

Using the unexpected losses ULi and ULp as the quantifiers of risk, we

expect that the risk contributions from the risky assets is simply the total

portfolio risk. As a verification, it is seen mathematically that∑i

RCi =

∑iULi

∑j ULjρij

ULp= ULp.

59

Calculation of EL, UL and RC for a two-asset credit portfolio

ρ default correlation coefficient between the two exposures

ELpportfolio expected loss

ELp = EL1 + EL2

ULp

portfolio unexpected loss

ULp =√UL2

1 + UL22 +2ρUL1UL2

RC1risk contribution from Exposure 1

RC1 = UL1(UL1 + ρUL2)/ULp

RC2risk contribution from Exposure 2

RC2 = UL2(UL2 + ρUL1)/ULp

ULp = RC1 +RC2

60

Fitting of loss distribution

The two statistical measures about the credit portfolio are

1. mean, or called the portfolio expected loss;

2. standard deviation, or called the portfolio unexpected loss.

We approximate the loss distribution of the original portfolio by a be-

ta distribution through matching the first and second moments of the

portfolio loss distribution.

The risk quantiles of the original portfolio can be approximated by the

respective quantities of the approximating random variable X. The price

for such convenience of fitting is the model risk.

61

Beta distribution

The density function of a beta distribution is

f(x;α, β) =

Γ(α+β)Γ(α)Γ(β)x

α−1(1− x)β−1, 0 < x < 1

0 otherwiseα > 0, β > 0,

where Γ(α) =∫∞0 e−xxα−1 dx. Mean µ = α

α+β and variance σ2 =αβ

(α+β)2(α+β+1). Determine α and β in terms of µ and σ2.

A beta distribution with only two degrees of freedom is perhaps insufficient

to give an adequate description of the tail events in the loss distribution.

62

Third and fourth order moments of a distribution

• Skewness describes the departure from symmetry of a distribution:

γ =

∫∞−∞(x− E[X])3f(x) dx

σ3.

The skewness of a normal distribution is zero. Positive skewness

indicates that the distribution has a long right tail and so entails large

positive values.

• Kurtosis describes the degree of flatness of a distribution:

δ =

∫∞−∞(x− E[X])4f(x) dx

σ4.

The kurtosis of a normal distribution is 3. A distribution with kurtosis

greater than 3 has the tails decay less quickly than that of the normal

distribution, implying a greater likelihood of large value in both tails.

63

Characteristics of loss distributions for different risk types

Type of risk Second moment Third moment Fourth moment

(standard deviation) (skewness) (kurtosis)

Market risk High Zero Low

Credit risk Moderate Moderate Moderate

Operational risk Low High High

• The market risk loss distribution is symmetrical but not perfectly

normally distributed.

64

• The credit risk loss distribution is quite skewed with long right tail.

• The operational risk distribution has a quite extreme shape. Most

of the times, losses are modest, but occasionally they are very large.

65

Loss distribution of a credit portfolio

All risk quantities on a portfolio level are based on the portfolio loss

variable Lp. Once the loss distribution is generated, all risk measures can

be calculated.

66

1.4 VaR (value-at-risk), expected shortfall and coherent risk mea-

sure

The Value-at-Risk (VaR) can be translated as “I am X percent certain

there will not be a loss of more than V dollars in the next N days.” If

the VaR on a risky portfolio is $1 million at one-month, 95% confidence

levels, then there is only 5% chance that the portfolio loses more than

$1 million over the next one month period.

• The variable V is the VaR of the portfolio. It is a function of (i) time

horizon (N days); (ii) confidence level (X%).

• It is the loss level over N days that has a probability of only (100−X)%

of being exceeded.

• The Bank of International Settlements proposes banks to calculate

VaR for market risk with N = 10 and X = 99.

67

• Weakness of volatility as the measure of risk: it does not care about

the direction of portfolio value movement. Thick “head” of upside

gain is viewed the same as thick “tail” of downside loss by volatility.

Calculation of VaR from the probability distribution of the change in

the portfolio value; confidence level is X%. Gains in portfolio value are

positive; losses are negative.

68

• VaR disregards the details of loss distribution beyond the VaR Level

(tail risk)

Alternative situation where VaR is the same, but the mean of loss

beyond VaR is larger.

• VaR is commonly calculated based on historical scenarios. Under

catastrophic market conditions or an extreme dependence structure of

assets (clustering effect), VaR may underestimate risk due to survival

bias. We require a default correlation model to quantify tail risk under

distressed state, which may not be captured by historical scenarios.

69

Formal definition

VaR is defined for some confidence level α as the α-quantile of a loss

random variable X

VaRα(X) = infx|P [X ≤ x] ≥ α.

For example, take α = 99% and one-month horizon; the above definition

states that with 99% chance that the loss amount (value of X) is less

than VaRα(X) within the one-month period.

Banks should hold some capital cushion (economic capital) against unex-

pected losses. Using UL is not sufficient since there might be a significant

likelihood that losses will exceed portfolio’s EL by more than one standard

deviation of the portfolio loss. While VaR is determined with reference to

a given choice of α, there is no consideration of confidence level in UL.

Risk measures that rely on one absolute value and a single choice of

confidence level are subject to game playing by fund managers. A man-

ager may choose a portfolio that meets the VaR requirement but pays no

attention to the severity of losses beyond VaR.

70

Generalized inverse and α-quantile

Given a nondecreasing function F : R→ R, the generalized inverse of F is

given by

F ←(y) = infx ∈ R : F (x) ≥ y

with the convention inf ϕ =∞.

If F is strictly increasing, then F ← = F−1. We recover the usual inverse.

Using the generalized inverse, we define the α-quantile of F by

qα(F ) = F ←(α) = infx ∈ R : F (x) ≥ α, α ∈ (0,1).

Note that VaRα(F ) = qα(F ), where F is the loss distribution. Also, it is

seen that

VaR(aX + b) = αVaRα(X) + b,

for a > 0 and b ∈ R.

71

Example 1 – Portfolio gain treated as a normal random variable

Suppose that the gain from a portfolio during six months is normally

distributed with a mean of $2 million and a standard deviation of $10

million.

Recall the cumulative normal distribution:

N(x) =∫ x

−∞

1√2π

e−t2/2 dt,

and N(−2.33) = 0.01 = 1%.

• From the properties of the normal distribution, the one-percentile

point of this distribution is 2− 2.33× 10, or -$21.3 million.

• The VaR for the portfolio with a time horizon of six months and

confidence level of 99% is therefore $21.3 million.

72

Example 2

Suppose that for a one-year project all outcomes between a loss $50

million and a gain of $50 million are considered equally likely.

• The loss from the project has a uniform distribution extending from

−$50 million to +$50 million. There is a 1% chance that there will

be a loss greater than $49 million.

• The VaR with a one-year time horizon and a 99% confidence level is

therefore $49 million.

73

Calculation of VaR using historical simulation

Suppose that VaR is to be calculated for a portfolio using a 1-day time

horizon, a 99% confidence level, and 501 days of data.

• The first step is to identify the market variables affecting the portfolio.

These are typically exchange rates, equity prices, interest rates, etc.

• Data is then collected on the movements in these market variables

over the most recent 501 days. This provides 500 alternative scenarios

for what can happen between two consecutive days.

• Scenario 1 is where the percentage changes in the values of all vari-

ables are the same as they were between Day 0 and Day 1, scenario

2 is where they are the same as they were between Day 1 and Day 2,

and so on.

74

Data for VaR historical simulation calculation

DayMarket Market

. . .Market

variable 1 variable 2 variable n

0 20.33 0.1132 . . . 65.37

1 20.78 0.1159 . . . 64.91

2 21.44 0.1162 . . . 65.02

3 20.97 0.1184 . . . 64.90... ... ... ... ...

498 25.72 0.1312 . . . 62.22

499 25.75 0.1323 . . . 61.99

500 25.85 0.1343 . . . 62.10

75

Define vi as the value of a market variable on Day i and suppose that

today is Day m, say m = 500. The ith scenario assumes that the value

of the market variable tomorrow will be vmvi

vi−1.

For the first variable, the value today, v500, is 25.85. Also v0 = 20.33 and

v1 = 20.78. It follows that the value of the first market variable in the

first scenario is 25.85 × 20.7820.33 = 26.42. For the second scenario, we have

25.85× 21.4420.78 = 26.67.

We generate all 500 historical scenarios for each market variable, and

repeat the calculation for all market variables.

76

Scenarios generated for tomorrow (Day 501) using data in the last table

Scenario Market Market. . .

Market Portfolio value Change in value

number variable 1 variable 2 variable n ($ millions) ($ millions)

1 26.42 0.1375 . . . 61.66 23.71 0.21

2 26.67 0.1346 . . . 62.21 23.12 -0.38

3 25.28 0.1368 . . . 61.99 22.94 -0.56... ... ... ... ... ... ...

499 25.88 0.1354 . . . 61.87 23.63 0.13

500 25.95 0.1363 . . . 62.21 22.87 -0.63

Under the assumption of scenario 1, the change in portfolio value between

Day 500 and Day 501 is repeating the same change in Day 0 and Day 1.

Based on today’s portfolio value of 23.50, the change in portfolio value

in scenario 1 and scenario 2 are, respectively, 23.71 − 23.50 = 0.21 and

23.12− 23.50 = −0.38.

77

How to estimate the 1-percentile point of the distribution of changes in

the portfolio value?

Since there are a total of 500 scenarios, we can estimate this as the fifth

worst number in the final column of the table. With confidence level of

99%, the maximum loss does not exceed the fifth worst number. The

N-day VaR for a 99% confidence level is calculated as√N times the 1-day

VaR.

Query

Why the choice of√N as the multiplier when the time horizon is N days?

This is associated with the property of dispersion that grows at square

root of time. This is consistent with the physics of diffusion, where

diffusion distance ∼√diffusion time. To double the diffusion distance,

the diffusion time required is four-fold.

78

Expected shortfall

The expected shortfall (tail conditional expectation) with respect to a

confidence level α is defined as

ESα(X) = E[X|X > VaRα(X)].

Let c = VaRα(X), a critical loss threshold corresponding to some con-

fidence level α, the expected shortfall capital provides a cushion against

the mean value of losses exceeding the critical threshold c.

The expected shortfall focusses on the expected loss in the tail, starting

at c, of the portfolio’s loss distribution.

• When the loss distribution is normal, VaR and expected shortfall give

essentially the same information. Both VaR and ES are multiples of

the standard deviation. For example, VaR at the 99% confidence

level is 2.33σ while ES of the same level is 2.67σ. This is because

normal distribution is fully specified by mean and standard deviation

σ, so both VaR and ES are multiples of σ.

79

Expected shortfall: E[X|X > V aRα(X)].

• The computation of the expected shortfall requires the information

on the tail distribution (extreme value distribution).

80

Relation between VaR and ES

For a loss L with continuous distribution function FL and density function

fℓ, the expected shortfall is given by

ESα(L) =1

1− α

∫ 1

αVaRu(L) du.

To show the claim, note that

ESα(L) = E[L|L ≥ VaRα] =1

1− αE[L1L≥VaRα

]=

1

1− α

∫ ∞VaRα

ℓfL(ℓ) dℓ.

We set u = FL(VaRu) and write ℓ as VaRu so that

u = ℓ when ℓ = VaRα and u = 1 when ℓ tends to ∞.

Also, we observe ℓfL(ℓ) dℓ = VaRu du. Hence, we obtain

ESα(L) =1

1− α

∫ 1

αVaRu(L) du.

81

Coherent risk measures

Let X and Y be two random variables, like the dollar loss amount of two

portfolios. A risk measure is called a coherent measure if the following

properties hold:

1. monotonicity

For X ≤ Y, γ(X) ≤ γ(Y )

2. translation invariance

For all X ∈ R, γ(X + x) = γ(X) + x.

Here, x is a deterministic scalar quantity.

3. positive homogeneity

For all λ > 0, γ(λX) = λγ(X)

4. subadditivity (benefit of diversification)

γ(X + Y ) ≤ γ(X) + γ(Y )

82

Financial interpretation of the four properties

1. Monotonicity: If a portfolio produces a worse result than another port-

folio for every state of the world, its risk measure should be greater.

2. Translation invariance: If an amount of cash K is added to a portfolio,

its risk measure should go down by K. This is seem by setting x = −K,

so that γ(X −K) = γ(X)−K. The risk as quantified by γ(X) can be

reduced to zero by adding γ(X) cash into the credit portfolio.

3. Positive homogeneity: Changing the size of a portfolio by a positive

factor λ, while keeping the relative amounts of different items in the

portfolio the same, should result in the risk measure being multiplied

by λ.

83

4. Subadditivity: The risk measure for two portfolios after they have

been merged should be no greater than the sum of their individual

risk measures before they were merged. This property reflects the

benefit of diversification, in a similar spirit to

σX+Y ≤ σX + σY ,

where σX denotes the standard deviation of X. Note that

σ2X+Y = σ2X + σ2Y +2ρσXσY

≤ σ2X + σ2Y +2σXσY = (σX + σY )2.

VaR satisfies the first three conditions. However, it does not always

satisfy the fourth one.

84

Example 3 – Violation of subaddivity in VaR

Suppose each of two independent projects has a probability of 0.02 of

loss of $10 million and a probability of 0.98 of a loss of $1 million during

a one-year period. Suppose we set the confidence level α to be 97.5%.

In this example, the loss random variable X of single project can assume

two discrete values: $1 million and $10 million.

• We have P [X = 1] = 0.98 and P [X = 10] = 0.02, so that P [X ≤ 1] =

0.98 and P [X ≤ 10] = 1. For 1 ≤ x < 10, we have P [X ≤ x] = 0.98 ≥0.975, for x < 1, P [X ≤ x] = 0. The smallest value of x that satisfies

P [X ≤ x] ≥ 97.5% is x = 1, so VaR97.5% = 1.

85

One-project portfolio

P [X = 1] = 0.98 and P [X = 10] = 0.02 so that P [X ≤ 1] = 0.98 and

P [X ≤ 10] = 1. The distribution function F (x) = P [X ≤ x] jumps by 0.98

and 0.02 when x crosses 1 and 10, respectively.

86

Two-project portfolio

P [Y = 2] = 0.9604, P [Y = 11] = 0.0392 and P [Y = 20] = 0.0004 so that

P [Y ≤ 2] = 0.9604, P [Y ≤ 11] = 0.9996 and P [Y ≤ 20] = 1.

87

When the projects are put in the same portfolio, there is a 0.02× 0.02 =

0.0004 probability of a loss of $20 million, a 2 × 0.02 × 0.98 = 0.0392

probability of a loss of $11 million, and a 0.98×0.98 = 0.9604 probability

of a loss of $2 million.

Let Y denote the loss random variable of the two projects. Note that

P [Y ≤ 2] = 0.9604, P [Y ≤ 11] = 0.9996 and P [Y ≤ 20] = 1. For y < 11,

P [Y ≤ y] = 0.9604; for 11 ≤ y < 20, P [Y ≤ y] = 0.9996 > 97.5%. The

smallest value of y that satisfies P [Y ≤ y] > 97.5% is y = 11, so the

one-year 97.5% VaR for the portfolio is $11 million.

The sum of the VaRs of the projects considered separately is $2 million.

The VaR of the portfolio is therefore greater than the sum of the VaRs

of the projects by $9 million. This violates the subadditivity condition.

88

Expected shortfall

In Example 3, the VaR for one of the projects considered on its own is $1

million. To calculate the expected shortfall for a 97.5% confidence level

we note that, of the 2.5% tail of the loss distribution, X is either equal

to 1 or 10. We observe 2% corresponds to a loss of $10 million and the

remaining 2.5%− 2% = 0.5% to a loss of $1 million.

• Conditional that we are in the 2.5% tail of the loss distribution, there

is therefore an0.02

0.025= 80% probability of a loss of $10 million and

100%− 80% = 20% probability of a loss of $1 million. The expected

loss is 0.8× 10+ 0.2× 1 or $8.2 million.

• When the two projects are combined, of the 2.5% tail of the loss

distribution, 0.04% corresponds to a loss of $20 million and the re-

maining 2.5%− 0.04% = 2.46% corresponds to a loss of $11 million.

Conditional that we are in the 2.5% tail of the loss distribution, the

expected loss is therefore (0.04/2.5)×20+(2.46/2.5)×11, or $11.144

million. Since 8.2+8.2 > 11.144, the expected shortfall measure does

satisfy the subadditivity condition for this example.

89

Example 4 – Violation of subaddivity in VaR

A bank had two $10 million one-year loans, each of which has a 1.25%

chance of defaulting. If a default occurs, all losses between 0% and 100%

of the principal are equally likely. If the loan does not default, a profit of

$0.2 million is made. To simplify matters, we suppose that if one loan

defaults it is certain that the other loan will not default. We would like

to compute VaR99%. Since the random loss variable is continuous, it

amounts to find x such that P [X > x] = 0.01 or

P [X > x|1D1= 1] =

0.01

0.0125= 80%.

90

1. Consider first a single loan. This has a 1.25% chance of defaulting.

When a default occurs, the loss experienced is evenly distributed be-

tween zero and $10 million. Conditional on a loss being made, there

is an 80% (0.8) chance that the loss will be greater than $2 million.

Since the probability of a loss is 1.25% (0.0125), the unconditional

probability of a loss greater than $2 million is 0.8× 0.0125 = 0.01 or

1%. Mathematically, we observe

P [X > 2|1D1= 1] = 0.8

so that

P [X > 2] = P [X > 2|1D1= 1]P [1D1

= 1] = 0.8× 0.0125 = 0.01.

The one-year 99% VaR is therefore $2 million.

91

2. Consider next the portfolio of two loans. Each loan defaults 1.25%

of the time and they never default together so that

P [one loss] = 2× 1.25% = 2.5%

P [two losses] = 0.

Upon occurrence of a loan loss event, the loss amount is uniformly

distributed between zero and $10 million.

There is a 2.5% (0.025) chance of one of the loans defaulting and

conditional on this event there is an 40% (0.4) chance that the loss

on the loan that defaults is greater than $6 million. That is,

P [X > 6|1D1= 1,1D2

= 0 or 1D2= 1,1D1

= 0]

=0.01

0.025= 40%.

The unconditional probability of a loss from a default being greater

than $6 million is therefore 0.4× 0.025 = 0.01 or 1%.

In the event that one loan defaults, a profit of $0.2 million is made on

the other loan, showing that the one-year 99% VaR is $5.8 million.

92

The total VaR of the loans considered separately is 2 + 2 = $4 million.

The total VaR after they have been combined in the portfolio is $1.8

million more at the value $5.8 million. This shows that the subadditivity

condition is violated.

Managers can game the VaR measure to report good risk management

while exposing the firm to substantial risks

• Computing firmwide VaR is often a formidable task to perform. The

alternative is to segment the computations by instruments and risk

drivers, and to compute separate VaR’s on tranches and desks of a

company. This is quite necessary in financial institutions since the

technological trading platforms are often desk by desk. With loss of

subadditivity, the firmwide VaR cannot be properly assessed since the

firmwide VaR may become significantly larger than the sum of all

VaRs from individual desks.

93

Expected shortfall

One-loan portfolio

We showed that the VaR for a single loan is $2 million. In the 1.0% tail

where X > VaR99% = 2, the loss ranges from 2 million to 10 million.

The expected shortfall from a single loan when the time horizon is one

year and the confidence level is 99% is therefore the expected loss on the

loan conditional on a loss greater than $2 million is halfway between $2

million and $10 million, or $6 million.

Two-loan portfolio

When one loan defaults, the other loan (by assumption) does not. The

non-defaulting loan conributes a profit of 0.2 million. The profit/loss is

uniformly distributed between a gain of $0.2 million and a loss of $9.8

million. The VaR for a portfolio consisting of the two loans was calculated

in Example 4 as $5.8 million. The expected shortfall from the portfolio is

therefore the expected loss on the portfolio conditional on the loss being

greater than $5.8 million.

94

In the 1.0% tail of the loss distribution, where X > VaR99% = 5.8, the

loss ranges from 5.8 to 9.8. The expected loss, given that we are in the

part of the uniform distribution between $5.8 million and $9.8 million, is

$7.8 million. This is the expected shortfall of the portfolio.

Since $7.8 million is less than 2×$6 = $12 million, the expected shortfall

measure does satisfy the subadditivity condition.

The subadditivity condition is not of purely theoretical interest. It is

not uncommon for a bank to find that, when it combines two portfolios

(e.g., its equity portfolio and its fixed income portfolio), the VaR of the

combined portfolio goes up.

95

Risk control for expected utility-maximizing investors

Utility-maximizing investors with VaR constraint optimally choose to con-

struct vulnerable positions that can result in large losses exceeding the

VaR level.

Example

Suppose that an investor invests 100 million yen in the following four

mutual funds:

• concentrated portfolio A, consisting of only one defaultable bond with

4% default rate;

• concentrated portfolio B, consisting of only one defaultable bond with

0.5% default rate;

• a diversified portfolio that consists of 100 defaultable bonds with 5%

default rate;

• a risk-free asset.

96

The profiles of all bonds in these funds are started as follows:

• maturity is one year

• defaults are mutually independent

• recovery rate is 10%.

Bond A has higher coupon rate with higher default rate. The yield to

maturity, default rate, and recovery rate are fixed until maturity.

Profiles of bonds included in the mutual funds

Number of bonds Yield to Default Recovery

included maturity(%) rate(%) rate(%)

Concentrated portfolio A 1 4.75 4.00 10

Concentrated portfolio B 1 0.75 0.50 10

Diversified portfolio 100 5.50 5.00 10

Risk-free asset 1 0.25 0.00 -

97

W final wealth, W0 initial wealth,X1 amount invested in concentrated portfolio A,X2 amount invested in concentrated portfolio B,X3 amount invested in diversified portfolio.

Assuming logarithmic utility, the expected utility of the investor is:

E[u(W )] =100∑n=0

0.96 · 0.995 · 0.05n · 0.95100−n ·100 Cn · ln w(1,1)

+100∑n=0

0.04 · 0.995 · 0.05n · 0.95100−n ·100 Cn · ln w(0.1,1)

+100∑n=0

0.96 · 0.005 · 0.05n · 0.95100−n ·100 Cn · ln w(1,0.1)

+100∑n=0

0.04 · 0.005 · 0.05n · 0.95100−n ·100 Cn · ln w(0.1,0.1),

where

w(a, b) =1.0475aX1 +1.0075bX2 +1.055X3100− 0.9n

100+ 1.0025(W0 −X1 −X2 −X3).

The multiplier 1.0475 in the first term is 1+ yield to maturity of bond A,

etc.98

The terms in E[u(W )] correspond to (i) non-default of the two bonds with

a = b = 1, (ii) default of the first bond and non-default of the second

bond with a = 0.1 and b = 1, etc. The index n counts the number of

defaults in the basket of 100 bonds. Note that 100−0.9n100 is the fraction

of par values received from the 100 − n non-defaulting bonds from the

diversified portfolio and W0−X1−X2−X3 is the amount invested in the

riskfree asset.

We analyze the impact of risk management with VaR and expected short-

fall on the rational investor’s decisions by solving the following five opti-

mization problems, where the holding period is one year.

1. No constraint

maxX1X2X3

E[u(W )].

99

2. Constraint with VaR at the 95% confidence level

maxX1X2X3

E[u(W )]

subject to VaR(95% confidence level) ≤ 3.

3. Constraint with expected shortfall at the 95% confidence level

maxX1X2X3

E[u(W )]

subject to expected shortfall(95% confidence level) ≤ 3.5.

4. Constraint with VaR at the 99% confidence level

maxX1X2X3

E[u(W )]

subject to VaR(99% confidence level) ≤ 3.

5. Constraint with expected shortfall at the 99% confidence level

maxX1X2X3

E[u(W )]

subject to expected shortfall(99% confidence level) ≤ 3.5.

100

Impact in risk concentration under VaR or ES constraint

• We analyze the effect of risk management with VaR and expected

shortfall by comparing solutions (2)-(5) with solution (1).

• The solution of the optimization problem with a 95% VaR constraint

shows that the amount invested in concentrated portfolio A is greater

than that of solution (1): that is, the portfolio concentration is en-

hanced by risk management using VaR. While VaR is reduced from

3.35 (unconstrained case) to 3, the expected shortfall increases from

5.26 (unconstrained case) to 14.35.

• The figure depicts the tails of the cumulative probability distributions

of the profit-loss of the portfolios. The left tail under VaR constraint

(95% confidence level) may suffer significant loss when bond A de-

faults (this risk is not well captured by VaR95%).

101

95% confidence level: Portfolios obtained with (i) no constraint, (ii) VaR

constraint, (iii) ES constraint.

102

Cumulative distribution of profit-loss: the left tail (95% confidence level).

VaR95% can be found directly by finding the profit/loss value at 5% of

the cumulative probability. The stepwise increments of the cumulative

distribution reflect the discrete loss amount upon default of a bond in

the portfolio. The extended horizontal segment along the 4% level in the

broken curve (2) exemplifies the occurrence of default of bond A (with

significant holding of 20.1%) with default rate 4%.

103

Under VaR constraint

When constrained by VaR, the investor must reduce her investment in the

diversified portfolio to reduce maximum losses with a 95% confidence lev-

el. This is made possible by increasing investments either in concentrated

portfolio or in a risk-free asset.

• Concentrated portfolio A has little effect on VaR, since the probabil-

ity of default lies beyond the 95% confidence interval. Concentrated

portfolio A also yields a higher return than other assets, except diver-

sified portfolio. Thus, the investor chooses to invest in concentrated

portfolio A.

• Although VaR is reduced, the optimal portfolio is vulnerable due to

its concentration and larger losses under conditions beyond the VaR

level. With high percentage holding (20.1%) on bond A that provides

high yield to maturity of 4.5%, VaR is kept under control while the

tail can become quite fat.

104

Under expected shortfall constraint

When constrained by expected shortfall, there is a higher chance that

the investor chooses optimally to reallocate his investment to a risk-free

asset, significantly reducing the portfolio risk.

• The investor cannot increase his investment in the concentrated port-

folio without affecting expected shortfall, which takes into account the

losses beyond the VaR level.

• Unlike risk management with VaR, risk management with expected

shortfall does not enhance credit concentration.

Remark

If investors can invest in assets whose loss is infrequent but large (such

as concentrated credit portfolios), the problem of tail risk can be serious.

Investors can manipulate the profit-loss distribution using those assets,

so that VaR becomes small while the tail becomes fat.105

Raising the confidence level to 99%

• We examine whether raising the confidence level of VaR solves the

problem. The new table gives the results of the optimization problem

with a 99% VaR or expected shortfall constraint. It shows that when

constrained by VaR at the 99% confidence level, the investor optimally

chooses to increase his/her investment in concentrated portfolio B

because the default rate of concentrated portfolio B is 0.5%, outside

the confidence level of VaR.

• Risk management with expected shortfall reduces the potential loss

beyond the VaR level by reducing credit concentration.

• VaR may enhance credit concentration because it disregards losses

beyond the VaR level, even at high confidence levels. On the other

hand, expected shortfall reduces credit concentration because it takes

into account losses beyond the VaR level as a conditional expectation.

106

99% confidence level: Portfolios obtained with (i) no constraint, (ii) VaR

constraint, (iii) ES constraint.

• With higher confidence level, it is expected that the portfolio obtained

with ”no constraint“ would have higher VaR and ES.

107

Cumulative distribution P [X ≤ x], X is the random profit

Cumulative distribution of profit-loss when the tail risk of VaR occurs.

108

Since we plot cumulative distribution against profit/loss, VaR can be

obtained directly by the point that the horizontal line 1 − α cuts the

curve.

In the upper curve on the left side, it shows higher probability of significant

loss. The expected shortfall on loss is larger (shown by the left side arrow

of expected shortfall). The corresponding VaR has less negative value on

profit (or smaller value of VaR on loss). This shows a tradeoff between

VaR and expected shortfall, where lower VaR on loss is achieved with

higher expected shortfall on loss.

109

Spectral risk measure

A risk measure can be characterized by the weights it assigns to quantiles

of the loss distribution.

• VaR gives a 100% weighting to the Xth quantile and zero to other

quantiles.

• Expected shortfall gives equal weight to all quantiles greater than the

Xth quantile and zero weight to all quantiles below Xth quantile.

A spectral risk measure is defined by making assumptions about the

weights assigned to quantiles. A general result is that a spectral risk

measure is coherent (i.e., it satisfies the subadditivity condition) if the

weight assigned to the qth quantile of the loss distribution is a nonde-

creasing function of q. Expected shortfall satisfies this condition since

nondecreasing property is (marginally) satisfied under constant weights.

110

Economic capital (risk capital)

• This is the amount of capital a financial institution needs in order to

absorb losses over a certain time horizon (usually one year) with a

certain confidence level.

• The confidence level depends on financial institutions’ objectives.

Corporations rated AA have a one-year probability of default less than

0.1%. This suggests that the confidence level should be 99.9%, or

even higher.

111

Take a target level of statistical confidence into account. For a given

level of confidence α, let Lp denote the random portfolio loss amount, we

define the credit VaR by the α-quartile of Lp:

qα = infq > 0|P [LP ≤ q] ≥ α.

Also, we define

ECα = economic capital = qα − ELP .

Say, α = 99.98%, this would mean ECα will be sufficient to cover losses in

9,998 out of 10,000 years (two occurrences over 10,000 years), assuming

a planning horizon of one year.

Why reducing the quantile qα by the EL in setting ECα? This is the usual

practice of decomposing the total risk capital into (i) expected loss (ii)

cushion against catastrophic losses.

112

The expected loss is the mean of loss and unexpected loss is the standard

deviation of loss.

113

Extreme value theory

Extreme value theory (EVT) is used to estimate the tails of a distribution.

EVT can be used to improve VaR and ES estimates with a very high

confidence level. It involves smoothing and extrapolating the tails of an

empirical distribution.

Suppose that F (v) = P [V ≤ v] is the cumulative distribution function for

a loss variable V (such as the loss on a portfolio over a certain period of

time) and that u is a value assumed by V in the right-hand tail of the

distribution. The probability that V lies between u and u + y (y > 0) is

F (u + y) − F (u). The probability that V assumes value that is greater

than u is 1− F (u). Define Fu(y) as the probability that V lies between u

and u+ y conditional on V > u. This is

P [V ≤ u+ y|V > u] =P [u < V ≤ u+ y]

P [V > u]= Fu(y) =

F (u+ y)− F (u)

1− F (u).

The variable Fu(y) defines the right tail of the probability distribution.

It is the cumulative probability distribution for the amount by which V

exceeds u by the amount y given that it does exceed u.

114

Generalized Pareto distribution

For a wide class of distributions F (v), the distribution of the tail beyond u

as denoted by Fu(y) converges to a generalized Pareto distribution as the

threshold u is increased. The generalized Pareto (cumulative) distribution

is

Gξ,β(y) = 1−(1+ ξ

y

β

)−1/ξ≈ Fu(y) when u is large.

The distribution has two parameters that have to be estimated from the

data, namely, ξ and β. The parameter ξ is the shape parameter which

determines the heaviness of the tail of the distribution. The parameter β

is a scale parameter that serves to scale y through the form y/β.

When the underlying variable V has a normal distribution, ξ = 0. As

the tails of the distribution become heavier, the value of ξ increases. For

most financial data, ξ is positive and in the range 0.1 to 0.4. For example,

when ξ is 1/2, the power term corresponds to square root power. Larger

positive ξ shows thicker tail distribution.

115

Estimating ξ and β

The parameters ξ and β can be estimated using maximum likelihood

methods. The probability density function, gξ,β(y), of the cumulative

distribution is calculated by differentiating Gξ,β(y) with respect to y. This

gives

gξ,β(y) =1

β

(1+

ξy

β

)−1/ξ−1.

We first choose a value for u. Recall that y is the loss above u. A value

close to the 95th percentile point of the empirical distribution usually

works well. We then rank the observations on V from the highest to the

lowest and focus our attention on those observations for which V > u.

Suppose there are nu such observations and they are vi (1 ≤ i ≤ nu).

116

We calibrate the distribution function by choosing the parameters to max-

imize the joint probability of the data points given the parameter values.

Assuming the data points are sampled independent, the joint probability

is equal to the product of their individual probabilities. The likelihood

function (assuming that ξ = 0) is the product of the probability values of

these nu observations:

nu∏i=1

1

β

(1+

ξ(vi − u)

β

)−1/ξ−1.

Maximizing this function is the same as maximizing its logarithm:

nu∑i=1

ln

(1+

ξ(vi − u)

β

)−1/ξ−1Standard numerical procedures can be used to find the values of ξ and β

that maximize this expression.

117

Estimating the tail of the distribution

The probability that V > u+ y conditional that V > u is 1−Gξ,β(y). The

probability that V > u is 1 − F (u). The unconditional probability that

V > x (when x > u) is therefore

[1− F (u)][1−Gξ,β(x− u)], where y = x− u.

If n is the total number of observations, an estimate of 1−F (u), calculated

from the empirical data, is nu/n. The unconditional probability that V > x

is therefore

P(V > x) =nu

n[1−Gξ,β(x− u)] =

nu

n

(1+ ξ

x− u

β

)−1/ξ. (A)

118

Calculation of VaR and ES

To calculate VaR with a confidence level of q, we solve

F (VaR) = q.

Since F (x) = 1− P(V > x), we obtain

q = 1−nu

n

(1+ ξ

VaR− u

β

)−1/ξ,

so that

VaR = u+β

ξ

[n

nu(1− q)

]−ξ− 1

. (B)

The expected shortfall is given by

ES =∫ ∞VaR

xgξ,β(x− u) dx =∫ ∞VaR

x

β

[1+

ξ(x− u)

β

]−1ξ−1

dx

=VaR+ β − ξu

1− ξ.

119

Historical simulation

To express the approach algebraically, define vi as the value of a market

variable on Day i and suppose that today is Day n. The ith scenario in

the historical simulation approach assumes that the value of the market

variable tomorrow will be

value under the ith scenario = vnvi

vi−1.

120

September 25, 2008, is an interesting date to choose in evaluating an

equity investment. The turmoil in credit markets, which started in August

2007, was more than a year old. Equity prices had been declining for

several months. Volatilities were increasing. Lehman Brothers had filed

for bankruptcy 10 days earlier. The Treasury Secretary’s $700 billion

Troubled Asset Relief Program (TARP) had not yet been passed by the

United States Congress. Note that Nikkei 225 (Japan) and FTSE 100

(UK) were more hard hit compared to DJIA (US) in the midst of the

2008 financial tsunami.121

Sample calculations

The DJIA was 11,022.06 on September 25, 2008. On August 8, 2006,

it was 11,173.59, down from 11,219.38 on August 7, 2006. The value

of the DJIA under Scenario 1 is therefore

11,022.06×11,173.59

11,219.38= 10,977.08.

Similarly, the value of the FTSE 100, the CAC 40, and the Nikkei 225

(measured in U.S. dollars) are 9,569.23, 6,204.55, and 115.05, respec-

tively. The value of the portfolio under Scenario 1 is therefore (in $000s):

4,000×10,977.08

11,022.06+ 3,000×

9,569.23

9,599.90

+ 1,000×6,204.55

6,200.40+ 2,000×

115.05

112.82= 10,014.334.

122

Scenario generated for September 26, 2008, using Data in the above

table (all indices are measured in U.S. dollars)

The FTSE100 in British pound × exchange rate of one British pound in

US dollars gives the FTSE100 in US dollars.

123

124

Extreme Value Theory Calculations

The parameter u is chosen to be 160 so that nu = 22. The trial values

for β and ξ are 40 and 0.3, respectively.

• If the choice of u is changed to 200, then nu = 8. There is a tradeoff

between u and nu. Higher u leads to smaller nu since there will be

less number of scenarios of losses that exceed 200.125

The above table shows calculations for the trial values β = 40 and ξ = 0.3.

The value of the log-likelihood function is −108.37.

The search for the values of β and ξ that maximize the log-likelihood

function gives β = 32.532 and ξ = 0.436, and the maximum value of the

log-likelihood function is −108.21.

Suppose that we wish to estimate the probability that the portfolio loss be-

tween September 25 and September 26, 2008, will be more than $300,000

(or 3% of its value). Using u = 160, we obtain from eq.(A) that

22

500

(1+ 0.436

300− 160

32.532

)−1/0.436= 0.0039,

which is more accurate than counting observations. The probability that

the portfolio loss will be more than $500,000 (or 5% of total portfolio

value) is 0.00086 by following a similar procedure.

126

VaR calculations

Using eq.(B), the value of VaR with a 99% confidence limit is

160 +32.532

0.436

[500

22(1− 0.99)

]−0.436− 1

= 227.8

or $227,800. In this instance, the VaR estimate is about $25,000 less

than the fifth worst loss. When the confidence level is increased to 99.9%,

VaR becomes

160+32.532

0.436

[500

22(1− 0.999)

]−0.436− 1

= 474.0

or $474,000. When it is increased further to 99.97%, VaR becomes

160+32.532

0.436

[500

22(1− 0.9997)

]−0.436− 1

= 742.5

or $742,500.

127

ES calculations

We can improve ES estimates and allow the confidence level used for ES

estimates to be increased. In our example, when the confidence level is

99%, the estimated ES is

227.8+ 32.532− 0.436× 160

1− 0.436= 337.9

or $337,900. When the confidence level is 99.9%, the estimated ES is

474.0+ 32.532− 0.436× 160

1− 0.436= 774.8

or $774,800.

128

Probability density calculations

The probability density function evaluated at the VaR level for the prob-

ability distribution of the loss, conditional on it being greater than 160,

is given by the gξ,β function. It is

1

32.532

(1+

0.436× (227.8− 160)

32.532

)−1/0.436−1= 0.0037.

The unconditional probability density function evaluated at the VaR level

is nu/n = 22/500 times 0.0037 = 0.00016.

129

Choices of u

This represents the tradeoff between accuracy of approximating the tail

distribution (higher u) and more data points available in the calibration

(lower u).

It is often found that values of ξ and β do depend on u, but the estimates

of F (x) remain roughly the same. We want u to be sufficiently high that

we are truly investigating the shape of the tail of the distribution, but

sufficiently low that the number of data items included in the maximum

likelihood calculation is not too low. More data lead to improved accuracy

in the assessment of the shape of the tail.

A rule of thumb is that u should be approximately equal to the 95th

percentile of the empirical distribution. In the case of the data we have

been looking at, the 95th percentile of the empirical distribution is 156.5.

In the search for the optimal values of ξ and β, both variables should be

constrained to be positive.

130

Semi-positive definiteness and nonnegativity of eigenvalues

Recall that a matrix A is said to be semi-positive definite matrix if xTAx ≥0, for all x. The eigenvalue λ and eigenvector v of the matrix A is defined

by

Av = λv.

The eigenvalues of a semi-positive definite matrix A are known to be

non-negative.

To prove the claim, suppose not, then vTAv = λvTv < 0, if λ is negative.

This leads to a contradiction.

131