estimating the predictive distribution for loss reserve models

Estimating the Predictive Distribution for

Loss Reserve Models

Glenn Meyers

Casualty Loss Reserve Seminar

September 12, 2006

Objectives of Paper

• Develop a methodology for predicting the distribution of outcomes for a loss reserve model.

• The methodology will draw on the combined experience of other “similar” insurers.– Use Bayes’ Theorem to identify “similar” insurers.

• Illustrate the methodology on Schedule P data for several insurers.

• Test the predictions of the methodology with data from later Schedule P reports.

• Compare results with reported reserves.

A Quick Description of the Methodology

• Expected loss is predicted by chain ladder/Cape Cod type formula

• The distribution of the actual loss around the expected loss is given by a collective risk (i.e. frequency/severity) model.


• The first step in the methodology is to get the maximum likelihood estimates of the model parameters for several large insurers.

• For an insurer’s data– Find the likelihood (probability of the data) given

the parameters of each model in the first step.– Use Bayes’ Theorem to find the posterior

probability of each model in the first step given the insurer’s data.


• The predictive loss model is a mixture of each of the models from the first step, weighted by its posterior probability.

• From the predictive loss model, one can calculate ranges or statistics of interest such as the standard deviation or various percentiles of the predicted outcomes.

The Data

• Commercial Auto Paid Losses from 1995 Schedule P (from AM Best)– Long enough tail to be interesting, yet we

expect minimal development after 10 years.

• Selected 250 Insurance Groups– Exposure in all 10 years– Believable payment patterns– Set negative incremental losses equal to zero.

16 insurer groups account for one half of the premium volume

Look at Incremental Development Factors

• Accident year 1986

• Proportion of loss paid in the “Lag” development year

• Divided the 250 Insurers into four industry segments, each accounting for about 1/4 of the total premium.

• Plot the payment paths

Incremental Development Factors - 1986

Incremental development factors appear to be relatively stable for the 40 insurers that represent about 3/4 of the premium.

They are highly unstable for the 210 insurers that represent about 1/4 of the premium.

The variability appears to increase as size decreases

Expected Loss Model

• Paid Loss is the incremental paid loss in the AY and Lag• ELR is the Expected Loss Ratio• ELR and DevLag are unknown parameters

– Can be estimated by maximum likelihood– Can be assigned posterior probabilities for Bayesian analysis

• Similar to “Cape Cod” method in that the expected loss ratio is estimated rather than determined externally.

,

E Paid Loss PremiumAY LagAY LagELR Dev

Distribution of Actual Loss around the Expected Loss

• Compound Negative Binomial Distribution (CNB)– Conditional on Expected Loss – CNB(x | E[Paid Loss])– Claim count is negative binomial– Claim severity distribution determined externally

• The claim severity distributions were derived from data reported to ISO. Policy Limit = $1,000,000– Vary by settlement lag. Later lags are more severe.

Claim Severity Distributions

Lag1

Lag 2

Lag 3

Lag 4

Lags 5-10

Likelihood Function for a Given Insurer’s Losses –

10 11

, ,1 1

,

| E Paid Loss

Likelihood

AY

AY Lag AY LagAY Lag

AY Lag

CNB x

x

,AY Lagx

where

,E Paid Loss PremiumAY Lag AY LagELR Dev

Maximum Likelihood Estimates of Incremental Development Factors

Loss development factors reflect the constraints on the MLE’s described in prior slide

Contrast this with the observed 1986 loss development factors on the next slide

Incremental Development Factors - 1986(Repeat of Earlier Slide)

Loss payment factors appear to be relatively stable for the 40 insurers that represent about 3/4 of the premium.

They are highly unstable for the 210 insurers that represent about 1/4 of the premium.

The variability appears to increase as size decreases

Maximum Likelihood Estimates of Expected Loss Ratios

Estimates of the ELRs are more volatile for the smaller insurers.

Using Bayes’ Theorem

• Let = {ELR, DevLag, Lag = 1,2,…,10} be a set of models for the data.– A model may consist of different “models” or of

different parameters for the same “model.”

• For each model in , calculate the likelihood of the data being analyzed.

Pr data | model

Using Bayes’ Theorem

• Then using Bayes’ Theorem, calculate the posterior probability of each parameter set given the data.

Posterior model | data

Pr data | model Prior model

Prior Distribution of Loss Payment Paths

Prior loss payment paths come from the loss development paths of the insurers ranked 1-40, with equal probability

Posterior loss payment path is a mixture of prior loss development paths.

Prior Distribution of Expected Loss Ratios

The prior distribution of expected loss ratios was chosen by visual inspection.

Predicting Future Loss PaymentsUsing Bayes’ Theorem

• For each model, estimate the statistic of choice, S, for future loss payments.

• Examples of S– Moments of future loss payments.– The probability density of a future loss payment of x,– The cumulative probability, or percentile, of a future

loss payment of x.

• These examples can apply to single (AY,Lag) cells, of any combination of cells such as a given Lag or accident year. – Use FFT’s to calculate distribution of sum of cells

Predicting Future Loss Payments Using Bayes’ Theorem

• Calculate the Statistic S for each model.

• Then the posterior estimate of S is the model estimate of S weighted by the posterior probability of each model

1

Posterior Estimate of

| model Posterior model | datan

i ii

S

S

Sample Calculations for Selected Insurers

• Coefficient of Variation of predictive distribution of unpaid losses.

• Plot the probability density of the predictive distribution of unpaid losses.

Predictive DistributionInsurer Rank 7

Predictive Mean = $401,951 K

CV of Total Reserve = 6.9%

Predictive DistributionInsurer Rank 97

Predictive Mean = $40,277 K

CV of Total Reserve = 12.6%

CV of Unpaid Losses

Validating the Model on Fresh Data

• Examined data from 2001 Annual Statements– Both 1995 and 2001 statements contained losses

paid for accident years 1992-1995.– Often statements did not agree in overlapping years

because of changes in corporate structure. We got agreement in earned premium for 109 of the 250 insurers.

• Calculated the predicted percentiles for the amount paid from 1996 to 2001

• If model works, the predicted percentiles should be uniformly distributed

PP Plots on Validation Data

Plot sorted predicted percentiles against uniform distribution.

Significant differences given by Kolomogorov-Smirnov test.

Critical values @ 95% = ±13.03%

Feedback

• If you have paid data, you must also have the posted reserves. How do your predictions match up with reported reserves?

• Your results are conditional on the data reported in Schedule P. Shouldn’t an actuary with access to detailed company data (e.g. case reserves) be able to get more accurate estimates?

Predictive and Reported Reserves

• For the validation sample, the predictive mean (in aggregate) is closer to the 2001 retrospective reserve.

• Possible conservatism in reserves. OK?• “%” means % reported over the predictive mean.• Retrospective = reported less paid prior to end of 1995.

Reported 1995 Reserve (000)

Predictive Mean (000)

Initial @ 1995

Retrospective @ 2001

250 Insurers AY 1986-1995 14,873,303 16,221,998 - 9.1% ---

109 Insurers AY 1992-1995 1,798,794 1,976,299 – 9.9% 1,842,104 – 2.4%

Reported Reserves More Accurate?

• Divide the validation sample in to two groups and look at subsequent development.

1. Reported Reserve < Predictive Mean2. Reported Reserve > Predictive Mean

• Expected result if Reported Reserve is accurate.– Reported Reserve = Retrospective Reserve for each

group • Expected result if Predictive Mean is accurate?

– Predictive Mean Retrospective Reserve for each group

– There are still some outstanding losses in the retrospective reserve.

Subsequent Reserve Changes

• Group 1 • 50-50 up/down• Ups are bigger

• Group 2 • More downs than

ups

• Results are independent of insurer size

Group 1 Group 2

Subsequent Reserve Changes Reported Reserve @ 1995

< Predictive Mean (000) > Predictive Mean (000)

Number of Insurers 66 43

Total Predictive Mean 926,134 872,660

1995 Reserve @ 1995 803,175 1,173,124

1995 Reserve @ 2001 856,393 985,711

• The CNB formula identified two groups where:

– Group 1 tends to under-reserve– Group 2 tends to over-reserve

• Incomplete agreement at Group level – Some in each group get it right

Main Points of Paper• How do we evaluate stochastic loss reserve

formula? – Test predictions of future loss payments– Test on several insurers– Main focus is the testing

• Are there any formulas that can pass these tests?– Bayesian CNB does pretty good on CA Schedule P data.– Uses information from many insurers – Are there other formulas? This paper sets a bar for

additional research to raise.

estimating the predictive distribution for loss reserve models

Documents