Download - A Mixing Model for Operational Risk

8/9/2019 A Mixing Model for Operational Risk

1/18Electronic copy available at: http://ssrn.com/abstract=1090863

A Mixing Model For Operational Risk

Jim Gustafsson Jens Perch Nielsen

30th June 2008

Abstract

External data can often be useful in improving estimation of operational risk loss distributions.

This paper develops a systematic approach that incorporates external information into internal

loss distribution modelling. The standard statistical model resembles bayesian methodology or

credibility theory in the sense that prior knowledge (external data) has more weight when internal

data is scarce than when internal data is abundant.

Key words and phrases: Mixing data sources, Prior knowledge, Operational risk, Actuarial

loss models, Transformation, Generalized Champernowne distribution, Scarce data.

JEL Classification: C13, C14.

1 Introduction

Calculating loss distributions for operational risk by only using internal data often fails to capture

potential risks and unprecedented large loss amounts that has huge impact on the capital. Conversely,

calculating loss distribution by only using external data provides a capital figure that is not sensi-

tive to internal losses, does not increase despite the occurrence of a large internal loss, and does not

decrease despite the improvement of internal controls. The proposed model in this paper contains

both internal and external data. We first go through an intuitive and simplified example on how prior

RSA Scandinavia & University of Copenhagen | Gammel Kongevej 60 | DK 1790 Copenhagen V, Denmark| E-mail:[email protected]| telephone: +45 33 555 23 57, fax: +45 33 55 21 22 |Cass Business School | E:mail: Jens.Nielsen,[email protected]|

1


2/18Electronic copy available at: http://ssrn.com/abstract=1090863

knowledge can be incorporated into modern estimation techniques. Consider the simplified example:

Let (Hi)1in be iid stochastic variables with density h on [0, 1], and let h be another density on [0, 1]

representing prior knowledge of the density h. We wish to estimate h guided by our prior knowledge

density h. Let now g = h/h and let h = h g be our estimator ofh guided by prior knowledge.When h is close to h then g is close to the density of a Uniform distribution. To estimate g we

take advantage of nonparametric smoothing techniques, where we consider the simplest smoothing

technique available: kernel smoothing. Note that in the above estimation of h, we have changed the

estimation from estimating the completely unknown and complicated density h to the simpler g that

is close to the density of a Uniform distribution when h represent good prior knowledge. Now let g

be a kernel density estimation function that estimates g. The nature of kernel smoothing is such that

it smooths a lot when data is sparse. In our case this means that h will be close to h for very small

sample sizes. When data is more abundant not as much kernel smoothing will take place and the com-

plicated features ofg can be visualized. Therefore, kernel smoothing incorporating prior knowledge

has many similarities to bayesian methodology or credibility theory that also use global information

as prior knowledge when data is scarce and a more local estimation when data is abundant.

The purpose of this paper is to use prior knowledge from external data to improve our estimation

of the internal data distribution. The underlying dynamics of the new mixing model transformation

approach reflects the simple intuition in the illustration above on how to include prior knowledge

on [0, 1]-densities. In our case we wish to estimate loss distributions on the entire positive real axis.

This is done by estimating a distribution F on the external data. Then we transform the internal data

(Hi)1in with the estimated external cumulative distribution function, i.e using F(Hi). The trans-

formed data set will have a density equal to the true internal density divided by the estimated external

density distribution transformed such that the support is on [0, 1]. When the external data distribution

is a close approximation to the internal data distribution, we have simplified the problem to estimating

something close to a Uniform distribution. When this estimation problem has been solved through

kernel smoothing, we back-transform to the original axis and find our final mixing estimator.

2


3/18

The proposed mixing model is based on a semi-parametric estimator of the loss distribution. Klug-

man, Panjer and Willmot (1998), McNeil, Frey and Embrechts (2005), Cizek, Hardle and Weron

(2005) and Panjer (2006) are important introductions to actuarial estimation techniques of purely

parametric loss distributions. Cruz (2001) describes a number of parametric distributions which can

be used for modeling loss distributions. Embrechts, Kluppelberg and Mikosch (1999) and Embrechts

(2000) focus on the tail and Extreme Value Theory (EVT). Further, the recent paper by Degen, Em-

brechts and Lambrigger (2007) discusses some fundamental properties of the g-and-h distribution

and how it is linked to the well documented EVT based methodology. However, if one focus on the

excess function, the link to the Generalized Pareto Distribution (GPD) has an extremely slow conver-

gency rate and capital estimation for small level of risk tolerance using EVT may lead to inaccurate

results if the parametric g-and-h distribution is chosen as model. Also Diebold, Schuermann and

Stroughair (2001) and Dutta and Perry (2006) stress weaknesses of EVT when it comes to real data

analysis. This problem may be solved by non- and semi-parametric procedures. For a review of

modern kernel smoothing techniques see Wand and Jones (1995). During the last decade a class of

semi-parametric methods was developed and designed to work better than a purely non-parametric

approach, see Bolance, Guillen and Nielsen (2003), Hjort and Glad (1995) and Clements, Hurn and

Lindsey (2003). They showed that non-parametric estimators could substantially be improved in a

transformation process and they offer several alternatives for the transformation itself.

The paper by Buch-Larsen, Bolance, Guillen and Nielsen (2005) maps the original data within

[0, 1] via the parametric start and correct non-parametrically for possible misspecification. A flexible

parametric start helps to mitigate an accurate estimation, Buch-Larsen et al. (2005) generalized the

Champernowne distribution (GCD), see Champernowne (1936,1952), and used the cumulative dis-

tribution as transformation function. In the spirit of Buch-Larsen et al. (2005) many remedies have

been proposed. Gustafsson, Hagmann, Nielsen and Scaillet (2008) investigate the performance be-

tween symmetric and asymmetric kernels in [0,1], Gustafsson, Nielsen, Pritchard and Roberts (2006)

estimate operational risk losses with a continuous credibility model, and Guillen, Gustafsson, Nielsen

and Pritchard (2007), and the extended version by Buch-Kromann, Englund, Gustafsson, Nielsen and

3


4/18

Thuring (2007), introduced the concept of under-reporting in operational risk quantification. Bolance,

Guillen and Nielsen (2008) develop a transformation kernel density estimator by using a double

transformation to estimate heavy-tailed distributions. Before describing the new method we mention

other recent methods in the literature. The papers by Shevchenko and Wuthrich (2006), Buhlmann,

Shevchenko and Wuthrich (2007) and Lambrigger, Shevchenko and Wuthrich (2007) combine loss

data with scenario analysis information via Bayesian inference for the assessment of operational risk,

and Verrall, Cowell and Khoon (2007) examine Bayesian networks for operational risk assessment.

Figini, Guidici, Uberti and Sanyal (2008) develops a method that can estimate the internal data distri-

bution with help from truncated external data originating from the exact same underlying distribution.

In our approach we allow the external distribution to be different from the internal distribution, our

transformation approach has the consequence that the internal estimation process is guided by the

more stable estimator originating from the external data. If the underlying distribution of the external

data is close to the underlying distribution of the internal data, then this approach improves the effi-

ciency of estimation. We also present a variation of this approach where we correct for the estimated

median, such that only the shape of the underlying internal data has to be close to the shape of the

underlying external data for our procedure to work well.

2 The asymptotic properties of the transformation approach

Let (Xi)1in be a sequence ofiid random losses from a probability distribution F(x) with unknown

density function f(x). Let T(x) be a set of twice continuous differentiable transformations, with a

first derivative T(x) that serves as a global start. This density is assumed to provide a meaningful butpotentially inaccurate description off(x). The transformation function T(x) could be a cumulative

distribution function, or it could be a non-parametric class of functions, see e.g Wand et al. (1991)

and Buch-Larsen et al. (2005). The transformed density estimator with a local constant boundary

4


5/18

correction can be written as

f(x) = T(x)01 (T(x), h)

1 n1n

i=1 K

h (T(Xi)

T(x))

= T(x) l(T(x)). (2.1)

with local model l(), transformed losses T(Xi), bandwidth h, symmetric kernel Kh with Kh() =1h

K h

and

ij (u, h) =

min{1,u/h}max{1,(u1)/h}

(vh)iK(v)jdv. (2.2)

A deeper analysis of the transformation process could be found in e.g Gustafsson et al. (2008). Theasymptotic theory of (2.1) are presented by theorem 1, and the proof can be found in Buch-Larsen et

al. (2005).

Theorem 1 Let the transformation function T(x) be a two times differentiable known function. The

bias off(x) is given by

E f(x)

f(x) = 2121 (T(x), h) (T(x)) + o(h2)with (T(x)) =

f(x)T(x)

1

T(x)

and the variance

V

f(x)

= 02 (T(x), h) (nh)

1 f(x)T(x) + o

(nh)1

where the asymptotics is given for n , h 0 andnh .

3 The transformation technique

In the previous section a general description of the transformation approach and its asymptotic prop-

erties was given. In this section we assume that the transformation function is a cumulative distri-

bution function described by T(x, ) with density T(x, ), indexed by the global parameter =

{1, 2,...,p} Rp

. As before, let (Xi)1in be a sequence of random losses originating from a

5


6/18

probability distribution F(x) with unknown density f(x). Then equation (2.1) could be reformulated

as

f(x, ) = T(x, )

01 (T(x, ), h)1 n1

ni=1

Kh (T(Xi, ) T(x, ))

= T(x, ) l(T(x, )) (3.1)

Now let be some suitable estimator of, then the estimator (3.1) becomes

f(x, ) = T(x, )

01 T(x, ), h1

n1n

i=1 Kh T(Xi, ) T(x, )

= T(x, ) l(T(x, )) (3.2)

From e.g Buch-Larsen et al. (2005) we know that when is a square-root-n-consistent estimator of

the asymptotic theory of f(x, ) is equivalent to the asymptotic theory of f(x, ). The asymptotic

theory off(x, ) immediately follows from Theorem 1, and

nhfx, Efx, N0, 02 (T(x), h) f(x)T(x)

nh

holds.

Let now (Xi)1in be a sequence of collected internal losses and let (Yj)1jm be a sequence of

external reported losses. Then (Yj)1jm represent prior knowledge that should enrich a limited in-

ternal data set. The parametric density estimator based on the external data set offers prior knowledge

from the general industry. The correcting of this prior knowledge is based on a local kernel smoother

of the transformed internal data points. This provide us with consistent and fully nonparametric esti-

mator of the density of the internal data . The approach is therefore fully nonparametric at the same

time as it is informed by prior information based on external data sources. The new mixing local

transformation kernel density estimator has asymptotics as (3.1) in Theorem 1. Hence, ifT(x, y) is

a misspecified approximation of f(x), a convergency in probability of y takes place to the pseudo

6


7/18

true value which minimizes the Kullback-Leibler distance1 between T(x, y) and the true density

f(x). Therefore y is a square-root-n-consistent estimator of. The proposed mixing model could be

expressed as

f(x, y) = T(x, y)

01

T(x, y), h

1n1

ni=1

Kh

T(Xi, y) T(x, y)

= T(x, y) l(T(x, y)) (3.3)

and have the same asymptotic theory as (3.1) for n , h 0 and nh .

The global start T(x, ) in (3.2) and (3.3) are modelled by the Generalized Champernowne Dis-tribution (GCD) with density

T(x, ) = (x + c)(1) ((M + c) (c))

((x + c) + (M + c) 2(c))2 , x R+ (3.4)

incorporating internal parameters = { , M, c} and external parameters y = {y, My, cy}. Here

is a tail parameter, M controls the body of the distribution and c is a shift parameter that control the

domain close to zero. The parameter vectors and y are estimated by maximum likelihood and for

a deeper analysis on the GCD, see Buch-Larsen et al. (2005).

Another slightly different mixing model will also be included in the paper. This estimator follows

the same idea as in Gustafsson et al. (2005). The model is the same as (3.3) but instead of using

the prior knowledge parameter vector y, this generalized model transforms the internal losses with

a combined parameter vector y = {y, M , cy}. The differences are that prior knowledge is onlyprovided by the tail parameter y and the shift parameter cy, while the mode is represented by the

estimated internal parameter M. With this model we enclose the external parameters y and cy with

a body described internally by M.

1The Kullback-Leibler distance is a distance function from a true probability density, f(x), to a target probability

estimator, T(x, y), and defined as

0f(w)log

f(w)

T(w,y)

dw

7


8/18

4 Data study

This section demonstrates the practical relevance of the above presented mixing model. We estimate

a loss distribution and calculate next years operational risk exposure by using the common risk mea-

sures Value-at-Risk (VaR) and Tail-Value-at-Risk (TVaR) for different levels of risk tolerance. For

the severity estimator we employ model (3.2) and model (3.3), and benchmark with the parametric

Weibull distribution. For the frequency model we assume that N(t) is an independent homogenous

Poisson process, N(t) P o(t), with positive intensity . By using Monte Carlo simulation withthe severity and frequency assumptions we could create a simulated one year operational risk loss

distribution.

We denote the internal collected losses from the event risk category Employment Practices and

Workplace Safety with (Xi)1iN(t) where N(t) describes random number of losses over a fixed time

period t = 0,...,T and with N(0) = 0. Further, (Yj)1jM(t) represent external data from the same

event risk category with random number of losses M(t) with M(0) = 0. Table 1 reports summary

statistics on each data sets.

TABLE 1

Statistics for Event Risk Category Employment Practices and Workplace Safety

Number of

Losses

Maximum

Loss ($M)

Sample

Mean ($M)

Sample

Median ($M)

Standard

Deviation ($M)

Time

Horizon (T)

Internal data 120 16.80 1.86 0.30 3.66 2

External data 6526 561.43 1.82 0.18 13.41 8

The mean and median are similar for the two data sets. However, the number of losses, the maximum

loss, the standard deviation and the collection period are widely different. We condition on N = n and

sample the Poisson process of internal events that occur over a one year time horizon. The maximum

likelihood estimator of the annual intensity of internal losses n = 120 is = n/T = 120/2 = 60,

and denote the annual simulated frequencies by r with r = 1,...,R and number of simulations

R = 10.000. For each r we draw randomly uniform distributed samples and combine these with

8


9/18

loss sizes taken from the inverse function of the severity distribution. The outcome is the annual total

loss distribution denoted by

Sr =

rk=1

F(urk, ), r = 1,...,R

with urk U(0, 1) for k = 1, ..., r, and

F(urk, ) =urk0

f(, )d

with incorporating for model (3.2) or the prior knowledge parameters y and y for (3.3). For

simplicity, we introduce the abbreviations M1 for the semi-parametric model (3.2), M2 and M3 for

(3.3) with y and y respectively. M4 is the purely external data version of (3.2), where the prior

knowledge is locally corrected with external data. This corresponds to the situation when a company

has not started to collect internal data and must rely entirely on prior knowledge. Finally, M5 is

the benchmark model with a Weibull assumption on the severity. Then, a loss distribution could be

calculated with relevant return periods and thereby identify the capital to be held to protect against

unexpected losses. The two risk measured used are the VaR

VaR (Sr) = sup {s R | P (Sr s) }

and TVaR that give us the expectation of the area above VaR and is defined as

TVaR (Sr) = E[Sr | Sr V aR (Sr)]

for risk tolerance . Table 2 collects summary statistics for the simulated total loss distribution across

each model. Among the usual summary statistics we report VaR and TVaR for risk tolerance =

{0.95, 0.99, 0.999}.

9


10/18

TABLE 2

Statistics of Simulated Loss Distributions for Event Risk Category Employment Practices and Workplace Safety ( $M)

Model Mean Median Sd VaR95% VaR99% VaR99.9% TVaR95% TVaR99% TVaR99.9%

M1 226 178 149 547 765 980 691 889 1180

M2 244 186 167 609 881 1227 787 1025 1385

M3 296 213 241 633 936 1331 820 1143 1487

M4 316 274 322 801 1165 1644 1073 1388 1858

M5 100 97 27 249 272 294 363 383 405

One can see from Table 2 that all loss distributions are right skewed since all means are larger than

its respective median. Interpreting the VaR results one notices that the fully prior knowledge model

M4 shows much larger values than the others. The benchmark model M5 predicts the lowest values

of the considered models and if we compare M5 against the purely internal estimator M1, model M5

suggests 30% 40% capital requirement, of the total amount M1 recommends. For the mixing modelM2 we have only 8% higher sample mean and 10%

25% higher VaR values than M1 and this is due

to prior knowledge based on external data. If we view the TVaR results, similar pattern as in VaR

could be visualized. A conclusion is that the two mixing models, M2 and M3, are able to stabilizing

the estimation process and does not underestimate the tail as model M5 apparently does.

5 Data Driven Simulation Study

This section provides a data driven simulation study. Here we evaluate scenarios with different prior

knowledge for the new mixing model. The study demonstrates the intuitive expectation, namely, the

mixing model outperform model This simulation study teach us what we already expected, namely,

that the mixing model outperform model (3.2) in situations with scarce data. As data becomes more

abundant, we experienced that the proposed model oppose the performance, sometimes even hurting

a bit. Another insight is that the parametric Weibull distribution provides doubtful goodness-of-fit

to heavy tailed distributed data. The study brings into play two changing true distributions with

10


11/18

unlike characteristics and the idea with a data driven simulation study, in contrast to a predetermined

simulation study, is that the two true distributions are estimated on the real operational risk samples

described in the previous section. The two distributions used in the study are presented by their

respective densities below.

1. Lognormal, f(x, 1) =1

x2

e12(

log(x) )

2

, x R+, with location- and scale parameter1 = {, }.

2. Generalized Champernowne Distribution (GCD), f(x, 2) =(x+c)(1)((M+c)(c))((x+c)+(M+c)2(c))2 , x R+,

with 2 =

{ , M, c

}.

The study was also made with the distributions; Exponential, Weibull, Gamma and Pareto. The

outcomes were similar to the lognormal and GCD, but are not outlined in the paper although avail-

able upon request. The two true distributions are estimated on the two available data sets by max-

imum likelihood estimation and produce the estimated parameter vectors d and yd with d = 1, 2.

Random losses are drawn from the estimated true models to obtain pseudo data. The internal data

are expressed by (X

b

di)1ink , were d indicate distribution treated, the iteration b = 1,...,B, withB = 2000, and the sample sizes simulated for next years exposure N(t) = nk are chosen as

nk = {10, 20, 50, 100, 200, 500, 1000, 2000, 4000}. The external samples are denoted by (Ybdj)1jmlwith M(t) = ml where ml = {100, 5000}. For each data set the internal and external global start,T(x, ) in (3.2) and (3.3), are assumed to be explained by the GCD described in (3.4). The modeldescribed in (3.2) is then estimated on each sample combination {b,d,nk}, and for each sample nkwe estimate the mixing model (3.3) on external data combination

{b,d,ml

}.

In summary, for each internal sample nk six estimators will be compared. One that is pure internal

(3.2), four scenarios of the mixing model (3.3), informed with different external sample sizes ml and

different global parametric start. Also here we benchmark with a parametric Weibull distribution.

The performance measures considered are means over replications for the integrated squared error

(ISE), which measure the error between the true and the estimated density with equal weight across

the support, and the weighted integrated squared error (WISE) that focuses on the fit in the tails. We

11


12/18

can write the statistical performance measures as

ISE =

1

B

Bb=1

0

f(x, ) f(x,

d)2

dx

1/2

WISE =1

B

Bb=1

0

f(x, ) f(x, d)

2x2dx

1/2

with true density f(x, d), and the estimator f(x, ) as one of the six different models. Note that thetrue model is only enriched with the internal data.

Figure 1 presents the results in the situation with the true lognormal distribution. We illustrate the

result by a relative error, meaning that the outcome that the pure internal estimator of ISE and WISE

are divided with the outcome of the proposed models. A curve above one provides a worse prediction

from the mixing model in that specific sample point.

*** Figure 1 About Here ***

The top left-hand graph fluctuates around one. The mixing model therefore has similar ISE as the

pure internal estimator. For some internal sample points we have less than a half percentage better

results using the pure internal estimator (3.2), and of the compared mixing models the model that is

enriched with 5000 external observations, f(x, y)5000 performs best when the maximum number of

internal observations are reviewed. The top right-hand graph presents the benchmark model perfor-

mance. The parametric Weibull grows from 20% worse to above 30% worse as the internal sample

size increases. For the tail measure WISE we stress that above n4 the mixing models are stabilized

between 1%-2% worse performance. Note that the breakpoint are higher when one focuses on the

extremes than with the whole axes as in ISE. The bottom right-hand graph presents the benchmark

performance and clearly the Weibull has predicts poorly the true underlying density of the internal

data set with culmination 60% poorer results on sample size n9 = 4000.

12


13/18

In the situation with the GCD almost identical appearance as in the lognormal case could be rec-

ognized. Figure 2 gives a visual interpretation of the results.

*** Figure 2 About Here ***

For the ISE the breakpoint have increased to around n5 compared to the lognormal, thereafter the

mixing models become stabilized in the region between 5%-10% poorer prediction power. Focusing

on WISE we could see that the breakpoint are between 100-500 internal observations.

6 Conclusions

This paper gives an intuitive and easily implemented model that combine different data sources. From

the data study we found that using a light tailed parametric fit, small numbers on the Value-at-Risk

would be identified. Also, the mixing methods seem to give more stable and more realistic results than

parametric methods not including external data. From the data driven simulation study we conclude

that in the situation with scarce internal data the proposed mixing model outperforms the model with

only one data source. However, as the internal data becomes more abundant the breakpoint where

(3.2) becomes better than (3.3) could be identified. Also we find that the breakpoints catch in earlier

for light-tailed distributions than for heavy-tailed ones. Large data sets will often do better without

prior knowledge in the sense that the transformation is better estimated from pure internal data, how-

ever a deep data analysis needs to be done by practioners when the breakpoint is reached to find out

if the internal data alone includes large losses that they believe cover the whole company operational

risk exposure. Otherwise the mixing model should be preferred since it automatically enriches the

internal exposure with large external losses.

13


14/18

References

[1] BOLANCE, C., GUILLEN , M., AN D NIELSEN, J.P. (2003). Kernel density estimation of actu-

arial loss functions. Insurance, Mathematics and Economics, Vol. 32, 19-36.

[2] BOLANCE, C, GUILLEN , M. AN D NIELSEN, J .P. (2008). Inverse Beta transformation in kernel

density estimation. Statistics & Probability Letters (to appear).

[3] BUC H-L ARSEN, T., NIELSEN, J.P., GUILLEN , M. AN D BOLANCE, C. (2005). Kernel den-

sity estimation for heavy-tailed distributions using the Champernowne transformation. Statistics,

Vol. 39, No. 6, 503-518.

[4] BUC H-K ROMANN, T., ENGLUND, M., GUSTAFSSON, J., NIELSEN, J.P. AN D THURING, F.

(20 07). Non-parametric estimation of operational risk losses adjusted for under-reporting. Scan-

dinavian Actuarial Journal, Vol. 4, 293-304.

[5] BUHLMANN, H. , SHEVCHENKO, P.V. AN D WUTHRICH, M.V. (2007). A toy model for

operational risk quantification using credibility theory. Journal of Operational Risk, Vol. 2, No.

1, 3-20.

[6] CHAMPERNOWNE, D.G. (1936). The Oxford meeting, September 25-29, 1936. Econometrica,

Vol. 5, No. 4, October 1937.

[7] CHAMPERNOWNE, D.G. (1952). The graduation of income distributions. Econometrica, Vol.

20, 591-615.

[8] CIZEK, P., HARDLE, W. AN D WERON, R. (2005). Statistical Tools for Finance and Insurance.

Springer-Verlag Berlin Heidelberg.

[9] CLEMENTS, A.E., HUR N, A.S. AN D LINDSAY, K.A. (2003 ). Mobius-like mappings and their

use in kernel density estimation. Journal of the Americal Statistical Association, Vol. 98, 993-

1000.

[10] CRUZ , M.G. (2001). Modeling, Measuring and Hedging Operational Risk. John Wiley & Sons,

LTD.

14


15/18

[11] DEGEN, M., EMBRECHTS, P. AN D LAMBRIGGER, D.D. (2007). The Quantitative Modeling

of Operational Risk: Between G-and-H and EVT. Astin Bulletin, Vol. 37, No. 2, 265-292.

[12] DIEBOLD, F., SCHUERMANN, T. AN D STROUGHAIR, J. (2001). Pitfalls and oppurtunities in

the use of extreme value theory in risk mangement. In: Refenes, A.-P., Moody, J. and Burgess,

A. (Eds.), Advances in Computational Finance, Kluwer Academic Press, Amsterdam, pp. 3-12,

Reprinted from: Journal of Risk Finance, Vol. 1, 30-36 (Winter 2000).

[13] DUTTA, K. AN D PERRY, J. (2006). A tale of tails: an empirical analysis of loss distribution

models for estimating operational risk capital. Federal Reserve Bank of Boston, Working Paper

No 06-13.

[14] EMBRECHTS, P., KLUPPELBERG, C. AN D MIKOSCH, T. (1999). Modeling Extremal Events

for Insurance and Finance. Springer.

[15] EMBRECHTS, P. (2000). Extremes and Integrated Risk Management. London: Risk Books,

Risk Waters Group.

[16] FIGINI, S., GIUDICI, P., UBERTI, P. AN D SANYAL, A. (2008). A statistical method to opti-

mize the combination of internal and external data in operational risk measurement. Journal of

Operational Risk, Vol. 2, No. 4, 69-78.

[17] GUILLEN , M., GUSTAFSSON, J., NIELSEN, J.P. AN D PRITCHARD, P. (2007 ). Using external

data in operational risk. Geneva Papers, Vol. 32, 178-189.

[18] GUSTAFSSON, J . , NIELSEN, J.P., PRITCHARD, P. AN D ROBERTS, D. (2006). Quantifying

Operational Risk Guided by Kernel Smoothing and Continuous credibility: A Practioners view.

Journal of Operational Risk, Vol. 1, No. 1, 43-56.

[19] GUSTAFSSON, J., HAGMANN, M., NIELSEN, J.P. AN D SCAILLET, O. (2008). Local Trans-

formation Kernel Density Estimation of Loss Distributions. Journal of Business and Economic

Statistics (to appear)

15


16/18

[20] HJORT, N.L. AN D GLA D, I.K . (1995). Nonparametric Density Estimation with a Parametric

Start. Annals of Statistics, Vol. 24, 882-904.

[21] KLUGMAN, S.A., PANJER, H.A. AN D WILLMOT, G.E. (1998). Loss Models: From Data to

Decisions. New York: John Wiley & Sons, Inc.

[22] LAMBRIGGER, D.D., SHEVCHENKO, P.V. AN D WUTHRICH, M.V. (2007). The quantification

of operational risk using internal data, relevant external data and expert opinion. Journal of

Operational Risk, Vol. 2, No. 3, 3-27.

[23] MCNEI L, A.J. , FRE Y, R. AN D EMBRECHTS, P. (2005). Quantitative Risk Management.

Princeton Series in Finance.

[24] PANJER, H.H. (2006). Operational Risk: Modeling Analytics. New York: John Wiley & Sons,

Inc.

[25] SHEVCHENKO, P.V. AN D WUTHRICH, M.V. (2006). The structural modeling of operational

risk via Bayesian inference. Journal of Operational Risk, Vol. 1, No. 3, 3-26.

[26] VERRALL, R.J., COWELL, R. AN D KHOON, Y.Y. (2007). Modelling Operational Risk with

Bayesian Networks. Journal of Risk and Insurance, Vol. 74, No. 4, 795-827.

[27] WAN D, M.P., MARRON, J.S. AN D RUPPERT, D. (1991). Transformation in Density Estima-

tion (with comments). Journal of the American Statistical Association, Vol. 94, 1231-1241.

[28] WAN D, M.P. AN D JONES, M.C. (1995). Kernel Smoothing. Chapman & Hall.

16


17/18

Lognormal Distribution

Internal Data

ISE(R

el)

10

20

50

100

200

500

1000

2000

4000

0.9

94

0.9

98

1.0

02

f^(x, ^

y

)100

f^(x, ^

y

)5000

f^(x, ^

y

)100

f^(x, ^

y

)5000


Internal Data

ISE(R

el)

10

20

50

100

200

500

1000

2000

4000

0.8

0.9

1.0

1.1

1.2

1.3

1.4

Benchmark Weibull


Internal Data

WISE(Rel)

10

20

50

100

200

500

1000

2000

4000

0.9

6

0.9

8

1.0

0

1

.02

f^(x, ^

y

)100

f^(x, ^

y

)5000

f^(x, ^

y

)100

f^(x, ^

y

)5000


Internal Data

WISE(Rel)

10

20

50

100

200

500

1000

2000

4000

0.8

1.0

1.2

1.4

1.6

Benchmark Weibull

Figure 1: The relative error between model (3.2) and mixing model (3.3) estimated on data driven lognormaldata. The internal data used is in the range [10, 4000] and the mixing model (3.3) is enriched by theexternal sample sizes m = 100 and m = 5000.

17


18/18

GCD

Internal Data

ISE(R

el)

10

20

50

100

200

500

1000

2000

4000

0.7

0.8

0.9

1.0

1.1

1.2

f^(x, ^

y

)100

f^(x, ^

y

)5000

f^(x, ^

y

)100

f^(x, ^

y

)5000

GCD

Internal Data

ISE(R

el)

10

20

50

100

200

500

1000

2000

4000

0.5

1.0

1.5

2

.0

2.5

3.0

Benchmark Weibull

GCD

Internal Data

WISE(Rel)

10

20

50

100

200

500

1000

2000

4000

0.0

0.5

1.0

1.5

2.0

f^(x, ^

y

)100

f^(x, ^

y

)5000

f^(x, ^

y

)100

f^(x, ^

y

)5000

GCD

Internal Data

WISE(Rel)

10

20

50

100

200

500

1000

2000

4000

0

2

4

6

8

10

Benchmark Weibull

Figure 2: The relative error between model (3.2) and mixing model (3.3) estimated on data driven GCD data.The internal data used is in the range [10, 4000] and the mixing model (3.3) is enriched by theexternal sample sizes m = 100 and m = 5000.

18

Download - A Mixing Model for Operational Risk

Top Related