Transcript
  • 8/9/2019 A Mixing Model for Operational Risk

    1/18Electronic copy available at: http://ssrn.com/abstract=1090863

    A Mixing Model For Operational Risk

    Jim Gustafsson Jens Perch Nielsen

    30th June 2008

    Abstract

    External data can often be useful in improving estimation of operational risk loss distributions.

    This paper develops a systematic approach that incorporates external information into internal

    loss distribution modelling. The standard statistical model resembles bayesian methodology or

    credibility theory in the sense that prior knowledge (external data) has more weight when internal

    data is scarce than when internal data is abundant.

    Key words and phrases: Mixing data sources, Prior knowledge, Operational risk, Actuarial

    loss models, Transformation, Generalized Champernowne distribution, Scarce data.

    JEL Classification: C13, C14.

    1 Introduction

    Calculating loss distributions for operational risk by only using internal data often fails to capture

    potential risks and unprecedented large loss amounts that has huge impact on the capital. Conversely,

    calculating loss distribution by only using external data provides a capital figure that is not sensi-

    tive to internal losses, does not increase despite the occurrence of a large internal loss, and does not

    decrease despite the improvement of internal controls. The proposed model in this paper contains

    both internal and external data. We first go through an intuitive and simplified example on how prior

    RSA Scandinavia & University of Copenhagen | Gammel Kongevej 60 | DK 1790 Copenhagen V, Denmark| E-mail:[email protected]| telephone: +45 33 555 23 57, fax: +45 33 55 21 22 |Cass Business School | E:mail: Jens.Nielsen,[email protected]|

    1

  • 8/9/2019 A Mixing Model for Operational Risk

    2/18Electronic copy available at: http://ssrn.com/abstract=1090863

    knowledge can be incorporated into modern estimation techniques. Consider the simplified example:

    Let (Hi)1in be iid stochastic variables with density h on [0, 1], and let h be another density on [0, 1]

    representing prior knowledge of the density h. We wish to estimate h guided by our prior knowledge

    density h. Let now g = h/h and let h = h g be our estimator ofh guided by prior knowledge.When h is close to h then g is close to the density of a Uniform distribution. To estimate g we

    take advantage of nonparametric smoothing techniques, where we consider the simplest smoothing

    technique available: kernel smoothing. Note that in the above estimation of h, we have changed the

    estimation from estimating the completely unknown and complicated density h to the simpler g that

    is close to the density of a Uniform distribution when h represent good prior knowledge. Now let g

    be a kernel density estimation function that estimates g. The nature of kernel smoothing is such that

    it smooths a lot when data is sparse. In our case this means that h will be close to h for very small

    sample sizes. When data is more abundant not as much kernel smoothing will take place and the com-

    plicated features ofg can be visualized. Therefore, kernel smoothing incorporating prior knowledge

    has many similarities to bayesian methodology or credibility theory that also use global information

    as prior knowledge when data is scarce and a more local estimation when data is abundant.

    The purpose of this paper is to use prior knowledge from external data to improve our estimation

    of the internal data distribution. The underlying dynamics of the new mixing model transformation

    approach reflects the simple intuition in the illustration above on how to include prior knowledge

    on [0, 1]-densities. In our case we wish to estimate loss distributions on the entire positive real axis.

    This is done by estimating a distribution F on the external data. Then we transform the internal data

    (Hi)1in with the estimated external cumulative distribution function, i.e using F(Hi). The trans-

    formed data set will have a density equal to the true internal density divided by the estimated external

    density distribution transformed such that the support is on [0, 1]. When the external data distribution

    is a close approximation to the internal data distribution, we have simplified the problem to estimating

    something close to a Uniform distribution. When this estimation problem has been solved through

    kernel smoothing, we back-transform to the original axis and find our final mixing estimator.

    2

  • 8/9/2019 A Mixing Model for Operational Risk

    3/18

    The proposed mixing model is based on a semi-parametric estimator of the loss distribution. Klug-

    man, Panjer and Willmot (1998), McNeil, Frey and Embrechts (2005), Cizek, Hardle and Weron

    (2005) and Panjer (2006) are important introductions to actuarial estimation techniques of purely

    parametric loss distributions. Cruz (2001) describes a number of parametric distributions which can

    be used for modeling loss distributions. Embrechts, Kluppelberg and Mikosch (1999) and Embrechts

    (2000) focus on the tail and Extreme Value Theory (EVT). Further, the recent paper by Degen, Em-

    brechts and Lambrigger (2007) discusses some fundamental properties of the g-and-h distribution

    and how it is linked to the well documented EVT based methodology. However, if one focus on the

    excess function, the link to the Generalized Pareto Distribution (GPD) has an extremely slow conver-

    gency rate and capital estimation for small level of risk tolerance using EVT may lead to inaccurate

    results if the parametric g-and-h distribution is chosen as model. Also Diebold, Schuermann and

    Stroughair (2001) and Dutta and Perry (2006) stress weaknesses of EVT when it comes to real data

    analysis. This problem may be solved by non- and semi-parametric procedures. For a review of

    modern kernel smoothing techniques see Wand and Jones (1995). During the last decade a class of

    semi-parametric methods was developed and designed to work better than a purely non-parametric

    approach, see Bolance, Guillen and Nielsen (2003), Hjort and Glad (1995) and Clements, Hurn and

    Lindsey (2003). They showed that non-parametric estimators could substantially be improved in a

    transformation process and they offer several alternatives for the transformation itself.

    The paper by Buch-Larsen, Bolance, Guillen and Nielsen (2005) maps the original data within

    [0, 1] via the parametric start and correct non-parametrically for possible misspecification. A flexible

    parametric start helps to mitigate an accurate estimation, Buch-Larsen et al. (2005) generalized the

    Champernowne distribution (GCD), see Champernowne (1936,1952), and used the cumulative dis-

    tribution as transformation function. In the spirit of Buch-Larsen et al. (2005) many remedies have

    been proposed. Gustafsson, Hagmann, Nielsen and Scaillet (2008) investigate the performance be-

    tween symmetric and asymmetric kernels in [0,1], Gustafsson, Nielsen, Pritchard and Roberts (2006)

    estimate operational risk losses with a continuous credibility model, and Guillen, Gustafsson, Nielsen

    and Pritchard (2007), and the extended version by Buch-Kromann, Englund, Gustafsson, Nielsen and

    3

  • 8/9/2019 A Mixing Model for Operational Risk

    4/18

    Thuring (2007), introduced the concept of under-reporting in operational risk quantification. Bolance,

    Guillen and Nielsen (2008) develop a transformation kernel density estimator by using a double

    transformation to estimate heavy-tailed distributions. Before describing the new method we mention

    other recent methods in the literature. The papers by Shevchenko and Wuthrich (2006), Buhlmann,

    Shevchenko and Wuthrich (2007) and Lambrigger, Shevchenko and Wuthrich (2007) combine loss

    data with scenario analysis information via Bayesian inference for the assessment of operational risk,

    and Verrall, Cowell and Khoon (2007) examine Bayesian networks for operational risk assessment.

    Figini, Guidici, Uberti and Sanyal (2008) develops a method that can estimate the internal data distri-

    bution with help from truncated external data originating from the exact same underlying distribution.

    In our approach we allow the external distribution to be different from the internal distribution, our

    transformation approach has the consequence that the internal estimation process is guided by the

    more stable estimator originating from the external data. If the underlying distribution of the external

    data is close to the underlying distribution of the internal data, then this approach improves the effi-

    ciency of estimation. We also present a variation of this approach where we correct for the estimated

    median, such that only the shape of the underlying internal data has to be close to the shape of the

    underlying external data for our procedure to work well.

    2 The asymptotic properties of the transformation approach

    Let (Xi)1in be a sequence ofiid random losses from a probability distribution F(x) with unknown

    density function f(x). Let T(x) be a set of twice continuous differentiable transformations, with a

    first derivative T(x) that serves as a global start. This density is assumed to provide a meaningful butpotentially inaccurate description off(x). The transformation function T(x) could be a cumulative

    distribution function, or it could be a non-parametric class of functions, see e.g Wand et al. (1991)

    and Buch-Larsen et al. (2005). The transformed density estimator with a local constant boundary

    4

  • 8/9/2019 A Mixing Model for Operational Risk

    5/18

    correction can be written as

    f(x) = T(x)01 (T(x), h)

    1 n1n

    i=1 K

    h (T(Xi)

    T(x))

    = T(x) l(T(x)). (2.1)

    with local model l(), transformed losses T(Xi), bandwidth h, symmetric kernel Kh with Kh() =1h

    K h

    and

    ij (u, h) =

    min{1,u/h}max{1,(u1)/h}

    (vh)iK(v)jdv. (2.2)

    A deeper analysis of the transformation process could be found in e.g Gustafsson et al. (2008). Theasymptotic theory of (2.1) are presented by theorem 1, and the proof can be found in Buch-Larsen et

    al. (2005).

    Theorem 1 Let the transformation function T(x) be a two times differentiable known function. The

    bias off(x) is given by

    E f(x)

    f(x) = 2121 (T(x), h) (T(x)) + o(h2)with (T(x)) =

    f(x)T(x)

    1

    T(x)

    and the variance

    V

    f(x)

    = 02 (T(x), h) (nh)

    1 f(x)T(x) + o

    (nh)1

    where the asymptotics is given for n , h 0 andnh .

    3 The transformation technique

    In the previous section a general description of the transformation approach and its asymptotic prop-

    erties was given. In this section we assume that the transformation function is a cumulative distri-

    bution function described by T(x, ) with density T(x, ), indexed by the global parameter =

    {1, 2,...,p} Rp

    . As before, let (Xi)1in be a sequence of random losses originating from a

    5

  • 8/9/2019 A Mixing Model for Operational Risk

    6/18

    probability distribution F(x) with unknown density f(x). Then equation (2.1) could be reformulated

    as

    f(x, ) = T(x, )

    01 (T(x, ), h)1 n1

    ni=1

    Kh (T(Xi, ) T(x, ))

    = T(x, ) l(T(x, )) (3.1)

    Now let be some suitable estimator of, then the estimator (3.1) becomes

    f(x, ) = T(x, )

    01 T(x, ), h1

    n1n

    i=1 Kh T(Xi, ) T(x, )

    = T(x, ) l(T(x, )) (3.2)

    From e.g Buch-Larsen et al. (2005) we know that when is a square-root-n-consistent estimator of

    the asymptotic theory of f(x, ) is equivalent to the asymptotic theory of f(x, ). The asymptotic

    theory off(x, ) immediately follows from Theorem 1, and

    nhfx, Efx, N0, 02 (T(x), h) f(x)T(x)

    nh

    holds.

    Let now (Xi)1in be a sequence of collected internal losses and let (Yj)1jm be a sequence of

    external reported losses. Then (Yj)1jm represent prior knowledge that should enrich a limited in-

    ternal data set. The parametric density estimator based on the external data set offers prior knowledge

    from the general industry. The correcting of this prior knowledge is based on a local kernel smoother

    of the transformed internal data points. This provide us with consistent and fully nonparametric esti-

    mator of the density of the internal data . The approach is therefore fully nonparametric at the same

    time as it is informed by prior information based on external data sources. The new mixing local

    transformation kernel density estimator has asymptotics as (3.1) in Theorem 1. Hence, ifT(x, y) is

    a misspecified approximation of f(x), a convergency in probability of y takes place to the pseudo

    6

  • 8/9/2019 A Mixing Model for Operational Risk

    7/18

    true value which minimizes the Kullback-Leibler distance1 between T(x, y) and the true density

    f(x). Therefore y is a square-root-n-consistent estimator of. The proposed mixing model could be

    expressed as

    f(x, y) = T(x, y)

    01

    T(x, y), h

    1n1

    ni=1

    Kh

    T(Xi, y) T(x, y)

    = T(x, y) l(T(x, y)) (3.3)

    and have the same asymptotic theory as (3.1) for n , h 0 and nh .

    The global start T(x, ) in (3.2) and (3.3) are modelled by the Generalized Champernowne Dis-tribution (GCD) with density

    T(x, ) = (x + c)(1) ((M + c) (c))

    ((x + c) + (M + c) 2(c))2 , x R+ (3.4)

    incorporating internal parameters = { , M, c} and external parameters y = {y, My, cy}. Here

    is a tail parameter, M controls the body of the distribution and c is a shift parameter that control the

    domain close to zero. The parameter vectors and y are estimated by maximum likelihood and for

    a deeper analysis on the GCD, see Buch-Larsen et al. (2005).

    Another slightly different mixing model will also be included in the paper. This estimator follows

    the same idea as in Gustafsson et al. (2005). The model is the same as (3.3) but instead of using

    the prior knowledge parameter vector y, this generalized model transforms the internal losses with

    a combined parameter vector y = {y, M , cy}. The differences are that prior knowledge is onlyprovided by the tail parameter y and the shift parameter cy, while the mode is represented by the

    estimated internal parameter M. With this model we enclose the external parameters y and cy with

    a body described internally by M.

    1The Kullback-Leibler distance is a distance function from a true probability density, f(x), to a target probability

    estimator, T(x, y), and defined as

    0f(w)log

    f(w)

    T(w,y)

    dw

    7

  • 8/9/2019 A Mixing Model for Operational Risk

    8/18

    4 Data study

    This section demonstrates the practical relevance of the above presented mixing model. We estimate

    a loss distribution and calculate next years operational risk exposure by using the common risk mea-

    sures Value-at-Risk (VaR) and Tail-Value-at-Risk (TVaR) for different levels of risk tolerance. For

    the severity estimator we employ model (3.2) and model (3.3), and benchmark with the parametric

    Weibull distribution. For the frequency model we assume that N(t) is an independent homogenous

    Poisson process, N(t) P o(t), with positive intensity . By using Monte Carlo simulation withthe severity and frequency assumptions we could create a simulated one year operational risk loss

    distribution.

    We denote the internal collected losses from the event risk category Employment Practices and

    Workplace Safety with (Xi)1iN(t) where N(t) describes random number of losses over a fixed time

    period t = 0,...,T and with N(0) = 0. Further, (Yj)1jM(t) represent external data from the same

    event risk category with random number of losses M(t) with M(0) = 0. Table 1 reports summary

    statistics on each data sets.

    TABLE 1

    Statistics for Event Risk Category Employment Practices and Workplace Safety

    Number of

    Losses

    Maximum

    Loss ($M)

    Sample

    Mean ($M)

    Sample

    Median ($M)

    Standard

    Deviation ($M)

    Time

    Horizon (T)

    Internal data 120 16.80 1.86 0.30 3.66 2

    External data 6526 561.43 1.82 0.18 13.41 8

    The mean and median are similar for the two data sets. However, the number of losses, the maximum

    loss, the standard deviation and the collection period are widely different. We condition on N = n and

    sample the Poisson process of internal events that occur over a one year time horizon. The maximum

    likelihood estimator of the annual intensity of internal losses n = 120 is = n/T = 120/2 = 60,

    and denote the annual simulated frequencies by r with r = 1,...,R and number of simulations

    R = 10.000. For each r we draw randomly uniform distributed samples and combine these with

    8

  • 8/9/2019 A Mixing Model for Operational Risk

    9/18

    loss sizes taken from the inverse function of the severity distribution. The outcome is the annual total

    loss distribution denoted by

    Sr =

    rk=1

    F(urk, ), r = 1,...,R

    with urk U(0, 1) for k = 1, ..., r, and

    F(urk, ) =urk0

    f(, )d

    with incorporating for model (3.2) or the prior knowledge parameters y and y for (3.3). For

    simplicity, we introduce the abbreviations M1 for the semi-parametric model (3.2), M2 and M3 for

    (3.3) with y and y respectively. M4 is the purely external data version of (3.2), where the prior

    knowledge is locally corrected with external data. This corresponds to the situation when a company

    has not started to collect internal data and must rely entirely on prior knowledge. Finally, M5 is

    the benchmark model with a Weibull assumption on the severity. Then, a loss distribution could be

    calculated with relevant return periods and thereby identify the capital to be held to protect against

    unexpected losses. The two risk measured used are the VaR

    VaR (Sr) = sup {s R | P (Sr s) }

    and TVaR that give us the expectation of the area above VaR and is defined as

    TVaR (Sr) = E[Sr | Sr V aR (Sr)]

    for risk tolerance . Table 2 collects summary statistics for the simulated total loss distribution across

    each model. Among the usual summary statistics we report VaR and TVaR for risk tolerance =

    {0.95, 0.99, 0.999}.

    9

  • 8/9/2019 A Mixing Model for Operational Risk

    10/18

    TABLE 2

    Statistics of Simulated Loss Distributions for Event Risk Category Employment Practices and Workplace Safety ( $M)

    Model Mean Median Sd VaR95% VaR99% VaR99.9% TVaR95% TVaR99% TVaR99.9%

    M1 226 178 149 547 765 980 691 889 1180

    M2 244 186 167 609 881 1227 787 1025 1385

    M3 296 213 241 633 936 1331 820 1143 1487

    M4 316 274 322 801 1165 1644 1073 1388 1858

    M5 100 97 27 249 272 294 363 383 405

    One can see from Table 2 that all loss distributions are right skewed since all means are larger than

    its respective median. Interpreting the VaR results one notices that the fully prior knowledge model

    M4 shows much larger values than the others. The benchmark model M5 predicts the lowest values

    of the considered models and if we compare M5 against the purely internal estimator M1, model M5

    suggests 30% 40% capital requirement, of the total amount M1 recommends. For the mixing modelM2 we have only 8% higher sample mean and 10%

    25% higher VaR values than M1 and this is due

    to prior knowledge based on external data. If we view the TVaR results, similar pattern as in VaR

    could be visualized. A conclusion is that the two mixing models, M2 and M3, are able to stabilizing

    the estimation process and does not underestimate the tail as model M5 apparently does.

    5 Data Driven Simulation Study

    This section provides a data driven simulation study. Here we evaluate scenarios with different prior

    knowledge for the new mixing model. The study demonstrates the intuitive expectation, namely, the

    mixing model outperform model This simulation study teach us what we already expected, namely,

    that the mixing model outperform model (3.2) in situations with scarce data. As data becomes more

    abundant, we experienced that the proposed model oppose the performance, sometimes even hurting

    a bit. Another insight is that the parametric Weibull distribution provides doubtful goodness-of-fit

    to heavy tailed distributed data. The study brings into play two changing true distributions with

    10

  • 8/9/2019 A Mixing Model for Operational Risk

    11/18

    unlike characteristics and the idea with a data driven simulation study, in contrast to a predetermined

    simulation study, is that the two true distributions are estimated on the real operational risk samples

    described in the previous section. The two distributions used in the study are presented by their

    respective densities below.

    1. Lognormal, f(x, 1) =1

    x2

    e12(

    log(x) )

    2

    , x R+, with location- and scale parameter1 = {, }.

    2. Generalized Champernowne Distribution (GCD), f(x, 2) =(x+c)(1)((M+c)(c))((x+c)+(M+c)2(c))2 , x R+,

    with 2 =

    { , M, c

    }.

    The study was also made with the distributions; Exponential, Weibull, Gamma and Pareto. The

    outcomes were similar to the lognormal and GCD, but are not outlined in the paper although avail-

    able upon request. The two true distributions are estimated on the two available data sets by max-

    imum likelihood estimation and produce the estimated parameter vectors d and yd with d = 1, 2.

    Random losses are drawn from the estimated true models to obtain pseudo data. The internal data

    are expressed by (X

    b

    di)1ink , were d indicate distribution treated, the iteration b = 1,...,B, withB = 2000, and the sample sizes simulated for next years exposure N(t) = nk are chosen as

    nk = {10, 20, 50, 100, 200, 500, 1000, 2000, 4000}. The external samples are denoted by (Ybdj)1jmlwith M(t) = ml where ml = {100, 5000}. For each data set the internal and external global start,T(x, ) in (3.2) and (3.3), are assumed to be explained by the GCD described in (3.4). The modeldescribed in (3.2) is then estimated on each sample combination {b,d,nk}, and for each sample nkwe estimate the mixing model (3.3) on external data combination

    {b,d,ml

    }.

    In summary, for each internal sample nk six estimators will be compared. One that is pure internal

    (3.2), four scenarios of the mixing model (3.3), informed with different external sample sizes ml and

    different global parametric start. Also here we benchmark with a parametric Weibull distribution.

    The performance measures considered are means over replications for the integrated squared error

    (ISE), which measure the error between the true and the estimated density with equal weight across

    the support, and the weighted integrated squared error (WISE) that focuses on the fit in the tails. We

    11

  • 8/9/2019 A Mixing Model for Operational Risk

    12/18

    can write the statistical performance measures as

    ISE =

    1

    B

    Bb=1

    0

    f(x, ) f(x,

    d)2

    dx

    1/2

    WISE =1

    B

    Bb=1

    0

    f(x, ) f(x, d)

    2x2dx

    1/2

    with true density f(x, d), and the estimator f(x, ) as one of the six different models. Note that thetrue model is only enriched with the internal data.

    Figure 1 presents the results in the situation with the true lognormal distribution. We illustrate the

    result by a relative error, meaning that the outcome that the pure internal estimator of ISE and WISE

    are divided with the outcome of the proposed models. A curve above one provides a worse prediction

    from the mixing model in that specific sample point.

    *** Figure 1 About Here ***

    The top left-hand graph fluctuates around one. The mixing model therefore has similar ISE as the

    pure internal estimator. For some internal sample points we have less than a half percentage better

    results using the pure internal estimator (3.2), and of the compared mixing models the model that is

    enriched with 5000 external observations, f(x, y)5000 performs best when the maximum number of

    internal observations are reviewed. The top right-hand graph presents the benchmark model perfor-

    mance. The parametric Weibull grows from 20% worse to above 30% worse as the internal sample

    size increases. For the tail measure WISE we stress that above n4 the mixing models are stabilized

    between 1%-2% worse performance. Note that the breakpoint are higher when one focuses on the

    extremes than with the whole axes as in ISE. The bottom right-hand graph presents the benchmark

    performance and clearly the Weibull has predicts poorly the true underlying density of the internal

    data set with culmination 60% poorer results on sample size n9 = 4000.

    12

  • 8/9/2019 A Mixing Model for Operational Risk

    13/18

    In the situation with the GCD almost identical appearance as in the lognormal case could be rec-

    ognized. Figure 2 gives a visual interpretation of the results.

    *** Figure 2 About Here ***

    For the ISE the breakpoint have increased to around n5 compared to the lognormal, thereafter the

    mixing models become stabilized in the region between 5%-10% poorer prediction power. Focusing

    on WISE we could see that the breakpoint are between 100-500 internal observations.

    6 Conclusions

    This paper gives an intuitive and easily implemented model that combine different data sources. From

    the data study we found that using a light tailed parametric fit, small numbers on the Value-at-Risk

    would be identified. Also, the mixing methods seem to give more stable and more realistic results than

    parametric methods not including external data. From the data driven simulation study we conclude

    that in the situation with scarce internal data the proposed mixing model outperforms the model with

    only one data source. However, as the internal data becomes more abundant the breakpoint where

    (3.2) becomes better than (3.3) could be identified. Also we find that the breakpoints catch in earlier

    for light-tailed distributions than for heavy-tailed ones. Large data sets will often do better without

    prior knowledge in the sense that the transformation is better estimated from pure internal data, how-

    ever a deep data analysis needs to be done by practioners when the breakpoint is reached to find out

    if the internal data alone includes large losses that they believe cover the whole company operational

    risk exposure. Otherwise the mixing model should be preferred since it automatically enriches the

    internal exposure with large external losses.

    13

  • 8/9/2019 A Mixing Model for Operational Risk

    14/18

    References

    [1] BOLANCE, C., GUILLEN , M., AN D NIELSEN, J.P. (2003). Kernel density estimation of actu-

    arial loss functions. Insurance, Mathematics and Economics, Vol. 32, 19-36.

    [2] BOLANCE, C, GUILLEN , M. AN D NIELSEN, J .P. (2008). Inverse Beta transformation in kernel

    density estimation. Statistics & Probability Letters (to appear).

    [3] BUC H-L ARSEN, T., NIELSEN, J.P., GUILLEN , M. AN D BOLANCE, C. (2005). Kernel den-

    sity estimation for heavy-tailed distributions using the Champernowne transformation. Statistics,

    Vol. 39, No. 6, 503-518.

    [4] BUC H-K ROMANN, T., ENGLUND, M., GUSTAFSSON, J., NIELSEN, J.P. AN D THURING, F.

    (20 07). Non-parametric estimation of operational risk losses adjusted for under-reporting. Scan-

    dinavian Actuarial Journal, Vol. 4, 293-304.

    [5] BUHLMANN, H. , SHEVCHENKO, P.V. AN D WUTHRICH, M.V. (2007). A toy model for

    operational risk quantification using credibility theory. Journal of Operational Risk, Vol. 2, No.

    1, 3-20.

    [6] CHAMPERNOWNE, D.G. (1936). The Oxford meeting, September 25-29, 1936. Econometrica,

    Vol. 5, No. 4, October 1937.

    [7] CHAMPERNOWNE, D.G. (1952). The graduation of income distributions. Econometrica, Vol.

    20, 591-615.

    [8] CIZEK, P., HARDLE, W. AN D WERON, R. (2005). Statistical Tools for Finance and Insurance.

    Springer-Verlag Berlin Heidelberg.

    [9] CLEMENTS, A.E., HUR N, A.S. AN D LINDSAY, K.A. (2003 ). Mobius-like mappings and their

    use in kernel density estimation. Journal of the Americal Statistical Association, Vol. 98, 993-

    1000.

    [10] CRUZ , M.G. (2001). Modeling, Measuring and Hedging Operational Risk. John Wiley & Sons,

    LTD.

    14

  • 8/9/2019 A Mixing Model for Operational Risk

    15/18

    [11] DEGEN, M., EMBRECHTS, P. AN D LAMBRIGGER, D.D. (2007). The Quantitative Modeling

    of Operational Risk: Between G-and-H and EVT. Astin Bulletin, Vol. 37, No. 2, 265-292.

    [12] DIEBOLD, F., SCHUERMANN, T. AN D STROUGHAIR, J. (2001). Pitfalls and oppurtunities in

    the use of extreme value theory in risk mangement. In: Refenes, A.-P., Moody, J. and Burgess,

    A. (Eds.), Advances in Computational Finance, Kluwer Academic Press, Amsterdam, pp. 3-12,

    Reprinted from: Journal of Risk Finance, Vol. 1, 30-36 (Winter 2000).

    [13] DUTTA, K. AN D PERRY, J. (2006). A tale of tails: an empirical analysis of loss distribution

    models for estimating operational risk capital. Federal Reserve Bank of Boston, Working Paper

    No 06-13.

    [14] EMBRECHTS, P., KLUPPELBERG, C. AN D MIKOSCH, T. (1999). Modeling Extremal Events

    for Insurance and Finance. Springer.

    [15] EMBRECHTS, P. (2000). Extremes and Integrated Risk Management. London: Risk Books,

    Risk Waters Group.

    [16] FIGINI, S., GIUDICI, P., UBERTI, P. AN D SANYAL, A. (2008). A statistical method to opti-

    mize the combination of internal and external data in operational risk measurement. Journal of

    Operational Risk, Vol. 2, No. 4, 69-78.

    [17] GUILLEN , M., GUSTAFSSON, J., NIELSEN, J.P. AN D PRITCHARD, P. (2007 ). Using external

    data in operational risk. Geneva Papers, Vol. 32, 178-189.

    [18] GUSTAFSSON, J . , NIELSEN, J.P., PRITCHARD, P. AN D ROBERTS, D. (2006). Quantifying

    Operational Risk Guided by Kernel Smoothing and Continuous credibility: A Practioners view.

    Journal of Operational Risk, Vol. 1, No. 1, 43-56.

    [19] GUSTAFSSON, J., HAGMANN, M., NIELSEN, J.P. AN D SCAILLET, O. (2008). Local Trans-

    formation Kernel Density Estimation of Loss Distributions. Journal of Business and Economic

    Statistics (to appear)

    15

  • 8/9/2019 A Mixing Model for Operational Risk

    16/18

    [20] HJORT, N.L. AN D GLA D, I.K . (1995). Nonparametric Density Estimation with a Parametric

    Start. Annals of Statistics, Vol. 24, 882-904.

    [21] KLUGMAN, S.A., PANJER, H.A. AN D WILLMOT, G.E. (1998). Loss Models: From Data to

    Decisions. New York: John Wiley & Sons, Inc.

    [22] LAMBRIGGER, D.D., SHEVCHENKO, P.V. AN D WUTHRICH, M.V. (2007). The quantification

    of operational risk using internal data, relevant external data and expert opinion. Journal of

    Operational Risk, Vol. 2, No. 3, 3-27.

    [23] MCNEI L, A.J. , FRE Y, R. AN D EMBRECHTS, P. (2005). Quantitative Risk Management.

    Princeton Series in Finance.

    [24] PANJER, H.H. (2006). Operational Risk: Modeling Analytics. New York: John Wiley & Sons,

    Inc.

    [25] SHEVCHENKO, P.V. AN D WUTHRICH, M.V. (2006). The structural modeling of operational

    risk via Bayesian inference. Journal of Operational Risk, Vol. 1, No. 3, 3-26.

    [26] VERRALL, R.J., COWELL, R. AN D KHOON, Y.Y. (2007). Modelling Operational Risk with

    Bayesian Networks. Journal of Risk and Insurance, Vol. 74, No. 4, 795-827.

    [27] WAN D, M.P., MARRON, J.S. AN D RUPPERT, D. (1991). Transformation in Density Estima-

    tion (with comments). Journal of the American Statistical Association, Vol. 94, 1231-1241.

    [28] WAN D, M.P. AN D JONES, M.C. (1995). Kernel Smoothing. Chapman & Hall.

    16

  • 8/9/2019 A Mixing Model for Operational Risk

    17/18

    Lognormal Distribution

    Internal Data

    ISE(R

    el)

    10

    20

    50

    100

    200

    500

    1000

    2000

    4000

    0.9

    94

    0.9

    98

    1.0

    02

    f^(x, ^

    y

    )100

    f^(x, ^

    y

    )5000

    f^(x, ^

    y

    )100

    f^(x, ^

    y

    )5000

    Lognormal Distribution

    Internal Data

    ISE(R

    el)

    10

    20

    50

    100

    200

    500

    1000

    2000

    4000

    0.8

    0.9

    1.0

    1.1

    1.2

    1.3

    1.4

    Benchmark Weibull

    Lognormal Distribution

    Internal Data

    WISE(Rel)

    10

    20

    50

    100

    200

    500

    1000

    2000

    4000

    0.9

    6

    0.9

    8

    1.0

    0

    1

    .02

    f^(x, ^

    y

    )100

    f^(x, ^

    y

    )5000

    f^(x, ^

    y

    )100

    f^(x, ^

    y

    )5000

    Lognormal Distribution

    Internal Data

    WISE(Rel)

    10

    20

    50

    100

    200

    500

    1000

    2000

    4000

    0.8

    1.0

    1.2

    1.4

    1.6

    Benchmark Weibull

    Figure 1: The relative error between model (3.2) and mixing model (3.3) estimated on data driven lognormaldata. The internal data used is in the range [10, 4000] and the mixing model (3.3) is enriched by theexternal sample sizes m = 100 and m = 5000.

    17

  • 8/9/2019 A Mixing Model for Operational Risk

    18/18

    GCD

    Internal Data

    ISE(R

    el)

    10

    20

    50

    100

    200

    500

    1000

    2000

    4000

    0.7

    0.8

    0.9

    1.0

    1.1

    1.2

    f^(x, ^

    y

    )100

    f^(x, ^

    y

    )5000

    f^(x, ^

    y

    )100

    f^(x, ^

    y

    )5000

    GCD

    Internal Data

    ISE(R

    el)

    10

    20

    50

    100

    200

    500

    1000

    2000

    4000

    0.5

    1.0

    1.5

    2

    .0

    2.5

    3.0

    Benchmark Weibull

    GCD

    Internal Data

    WISE(Rel)

    10

    20

    50

    100

    200

    500

    1000

    2000

    4000

    0.0

    0.5

    1.0

    1.5

    2.0

    f^(x, ^

    y

    )100

    f^(x, ^

    y

    )5000

    f^(x, ^

    y

    )100

    f^(x, ^

    y

    )5000

    GCD

    Internal Data

    WISE(Rel)

    10

    20

    50

    100

    200

    500

    1000

    2000

    4000

    0

    2

    4

    6

    8

    10

    Benchmark Weibull

    Figure 2: The relative error between model (3.2) and mixing model (3.3) estimated on data driven GCD data.The internal data used is in the range [10, 4000] and the mixing model (3.3) is enriched by theexternal sample sizes m = 100 and m = 5000.

    18


Top Related