applications of risk theory and multivariate analysis in insurance practice

14
APPLIED STOCHASTIC MODELS AND DATA ANALYSIS, VOL. 11,231-244 (1995) APPLICATIONS OF RISK THEORY AND MULTIVARIATE ANALYSIS IN INSURANCE PRACTICE B. NIGGEMEYER, M. RADTKE AND A. REICH The Cologne Re, 11 Theodor-Heuss-Ring,D 50668, Cologne, Germany SUMMARY Segmentation strategies and differentiated preselection in underwriting call for new portfolio management techniques, the use of which is becoming increasingly widespread amongst insurance companies faced with growing competitive pressure. In recent years, mathematical procedures have been developed for this purpose and the applicability of existing procedures for use in insurance practice has been recognized. The present paper elucidates various methods for determining the aggregate claims distribution, which describes the performance and volatility of a portfolio. There then follows a presentation of multivariate methods, particularly generalized linear models and methods applying variance and discriminant analysis, which facilitate the analysis of more narrowly defined segments and subportfolios. Finally, the paper contains a description of applications used by numerous insurance companies, primarily for motor and property portfolios. KEY WORDS multivariate statistics; generalized linear models; aggregate claims 1. INTRODUCTION Insurance companies in Europe are increasingly exploiting the latitude created by the liberalization of the industry to develop new strategies in respect of products and markets. Against the backdrop of intensified competition, increasing signs of market saturation and a greater degree of customer orientation, the need to address systematically various customer groups by means of differentiated preselection and segmentation strategies is very much in the spotlight. Profit-oriented portfolio management entails the goal-directed organization and optimization of both the profit situation and the risk situation of individual subcollectives (segments). On the one hand, a decision must be made as to which risks an insurance company wishes to accept in order to improve its overall performance. On the other hand, in view of the fact that an insurance policy covers claims the scale of which is a priori unknown, it is necessary in economic terms to evaluate and safeguard against possible claims developments. On the claims side, the stochastic regularity of the influence exerted by random factors must be determined in order to implement a planned reaction v i s - h i s economic considerations; on the premium side, the task is to gauge the degree of flexibility offered by a market whose mood is characterized by sinking profit margins and intense competition. As the dynamics of competition become more vigorous, however, the active aspects of premium determination are frequently subject to extremely tight restrictions. Opportunities CCC 8755-0024/95/03023 1-14 0 1995 by John Wiley & Sons, Ltd. Received I8 April 1995

Upload: b-niggemeyer

Post on 06-Jun-2016

221 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Applications of risk theory and multivariate analysis in insurance practice

APPLIED STOCHASTIC MODELS AND DATA ANALYSIS, VOL. 11,231-244 (1995)

APPLICATIONS OF RISK THEORY AND MULTIVARIATE ANALYSIS IN INSURANCE PRACTICE

B. NIGGEMEYER, M. RADTKE AND A. REICH The Cologne Re, 11 Theodor-Heuss-Ring, D 50668, Cologne, Germany

SUMMARY

Segmentation strategies and differentiated preselection in underwriting call for new portfolio management techniques, the use of which is becoming increasingly widespread amongst insurance companies faced with growing competitive pressure. In recent years, mathematical procedures have been developed for this purpose and the applicability of existing procedures for use in insurance practice has been recognized. The present paper elucidates various methods for determining the aggregate claims distribution, which describes the performance and volatility of a portfolio. There then follows a presentation of multivariate methods, particularly generalized linear models and methods applying variance and discriminant analysis, which facilitate the analysis of more narrowly defined segments and subportfolios. Finally, the paper contains a description of applications used by numerous insurance companies, primarily for motor and property portfolios.

KEY WORDS multivariate statistics; generalized linear models; aggregate claims

1. INTRODUCTION

Insurance companies in Europe are increasingly exploiting the latitude created by the liberalization of the industry to develop new strategies in respect of products and markets.

Against the backdrop of intensified competition, increasing signs of market saturation and a greater degree of customer orientation, the need to address systematically various customer groups by means of differentiated preselection and segmentation strategies is very much in the spotlight.

Profit-oriented portfolio management entails the goal-directed organization and optimization of both the profit situation and the risk situation of individual subcollectives (segments). On the one hand, a decision must be made as to which risks an insurance company wishes to accept in order to improve its overall performance. On the other hand, in view of the fact that an insurance policy covers claims the scale of which is a priori unknown, it is necessary in economic terms to evaluate and safeguard against possible claims developments.

On the claims side, the stochastic regularity of the influence exerted by random factors must be determined in order to implement a planned reaction v i s - h i s economic considerations; on the premium side, the task is to gauge the degree of flexibility offered by a market whose mood is characterized by sinking profit margins and intense competition. As the dynamics of competition become more vigorous, however, the active aspects of premium determination are frequently subject to extremely tight restrictions. Opportunities

CCC 8755-0024/95/03023 1-14 0 1995 by John Wiley & Sons, Ltd.

Received I8 April 1995

Page 2: Applications of risk theory and multivariate analysis in insurance practice

232 B. NIGGEMEYER, M. RADTKE AND A. REICH

for portfolio management are to be found mainly in the development of strategies for selection and segmentation.

Thus, in terms of its mathematical essence, the first task which arises lies in the measurement of a portfolio’s performance in the light of the multidimensionality of the actual target process. A precondition for such a step is the formulation of economic decision models, which should also evaluate alternative modes of operational portfolio management in the light of their level of goal achievement. The range of options available includes all the measures associated with traditional risk policy. It is just as important to integrate the portfolio into the direct insurer’s overall risk situation by means of an economically efficient organization of risk transfer (in terms of both type and extent) as it is to segment the portfolio in the light of a specific, goal-directed marketing concept or with an eye to satisfactory product placement and design.

In the eighties, the development of mathematical methods for use in the insurance industry-and the fact that their applicability was recognized-served to bridge the long- standing gulf between mathematical theory and insurance practice in the sense of profit- oriented underwriting. These methods provide the three key target variables of profitability, security and growth with an operational basis.

The performance and volatility of a portfolio may be evaluated using so-called aggregate claims distributions. As far as the mathematical aspects are concerned, this involves one of the central topics in the field of risk theory and is an area which has witnessed some interesting developments. Above all, what is at stake here is the conflict (described in Section 2) between models which satisfy practical requirements but which are scarcely calculable (so-called individual models) and (so-called collective) models which can now be quickly handled numerically via recursive algorithms and which can serve as approximations (cf. References 1 and 2).

Leaving aside the generally antinomic goals of profitability and security, in the current business climate the question of growth assumes not only quantitative but also-and primarily-qualitative aspects for insurers; modem portfolio management goes beyond a merely global framework. Although the concept of equalization within the collective-as indeed is reflected in the perception of the portfoliwretains decisive significance for the insurance industry, insurers operating in markets which are in the throes of upheaval are also obliged to consider the level of heterogeneity (in terms of profitability and volatility) displayed by individual subsegments. In the mathematical field, such a differentiated viewpoint requires the application of multivariate -9ethods of mathematical statistics.

The focus here is on a mode of systematic risk segmentation whose scope extends not only to determining and proving the relevance of potential characteristics to the risk in question but which also has implications for individual tariffication and product development. A number of multivariate methods of practical relevance are described in Section 3.

By applying all these methods to a direct insurer’s concrete portfolio, it is possible to prepare a quantified description of the portfolio’s profit and risk situations, thereby satisfying a basic precondition for planned management.

An entire package of risk-policy measures relating to aspects of portfolio, reserving, reinsurance and premium policy can be individually tailored to the risk situation and quantified in terms of its effectiveness, thereby facilitating portfolio optimization by simultaneously allowing for both profit and security objectives.

Over the last five years, we have applied these methods on behalf of numerous direct insurance companies in various classes of business, particularly in the Property and Casualty sectors.

Page 3: Applications of risk theory and multivariate analysis in insurance practice

RISK THEORY AND MULTIVARIATE ANALYSIS 233

2. AGGREGATE CLAIM

A concrete portfolio management policy geared to an insurer’s profit, security and growth objectives is dependent upon a considerable volume of data, both on the portfolio as a whole and on its subcollectives.

For several decades, risk theory has been concerned with the theoretical description and optimization of insurance processes via mathematical models and economic decision criteria. It goes without saying that one of the key tasks facing risk theory has always been perceived to be the juxtaposition of a portfolio’s premium income with the random variable of the aggregate claim in terms of the sum of the individual claims. The focus here is on the determination of an anticipated equalization effect across the collective of all risks. Figure 1 shows the density of the aggregate claim of a large Fire portfolio.

The field of risk theory has long offered two fundamentally different classes of aggregate claim model, namely collective models and individual models. Individual risk theory assumes that the aggregate claim is composed of the sum of n independent (but not necessarily identically distributed, hence ‘individual’) risks

i - 1

where n is the (deterministic) number of all the policies in a portfolio.

Density of the Aggregate Claim

0,0080-

0, 0 0 7 0 -

0,0060-

1 0 , 0050- + .- d .- n 0,0040- 16 n 0 c ~10,0030-

0,0020-

0,0010-

0,0000 0 20000 40000 bOOOO 8 0 0 0 0 1 0 0 0 0 0 1 4 0 0 0 0 1 8 0 0 0 0

aggregate c l a i m

Figure 1. Density of the aggregate claim for the gross portfolio

Page 4: Applications of risk theory and multivariate analysis in insurance practice

234 B. NIGGEMEYER, M. RADTKE AND A. REICH

In collective models, however, the aggregate claim is described as the sum of a random number N of independent and identically distributed (i.e. assigned the same distribution function) claims Yj:

N

s c o l l = C Yj j - 1

In both models, the distribution function of the aggregate claim is to be determined. Although this could be deduced analytically, there is no doubt that it must also be worked out numerically.

As far as the numerical calculability of the distribution function of the aggregate claim is concerned, the individual model benefits from only having to add up a fixed number of random variables, but it also has the disadvantage that these random variables are by nature not identically distributed. By contrast, a collective model has the advantage that only identically distributed random variables need to be added up, but it also has a clear drawback insofar as the length of the sum is stochastic, i.e. it also constitutes a random variable. For the time being, no comment will be made on any connection between the two models.

The usefulness of these models is dependent on two crucial aspects. Firstly, models are irrefragably of no practical value if they fail to facilitate the concrete determination of the key target variable (in this case the numerical calculation of the aggregate claims distribution). This failure was true of both models until the end of the 70s. Secondly, it is important to be aware of algorithms-i.e. calculation methods for determining the aggregate claims distribution-which quickly and accurately define the aggregate claims distribution as a target variable. Not surprisingly, this is a rather complex task owing to the various possibilities of partial claims and large numbers of claims.

Be that as it may, exact algorithms were developed for both models in 1981 and 1989, respectively. With some computational effort, these algorithms may be calculated on the computer equipment currently in use by insurance companies. We examine these algorithms, which in both instances are rendered as recurrence formulae, in the following subsections.

However, simply in view of the fact that the known algorithms for collective models may be computed more quickly than those for individual models, the question arises as to whether adapted, approximative collective models could be used for the pertinent individual models if they were able to offer a sufficient degree of accuracy for practical insurance purposes. Section 2.3 explains how this is possible and adumbrates various adaptation methods.

2.1. Individual model We start with a portfolio consisting of a finite number of policies whose potential claims

X,aO are assumed to be random variables. Further, it is assumed that the n units of X , are independent of one another and that each X i has an individual distribution function

P ( X , s x ) = F ; ( x ) , x E R

Thus F i ( x ) specifies for any selected x E R the probability that the claim of the ith policy is less than or equal to x.

Initially, the distribution function F of the aggregate claim n

i = 1

can be very simply described with the aid of F;. The distribution function of the sum of two (and hence also a finitely large number of) independent random variables is by definition the

Page 5: Applications of risk theory and multivariate analysis in insurance practice

RISK THEORY AND MULTIVARIATE ANALYSIS 235

convolution of the two distribution functions. Hence for any x E R

F ( x ) = P ( S i n d d ~ ) = F 1 * F , * ... * F,(x) From the standpoint of calculability, this simple convolution formula is utterly impractical

for larger n values. If n is greater than lo00 (which is generally the case with an insurance company's portfolios), even medium-sized computers require a far from acceptable amount of computing time for the numerical determination of the aggregate claims distribution. Reference 3 contains data on such computing times, also with regard to other procedures. Thus, since the convolution formula for the numerical determination of the aggregate claims distribution requires so much machine time, efforts have been made to find other methods.

The eighties saw significant progress in this area. Firstly, Kornya4 offered an entirely different method for calculating the aggregate claims distribution for a special case (namely if all X i are only 2-point distributions, i.e. the scenario found in term Life insurance). In 1989, De f i l s made a decisive breakthrough. Given extremely relaxed assumptions-which are wholly satisfactory for practical purposes-he deduced an approximate solution which can be calculated with an efficient algorithm. The computing time, although still considerable, is substantially less than when using the convolution formula. The assumptions merely state that the X i are concentrated on No (in particular, then, discrete) and that they are bounded.

2.2. Collective model

Traditionally, collective models have played a much greater role in practice and in theory than individual models. Although there is no immediate connection between the two classes of models, we shall explain in Section 2.3 how collective models can be used very quickly to determine numerically the aggregate claims distribution of an individual model.

In a collective model, it is assumed that the aggregate claim Scoll has the form N

j - 1

with independent and identically distributed real random variables Yj>O. The length N of this sum is assumed to be a random variable concentrated on No, which is independent of (Y,). Y, is interpreted as the jth claim and N as the claims number.

If G denotes the distribution function of Y,, which according to the assumption is no longer dependent on j , then

G(x)= P ( Y , S X )

and if

pn = P ( N = n)

is the probability of precisely n claims occumng, it is quite elementary to demonstrate that the distribution function F,,, = P(ScOI, s x ) of the aggregate claim Scoll may be expressed very simply in terms of p n and C:

n - 0

In this case, G*" is the n-fold convolution power of G. This formula, too, is useless numerical purposes, firstly because it involves an infinite series, and secondly because

for the

Page 6: Applications of risk theory and multivariate analysis in insurance practice

236 B. NIGGEMEYER, M. RADTKE AND A. REICH

partial sums also contain high convolution powers of G, the computation of which again requires an excessive amount of computing time.

In 1981, the Canadian Panjer6 made a decisive breakthrough. Given very relaxed assumptions for Yj and for the ‘Panjer class’ of N (i.e. Poisson, negative binomial and binomial), he discovered a recursion and thus also a highly efficient algorithm for the aggregate claims distribution Fcoll. To explain it more precisely, if Y, are concentrated in N with a density g, and if N belongs to the Panjer class, then the density f of the aggregate claims distribution is

f (0) = P ( N = 0)

with appropriate constants a, b E R, these being clearly determined by N . The assumption in respect of Yj is sufficiently general, meaning in fact only that claims

are payable in full DM amounts. Similarly, the assumption that the claims number obeys either a Poisson, negative binomial or binomial distribution is uncritical. The mathematical reason for the occurrence of precisely these types of claims numbers is well known: typically, they are precisely those claims number distributions which satisfy a linear recursion of the form

Pn= a + - Pn-I ( :I with some constants a, b E R.

Since it is not so important for practical purposes, it need only be mentioned here in passing that if the claims size distribution G has a continuous density it is possible to describe the target function Fco,, in terms of a solution of a linear Volterra integral equation of type 2. In practice, however, existing methods for obtaining a numerical solution to such integral equations have remained relatively insignificant.

Of decisive importance for the practicability and application of Panjer’s recurrence formula is the fact that the necessary computer times are acceptable and, in particular, that they are well below those when using the previously known algorithms for individual models. Quantitative data on this point are to be found in Reference 3.

2.3. The combination of individual models with collective models

Owing to the poor calculability of individual models, practice (and theory) has tended to focus much more closely on collective models. However, leaving aside certain special cases remote from actual practice, the aggregate claims distribution of an individual model is certainly never identical with the aggregate claims distribution of any form of calculable collective model. Consequently, the application of a collective model inevitably produces a (perhaps tolerable) error. For a long time, this error simply remained unnoticed. In 1985, Hipp7 offered the first estimate of this error in the event of a standard transition from an individual model to an ‘associated’ collective model, as is normal practice especially with regard to a Poisson-distributed claims number. As far as medium-sized collectives are concerned, this error, which is in the order of 1/1OOO, is sufficiently small. This transition to a collective model is performed in such a way that the claims size distribution G of Y, is obtained in terms of a weighted sum of the conditional claims size distributions Fi of X i .

Page 7: Applications of risk theory and multivariate analysis in insurance practice

RISK THEORY AND MULTIVARIATE ANALYSIS 237

In other words, we assume an individual model

to which a collective model N

scol l= 1 Yj 1 - 1

is associated in such a way that the claims size distribution G of Yj

q; G ( x ) = 1 - P ( X , s x ) - (1 -q i )

; = I nq 4; suffices. In this context,

qi = P(Xi > O), n

and N is, for example, Poisson-distributed with a parameter

+ 9" = n.q = q , + ... The obvious question as to whether this selection of a collective model-as found in standard

practice-is also the optimal approach was answered in the negative by Kuon et a/.* Significantly better collective models are available [cf. Reference 81 in the sense that the accuracy is improved by at least the tenth power.

3. MULTIVARIATE PROCEDURES

Collective and individual aggregate claims distributions model the overall performance of a portfolio. In order to obtain perfect control of a risk, however, it is necessary to look beyond the pure portfolio perspective. More searching questions relating to systematic risk segmentation have to be dealt with, i.e. the splitting of the portfolio into homogeneous subsegments in terms of profitability and volatility. Generally speaking, prior to quantifying the segmentation effects the initial task is to evaluate the relevance of the segmentation characteristics by using suitable methods.

All these questions entail the statistically significant proof of largely complex multidimensional correlations. In this connection, applied mathematical statistics has a plurality of multivariate procedures at its disposal; we shall adumbrate certain of these below, paying special attention to their practical relevance. Of crucial importance is the fact that, unlike the commonly used purely univariate analyses, the multivariate statistical procedures under discussion here adequately allow for the underlying multidimensional structural correlations.

Thus, within the scope of a portfolio management analysis, the initial identification of the characteristics relevant to the claims development is of central importance. The features of each of these characteristics will subsequently form the basis of a decision grid in which the rules for accepting or declining risks due to be written are defined. However, multi-layered dependences may mean that a certain characteristic in combination with other characteristics scarcely possesses any explanatory power, even though an effect apparently exists from the

Page 8: Applications of risk theory and multivariate analysis in insurance practice

238 B. NIGGEMEYER, M. RADTKE AND A. REICH

univariate viewpoint. Overall, then, in the light of such multi-layered dependences, it is imperative to use suitable methods from the field of multivariate statistics. In this way, it is possible to solve the problem of obtaining statistically significant proof of multidimensional dependencies and of quantifying such correlations.

Multiple regression and variance analyses, which have to date been regarded as the standard methods in mathematical statistics for dealing with such questions, may be integrated into the comprehensive theory of generalized linear models. They can be used simultaneously to approach problem relating to metrical, ordinal and categorial characteristics. A brief outline of these models is presented below. In this context, only the case of a target variable Y dependent on p covariates XI, . . . , X, is considered, and the model is initially introduced via the classical linear approach for normal observations. The extension to the multivariate case of N target variables follows immediately afterwards.

Let

Y , = Z ‘ , P + E ~ , i = l , ..., n

the classical linear model, where Z j , the design vector, is an appropriate function of the covariate vector X i = (XI, ;, . . . , X,, ;) and /3 is the vector of unknown parameters. For a vector of metric variables the simplest form of Z j is Zj = (1, Xi), and the model then turns into the well- known form

,= 1

The errors E , are assumed to be normal-distributed and independent, i.e.

E , - N ( O , U ? ) , i = 1, ..., n

where N ( 0 , a’) denotes the normal distribution with mean 0 and variance u’.

the observation Y , are independent and normal-distributed We now rewrite this classical model in a form that directly leads to generalized linear models:

Y, - ~ ( p , , a’), i = 1, ..., n

with p, = E(Y,). The mean p , is then given by the linear combination ZIP, i.e.

p,=Z’,B, i = l , ..., n

Since, normally, the covariates XI are stochastic, we assume the vector ( Y , , X , ) to be independent and identically distributed. So the assumption on Y, is conditional given XI, which means that both the density of Y , and the independence of Y , are only conditional given XI.

In the following definition of generalized linear models, the preceding assumptions are relaxed.

3.1. Distributional assumptions

to a simple exponential family with expectation E( Y , / X j ) = p i and density Given X i , the Y j are (conditionally) independent and the conditional distribution of Y j belongs

Page 9: Applications of risk theory and multivariate analysis in insurance practice

RISK THEORY AND MULTIVARIATE ANALYSIS 239

where O i is the so-called natural parameter; @ is an additional scale or dispersion parameter; and b, c are specific functions corresponding to the type of exponential family, for example normal, the binomial, the Poisson, the gamma and the inverse Gaussian distribution.

3.2. Structural assumption

The expectation p, is related to a linear predictor q, = Z',@ by

g(p,) = V l = z:@ where g is the link function; @ is a vector of unknown parameters of dimension p ; and Z, is a design vector of dimension p , which is determined as an appropriate function Z, = Z(X i ) of the covariates X I .

The choice of an appropriate link-function depends on the specific underlying exponential family of the model, i.e. on the type of responses. For each exponential family there exists a so-called natural link-function. Natural link-functions relate the natural parameter of the density directly to the linear predictor:

@= @ ( p ) = q = Z @

i.e. g ( p ) = @(p). For example, the natural link-functions are q = p for the normal, 17 = log p for the Poisson and q =log ( p / ( l - p ) ) for the binomial distribution.

Thus, in the case of normal responses, the classical models of regression and variance analysis are obtained; for Poisson-distributed responses-for example claims counts in a contingency table-the approach leads to log-linear models, and in the case of binomial data the generalized linear models prove to be logit-models for categorial regression analysis.

Concerning the design vector, metrical variables can be incorporated directly or after appropriate transformations such as log ( x ) , x 2 , etc. Categorial covariates, ordered or unordered, have to be coded using appropriate methods into dummy vectors (for further remarks on the theory, we refer to References 9 and 10.

From a practical point of view, it is important that the approach using generalized linear models constitutes a framework in which many of the above-mentioned problems can be dealt with. For example, in Motor Liability business, the estimation of the claims burden dependent on several risk factors (covariates) in the light of risk-significant aspects is nothing more than model fitting, parameter estimation and the application of the appropriate testing procedures. Furthermore, all the statistical procedures for variable selecting and checking, i.e. stepwise backwards and forwards selection, can be applied previously in order to check the relevance to the risk of new additional risk factors under consideration. The result is a scoring of possible variables with regard to their ability to discriminate the claims values.

The task of optimally segmenting a portfolio into the most homogeneous subsegments is possible a further straightforward, but extremely effective application. The purpose here is to use the method to determine both the selection of the segmentation criteria (variables) and the actual portfolio split. The procedures essentially involve a simple linear variance analysis approach in which the dependent variable, such as the loss ratio of the segments which can be formed via a plurality of potential explanatory factors, is in linear dependence.

If we initially consider just one explanatory factor (covariable), we obtain the model

YI,, = Pi + & I , ,

Page 10: Applications of risk theory and multivariate analysis in insurance practice

240 B. NIGGEMEYER, M. RADTKE AND A. REICH

where Y,,] represents the jth observation (risk) in the ith segment, i = 1, .. ., I , j = 1, . .. , n,; E,,,

independent random variables with E ( E J = 0 and V ( E J = u2, uncorrelated with the factor; p, is the mean value (e.g. mean loss ratio) of the ith segment; n, is the number of observations in the ith segment; n =c:,, n, is the total number of observations; and p = l / n I:=, n,p l is the mean value of the total portfolio.

This model has the following equivalent representation:

YI,] = P + a, + E l , ]

with a , = p, - p and c/31 n,a, = 0. Then E(YI,]) = p + a , is valid. Thus, the linear coefficient a , measures the effect of the ith segment-which is initially

defined via exactly one factor-an the dependent variable. As a result, the factor has an effect on the dependent variable precisely in cases where individual a, are distinct from zero. An appropriate test for this can easily be formulated under normal conditions on the basis of the variance analysis.

Aided by the analysis of variance, the hypothesis H,: a , = a2 = ... = a , = 0, i.e. all expected values of the segments are equal, can be tested by means of

n - I I- 1

FI-I , , - ,

c n , x a ;

r =--

2 I

2 i = I r =

with F I - l , n - , F-distributed in respect of the parameter (I - 1 , n - I ) . Alternatively, a distribution-free cumulative ranking test can also be formulated for H,, which dispenses with the assumption of normality and is based solely on a continuous distribution. The value r2 represents the classical measure of fit, which can be interpreted as a quotient of the variation between the segments and the total variation. The underlying variance analysis also reveals that the minimization of the variation within the segments (in the sense of L 2 ) is equivalent to the maximization of the variation between the segments and hence also to the maximization of r2.

By means of this basic approach, the multidimensional problem of obtaining an optimal segmentation vis-;-,is both a plurality of potential segmentation characteristics (factors) and the formation of the segments can be solved in a hierarchical, successive procedure via the concrete features of the characteristics being applied. In this connection, the goodness of a segmentation can be measured by means of the heterogeneity of the individual segments, i.e.

The exact determination of the optimal segmentation generally necessitates the calculation of this measure of goodness for all potential segmentations. Even given a small number of characteristics for investigation-and each with relatively few categories-the computational effort required is no longer manageable from a technical viewpoint. For this reason, as a first step, only binary segmentations for each characteristic are considered (cf. corresponding procedure in cluster analysis), and an optimal solution within the above-mentioned framework (i.e. with regard to the characteristics and their categories) is determined for a binary split. In a

Page 11: Applications of risk theory and multivariate analysis in insurance practice

RISK THEORY AND MULTIVARIATE ANALYSIS 24 1

hierarchical top-down approach, this procedure can then be applied successively to a portfolio of risks, i.e. the originally heterogeneous risks located in a common segment are broken down into two sub-portfolios in respect of the segmentation characteristic whose split-also with regard to the categories-displays the highest-value of the measure of fit. The resultant segments are more homogeneous than the original segment, since the variation within the segments decreases as the measure of fit increases. In addition, the significance of the analysis-as described above-can be evaluated at each stage and suitable cut-off criteria for the segmentation procedure may be formulated by means of the appropriate tests. Of decisive importance here is the fact that in this hierarchical procedure multidimensional dependences between the segmentation characteristics-so-called interactions-are implicitly allowed for, rather than having to be explicitly modelled as is the case with log-linear modelling approaches. The result of a segmentation analysis, i.e. the successive breakdown of an original portfolio into increasingly homogeneous segments in respect of a claims value, can be clearly illustrated with the aid of tree diagrams.

4. SCOPE FOR PRACTICAL APPLICATION

The need to analyse a portfolio as a whole in terms of sub-segments for management purposes is self-evident.

With markets offering progressively less room for manoeuvre, it is necessary to set clear objectives and implement systematic measures. Whether in the shape of portfolio improvement concepts for a particular class of business or in the form of risk-oriented market segmentation, the desire to fine-tune insurance business with an eye to profitability is more pronounced than ever. Top of the agenda is the need to keep loss ratios below the market average and thereby attain profitable market shares. The manageability of the portfolio as a whole is to be further enhanced by means of numerous segmentation criteria (in Fire insurance, for example, in addition to the EML, risk classifications, regions, share in the original, types of operation, discount, etc.). By applying procedures from the area of multivariate statistics (cf. Section 3), the aim is to obtain the most highly segmented breakdown possible without sacrificing statistical significance.

At this point, concrete problems frequently crop up. Firstly, a quantitative analysis cannot be performed without data access. Many insurers today have at their disposal individual-risk data that are differentiated according to tariff characteristics, or it may even be the case that they expand their master records my means of field statistics on bulk business gathered in close cooperation with market research institutes. Yet the available data are not always analysed systematically. Underpriced risks are not infrequently treated just like overpriced ones. In the most extreme cases, disastrously poor risks even receive special discounts, e.g. due to the large sums insured, because the characteristic responsible for the heavy claims burden has been overlooked. Unfortunately, sales, marketing and the fixing of prices and discounts are still uncoordinated and organized along very undifferentiated lines.

Insurance companies’ underwriting departments are, of course, well aware that the development of risks within a single class of business varies. In order to be able to write risks on an increasingly individualized basis, they would prefer to have a highly detailed portfolio structure that takes into account numerous characteristics. However, if the portfolio is broken down into ever smaller segments, there eventually comes a point where statistically significant statements are no longer possible, for example because class frequencies become too small. On grounds of significance, underwriting departments are obliged to settle for what is clearly feasible rather than what is desirable.

Page 12: Applications of risk theory and multivariate analysis in insurance practice

242 B. NIGGEMEYER, M. RADTKE AND A. REICH

Second, it is necessary at this stage to incorporate methodically sound procedures for major- claim adjustment. In many companies, this interface with the risk analysis of a portfolio has already become a highly controversial topic as far as up-to-date profit monitoring is concerned. Major-claim adjustment does not mean neglecting major claims. Just as random major claims have to be excluded from the empirical data, it is necessary in a second stage to supplement all basic claims with the expected major-claim shares so as also to make adequate allowance for major claims that randomly fail to occur. This is unfortunately not standard practice in all insurance companies, although, in the case of Fire portfolios for example, the EML-based determination of the fluctuation potential in accordance with the methods described in Section 2 for calculating the aggregate claims distribution already offers a way of determining the major- claim shares that is adequate to the risk.

The crux of the entire issue4.e. the identification and combination of segments that behave homogeneously in terms of their profitability-can now be approached using the methods described in Section 3. The hierarchical segmentation of the portfolio as a whole created by using a top-down approach facilitates the precise localization of profit and loss segments. The extent of the premium insufficiency or of the potential yield of the profitable segments is quantified, and the priority attaching to the various risk-evaluation criteria is determined. A complete portfolio split produces a complex tree (compare Figure 2) with numerous branches and hierarchies together with, at the lowest level, segments that can be addressed directly in operational terms.

Private customer business represents quite a different proposition. In Motor insurance, the key Non-Life class of business, insurers are facing highly differentiated, company-specific tariffs with finely graduated discount systems as well as the possibility of substantially shorter product

Page 13: Applications of risk theory and multivariate analysis in insurance practice

RISK THEORY AND MULTIVARIATE ANALYSIS 243

and tariff cycles. Following the abolition of the Motor Liability tariff, the Motor market in Germany, in particular, is experiencing a period of upheaval involving considerable implications for the future.

Preparations for a changed competitive environment in which the mastery of underwriting practices will assume crucial importance as a performance factor are already more advanced in Motor than in other classes of business. In addition, car insurers in Germany already have some experience of fixing premiums differentiated along risk-group lines, though it was the industry’s Association which was responsible for performing the actual calculations. Whereas previously the characteristics describing customer quality in MTPL were engine size, no-claims bonus and region, and in Comprehensive Coverage vehicle type class and no-claims bonus, since 1992 insurance companies have been searching for additional characteristics. Via field studies and external databases, sub-portfolios consisting of between 20,000 and 100,OOO policies have been supplemented by up to seventy characteristics.

However, the required degree of risk transparency does not materialize if, for example, the dependence of the claims burden is determined solely by the car’s mileage, the sex of the driver or some other characteristic. The reason is that the problem as such is multidimensional and can only be solved in several stages with the aid of a wide variety of methods and multivariate statistics.

The decisive factor in economic terms is that only discounts calculated on a univariate basis can generate negative selection in a market in the throes of upheaval. Thus, for example, frequent women drivers fall increasingly under the scope of the lady tariffs. In the case of characteristics whose overall effect upon the claims burden is positive (e.g. sex, cf. lady tariff), in combination with other characteristics (e.g. convertible), this effect can go into reverse. For this reason, multidimensional handling and comprehension are indispensable. Additionally, characteristics incapable of factual examination, such as annual mileage, can be replaced using these methods via dependences upon other characteristics (occupation, use, sex). Obviously, in addition to the metrical characteristics, categorial characteristics in particular are also taken into consideration.

Once the characteristics relevant to the risk have been derived from the large pool of possible characteristics and their mutual dependences have been quantified, the performance of the portfolio can be drastically improved by appropriate focusing within the framework of systematic market cultivation.

Such an analysis also lays the foundations for the development and implementation of a tariff structure that is both company-specific and adequate to cater for the risks involved. By identifying the profitable customer segments and the characteristics of such segments relevant to the risks, the first stage has been accomplished. It is, of course, then necessary to gear the sales and marketing systems according to the newly defined target groups.

It is not only in Fire, Motor and Liability business that insurance companies must brace themselves in the face of the anticipated upheavals so as to be able to satisfy quickly the requirements of changing markets. On a very general level, there is a need in all classes of insurance for more finely tuned procedures of market differentiation, entailing the approaches described above.

The instruments of modem portfolio management enable an insurance company to develop well-founded proposals in respect of its own strategic options. Taking as a basis strong and weak points as well as opportunities and risks, it is thus possible to define a company’s own competitive advantages and to bring them to bear in attractive market sectors.

To put it in a nutshell, the establishment of profit-oriented portfolio management between the poles of risk-oriented underwriting practices and modem database marketing is a major

Page 14: Applications of risk theory and multivariate analysis in insurance practice

244 B. MGGEMEYER, M. RADTKE AND A. REICH

challenge for insurance companies. Of crucial importance in this framework is the development of underwriting practices which, rather than simply managing portfolios and applying calculated tariffs, proceed from the impulses of selective market activity which are necessary in a deregulated market. The transition from thinking in terms of products to thinking in terms of requirements in the sense of customer-oriented solutions to problems justifies the accompanying counterpole at insurance companies. It is between these two poles-underwriting on the one hand and customer requirements on the other-that the area of tension emerges from which creative future-oriented products in the sense of solutions to problems for the customer can develop.

REFERENCES

1. H. Biihlmann, Mathematical Methods in Risk Theory, Springer-Verlag, Berlin, 1970. 2. C. Hipp and R. Michel, Risikotheorie: Srochasrische Modelle und Statistische Methoden, Verlag

Versicherungswinschaft e.V., Karlsruhe, 1990. 3. S. Kuon, A.Reich and L. Reimers, ’Panjer vs Komya vs De Pril: a comparison from a practical point of view’,

ASTIN Bull.. 17. 183-191 (1987). . . 4. P. S. Komya, ‘Distribution of aggregate claims in the individual risk theory model’, Trans. SOC. Acruaries, 35,

5 . N. De Pril, ‘The aggregate claims distribution in the individual model with arbitrary positive claims’, ASTIN Bull.,

6. H. Panjer, ‘Recursive evaluation of a family of compound distributions’, ASTIN Bull., 12, 22-26 (1981). 7. C. Hipp, ‘Approximation of aggregate claims distributions by compound Poisson distributions’, Insurance: Math.

8. S . Kuon, M. Radtke and A. Reich, ‘An appropriate way to switch from the individual risk model to the collective

9. L. Fahrmeier and G. Tutz, Multivariate Sratistical Modelling Based on Generalized Linear Models, Springer-

823-836 (1983).

19,9-24 (1989).

& E c o ~ . , 4,227-232 (1985).

one’, ASTIN Bull., 23,23-54 (1993).

Verlag, New York, 1994. 10. L. Fahnneier and A. Hamerle, Mulrivariate statisrische Verfahren, De Gruyter, Berlin, 1984.