art cladag 2011

16
European Actuarial Journal manuscript No. (will be inserted by the editor) An actuarial model for assessing general practictioners’ prescriptions costs Giorgio Spedicato the date of receipt and acceptance should be inserted later Abstract Mon itoring genera l pra cti tioner s’ pre scr ipt ion s cos ts is an imp ort ant iss ue in ord er to efciently allocate national health insurance resources. To address this aim this paper pro- poses a methodology based on non - life actuarial models. The patients’ frequency and costs of drugs prescriptions are modeled by means of Generalized Additive Models for Location, Scale and Shape (GAMLSS) in our approach. The total cost of the pool of patients’ drug presc riptio ns is then modelled by means of con volu tions , follo wing a class ical risk theor y approach. An example based on a quasi-real dataset exemplies the proposed methodology. Keywords GAMLSS · public health insurance · drug prescriptions coverage · predictive models 1 Introduction Monitoring general practitioners’ (GPs) costs of drug prescriptions is an important issue to efciently allocate National Health Insurance (NHI) budget. Prolonged economic downturn has produced increased pressure on governments toward rationalization and budget restric- tions. For example, NHI policy discussion in Italy [ 3] has brought the attention upon “stan- dard costs of service”, which should represent the efcient price of any service granted by the NHI. This paper aims to show a rationale approach to assess the standard cost of drug prescriptions charged to the NHI. Drug prescriptions’ expenditure has been widely studied by health econometricians and medical researchers. In particular, [26] analysed GPs’ drug prescriptions’ costs in Ireland. The yearly total cost was estimated by means of a linear regression model, based on aggre- gate demographic variables of each GP’s pool of patients. [21] conducted a similar study in Northern Italy. Here, models applied were multiple linear regression and LISREL model, based on both patient- and GP-level demographic variables. [ 9] studied the effect of GPs’ age and sex on number and cost of drug prescriptions in Catalunya region. Finally, [ 10] applied a panel data econometric model on data from Catalunya region. In synthesis, medi- cal literature conrm the availability of data and the importance of using statistical models; moreover, empirical studies show that both patient- and GPs-level demographic variables BLINDED

Upload: spedicatogiorgio

Post on 04-Apr-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

7/31/2019 Art Cladag 2011

http://slidepdf.com/reader/full/art-cladag-2011 1/16

European Actuarial Journal manuscript No.(will be inserted by the editor)

An actuarial model for assessing general practictioners’

prescriptions costs

Giorgio Spedicato

the date of receipt and acceptance should be inserted later

Abstract Monitoring general practitioners’ prescriptions costs is an important issue in order

to efficiently allocate national health insurance resources. To address this aim this paper pro-poses a methodology based on non - life actuarial models. The patients’ frequency and costs

of drugs prescriptions are modeled by means of Generalized Additive Models for Location,

Scale and Shape (GAMLSS) in our approach. The total cost of the pool of patients’ drug

prescriptions is then modelled by means of convolutions, following a classical risk theory

approach. An example based on a quasi-real dataset exemplifies the proposed methodology.

Keywords GAMLSS · public health insurance · drug prescriptions coverage · predictive

models

1 Introduction

Monitoring general practitioners’ (GPs) costs of drug prescriptions is an important issue to

efficiently allocate National Health Insurance (NHI) budget. Prolonged economic downturn

has produced increased pressure on governments toward rationalization and budget restric-

tions. For example, NHI policy discussion in Italy [3] has brought the attention upon “stan-

dard costs of service”, which should represent the efficient price of any service granted by

the NHI. This paper aims to show a rationale approach to assess the standard cost of drug

prescriptions charged to the NHI.

Drug prescriptions’ expenditure has been widely studied by health econometricians and

medical researchers. In particular, [26] analysed GPs’ drug prescriptions’ costs in Ireland.

The yearly total cost was estimated by means of a linear regression model, based on aggre-

gate demographic variables of each GP’s pool of patients. [21] conducted a similar study in

Northern Italy. Here, models applied were multiple linear regression and LISREL model,

based on both patient- and GP-level demographic variables. [ 9] studied the effect of GPs’

age and sex on number and cost of drug prescriptions in Catalunya region. Finally, [10]applied a panel data econometric model on data from Catalunya region. In synthesis, medi-

cal literature confirm the availability of data and the importance of using statistical models;

moreover, empirical studies show that both patient- and GPs-level demographic variables

BLINDED

7/31/2019 Art Cladag 2011

http://slidepdf.com/reader/full/art-cladag-2011 2/16

2 Giorgio Spedicato

play a significant role in determining the total cost of the yearly prescriptions. However,

there are three drawbacks in the approaches proposed in the medical literature: number and

cost of single prescriptions are not separately taken into account; linear regression mod-

els are used, while generalized linear models (GLMs) seem more adequate; the expected

value of total cost of the yearly prescriptions is modeled only, without taking into account

variability.

Even if no actuarial literature exists on this topic, some well known actuarial approaches

may be usefully applied in this context. In particular, we propose a new methodology which

combines four actuarial techniques that are widely used in non-life insurance actuarial prac-

tice.

The first technique consists of the convolution of stochastic distributions (see e.g. [4] for

a theoretical introduction). In particular, risk theory models the total cost of claims by con-

volution of number and cost of claims’ distributions. One of the mayor applications of risk 

theory in actuarial context regards the estimation of insurers’ Solvency Capital Requirement

([2]); see an example in [20]. Here, we propose to model the distribution of the yearly total

costs of drugs’ prescriptions of a single patient as a convolution of the stochastic number

and cost of prescriptions associated with the patient.

The second technique is an extension of GLMs, that is Generalized Additive Models for

Location, Scale and Shape (GAMLSS) [19]. GLMs are widely used in non-life rate-making

([1] and [5]). In particular, over-dispersed Poisson and Gamma GLMs are applied to model

the frequency and the severity of claims as a function of policyholders’ characteristics in

order to assess risk premium of insurance coverages (see [25] for details). However, the

variability of number and cost of claims is rarely taken into consideration in a standard rate-

making. GAMLSS allow to model as a function of covariates not only the mean, but also

other parameters which enable to completely define the conditional distribution of the de-

pendent variable. Very few actuarial applications of GAMLSS exist. In particular, GAMLSS

have been proposed to assess the frequency and the cost of claims in the Australian market

in [6] and to analyse mortality trend in [24]. Moreover, in order to assess the premium risk 

Solvency II capital requirement, [22] applies GAMLSS to better take into account portfo-

lios’ heterogeneity. Here, we propose to model frequency and costs of prescription drugs for

each patient by means of GAMLSS, in order to estimate location and dispersion parameters

as a function of patient characteristics.

The third technique is represented by models for lapse probability and conversion rate,

widely used in actuarial practice to predict drop-outs and arrivals, given that a policyholders’

portfolio is an open collectivity (see e.g. [25] and [23] for a practical discussion). We propose

to model the probability that any subject may leave the GP for death or other causes, as well

as the probability that a new subject may enter the pool of patients of the GP.

The fourth technique consists of approximating the total loss distribution of a portfolio

by a theoretical distribution (see for details [14]). We extend this approach for approximating

the yearly total cost of drug prescriptions arising from a GP’s pool of patients.

The paper will be structured as follows: the methodology will be introduced in Section

2, an example based on a quasi real data set will be discussed in Section 3. Finally, in Section

4 conclusions and suggestions for further research will be provided.

2 The methodology

This section introduces the theoretical tools which are the basis of the new methodology

proposed.

7/31/2019 Art Cladag 2011

http://slidepdf.com/reader/full/art-cladag-2011 3/16

An actuarial model for assessing general practictioners’ prescriptions costs 3

2.1 Risk theory

One of the goals of risk theory is modeling the total cost of a policyholders’ portfolio. Given

that patients are heterogeneous, we follow the so-called individual risk theory approach to

model the distribution of the yearly total costs of prescription drugs. In particular, the yearly

total cost T  of prescription drugs can be expressed as the sum of single patients’ costs t i,

i = 1, ... N , that is:

T  = N 

∑i=1

t i, (1)

where both T  and t i, i = 1, ... N , are random variables.

Then, the yearly cost of prescription drugs t i for patient i can be seen as a convolution

of single patients yearly costs ci j of prescription drugs for patient i, j = 1, ...ni, that is:

t i =ni

∑ j=0

ci j , (2)

where ni represents the stochastic number of prescription drugs during the exposure period

for patient i and ci j represents the ni stochastic costs of drug prescriptions for patient i.

2.2 GAMLSS

GLM extends classical linear model when the dependent variable is not conditionally Gaus-

sian distributed; here, the expected value of the dependent variable ˜ yi is expressed as a

function of covariates through the GLM link function, that is:

 E [ ˜ yi] = µ i = g−1 (ηi) = f ( xi)var [ ˜ yi] = φ V (µ i)

(3)

where g−1() is the link function, V (µ i) is a function that depends by the distributionfamily and φ  is a constant that can be estimated from the data (see [ 1] for details). How-

ever, standard GLM framework leads to restrictive modeling for the variance of ˜ yi, since it

depends on µ i.

A recent extention of GLMs, i.e. GAMLSS family, overcomes such limitations. GAMLSS

enable to model up to four parameters of ˜ yi distribution as a function of covariates (i.e. lo-

cation µ i, scale σ i and shape parameters ν i and τ i). Then, we have:

µ i = f 1 ( xi)σ i = f 2 ( xi)ν i = f 3 ( xi)τ i = f 4 ( xi)

(4)

The distribution of ˜ yi is therefore fully characterized by a set of flexible equations. Inparticular, equation (4) implies that moments of ˜ yi can be directly expressed as a function of 

covariates after a convenient parametrization, that is:

 E [ ˜ yi] = f ( xi)var [ ˜ yi] = g ( xi)

(5)

7/31/2019 Art Cladag 2011

http://slidepdf.com/reader/full/art-cladag-2011 4/16

4 Giorgio Spedicato

Current GAMLSS R package [19] supports more than 60 distributions, non-linear and

non-parametric relationships (e.g. cubic splines, loess and non parametric smoothers), ran-

dom effect modeling; moreover, it provides a full set of diagnostic tools.

In order to assess the drug prescriptions’ total cost in (2) of a GP’s pool of patients, we

propose to model ni and ci j by means of GAMLSS framework as a function of patient’s char-

acteristics. This enables to obtain expressions for E [ni], var [ni], E [ci] and var [ci] following

equation (5). We propose to use for ni a count data regression model, while for ci a posi-

tive distribution regression model. Suitable candidates for ni are Negative Binomial (NB) or

Poisson (POI) distributions, which are are widely used in non-life actuarial practice; the ad-

vantage is that closed forms for the moments exist as a function of distributions parameters.

Formulas (6) and (7) show conveniently parametrizations of NB and POI probability mass

functions, respectively:

 pY  ( y|µ ) = e−µ µ  y y!

 E [Y ] = µ var[Y ] = µ 

(6)

 pY  ( y|µ ) =

Γ ( y+ 1σ )

Γ ( 1σ )Γ ( y+1)

σµ 

1+σµ  y 1

1+σµ  1

σ 

 E [Y ] = µ 

var [Y ] = µ +µ 2σ 

(7)

However, suitable candidates for ci are Gamma (GA) and Inverse Gaussian (IG). Equa-

tions (8) and (9) show convenient parametrizations of GA and IG density functions, respec-

tively:

 f Y  ( y|µ ,σ ) = 1

(σ 2µ ) y

1

σ 2−1

e

− y

(σ 2µ )

Γ 

1

σ 2

 E [Y ] = µ 

var[Y ] = σ 2µ 2

(8)

 f Y  ( y|µ ,σ ) = 1√ 2πσ 2 y3

e−( y−µ )2

2µ 2σ 2 y

 E [Y ] = µ 

var [Y ] = σ 2µ 3

(9)

2.3 Lapse probability and conversion rate

With the aim to optimize proposed tariffs, actuaries usually fit models for lapse probability

and conversion rates which take into account new policyholders’ flows and existing cus-

tomer drop outs, respectively. The standard approach is logistic regression with covariates

regarding policyholders’ demographic profile and market competitiveness environment (see

[25]). Lapse and conversion modeling allows to define properly the effective period of ex-

posure at risk for each subject during the time of the study, ei.Our application models drug prescriptions cost of a pool of patient during one calendar

year. However each patient can enter the pool after the beginning of the year, e.g. for having

changed residence, and can leave the pool before the end of the year, e.g. for death.

Therefore the effective exposure period becomes a stochastic variable, ei that shall be mod-

elled in order to properly assess t i. We assume the expected value of ni to be proportional

7/31/2019 Art Cladag 2011

http://slidepdf.com/reader/full/art-cladag-2011 5/16

An actuarial model for assessing general practictioners’ prescriptions costs 5

to ei, as formula 10 shows. GLM modelling handles this issue by means of offsets, as [1]

shows. The ln (ei) term in the link equation has its coefficient set at 1 by an offset term, as

equation 10 shows.

 E [ni] = ei exp x

iβ → ln ( E [ni]) = ln (ei) + x

iβ  (10)

However since the exposure variable is in our application stochastic, equation 10 will be

properly modified to take into account the contribution of inflows and outflows.

ei = 1− eli + enb

i (11)

Equation 11 expresses the exposure of patient i-th, ei as the algebraic sum of three com-

ponents: the exposure amount, 1, that would be acheived if the patient would stay within

the pool for the full calendar year, less the fraction of year exposure, eli , that shall not

be considered in case the patient leaves the pool before the year end, plus the exposure

contribution,enbi , of new patients that shares the same demographic profile of patient i-th.

eli = qi

˜ I d  can be expressed as the product of a Bernulli random variable qi and a uniform

(0,1) random variable, ˜ I d . In particular qi represents the probability that patient i-th will

leave the pool within the year, while d  represents the fraction of year lost. Using a uniform

distriubtion, we are assuming that lapse probability is constant thought the year.

Similarly we can express the exposure to new patients’ flow as enbi =

mi

∑ j=0

˜ I nb j . m j represents

the random number of new patients and it will be modelled by a Poisson distribution of 

parameter λ  j. Moreover we are assuming that i patients share the demographic profile of 

patient i-th. The interpretation of  ˜ I nb j is parallel to the d  one.

2.4 Loss distribution modeling

Many actuarial application uses loss distribution modelling to assess the shape of claimcosts. Loss distribution modeling fits theoretical distribution parameters on real data in or-

der to fully characterize the distribution that better fit empirical claim costs under study.

Fitting distributions requires to choose theoretical functions as candidates, to estimate their

parameters and to assess their goodness of fit. Another application of loss distribution mod-

eling lies in approximating the insurer portfolio’s total cost, T , by a simple theoretical dis-

tribution.

[14] book provides a comprehensive dissertation on loss distribution modelling.

An analytical expression of the loss distribution allows to estimate key moments (e.g. mean

and variance) and other statistics by closed form instead using simulation analysis that can

be time - consuming. However very often real data are difficult to be synthesized by theo-

retical distribution due to data quality problems or excessive heterogeneity.

The applications of loss distribution fitting in this paper is twofold. The first side con-sists in the selection of conditional distribution for ni and ci j when performing GAMLSS

modelling. Normalized quantile residuals (see [8] for details) plots aided the assessment of 

chosen conditional distribution reasonableness. The second side consists in the closed ap-

proximation of shape of  T  by means of a log-normal distribution following the approach

outlined in [15] paper.

7/31/2019 Art Cladag 2011

http://slidepdf.com/reader/full/art-cladag-2011 6/16

6 Giorgio Spedicato

2.5 The estimation procedure

In order to estimate T  we will define the distributions of ni and ci by means of GAMLSS

predictive models. Patients with full year exposure will be used to calibrate the model for ni.

Distributions of  t i and T  can be obtained empirically by means of Monte Carlo simula-

tion. In particular, a random realization from distribution of the total cost t i for patient i can

be simulated using the convolution algorithm:

1. Sample one realization of the effective yearly exposure for patient i-th, ei

2. Select the number of prescription drugs, k , at random from the assumed prescription

drugs’ frequency distribution ni.

3. Do the following k  times. Select the prescription drugs’ cost, z, at random from the

assumed prescription drugs’ cost distribution ci. costs, z, selected in step 2.

Then, if the outlined process is repeated for all N patients of the general practitioner’s port-

folio, we obtain one random realization from the distribution of the total cost T .

Finally, in order to obtain the distribution of  t i or T  it is necessary to repeat the previous

steps M times ( M  >> 0).

3 An empirical application

3.1 Data sources and preparation

3.1.1 Data sources

An empirical application will be presented in the studio to exemplify numerically the frame-

work outlined previously. We will assess the distribution of yearly drug prescription total

cost of a target GP pool of patients. The data sources used in the application are:

1. A data set,the prescriptions data set (PDS), containing the number of prescriptions of 

6,000+ patients to their GPs [11]. Each rows in the PDS contains the number of pre-scriptions during a whole year (dependent variable) plus a wide choice of demographic

data. PDS will be used to calibrate the frequency model. We have not challenged the

reliability of PDS due to the impossibility to perform such task. Moreover the PDS has

been collected on patients between 25 and 65 years of age. All analyses will be therefore

limited to the corresponding span of age, without losing generality.

2. A life table split by sex used to model the probability of death as a function of age

(source [13]).

3. A data set in the same format of the VDS containing 600 patient demographic data,

henceforth the target data set (TDS). TDS represents the pool of patients of a GP that

we code as XY. XY T  distribution is to be assessed by the methodology proposed in this

paper.

4. A data set containing a sample of drugs costs along with the age and sex of the patient

whom the prescriptions was required for. This data set, henceforth the Costs Data Set(CDS) will be used to calibrate the drug prescription cost model. This dataset has been

collected in Spring 2011 thanks to the cooperation of an Italian drugstore.

5. A function that allows to model the probability of drop out due to reasons other than

death (lapse probability). Due to data availability limitation, we have set this probability

to a flat value of 2.0%, after a discussion with a panel of experienced GPs.

7/31/2019 Art Cladag 2011

http://slidepdf.com/reader/full/art-cladag-2011 7/16

An actuarial model for assessing general practictioners’ prescriptions costs 7

6. A function that gives the rate of new enrolled patients (conversion rates). Due to data

availability limitation, we have set this rate to a flat value of 3.0%, after a discussion

with a panel of experienced GPs.

Standard lapse and conversion models deployed by personal lines pricing actuaries uses

logistic regression model to predict yearly lapse probability for each policyholder. Variableused in such regression models consist policyholder demographics, policyholder purchasing

behaviour and market competitiveness.

In our problem it is clear that the risk of enter and drop out from the pools is not uniform

among the patients. Age is indeed a systematic risk factor, but we had not the data source

to build predictive models for lapses and conversion rates with covariates and therefore we

choose a flat lapse rate to model drop out for reasons other than deaths. Even if the followed

approach is simple, it however permits to simulate the open collectivity patients flows.

As the aim of the paper is to demonstrate the feasibility of the process, we did not care

to find datasets completely matching to the real problems. The PDS and VDS comes from

a German study on yearly number of visit to GPs conducted in the 80s. We have assumed

that the number of the visit to the doctor may be a perfect proxy to the number of drug

prescription and that the population sampled in PDS and VDS dataset are representativeof the population targeted that is represented by northern Italy NHI patients. On the other

hand the CDS represents a sample of drug prescriptions amount collected in Spring 2011

thanks to the cooperation of a drug store of Nibionno (Italy). The number and the cost data

set are not collected on the same subject. This issue does not represent a limitation to the

analysis as the cost distributions has been assumed independent from the distribution of drug

prescriptions number having the effect of structural variables (like age and sex) taken into

account. Nevertheless the employed data sources allowed us to exemplify adequately the

operative methodology we have discussed 2.5.

3.2 Predictive models estimation

GAMLSS can be fitted by means of an R package ([18]).

As long as the purpose of this article is to illustrate the application of an actuarial

methodology to a health economic problem, the modelling stage has not been excessively

complicated and an approach somewhat resembling the usual pricing practice in non - life

insurance has been followed.

The PDS average number of drug prescription equal to 3.33 and corresponding standard

deviation is 6.03. The sampled costs of drug prescriptions average is 20.3 and corresponding

standard deviation is 24.1.

Two predictive model on ni and ˜ci j were fitted using GAMLSS framework.

Model building process consisted in experimenting and assessing different distributional

assumption of the dependent variables, the significance of candidate predictors and their

functional relationship within the regression equation, as properly described in [17]. Finally

following decisions were taken with respect to the selected models:

– The negative binomial has been chosen as underlying distribution for the frequency of 

prescriptions, while the inverse Gaussian has been chosen as underlying distribution for

the cost of a single drug prescription. They were parametrized using formulas 7 and 9

respectively.

7/31/2019 Art Cladag 2011

http://slidepdf.com/reader/full/art-cladag-2011 8/16

7/31/2019 Art Cladag 2011

http://slidepdf.com/reader/full/art-cladag-2011 9/16

An actuarial model for assessing general practictioners’ prescriptions costs 9

model plot.png

Fig. 1 Drugs prescriptions frequency model marginal effects plot, µ parameter

model plot.png

Fig. 2 Drugs prescriptions costs model marginal effects plot, µ parameter

7/31/2019 Art Cladag 2011

http://slidepdf.com/reader/full/art-cladag-2011 10/16

10 Giorgio Spedicato

Fig. 3 GAMLSS diagnostic output of drugs prescriptions frequency model

3.3 The simulation process

The cost distribution of the yearly amount of drugs prescription for the TDS has been ini-

tially simulated as follows:

1. The TDS has been duplicated into two distinct dataset: the first one representing patientsin force at the beginning of the period (henceforth IFP), the second one representing the

patients (henceforth NP) that would enter in the GP pool after the beginning the period.

2. The following passage have been repeated m = 1, . . . , M = 1000 times in order to simu-

late the distribution of the patients’ pool drug prescriptions total costs:

(a) The exposure in terms of patient/years ˜ E  has been determined both for IFP and NP

datasets rows, as follows:

– For IFP patients’ exposure, one number I i from a Bernoulli variable with prob-

ability equal to qi(d ) + qi

(l) has been drawn. qi(d ) and qi

(l) represent the proba-

bility of lapse due to death and other causes respectively. Due to collected data

limitation, the model we built assumes that only age and sex affect the lapse

probability, allowing a contribution of other causes set flat as qi(l) = 0.02.In

case I i = 1 the yearly exposure for patient i-th is drawn from a uniform [0, 1].

Then the exposure for IFP dataset records is expressed as ei = ( I i = 0) ∗ 1 +U (0, 1)∗ ( I i = 1).

– For NP data set, the exposure has been determined first sampling a number ˜ini

from a Poisson with rate parameter 0.03 for each row. ˜ini represents the number

of patien with the same demographic characteristics of the patient in row i−th that will enter in the data set within the year. For each ˜ini the convolution

7/31/2019 Art Cladag 2011

http://slidepdf.com/reader/full/art-cladag-2011 11/16

An actuarial model for assessing general practictioners’ prescriptions costs 11

Fig. 4 GAMLSS diagnostic output of drugs prescriptions costs model

approach has been applied to determine the total exposure for new incoming

patients sampling ini outcomes from a uniform [0, 1] distribution.

.

(b) Predict E [ni], var [ni], E [ci] and var [ci] for each rows in IFP and NP dataset us-

ing GAMLSS models calibrated in the previous step. Therefore ni and ci are fully

defined since both µ  and σ  parameters are kwown for both ni and ci.

(c) Applying the convolution process on ni and ci to determine the total costs of drugprescription in the year as shown in formula 2. The number and the cost distributions

parameters have been estimated in the previous step.

(d) Sum the simulated amounts t i, number, ni and exposures ei along the IFP and NP

databases and then summing them up in order to determine the yearly ˜ E  patients

exposures, prescriptions number ˜ N  and total cost T  for the analysed pool of patients.

The R object oriented structure makes possible to perform the simulation process using

the predict methods applied on estimated GAMLSS regression models at 3.2 paragraph. The

simulation steps have shown to be quite slow, as several hours have been needed to simulate

the yearly total expenditures of a 600 patients’ group of hypothetical general practictioner

XY using just M = 750 simulations on a standard desktop PC. A short-cut would be there-

fore useful to apply the propose operationally.

In [15] the log - normal distribution has been suggested to fit total loss distribution for

an personal line non life portfolio. This suggestion has been followed and the log-normal

distribution has been fit on total prescription cost distribution simulated in the previous step

by the Monte Carlo approach. The R fitdistrplus package [7] was used in order to estimate

the parameters and to assess the goodness of fit graphically and using suitable statistical test

7/31/2019 Art Cladag 2011

http://slidepdf.com/reader/full/art-cladag-2011 12/16

12 Giorgio Spedicato

cost fit.png

Fig. 5 General practictioner XY yearly total cost of drug prescritpion log - normal distribution fit

(Andeson Darling and Kolmorogov Smirnow).

Fitting results shown in figure 5 show that the log-normal distribution could provide a very

good fit of  T . Moreover all p-values of the two goodness of fit statistical tests were non

significative. Therefore if the parameters of the log-normal distribution of T  would be known

in advance, there would be no need to conduct a time - consuming Monte-Carlo simulation

to assess the distribution T . We will show that it is possible to know these parameters in

advance. As T  is the sum of independent t i observation, equation 12 follows.

 E 

= N 

∑i=1

 E (t )ivar 

= N 

∑i=1

var (t )i (12)

Moreover each t i represents an outcome of a compound distribution. Following [4], the

expected value and the variance of  t i can be obtained in closed form from equation ??. All

terms in ?? are obtained from previously fitted GAMLSS models.

Since the theoretical expected value and variance of  T  are known, the parameters of 

the log-normal approximation of the total amount distribution can be therefore be evaluated

directly using the method of moments formulas 13. The direct estimation of parameters µ T 

and σ T  allows to completely define T  distribution.

µ T  = ln ( E (T ))− 1

2ln

1 +

var( E (T ))

 E 2 (T )

σ 2T  = ln

1 +

var( E (T ))

 E 2 (T )

(13)

Therefore the total cost distribution can be almost perfectly approximated using a quite

simple analytical distribution.

7/31/2019 Art Cladag 2011

http://slidepdf.com/reader/full/art-cladag-2011 13/16

An actuarial model for assessing general practictioners’ prescriptions costs 13

3.4 Results

The outlined algorithm has been applied on TDS data set, that represent general practictioner

XY 600 patients demographic data. Tables 1, 2 and 3 shows general practictioner patient

 / years, number of prescriptions and total cost of prescription key statistics. The 99.5%

percentile figure has been added for number and total amount. Such figure may be used to

budget and monitor GP XY drug prescriptions expenditures.

mean Q1 Q3

602.84 599.69 606.10

Table 1 Doctor XY patient/years distribution

mean SD Q1 Q3 p99.5

1952.96 125.33 1868.50 2038.00 2260.52

Table 2 Doctor XY number of prescriptions distribution

mean SD Q1 Q3 p99.5

39967.79 2721.93 38182.61 41814.40 46118.52

Table 3 Doctor XY total cost of prescriptions distribution

4 Conclusions and further research

4.1 Discussion of results

This article has shown how non - life actuarial techniques can be successfully applied to a

health economics problem. We have used GAMLSS to evaluate the frequency and the cost

of drug prescriptions following an approach closely resembling personal line rate-making.

The predictive models we propose can be used to assess and explain which demographic

risk factors affect significantly the number and the cost of drug prescriptions paid by NHI.

Moreover the convolution approach of the collective risk theory has been used to assess

the distribution of yearly expenditures of a GPs pool of patients. The assessment of the total

cost distribution can be used to monitor the prescriptions granted by the GP using a statisti-cally grounded approach.

A relevant limitation of the followed modelling approach is that pandemic events are

not handled properly as each patient is assumed independent from the other ones. Pandemic

events would affect at the same time may patients by disease contagion especially if spatially

7/31/2019 Art Cladag 2011

http://slidepdf.com/reader/full/art-cladag-2011 14/16

14 Giorgio Spedicato

close (as catastrophic insurance losses). On the other hand seasonal diseases do not repre-

sent an issue due to the yearly period of observation (fractional exposures have appeared to

be a small issue in this problem).

The approach followed in this paper has modelled all prescriptions granted by GP, avoid-

ing creating sub-models e.g. for disease groups.

Further subdivisions of drug prescriptions might be interesting for deepening the risk factors

influencing the frequency and the cost of homogeneous groups of drug expenditures.

We think that the most valuable use of the proposed model within health economics

would be a rationale assessment of the standard cost of drug prescription for a GP pool

of patient. Assessing T  and t i distributions would permit to obtain statistics useful for the

planning and budgeting process like:

– The expected value and any desired dispersion measures.

– Extreme percentiles (e.g. 99th), that may be used as a threshold for further actions in

order to investigate potential inefficiencies or abuses.

If the predictive model would be calibrated on a “certified” sample, they could be used to

estimate the ”standard cost” of yearly drug prescription for any GP pool of patients knowingpatients demographics. The use of “standard costs” of government provided services have

been acquiring relevant increasing importance in a period of budget pressure Italy and many

OECD countries are facing. At the same time the developed model would permit to obtain

the distribution percentiles of drug prescription that can be used to monitor the expenditures,

e.g. priotitizing routinely audits of individual GP drug prescriptions.

Moreover the proposed methodology can be easily used to estimate the drug prescription

costs taking into account inflation and changes of coverage offered by the NHI, like the

application of a yearly deductible or a coinsurance percentages. With respect to the actuarial

side of the analysis, another relevant application lies in the estimations of the multi - year

actuarial present value of the drug prescriptions costs for any patient given its demographic

profile. GAMLSS model for the number and costs let us to obtain a yearly average total cost

( a pure premium) as a function on age x, x + 1, . . . and other demographic variables, pr  x of any patient. After assumptions about future inflation rate it , the financial discount rate vt  and

the probability of survival x pt  have been made, the lifetime actuarial present value of drug

prescription cost for any patient in the pool can be expressed by formula 14.

ci =ω − x

∑t =0

(1 + ii)t 

t  p xvt  pr  x+t  (14)

4.2 Further research

Finally the proposed approach can certainly be applied to more traditional actuarial applica-

tions like personal lines rate-making or capital modelling. ———-The discussed model shows a rationale approach to assess a general practitioner drugs

prescriptions cost distribution. Actuarial techniques commpon in general insurance pricing

and risk management practices have been applied to a Health Economics problem as long

as GAMLSS models, frontier methods in regression modelling.

7/31/2019 Art Cladag 2011

http://slidepdf.com/reader/full/art-cladag-2011 15/16

An actuarial model for assessing general practictioners’ prescriptions costs 15

As a further research direction, predictive models more flexible than standard log-linear

regressions framework should be tested in order to better assess the frequency and the cost

of prescriptions. More refined models for patient lapses and patient conversions can be built

following what done in pricing optimization tasks on personal lines rate-making [23], when

data avaibility issue would be solved.

We suggest to expand the study by increasing the sample of physician analysed and

patients’ transactions. The use of GP level data would be a valuable improvement in the ex-

plicative power of the data set. In fact literature (e.g. [21]) has shown that GP characteristics

like length of practice affect the outcome significantly. This could increase the consistency

of model estimates and, last but not least, the inclusion of GP level variables in the model

could improve model explicative and predictive power.

7/31/2019 Art Cladag 2011

http://slidepdf.com/reader/full/art-cladag-2011 16/16

16 Giorgio Spedicato

Acknowledgements The authors wish to thank Dr. Stefania Giacalone for having provided drug prescrip-

tions costs data. I wish to thank Simona Minotti for her outstanding contribution in reviewing the document.

The data analysis in this paper was performed with R, statistical software which is released under the GNU

General Public License (GPL). For more information on R, the interested reader is referred to R Development

Core Team, [16].

References

1. Duncan Anderson, Sholom Feldblum, Claudine Modlin, Doris Schirmacher, Ernesto Schirmacher, and

Neeza Thandi. A practitioner’s guide to generalized linear models. Technical report, Casualty Actuarial

Society, 2007.

2. CEIOPS. Qis5 technical specifications, July 2010.

3. Cermlab. Alla ricerca di standard per la sanit federalista.

http://www.cermlab.it/argomenti.php?group=sanita&item=43, 02 2010.

4. C.D. Daykin, T. Pentik 

”ainen, and M. Pesonen. Practical risk theory for actuaries. Monographs on statistics and applied

probability. Chapman & Hall, 1994.

5. Piet de Jong and Gillian Heller. Generalized linear models for insurance data. Cambridge University

Press, New York, first edition edition, 2008.

6. Piet de Joung, Mikis Stasinopoulos, Robert Stasinopoulos, and Gillian Heller. Mean and dispersion

modeling for policy claim cost. Scandinavian Actuarial Journal, 2007.7. Marie Laure Delignette-Muller, Regis Pouillot, Jean-Baptiste Denis, and Christophe Dutang. fitdistrplus:

help to fit of a parametric distribution to non-censored or censored data , 2010. R package version 0.1-3.

8. Peter Dunn and Gordon K. Smyth. Randomized quantile residuals. J. Computat. Graph. Statist , 5:236–

244, 1996.

9. E. Fernandez-Liz, P. Modamio, A. Catalan, C. F. Lastra, T. Rodriguez, and E. L. Marino. Identifying

how age and gender influence prescription drug use in a primary health care environment in catalonia,

spain. Br J Clin Pharmacol, 65:407–417, Mar 2008.

10. M. Garcia-Goni and P. Ibern. Predictability of drug expenditures: an application using morbidity data.

 Health Econ, 17:119–126, Jan 2008.

11. Professor W. Greene. German health care usage data. online:

http://pages.stern.nyu.edu/ wgreene/Econometrics/PanelDataSets.htm, 1997.

12. Frank E. Harrel. title. Technical report, Vanderbilt University School of Medicine, 2011.

13. Istat. Geodemo istat: Tavole di mortalit regionali, 2011. Online; accessed 26-June-2011.

14. S.A. Klugman, H.H. Panjer, and G.E. Willmot. Loss Models: From Data to Decisions (Book, Solutions

 Manual, a nd ExamPrep). John Wiley & Sons, 2009.

15. D.E. Papush, G.S. Patrik, and F. Podgaits. Approximations of the aggregate loss distribution. In CAS Forum, pages 175–186, 2001.

16. R Development Core Team. R: A Language and Environment for Statistical Computing. R Foundation

for Statistical Computing, Vienna, Austria, 2010. ISBN 3-900051-07-0.

17. Bob Rigby and Mikis Stasinopoulos. A exible regression approach using gamlss in r, 11 2009.

18. R. A. Rigby and D. M. Stasinopoulos. Generalized additive models for location, scale and shape,(with

discussion). Applied Statistics, 54:507–554, 2005.

19. Robert Rigby and Mikis Stasinopoulos. Generalized additive models for location, scale and shape,(with

discussion). Applied Statistics, 54:507–554, 2005.

20. Nino Savelli and Gianpaolo Clemente. Hierarchical structures in the aggregation of premium risk for

insurance underwriting. Scandinavian Actuarial Journal, 1:1, 2010.

21. G. Simon, C. Francescutti, S. Brusin, and F. Rosa. Variation in drug prescription costs and general

practitioners in an area of north-east italy. the use of current data. Epidemiol Prev, 18:224–229, Dec

1994.

22. Giorgio Alfredo Spedicato. Solvency II premium risk modeling under the direct compensation CARD

system. PhD thesis, La Sapienza, Universita di Roma, 2011.

23. James Tanser. Pretium manual. Tower Watson, 3.1 edition, 2010.24. Gary Venter. Mortality trend models. Casualty Actuarial Society Forum, 1:1, 2011.

25. Geoff Werner and Claudine Modlin. Basic Ratemaking, 2009.

26. Keith Wilson-Davis and William G. Stevenson. Predicting prescribing costs: A model of northern ireland

general practices. Pharmacoepidemiology and Drug Safety, 1(6):341–345, 1992.