richard a. derrig ph. d. opal consulting llc visiting scholar, wharton school university of...
Post on 18-Dec-2015
216 views
TRANSCRIPT
Richard A. Derrig Ph. D.Richard A. Derrig Ph. D.
OPAL Consulting LLC OPAL Consulting LLC
Visiting Scholar, Wharton SchoolVisiting Scholar, Wharton SchoolUniversity of PennsylvaniaUniversity of Pennsylvania
Richard A. Derrig Ph. D.Richard A. Derrig Ph. D.
OPAL Consulting LLC OPAL Consulting LLC
Visiting Scholar, Wharton SchoolVisiting Scholar, Wharton SchoolUniversity of PennsylvaniaUniversity of Pennsylvania
New Applications of Statistics and Data Mining Techniques to Classification
and Fraud Detection in Insurance.
Bogotá, ColumbiaNovember 3, 2005
Insurance Fraud Bureau Insurance Fraud Bureau of Massachusettsof Massachusetts
You can steal more
with a ball point than
by gun point.
Insurance Fraud Bureau Insurance Fraud Bureau of Massachusettsof Massachusetts
Insurance fraud is known as a high reward, low risk crime.
GENERAL INSURANCE GENERAL INSURANCE PROBLEMSPROBLEMS
WHAT: Product Design WHERE: Market Characteristics WHO: Classification & Sale HOW: Claims Paid WHEN: Forecasting WHY: Profit (Expected)
TRADITIONALTRADITIONALMATHEMATICAL TECHNIQUESMATHEMATICAL TECHNIQUES
Arithmetic (Spreadsheets) Probability & Statistics (Range of Outcomes) Curve Fitting (Interpolation & Extrapolation) Model Building (Equations for Processes) Valuation (Risk, Investments, Catastrophes) Numerical Method (Analytic Solution Rare)
NON-TRADITIONAL NON-TRADITIONAL MATHEMATICSMATHEMATICS
Fuzzy Sets & Fuzzy Logic Elements: “in/out/partially both” Logic: “true/false/maybe” Decisions: “incompatible criteria”
Artificial Intelligence: “data mining” Neural Networks: “learning algorithms”
CLASSIFICATIONCLASSIFICATION
Segmentation: A major exercise for insurance underwriting and claims
Underwriting: Find profitable risks from among the available market
Claims: Sort claims into easy pay and claims needing investigation
Fuzzy Logic Compared with Fuzzy Logic Compared with ProbabilityProbability
Probability: Measures randomness; Measures whether or not event occurs;
and Randomness dissipates over time or with
further knowledge. Fuzziness:
Measures vagueness in language; Measures extent to which event occurs;
and Vagueness does not dissipate with time or
further knowledge.
Fuzzy LogicFuzzy Logic
The field of Pattern Recognition is a search for structure in data.
Old View: Given N objects, divide them into 2 < C < N clusters of homogeneous or similar types. Similarity can be based upon multiple features or criteria but each object is in one and only one cluster.
New View: Objects can be members of one or several clusters with varying strengths of membership; i.e. fuzzy clusters are Fuzzy Sets of clusters.
Example A: Classification of Individual Risks Example B: Classification of Injury Claims
Clusters
FUZZY SETSFUZZY SETSTOWN RATING CLASSIFICATION When is one Town near another for Auto
Insurance Rating?- Geographic Proximity (Traditional)- Overall Cost Index (Massachusetts)
Geographically close Towns do not have the same Expected Losses.
Clusters by Cost Produce Border Problems:Towns between Territories. Fuzzy Clusters acknowledge the Borders.
Are Overall Clusters correct for each Insurance Coverage?
Fuzzy Clustering on Five Auto Coverage Indices is better and demonstrates a weakness in Overall Crisp Clustering.
Fuzzy Clustering of Fraud Study Claims by Fuzzy Clustering of Fraud Study Claims by Assessment DataAssessment Data
Membership Value Cut at 0.2Membership Value Cut at 0.2
0
1
2
3
4
5
6
0 10 20 30 40 50 60 70 80 90 100 110 120 130
CLAIM FEATURE VECTOR ID
FIN
AL
CL
UST
ER
S
ValidBuild-up (Inj. Sus. Level <5)
Opportunistic Fraud
(0,0,0,0,0,0)
(0,1,0,1,3,0)
(1,4,0,4,6,0)
(1,7,0,7,7,0)
(7,8,7,8,8,0)
= Full Member =Partial Member Alpha = .2
SuspicionCenters
Build-up (Inj. Sus. Level = to or >5)
(A, C, Is, Ij, T, LN)
Planned Fraud
FRAUDFRAUD
The Major QuestionsWhat Is Fraud?
How Much Fraud is There?
What Companies Do about Fraud?
How Can We Identify a Fraudulent Claim?
FRAUD DEFINITIONFRAUD DEFINITION
Principles Clear and willful act Proscribed by law Obtaining money or value Under false pretenses
Abuse: Fails one or more Principles
Fraud DefinitionFraud Definition
COSTS Fraud (Criminal, Hard) Small
Mass. Auto & WC < 1% Abuse (Not Criminal, Soft Fraud),
BIG Bucks, Depends on Line “Abuse” is (legally) a gray area, unethical
behavior “Abuse” Containment is a Matter for
Company/Industry/Regulator
FRAUD TYPESFRAUD TYPES Insurer Fraud
Fraudulent Company Fraudulent Management
Agent Fraud No Policy False Premium
Company Fraud Embezzlement Inside/Outside Arrangements
Claim Fraud Claimant/Insured Providers/Rings
CLAIM FRAUD INDICATORSCLAIM FRAUD INDICATORSVALIDATION PROCEDURESVALIDATION PROCEDURES
Canadian Coalition Against Insurance Fraud (1997) 305 Fraud Indicators (45 vehicle theft)
“No one indicator by itself is necessarily suspicious”.
Problem: How to validate the systematic use of Fraud Indicators?
AIB FRAUD INDICATORSAIB FRAUD INDICATORSAIB FRAUD INDICATORSAIB FRAUD INDICATORS
Accident Characteristics (19) No report by police officer at scene No witnesses to accident
Claimant Characteristics (11) Retained an attorney very quickly Had a history of previous claims
Insured Driver Characteristics (8) Had a history of previous claims Gave address as hotel or P.O. Box
1989 Examples
AIB FRAUD INDICATORSAIB FRAUD INDICATORSAIB FRAUD INDICATORSAIB FRAUD INDICATORS
Injury Characteristics (12) Injury consisted of strain/sprain only No objective evidence of injury
Treatment Characteristics (9) Large number of visits to a chiropractor DC provided 3 or more modalities on most visits
Lost Wages Characteristics (6) Claimant worked for self or family member Employer wage differs from claimed wage loss
1989 Examples
REAL PROBLEMREAL PROBLEM
Classify all claims Identify valid classes
Pay the claim No hassle Visa Example
Identify (possible) fraud Investigation needed
Identify “gray” classes Minimize with “learning” algorithms
Experience and Judgment Artificial Intelligence Systems
Regression Models Fuzzy Clusters Neural Networks Expert Systems Genetic Algorithms All of the Above
FRAUDULENT CLAIM IDENTIFICATION
FRAUDULENT CLAIM IDENTIFICATION
DMDatabases
Scoring Functions
Graded Output
Non-Suspicious ClaimsRoutine Claims
Suspicious ClaimsComplicated Claims
POTENTIAL VALUE OF AN ARTIFICIAL POTENTIAL VALUE OF AN ARTIFICIAL INTELLIGENCE SCORING SYSTEMINTELLIGENCE SCORING SYSTEM
Screening to Detect Fraud Early Auditing of Closed Claims to Measure
Fraud Sorting to Select Efficiently among
Special Investigative Unit Referrals Providing Evidence to Support a Denial Protecting against Bad-Faith
Using Kohonen’s Self-Organizing Feature Map to Using Kohonen’s Self-Organizing Feature Map to Uncover Automobile Bodily Injury Claims FraudUncover Automobile Bodily Injury Claims Fraud
PATRICK L. BROCKETT
Gus S. Wortham Chaired Prof. of Risk Management
University of Texas at Austin
XIAOHUA XIA
University of Texas, at Austin
RICHARD A. DERRIG
Senior Vice President
Automobile Insurers Bureau of Massachusetts
Vice President of Research
Insurance Fraud Bureau of Massachusetts
JOURNAL OF RISK AND INSURANCE, 65:2, 245-274, 1998,
Self-Organizing Feature Maps T. Kohonen 1982-1990 (Cybernetics) Reference vectors map to OUTPUT format in
topologically faithful way. Example: Map onto 40x40 2-dimensional square.
Iterative Process Adjusts All Reference Vectors in a “Neighborhood” of the Nearest One. Neighborhood Size Shrinks over Iterations
NEURAL NETWORKSNEURAL NETWORKSNEURAL NETWORKSNEURAL NETWORKS
FEATURE MAPFEATURE MAPSUSPICION LEVELSSUSPICION LEVELS
1 4 7 10 13 16S1
S4
S7
S10
S13
S16
4-5
3-4
2-3
1-2
0-1
FEATURE MAPFEATURE MAPSIMILIARITY OF A CLAIMSIMILIARITY OF A CLAIM
1 5 9
13
17
S1
S4
S7
S10
S13
S16
4-5
3-4
2-3
1-2
0-1
DATA MODELING EXAMPLE: CLUSTERINGDATA MODELING EXAMPLE: CLUSTERING
Data on 16,000 Medicaid providers analyzed by unsupervised neural net
Neural network clustered Medicaid providers based on 100+ features
Investigators validated a small set of known fraudulent providers
Visualization tool displays clustering, showing known fraud and abuse
Subset of 100 providers with similar patterns investigated: Hit rate > 70% Cube size proportional to annual Medicaid revenues
© 1999 Intelligent Technologies Corporation
Modeling Hidden Exposures in Claim Modeling Hidden Exposures in Claim Severity via the EM AlgorithmSeverity via the EM Algorithm
Grzegorz A. RempalaDepartment of Mathematics
University of Louisvilleand
Richard A. DerrigOPAL Consulting LLC &
Wharton School, University of Pennsylvania
Hidden Exposures - OverviewHidden Exposures - Overview Modeling hidden risk exposures as additional
dimension(s) of the loss severity distribution
Considering the mixtures of probability distributions as the model for losses affected by hidden exposures with some parameters of the mixtures considered missing (i.e., unobservable in practice)
Approach is feasible due to advancements in the computer driven methodologies dealing with partially hidden or incomplete data models
Empirical data imputation has become more sophisticated and the availability of ever faster computing power have made it increasingly possible to solve these problems via iterative algorithms
Figure 1: Overall distribution of the 348 BI medical bill amounts from Appendix B Figure 1: Overall distribution of the 348 BI medical bill amounts from Appendix B compared with that submitted by provider A. compared with that submitted by provider A.
Left panel: frequency histograms (provider A’s histogram in filled bars).
Right panel: density estimators (provider A’s density in dashed line)
Source: Modeling Hidden Exposures in Claim Severity via the EM Algorithm, Grzegorz A. Rempala, Richard A. Derrig, pg. 9, 11/18/02
Figure 2: EM FitFigure 2: EM Fit
Left panel: mixture of normal distributions fitted via the EM algorithm to BI data
Right panel: Three normal components of the mixture.
Source: Modeling Hidden Exposures in Claim Severity via the EM Algorithm, Grzegorz A. Rempala, Richard A. Derrig, pg. 13, 11/18/02
Figure 3: Latent risk in BI data modeled by the EM Algorithm with Figure 3: Latent risk in BI data modeled by the EM Algorithm with m m = 3.= 3.
Left panel: set of responsibilities δ j 3.
Right panel: the third component of the normal mixture compared with the distribution of provider A’s claims (“A” claims density estimator is a solid curve)
Source: Modeling Hidden Exposures in Claim Severity via the EM Algorithm, Grzegorz A. Rempala, Richard A. Derrig, pg. 14, 11/18/02
Fraud Classification Using Principal Component Fraud Classification Using Principal Component Analysis of RIDITsAnalysis of RIDITs
PATRICK L. BROCKETTGus S. Wortham Chaired Prof. of Risk Management
University of Texas at AustinRICHARD A. DERRIGSenior Vice President
Automobile Insurers Bureau of MassachusettsVice President of Research
Insurance Fraud Bureau of MassachusettsLINDA L. GOLDEN
Marlene & Morton Meyerson Centennial Professor in BusinessUniversity of Texas
Austin, TexasARNOLD LEVINE
Professor EmeritusDepartment of Mathematics
Tulane UniversityNew Orleans LAMARK ALPERT
Professor of MarketingUniversity of Texas
Austin, TexasJOURNAL OF RISK AND INSURANCE, 69:3, SEPT. 2002
THE PROBLEMTHE PROBLEM
Data: Features have no natural metric-scale Model: Stochastic process has no parametric
form Classification: Inverse image of one dimensional
scoring function and decision rule Feature Value: Identify which features are
“important”
PRIDIT METHOD OVERVIEWPRIDIT METHOD OVERVIEW
1. DATA: N Claims, T Features, K sub T Responses, Monotone In “Fraud”
2. RIDIT score each possible response: proportion below minus proportion above, score centered at zero.
3. RESPONSE WEIGHTS: Principal Component of Claims x Features with RIDIT in Cells.
4. SCORE: Sum weights x claim ridit score.
5. PARTITION: above and below zero.
TABLE 1Computation of PRIDIT Scores
VariableVariable
Label
Proportion of “Yes”
Bt1 ("Yes")
Bt2 (“No")
Large # of Visits to Chiropractor TRT1 44% -.56 .44
Chiropractor provided 3 or more modalities on most visits
TRT2 12% -.88 .12
Large # of visits to a physical therapist TRT3 8% -.92 .08
MRI or CT scan but no inpatient hospital charges
TRT4 20% -.80 .20
Use of “high volume” medical provider TRT5 31% -.69 .31
Significant gaps in course of treatment TRT6 9% -.91 .09
Treatment was unusually prolonged
(> 6 months)
TRT7 24% -.76 .24
Indep. Medical examiner questioned extent of treatment
TRT8 11% -.89 .11
Medical audit raised questions about charges
TRT9 4% -.96 .04
TABLE 2Weights for Treatment Variables
VariablePRIDIT Weights
W(∞)
Regression Weights
TRT1 .30 .32***
TRT2 .19 .19***
TRT3 .53 .22***
TRT4 .38 .07
TRT5 .02 .08*
TRT6 .70 -.01
TRT7 .82 .03
TRT8 .37 .18***
TRT9 -.13 .24**
Regression significance shown at 1% (***), 5% (**) or 10% (*) levels.
TABLE 3PRIDIT Transformed Indicators, Scores and Classes
Claim TRT1
TRT2
TRT3
TRT4
TRT5
TRT6
TRT7
TRT8
TRT9
Score Class
1 0.44 0.12 0.08 0.2 0.31 0.09 0.24 0.11 0.04 .07 2
2 0.44 0.12 0.08 0.2 -0.69 0.09 0.24 0.11 0.04 .07 2
3 0.44 -0.88 -0.92 0.2 0.31 -0.91 -0.76 0.11 0.04 -.25 1
4 -0.56 0.12 0.08 0.2 0.31 0.09 0.24 0.11 0.04 .04 2
5 -0.56 -0.88 0.08 0.2 0.31 0.09 0.24 0.11 0.04 .02 2
6 0.44 0.12 0.08 0.2 0.31 0.09 0.24 0.11 0.04 .07 2
7 -0.56 0.12 0.08 0.2 0.31 0.09 -0.76 -0.89 0.04 -.10 1
8 -0.44 0.12 0.08 0.2 -0.69 0.09 0.24 0.11 0.04 .02 2
9 -0.56 -0.88 0.08 -0.8 0.31 0.09 0.24 0.11 -0.96 .05 2
10 -0.56 0.12 0.08 0.2 0.31 0.09 0.24 0.11 0.04 .04 2
TABLE 7
AIB Fraud and Suspicion Score DataTop 10 Fraud Indicators by Weight
PRIDIT Adj. Reg. Score Inv. Reg. Score
ACC3 ACC1 ACC11
ACC4 ACC9 CLT4
ACC15 ACC10 CLT7
CLT11 ACC19 CLT11
INJ1 CLT11 INJ1
INJ2 INS6 INJ3
INJ5 INJ2 INJ8
INJ6 INJ9 INJ11
INS8 TRT1 TRT1
TRT1 LW6 TRT9
TABLE 10
AIB Fraud Indicator and Suspicious Score Classes
Fraud/Non-fraud Classifications
PRIDIT
FraudNon-fraud All
AdjusterRegressionScore
Fraud 30 5 35
Non-Fraud
32 60 92
All 62 65 127
( =11.3) [4.0, 31.8]
REFERENCESREFERENCESBrockett, Patrick L., Derrig, Richard A., Golden, Linda L., Levine, Albert and Alpert, Mark,
(2002), Fraud Classification Using Principal Component Analysis of RIDITs, Journal of Risk and Insurance, 69:3, 341-373.
Brockett, Patrick L., Xiaohua, Xia and Derrig, Richard A., (1998), Using Kohonen’ Self-Organizing Feature Map to Uncover Automobile Bodily Injury Claims Fraud, Journal of Risk and Insurance, 65:245-274
Derrig, R.A. and H.I. Weisberg, [2004], Determinants of Total Compensation for Auto Bodily Injury Liability Under No-Fault: Investigation, Negotiation and the Suspicion of Fraud, Insurance and Risk Management, Volume 71, (4), pp. 633-662.
Derrig, R.A., H.I. Weisberg and Xiu Chen, [1994], Behavioral Factors and Lotteries Under No-Fault with a Monetary Threshold: A Study of Massachusetts Automobile Claims, Journal of Risk and Insurance, 61:2, 245-275.
Rempala, G. and R.A. Derrig, (2005), Modeling Hidden Exposures in Claim Severity via the EM Algorithm, NAAJ, v9,n2,108-128
Viaene, Stijn, Derrig, Richard A., Dedene, Guido, (2004), A Case Study of Applying Boosting Naïve Bayes to Claim Fraud Diagnosis, IEEE Transactions on Knowledge and Data Engineering, v18,n5, May
Viaene, Stijn, Derrig, Richard A., Baesens, Bart, and Dedene, Guido, (2002), A Comparison of State-of-the-Art Classification Techniques for Expert Automobile Insurance Fraud Detection, Journal of Risk and Insurance, 69:3, 373-423.
Fuzzy ReferencesFuzzy ReferencesInsurance Related MaterialBrockett, P.L., Cooper, W.W., Golden, L.L. and Pitaktong, V. (1994), A neural network method for obtaining an early warning of insurer solvency, Journal of Risk and Insurance 61, pp. 402-424.Cummins, J.D. and Derrig, R.A. (1997), Fuzzy financial pricing of property‑liability insurance, North American Actuarial Journal, 1:4, pp. 21-44.Cummins, J.D. and Derrig, R.A. (1993), Fuzzy trends in property-liability insurance claim costs, Journal of Risk and Insurance, September, 60, pp. 429-465.Derrig, R.A. and Ostaszewski, K.M. (1995), Fuzzy techniques of pattern recognition in risk and claim classification, Journal of Risk and Insurance, 62:3 pp.447-482. Derrig, R.A. and Ostaszewski, K.M. (1997), Managing the tax liability of a propety-liability insurance company, Journal of Risk and Insurance, 64:4, pp.695-711.DeWit, G.W. (1982), Underwriting and uncertainty, Insurance: Mathematics and Economics 1, pp. 277-285.Jablonowski, M. (1993), An expert system for retention selection, CPCU Journal, pp. 214-221.Lemaire, Jean (1990), Fuzzy insurance, Astin Bulletin 20(1), pp. 33-55.Ostaszewski, K.M. (1993), An Investigation into Possible Applications of Fuzzy Sets Methods in Actuarial Science, Society of Actuaries, Schaumburg, Illinois.Young, V. (1994), Application of fuzzy sets to group health underwriting, Transactions of the Society of Actuaries 45, pp. 551-590.Young, V. (1996), Rate changing: A fuzzy logic approach, Journal of Risk and Insurance, 63:3, pp.461-484.